from:"Marco Colombo"

Re: [GENERAL] Preventing OOM kills

2011-05-25 Thread Marco Colombo


On 05/25/2011 03:01 AM, John R Pierce wrote:

On 05/24/11 5:50 PM, Andrej wrote:

Add more RAM? Look at tunables for other processes on
the machine? At the end of the day making the kernel shoot
anything out of despair shouldn't be the done thing.


somehow, 'real' unix has neither a OOMkiller nor does it flat out die
under heavy loads, it just degrades gracefully. I've seen Solaris and
AIX and BSD servers happily chugging along with load factors in the
100s, significant portions of memory paging, etc, without completely
crumbling to a halt. Soimetimes I wonder why Linux even pretends to
support virtual memory, as you sure don't want it to be paging.




http://developers.sun.com/solaris/articles/subprocess/subprocess.html

Some operating systems (such as Linux, IBM AIX, and HP-UX) have a 
feature called memory overcommit (also known as lazy swap allocation). 
In a memory overcommit mode, malloc() does not reserve swap space and 
always returns a non-NULL pointer, regardless of whether there is enough 
VM on the system to support it or not.


The memory overcommit feature has advantages and disadvantages.

(the page goes on with some interesting info) [*]

It appears by your definition that neither Linux, AIX nor HP-UX are 
'real' Unix. Oh, wait, FreeBSD overcommits, too, so can't be 'real' either.


/me wonders now what a 'real' Unix is. :) Must be something related with 
'true' SysV derivatives. If memory serves me well, that's where the word 
'thrashing' originated, right? Actually in my experience nothing 
'thrashes' better than a SysV, Solaris included.


The solution for the OP problem is to keep the system from reaching OOM 
state in the first place. That is necessary even with overcommitting 
turned off. PG not performing its job because malloc() keeps failing 
isn't really a solution.


.TM.

[*] One missing piece is that overcommitting actually prevents or delays 
OOM state. The article does mention system memory can be used more 
flexibly and efficiently w/o really elaborating further. It means that, 
given the same amount of memory (RAM+swap), a non overcommitting system 
reaches OOM way before than a overcommitting one. Also it is rarely a 
good idea, when running low on memory, to switch to an allocation policy 
that is _less_ efficient, memory wise.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Pl/Python error when import from future

2011-03-31 Thread Marco Colombo


On 03/31/2011 09:58 PM, Davi Duarte wrote:

Hello,

I'm using PL/Python in PostegreSQL 9.0.3 and Python 2.6.5, I want to use
a feature of Python 3, Python have an option to import a module from
future version of Python.

In my case, I want to use the Python 3 division module, because it
returns a float division of integers by defaut, which in Python 2.x only
returns an integer, then I need use division module of the Python 3.

So to import a module from a future version, simply:

from __future__ import modulename

In my case I'm importing the division:

from __future__ import division

Python must requires that the import be in the first line of code, then
that is my problem, e.g., when running the function below:

CREATE OR REPLACE FUNCTION test() RETURNS text AS $$
from __future__ import division
a = 8/5
return eval(a)
$$ LANGUAGE plpython2u;

returns the following error:

SyntaxError: from __future__ imports must occur at the beginning of the file

But from __future__ ... is on first line of code.

Has anyone had this error? and what may be this error? a bug in
PL/Python? How can I fix this error?

Thanks,

Davi Duarte.


AFAIK, a CREATE FUNCTION in plpython2u adds a python function definition 
right before the code you provide (which is the fuction body, or 'suite' 
in python parlance):


def function_name(parameter_list):
  your_suite

(it does some other mangling to correct indentation, BTW)

As far as the python interpreter is concerned, 'from __future__ import 
...' statement is not at the beginning of the program.


I started looking into this some years ago, but at the time the only 
solution I managed to design was using some internal interpreter API, 
unsupported and subject to change. My idea was to mangle the parse tree, 
not the program source: in theory it would be possible to spot the 'form 
__future__ import ...' statement and move it at the top. It also avoids 
any problem with indentation. At the time there was no proper way to 
do that, tho.


I'm afraid there's no solution for your problem at the moment. Well, a 
workaround is to multiply one of the division arguments by 1.0, 
effectively casting it to float. Note this isn't the same of using 
float() itself.


 a = 1.0*8/5
 eval(a)
1.6

.TM.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_restore

2011-03-31 Thread Marco Colombo


On 03/31/2011 01:08 AM, Mike Orr wrote:

That might be a better solution. I was hoping to use the same pgdump
file for this that I also use for routine offline backup, but maybe
this is such a special case that a separate dump file would be better.


Why don't you post the exact mysqldump/mysql commands you are using? It 
would be easier to provide equivalent pgdump/psql commands.


Meanwhile, have a look at the pg_restore man page, -t and/or -L options. 
Maybe -c as well. Oh, and -1.


.TM.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Linux x Windows LOCALE/ENCODING compatibility

2010-11-17 Thread Marco Colombo


On 11/09/2010 02:31 AM, Carlos Henrique Reimer wrote:

Hi,

I'm currently in the process of moving the data from the Windows server
to the new Linux box but facing some problems with the encoding.

Additional configuration information: Windows is running PG 8.3 and the
new Linux box is PG 8.4.

Windows dump command:
pg_dump -U postgres -Fc -v -f f:\backup

Linux restore command:
pg_restore -v -n brasil -d mapas /backup


pg_restore error:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 3574; 0 40805 TABLE
DATA cidade
  postgres
pg_restore: [archiver (db)] COPY failed: ERROR:  character 0x81 of
encoding WIN
1252 has no equivalent in UTF8
CONTEXT:  COPY cidade, line 6

I also tried to dump using pg_dump -E UTF8 but then I got:
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  character 0x81 of encoding
WIN1252 has no equivalent in UTF8
pg_dump: The command was: COPY brasil.cidade (gid, municpio,
municpi0, uf, longitude, latitude, the_geom) TO stdout;
pg_dump: *** aborted because of error

How can I fix this error?



Well, that's pretty much evidence that the text you have on windows is 
NOT win1252-encoded. Or some parts of it, at least. According to this page:


http://en.wikipedia.org/wiki/Windows-1252

0x81 is undefined in win1252.

Please note that if the old DB has SQL_ASCII encoding, you may have 
mixed encoding text in the same database (either on purpose or by 
mistake) and you have to either keep using SQL_ASCII in the new DB 
(which means keeping potentially invalid data around), or do proper 
conversion to UTF-8 (which possibly has do be done differently for 
different tables, or even different rows in the same table) BEFORE 
importing it in the new DB.


To convert from one encoding to another, you can use, surprisingly, the 
'convert' function. The tricky part is that once you have text of 
unknown encoding, you can only guess. That is, try to convert it from 
some reasonable encodings to UTF-8, and look at the result. If it looks 
right, that may be the right encoding. :)


BTW, 0x81 is a rather weird char, it's invalid in many common encodings. 
It's valid in win-1251, where it represents U+0403 (CYRILLIC CAPITAL 
LETTER GJE), Ѓ (if you don't have the font, you won't be able to see 
it), but I think it's an unlikely candidate.


In both CP437 and CP850 (old DOS encodings for western languages) it 
represents U+00FC (LATIN SMALL LETTER U WITH DIAERESIS), ü (again, I 
hope you can see it). That's a better candidate, it's possible someone 
accessed the DB either directly, or via a web application, from a 
Windows ME or older system.


Unfortunately, AFAIK, PostgreSQL doesn't support the cp850 encoding. 
Maybe you can dump the data (even a single table) in text format and use 
a text editor (or shell utils like perl, sed, tr) to replace offending 
charaters with their proper win-1252 counterparts (e.g., 0x81 can be 
replaced with 0xfc, which is u with diaresis in win-1252).


I hope it helps.

.TM.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Linux

2010-11-04 Thread Marco Colombo


On 11/04/2010 04:00 PM, Michael Gould wrote:

I know that this is probably a religion issue but we are looking to
move Postgres to a Linux server. We currently have a Windows 2008 R2
active directory and all of the other servers are virtualized via VMWare
ESXi. One of the reasons is that we want to use a 64 bit Postgres server
and the UUID processing contrib module does not provide a 64 bit version
for Windows. I would also assume that the database when properly tuned
will probably run faster in a *inx environment.

What and why should I look at certain distributions? It appears from
what I read, Ubanta is a good desktop but not a server.

Best Regards



Just find one that ships with the latest PG, to save you some work. 
Unless you plan to compile  install PG manually, in that case, any 
major distribution would do. For production use, how long your version 
will be supported for (security updates) is likely to be the most 
important item in your checklist. I use CentOS.


.TM.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] MySQL versus Postgres

2010-08-30 Thread Marco Colombo


On 12/08/2010 03:43, Tom Lane wrote:

Marco Colombopg...@esiway.net  writes:

It's a matter of correctness: I see PG as a high
performance database system. Allowing to start it in awfully suboptimal
conditions it's no different from allowing '-00-00' as a date: it
may give you the idea you did the right thing, but most of the time you
didn't.


Unfortunately, there are quite a few of us for whom correctness
doesn't mean automatically try to eat all the resources available.
Your view of what is useful behavior is far too narrow-minded ...

regards, tom lane


Well, my idea was more along the line of automatically try to acquire a 
reasonable amount of the available resources.


You know, you don't jump directly from 0,1% to 100%. There's a lot in 
between. In medio stat virtus.


.TM.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] MySQL versus Postgres

2010-08-11 Thread Marco Colombo


On 11/08/2010 04:01, Greg Smith wrote:

3. The default configuration settings for PostgreSQL are not optimal
for performance. Can there be a recommended configuration file in the
installation (assuming certain amount of RAM and processor type) ?


This doesn't work because there are many different types of database
applications, and what's optimal even as a starting configuration for
each type is very different. Also, hardware changes pretty fast; you'd
be hard pressed to write down useful generic recommendations (or insert
them into the core database code) that are still relevant at all after a
release has been out a few years.


Well, many defaults are hardcoded into a file now. I'd like to see 
'auto' among possible values of parameters, e.g.:


max_connections = auto
shared_buffers = auto
work_mem = auto

with PG wild guessing reasonable values based on system specs. It may be 
a awful piece of code (getting system info is very platform specific), 
and sometimes the guess may be wrong. Anyway nothing prevents PG to have 
a postgresql_failsafe.conf.


Not that I'm advocating it. Complex systems need well-thought configuration.

.TM.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] MySQL versus Postgres

2010-08-11 Thread Marco Colombo


On 11/08/2010 17:34, Greg Smith wrote:

The problem here is that the amount of shared memory a system can
allocate is hard to discover any other way than starting the server and
seeing if it works. So doing what you advise will leave the database
unable to start on any system that hasn't gotten the right OS kernel
tweaks done first.


Well, is that true under Windows, too? I think we need to cover Windows, 
here.


Under unix, having postgresql start correctly is a concern of the 
distribution vendor. Even if the guessing isn't bullet-proof, the vendor 
either knows how to configure the kernel to have the 'auto' thing work, 
or is able to provide its own postgresql.conf.


Sure, there are people who download and compile, but I don't think they 
are afraid of editing postgresql.conf should the server fail to start.


Also, I'd say this is a case where it's much better to fail with a 
message listen buddy, your server has 64GB of RAM installed but your 
kernel is configured for 20MB of shared memory only, you should really 
increase it, rather than start successfully but with very poor 
performance. It's a matter of correctness: I see PG as a high 
performance database system. Allowing to start it in awfully suboptimal 
conditions it's no different from allowing '-00-00' as a date: it 
may give you the idea you did the right thing, but most of the time you 
didn't.


.TM.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] solaris slow

2010-08-05 Thread Marco Colombo


On 02/08/2010 21:14, John R Pierce wrote:

Another factor, if your linux system was using LVM (its the default
storage configuration on many distributions), there's a pretty good
chance the drive mapper is ignoring write barriers, which greatly speeds
up random writes at the expense of reliable commits.


A lot has been discussed about how well LVM plays with PostgreSQL.

But for sure, write barriers are not related to commits. A reliable 
commit is about disks synchronously flushing their caches to platters 
(and of course, before that, OS cache being flushed to disk cache, which 
is the easy part).


In ACID terms, write barries help with C, not with D. Actually, a write 
barrier is kind of the opposite of a flush, it's a don't flush this 
before that. It enforces an order on writes, but doesn't enforce their 
immediate execution.


Of course, if the disk _doesn't_ support write barriers, poor's man 
workaround is to enforce a complete flush instead.


AFAIK, there's no POSIX API to request a write barrier at application 
level. There's only the sync (i.e. flush) related API. And I'm pretty 
sure PostegreSQL issues the right syncs at the right times.


LVM correctly reports it doesn't support write barriers. The FS then 
knows what to do (issue a full flush instead), and no harm is done 
(because not all disks support wbs, the code to handle that case has to 
be there in the FS). The FS just runs slower. But just as safe as 
without LVM.


And since PostgreSQL issues syncs (or uses synchronous APIs), there's 
little point discussing of wbs here, since it's all about flushing. Wbs 
are issued by the FS in order to _avoid_ a flush: the FS major concern 
is C, not D; but when it's the application requesting a flush, the flush 
can't be avoided, and wbs are almost useless.


.TM.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing HTML: HTML entities being rendered in that raw form

2009-04-10 Thread Marco Colombo

linnewbie wrote:
 On Apr 9, 1:00 pm, st...@blighty.com (Steve Atkins) wrote:
 On Apr 9, 2009, at 9:27 AM, linnewbie wrote:

 Hi all,
 I have stored HTML in a text field that I subsequently render on the
 web.  However when I retrieve and render this data on the web I  am
 getting the entities being rendered in their raw form, ie, instead of
 getting the '' symbol when 'amp;' is stored  I'm getting the 'raw'
 'amp;'.
 I would be grateful if anyone can point out how I can get around this.
 It's a problem in your code, not the database. You're probably
 escaping it one time to many or unescaping it one time too few.

 Cheers,
Steve

 --
 Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
 To make changes to your 
 subscription:http://www.postgresql.org/mailpref/pgsql-general
 
 To clarify, I am not escaping the string in any way.
 
 Say the page I am saving the database is the about us page form a
 company website.
 First, make a from to create the about us page in a  text area field,
 then I copy the
 html from my text editor and past it ino this text area from.  I then
 have a cgi script which
 takes the contents from the text area field and stores it in the
 database.
 
 What I have on disk would be:
 
 
 ..
 p Bonnie  amp; Clyde/p

 which would usually be rendered as:
 
 Bonnie  Clype
 
 but this is not happening, it's being rendered
 as:
 
 Bonnie amp; Clyde

That's because, as someone else suggested, something is quoting the .
In order to be rendered 'Bonnie amp; Clyde' by the browser, it needs to be
'Bonnie amp;amp; Clyde' in the HTML (just view the HTML source from inside
the browser).

You haven't provided any detail on the rendering phase, BTW.

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-30 Thread Marco Colombo

Markus Wanner wrote:
 Hi,
 
 Martijn van Oosterhout wrote:
 And fsync better do what you're asking
 (how fast is just a performance issue, just as long as it's done).
 
 Where are we on this issue? I've read all of this thread and the one on
 the lvm-linux mailing list as well, but still don't feel confident.
 
 In the following scenario:
 
   fsync - filesystem - physical disk
 
 I'm assuming the filesystem correctly issues an blkdev_issue_flush() on
 the physical disk upon fsync(), to do what it's told: flush the cache(s)
 to disk. Further, I'm also assuming the physical disk is flushable (i.e.
 it correctly implements the blkdev_issue_flush() call). Here we can be
 pretty certain that fsync works as advertised, I think.
 
 The unanswered question to me is, what's happening, if I add LVM in
 between as follows:
 
   fsync - filesystmem - device mapper (lvm) - physical disk(s)
 
 Again, assume the filesystem issues a blkdev_issue_flush() to the lower
 layer and the physical disks are all flushable (and implement that
 correctly). How does the device mapper behave?
 
 I'd expect it to forward the blkdev_issue_flush() call to all affected
 devices and only return after the last one has confirmed and completed
 flushing its caches. Is that the case?
 
 I've also read about the newish write barriers and about filesystems
 implementing fsync with such write barriers. That seems fishy to me and
 would of course break in combination with LVM (which doesn't completely
 support write barriers, AFAIU). However, that's clearly the filesystem
 side of the story and has not much to do with whether fsync lies on top
 of LVM or not.
 
 Help in clarifying this issue greatly appreciated.
 
 Kind Regards
 
 Markus Wanner

Well, AFAIK, the summary would be:

1) adding LVM to the chain makes no difference;

2) you still need to disable the write-back cache in IDE/SATA disks,
for fsync() to work properly.

3) without LVM and with write-back cache enabled, due to current(?)
limitations in the linux kernel, with some journaled filesystems
(but not ext3 in data=write-back or data=ordered mode, I'm not sure
about data=journal), you may be less vulnerable, if you use fsync()
(or O_SYNC).

less vulnerable means that all pending changes are commetted to disk,
but the very last one.

So:
- write-back cache + EXT3 = unsafe
- write-back cache + other fs = (depending on the fs)[*] safer but not 100% safe
- write-back cache + LVM + any fs = unsafe
- write-thru cache + any fs = safe
- write-thru cache + LVM + any fs = safe

[*] the fs must use (directly or indirectly via journal commit) a write barrier
on fsync(). Ext3 doesn't (it does when the inode changes, but that happens
once a second only).

If you want both speed and safety, use a batter-backed controller (and 
write-thru
cache on disks, but the controller should enforce it when you plug the disks 
in).
It's the usual Fast, Safe, Cheap: choose two.

This is an interesting article:

http://support.microsoft.com/kb/234656/en-us/

note how for all three kinds of disk (IDE/SATA/SCSI) they say:
Disk caching should be disabled in order to use the drive with SQL Server.

They don't mention write barriers.

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-20 Thread Marco Colombo

Martijn van Oosterhout wrote:
 True, but the relative wakeup order of two different processes is not
 important since by definition they are working on different
 transactions. As long as the WAL writes for a single transaction (in a
 single process) are not reordered you're fine.

I'm not totally sure, but I think I understand what you mean here,
indipendent transactions by definition don't care about relative ordering.

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-20 Thread Marco Colombo

Ron Mayer wrote:
 Marco Colombo wrote:
 Yes, but we knew it already, didn't we? It's always been like
 that, with IDE disks and write-back cache enabled, fsync just
 waits for the disk reporting completion and disks lie about
 
 I've looked hard, and I have yet to see a disk that lies.

No, lie in the sense they report completion before the data
hit the platters. Of course, that's the expected behaviour with
write-back caches.

 ext3, OTOH seems to lie.

ext3 simply doesn't know, it interfaces with a block device,
which does the caching (OS level) and the reordering (e.g. elevator
algorithm). ext3 doesn't directly send commands to the disk,
neither manages the OS cache.

When software raid and device mapper come into play, you have
virtual block devices built on top of other block devices.

My home desktop has ext3 on top of a dm device (/dev/mapper/something,
a LV set up by LVM in this case), on top of a raid1 device (/dev/mdX),
on top of /dev/sdaX and /dev/sdbX, which, in a way, on their own
are blocks device built on others, /dev/sda and /dev/sdb (you don't
actually send commands to partitions, do you? although the mapping
sector offset relative to partition - real sector on disk is
trivial).

Each of these layers potentially caches writes and reorders them, it's
the job of a block device, although it makes sense at most only for
the last one, the one that controls the disk. Anyway there isn't
much ext3 can do, but posting wb and flush requests to the block
device at the top of the stack.

 IDE drives happily report whether they support write barriers
 or not, which you can see with the command:
 %hdparm -I /dev/hdf | grep FLUSH_CACHE_EXT

Of course a write barrier is not a cache flush. A flush is
synchronous, a write barrier asyncronous. The disk supports
flushing, not write barriers. Well, technically if you can
control the ordering of the requests, that's barriers proper.
With SCSI you can, IIRC. But a cache flush is, well, a flush.

 Linux kernels since 2005 or so check for this feature.  It'll
 happily tell you which of your devices don't support it.
   %dmesg | grep 'disabling barriers'
   JBD: barrier-based sync failed on md1 - disabling barriers
 And for devices that do, it will happily send IDE FLUSH CACHE
 commands to IDE drives that support the feature.   At the same
 time Linux kernels started sending the very similar. SCSI
 SYNCHRONIZE CACHE commands.

 Anyway, it's the block device job to control disk caches. A
 filesystem is just a client to the block device, it posts a
 flush request, what happens depends on the block device code.
 The FS doesn't talk to disks directly. And a write barrier is
 not a flush request, is a please do not reorder request.
 On fsync(), ext3 issues a flush request to the block device,
 that's all it's expected to do.
 
 But AFAICT ext3 fsync() only tell the block device to
 flush disk caches if the inode was changed.

No, ext3 posts a write barrier request when the inode changes and it
commits the journal, which is not a flush. [*]

 Or, at least empirically if I modify a file and do
 fsync(fd); on ext3 it does not wait until the disk
 spun to where it's supposed to spin.   But if I put
 a couple fchmod()'s right before the fsync() it does.

If you were right, and ext3 didn't wait, it would make no
difference to have disk cache enabled or not, on fsync.
My test shows a 50x speedup when turning the disk cache on.
So for sure ext3 is waiting for the block device to report
completion. It's the block device that - on flush - doesn't
issue a FLUSH command to the disk.

.TM.

[*] A barrier ends up in a FLUSH for the disk, but it doesn't
mean it's synchronous, like a real flush. Even journal updates done
with barriers don't mean hit the disk now, they just mean keep
order when writing. If you turn off automatic page cache flushing
and if you have zero memory pressure, a write request with a
barrier may stay forever in the OS cache, at least in theory.

Imagine you don't have bdflush and nothing reclaims resources: days
of activity may stay in RAM, as far as write barriers are concerned.
Now someone types 'sync' as root. The block device starts flushing
dirty pages, reordering writes, but honoring barriers, that is,
it reorders anything up to the first barrier, posts write requests
to the disk, issues a FLUSH command then waits until the flush
is completed. Then consumes the barrier, and starts processing
writes, reordering them up to the next barrier, and so on.
So yes, a barrier turns into a FLUSH command for the disk. But in
this scenario, days have passed since the original write/barrier request
from the filesystem.

Compare with a fsync(). Even in the above scenario, a fsync() should
end up in a FLUSH command to the disk, and wait for the request to
complete, before awakening the process that issued it. So the filesystem
has to request a flush operation to the block device, not a barrier.
And so it does.

If it turns out that the block device just issues writes

Re: [GENERAL] Maximum transaction rate

2009-03-18 Thread Marco Colombo

Greg Smith wrote:
 On Wed, 18 Mar 2009, Marco Colombo wrote:
 
 If you fsync() after each write you want ordered, there can't be any
 subsequent I/O (unless there are many different processes
 cuncurrently writing to the file w/o synchronization).
 
 Inside PostgreSQL, each of the database backend processes ends up
 writing blocks to the database disk, if they need to allocate a new
 buffer and the one they are handed is dirty.  You can easily have
 several of those writing to the same 1GB underlying file on disk.  So
 that prerequisite is there.  The main potential for a problem here would
 be if a stray unsynchronized write from one of those backends happened
 in a way that wasn't accounted for by the WAL+checkpoint design.

Wow, that would be quite a bug. That's why I wrote w/o synchronization.
stray + unaccounted + cuncurrent smells like the recipe for an
explosive to me :)

 What I
 was suggesting is that the way that synchronization happens in the
 database provides some defense from running into problems in this area.

I hope it's full defence. If you have two processes doing at the
same time write(); fsycn(); on the same file, either there are no order
requirements, or it will boom sooner or later... fsync() works inside
a single process, but any system call may put the process to sleep, and
who knows when it will be awakened and what other processes did to that
file meanwhile. I'm pretty confident that PG code protects access to
shared resources with synchronization primitives.

Anyway I was referring to WAL writes... due to the nature of a log,
it's hard to think of many unordered writes and of cuncurrent access
w/o synchronization. But inside a critical region, there can be more
than one single write, and you may need to enforce an order, but no
more than that before the final fsycn(). If so, userland originated
barriers instead of full fsync()'s may help with performance.
But I'm speculating.

 The way backends handle writes themselves is also why your suggestion
 about the database being able to utilize barriers isn't really helpful.
 Those trickle out all the time, and normally you don't even have to care
 about ordering them.  The only you do need to care, at checkpoint time,
 only a hard line is really practical--all writes up to that point,
 period. Trying to implement ordered writes for everything that happened
 before then would complicate the code base, which isn't going to happen
 for such a platform+filesystem specific feature, one that really doesn't
 offer much acceleration from the database's perspective.

I don't know the internals of WAL writing, I can't really reply on that.

 only when the journal wraps around there's a (extremely) small window
 of vulnerability. You need to write a careful crafted torture program
 to get any chance to observe that... such program exists, and triggers
 the problem
 
 Yeah, I've been following all that.  The PostgreSQL WAL design works on
 ext2 filesystems with no journal at all.  Some people even put their
 pg_xlog directory onto ext2 filesystems for best performance, relying on
 the WAL to be the journal.  As long as fsync is honored correctly, the
 WAL writes should be re-writing already allocated space, which makes
 this category of journal mayhem not so much of a problem.  But when I
 read about fsync doing unexpected things, that gets me more concerned.

Well, that's highly dependant on your expectations :) I don't expect
a fsync to trigger a journal commit, if metadata hasn't changed. That's
obviuosly true for metadata-only journals (like most of them, with
notable exceptions of ext3 in data=journal mode).

Yet, if you're referring to this
http://article.gmane.org/gmane.linux.file-systems/21373

well that seems to me the same usual thing/bug, fsync() allows disks to
lie when it comes to caching writes. Nothing new under the sun.

Barriers don't change much, because they don't replace a flush. They're
about consistency, not durability. So even with full barriers support,
a fsync implementation needs to end up in a disk cache flush, to be fully
compliant with its own semantics.

.TM.

-
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-18 Thread Marco Colombo

Ron Mayer wrote:
 Marco Colombo wrote:
 Ron Mayer wrote:
 Greg Smith wrote:
 There are some known limitations to Linux fsync that I remain somewhat
 concerned about, independantly of LVM, like ext3 fsync() only does a
 journal commit when the inode has changed (see
 http://kerneltrap.org/mailarchive/linux-kernel/2008/2/26/990504 )
 I wonder if there should be an optional fsync mode
 in postgres should turn fsync() into
 fchmod (fd, 0644); fchmod (fd, 0664);
 'course I meant: fchmod (fd, 0644); fchmod (fd, 0664); fsync(fd);
 to work around this issue.
 Question is... why do you care if the journal is not flushed on fsync?
 Only the file data blocks need to be, if the inode is unchanged.
 
 You don't - but ext3 fsync won't even push the file data blocks
 through a disk cache unless the inode was changed.
 
 The point is that ext3 only does the write barrier processing
 that issues the FLUSH CACHE (IDE) or SYNCHRONIZE CACHE (SCSI)
 commands on inode changes, not data changes.   And with no FLUSH
 CACHE or SYNCHRONINZE IDE the data blocks may sit in the disks
 cache after the fsync() as well.

Yes, but we knew it already, didn't we? It's always been like
that, with IDE disks and write-back cache enabled, fsync just
waits for the disk reporting completion and disks lie about
that. Write barriers enforce ordering, WHEN writes are
committed to disk, they will be in order, but that doesn't mean
NOW. Ordering is enough for FS a journal, the only requirement
is consistency.

Anyway, it's the block device job to control disk caches. A
filesystem is just a client to the block device, it posts a
flush request, what happens depends on the block device code.
The FS doesn't talk to disks directly. And a write barrier is
not a flush request, is a please do not reorder request.
On fsync(), ext3 issues a flush request to the block device,
that's all it's expected to do.

Now, some block devices may implement write barriers issuing
FLUSH commands to the disk, but that's another matter. A FS
shouldn't rely on that.

You can replace a barrier with a flush (not as efficently),
but not the other way around.

If a block device driver issues FLUSH for a barrier, and
doesn't issue a FLUSH for a flush, well, it's a buggy driver,
IMHO.

.TM.

-
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-18 Thread Marco Colombo

Martijn van Oosterhout wrote:
 Generally PG uses O_SYNC on open, so it's only one system call, not
 two. And the file it's writing to is generally preallocated (not
 always though).

It has to wait for I/O completion on write(), then, it has to go to
sleep. If two different processes do a write(), you don't know which
will be awakened first. Preallocation don't mean much here, since with
O_SYNC you expect a physical write to be done (with the whole sleep/
HW interrupt/SW interrupt/awake dance). It's true that you may expect
the writes to be carried out in order, and that might be enough. I'm
not sure tho.

 Well, that's highly dependant on your expectations :) I don't expect
 a fsync to trigger a journal commit, if metadata hasn't changed. That's
 obviuosly true for metadata-only journals (like most of them, with
 notable exceptions of ext3 in data=journal mode).
 
 Really the only thing needed is that the WAL entry reaches disk before
 the actual data does. AIUI as long as you have that the situation is
 recoverable. Given that the actual data probably won't be written for a
 while it'd need to go pretty wonky before you see an issue.

You're giveing up Durability here. In a closed system, that doesn't mean
much, but when you report payment accepted to third parties, you can't
forget about it later. The requirement you stated is for Consistency only.
That's  what a journaled FS cares about, i.e. no need for fsck (internal
consistency checks) after a crash. It may be acceptable for a remote
standby backup, you replay as much of the WAL as it's available after
the crash (the part you managed to copy, that is). But you know there
can be lost transactions.

It may be acceptable or not. Sometimes it's not. Sometimes you must be
sure the data in on platters before you report committed. Sometimes
when you say fsync! you mean i want data flushed to disk NOW, and I
really mean it!. :)

.TM.

-
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-17 Thread Marco Colombo

John R Pierce wrote:
 Stefan Kaltenbrunner wrote:
 So in my understanding LVM is safe on disks that have write cache
 disabled or behave as one (like a controller with a battery backed
 cache).
 
 what about drive write caches on battery backed raid controllers?  do
 the controllers ensure the drive cache gets flushed prior to releasing
 the cached write blocks ?

If LVM/dm is lying about fsync(), all this is moot. There's no point
talking about disk caches.

BTW. This discussion is continuing on the linux-lvm mailing list.
https://www.redhat.com/archives/linux-lvm/2009-March/msg00025.html
I have some PG databases on LVM systems, so I need to know for sure
I have have to move them elsewhere. It seemed to me the right place
for asking about the issue.

Someone there pointed out that fsycn() is not LVM's responsibility.

Correct. For sure, there's an API (or more than one) a filesystem uses
to force a flush on the underlying block device, and for sure it has to
called while inside the fsync() system call.

So lying to fsync() maybe is more correct than lying about fsync().

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-17 Thread Marco Colombo

Greg Smith wrote:
On Tue, 17 Mar 2009, Marco Colombo wrote:

If LVM/dm is lying about fsync(), all this is moot. There's no point
talking about disk caches.

I decided to run some tests to see what's going on there, and it looks
like some of my quick criticism of LVM might not actually be valid--it's
only the performance that is problematic, not necessarily the
reliability. Appears to support fsync just fine. I tested with kernel
2.6.22, so certainly not before the recent changes to LVM behavior
improving this area, but with the bugs around here from earlier kernels
squashed (like crummy HPA support circa 2.6.18-2.6.19, see
https://launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/82314 )

I've run tests too, you can seen them here:
https://www.redhat.com/archives/linux-lvm/2009-March/msg00055.html
in case you're looking for something trivial (write/fsync loop).

You can do a quick test of fsync rate using sysbench; got the idea from
http://www.mysqlperformanceblog.com/2006/05/03/group-commit-and-real-fsync/
(their command has some typos, fixed one below)

If fsync is working properly, you'll get something near the RPM rate of
the disk. If it's lying, you'll see a much higher number.

Same results. -W1 gives x50 speedup, it must be waiting for something
at disk level with -W0.

[...]

Based on this test, it looks to me like fsync works fine on LVM. It
must be passing that down to the physical disk correctly or I'd still be
seeing inflated rates. If you've got a physical disk that lies about
fsync, and you put a database on it, you're screwed whether or not you
use LVM; nothing different on LVM than in the regular case. A
battery-backed caching controller should also handle fsync fine if it
turns off the physical disk cache, which most of them do--and, again,
you're no more or less exposed to that particular problem with LVM than
a regular filesystem.

That was my initial understanding.

The thing that barriers helps out with is that it makes it possible to
optimize flushing ext3 journal metadata when combined with hard drives
that support the appropriate cache flushing mechanism (what hdparm calls
FLUSH CACHE EXT; see
http://forums.opensuse.org/archives/sls-archives/archives-suse-linux/archives-desktop-environments/379681-barrier-sync.html
). That way you can prioritize flushing just the metadata needed to
prevent filesystem corruption while still fully caching less critical
regular old writes. In that situation, performance could be greatly
improved over turning off caching altogether. However, in the
PostgreSQL case, the fsync hammer doesn't appreciate this optimization
anyway--all the database writes are going to get forced out by that no
matter what before the database considers them reliable. Proper
barriers support might be helpful in the case where you're using a
database on a shared disk that has other files being written to as well,
basically allowing caching on those while forcing the database blocks to
physical disk, but that presumes the Linux fsync implementation is more
sophisticated than I believe it currently is.

This is the same conclusion I came to. Moreover, once you have barriers
passed down to the disks, it would be nice to have a userland API to send
them to the kernel. Any application managing a 'journal' or 'log' type
of object, would benefit from that. I'm not familiar with PG internals,
but it's likely you can have some records you just want to be ordered, and
you can do something like write-barrier-write-barrier-...-fsync instead of
write-fsync-write-fsync-... Currenly fsync() (and friends, O_SYNC,
fdatasync(), O_DSYNC) is the only way to enforce ordering on writes
from userland.

Far as I can tell, the main open question I didn't directly test here is
whether LVM does any write reordering that can impact database use
because it doesn't handle write barriers properly. According to
https://www.redhat.com/archives/linux-lvm/2009-March/msg00026.html it
does not, and I never got the impression that was impacted by the LVM
layer before. The concern is nicely summarized by the comment from Xman
at http://lwn.net/Articles/283161/ :

fsync will block until the outstanding requests have been sync'd do
disk, but it doesn't guarantee that subsequent I/O's to the same fd
won't potentially also get completed, and potentially ahead of the I/O's
submitted prior to the fsync. In fact it can't make such guarantees
without functioning barriers.

Sure, but from userland you can't set barriers. If you fsync() after each
write you want ordered, there can't be any subsequent I/O (unless
there are many different processes cuncurrently writing to the file
w/o synchronization).

Since we know LVM does not have functioning barriers, this would seem to
be one area where PostgreSQL would be vulnerable. But since ext3
doesn't have barriers turned by default either (except some recent SuSE
system), it's not unique to a LVM setup

Re: [GENERAL] Maximum transaction rate

2009-03-17 Thread Marco Colombo

Ron Mayer wrote:
 Greg Smith wrote:
 There are some known limitations to Linux fsync that I remain somewhat
 concerned about, independantly of LVM, like ext3 fsync() only does a
 journal commit when the inode has changed (see
 http://kerneltrap.org/mailarchive/linux-kernel/2008/2/26/990504 ).  The
 way files are preallocated, the PostgreSQL WAL is supposed to function
 just fine even if you're using fdatasync after WAL writes, which also
 wouldn't touch the journal (last time I checked fdatasync was
 implemented as a full fsync on Linux).  Since the new ext4 is more
 
 Indeed it does.
 
 I wonder if there should be an optional fsync mode
 in postgres should turn fsync() into
 fchmod (fd, 0644); fchmod (fd, 0664);
 to work around this issue.

Question is... why do you care if the journal is not flushed on fsync?
Only the file data blocks need to be, if the inode is unchanged.

 For example this program below will show one write
 per disk revolution if you leave the fchmod() in there,
 and run many times faster (i.e. lying) if you remove it.
 This with ext3 on a standard IDE drive with the write
 cache enabled, and no LVM or anything between them.
 
 ==
 /*
 ** based on http://article.gmane.org/gmane.linux.file-systems/21373
 ** http://thread.gmane.org/gmane.linux.kernel/646040
 */
 #include sys/types.h
 #include sys/stat.h
 #include fcntl.h
 #include unistd.h
 #include stdio.h
 #include stdlib.h
 
 int main(int argc,char *argv[]) {
   if (argc2) {
 printf(usage: fs filename\n);
 exit(1);
   }
   int fd = open (argv[1], O_RDWR | O_CREAT | O_TRUNC, 0666);
   int i;
   for (i=0;i100;i++) {
 char byte;
 pwrite (fd, byte, 1, 0);
 fchmod (fd, 0644); fchmod (fd, 0664);
 fsync (fd);
   }
 }
 ==
 

I ran the program above, w/o the fchmod()s.

$ time ./test2 testfile

real0m0.056s
user0m0.001s
sys 0m0.008s

This is with ext3+LVM+raid1+sata disks with hdparm -W1.
With -W0 I get:

$ time ./test2 testfile

real0m1.014s
user0m0.000s
sys 0m0.008s

Big difference. The fsync() there does its job.

The same program runs with a x3 slowdown with the fsyncs, but that's
expected, it's doing twice the writes, and in different places.

.TM.

-
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-15 Thread Marco Colombo

Joshua D. Drake wrote:
 
 I understand but disabling cache is not an option for anyone I know. So
 I need to know the other :)
 
 Joshua D. Drake
 

Come on, how many people/organizations do you know who really need 30+ MB/s
sustained write throughtput in the disk subsystem but can't afford a
battery backed controller at the same time?

Something must take care of writing data in the disk cache on permanent
storage; write-thru caches, battery backed controllers, write barriers
are all alternatives, choose the one you like most.

The problem here is fsync(). We know that not fsync()'ing gives you a big
performance boost, but that's not the point. I want to choose, and I want
a true fsync() when I ask for one. Because if the data don't make it to
the disk cache, the whole point about wt, bb and wb is moot.

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-14 Thread Marco Colombo

Joshua D. Drake wrote:
 On Sat, 2009-03-14 at 05:25 +0100, Marco Colombo wrote:
 Scott Marlowe wrote:
 
 Also see:
 http://lkml.org/lkml/2008/2/26/41
 but it seems to me that all this discussion is under the assuption that
 disks have write-back caches.
 The alternative is to disable the disk write cache. says it all.
 
 If this applies to raid based cache as well then performance is going to
 completely tank. For users of Linux + PostgreSQL using LVM.
 
 Joshua D. Drake

Yet that's not the point. The point is safety. I may have a lightly loaded
database, with low write rate, but still I want it to be reliable. I just
want to know if disabling the caches makes it reliable or not. People on LK
seem to think it does. And it seems to me they may have a point.
fsync() is a flush operation on the block device, not a write barrier. LVM
doesn't pass write barriers down, but that doesn't mean it doesn't perform
a flush when requested to.

.TM.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-13 Thread Marco Colombo

Scott Marlowe wrote:
 On Fri, Mar 6, 2009 at 2:22 PM, Ben Chobot be...@silentmedia.com wrote:
 On Fri, 6 Mar 2009, Greg Smith wrote:

 On Fri, 6 Mar 2009, Tom Lane wrote:

  Otherwise you need to reconfigure your drive to not cache writes.
  I forget the incantation for that but it's in the PG list archives.
 There's a dicussion of this in the docs now,
 http://www.postgresql.org/docs/8.3/interactive/wal-reliability.html
 How does turning off write caching on the disk stop the problem with LVM? It
 still seems like you have to get the data out of the OS buffer, and if
 fsync() doesn't do that for you
 
 I think he was saying otherwise (if you're not using LVM and you still
 have this super high transaction rate) you'll need to turn off the
 drive's write caches.  I kinda wondered at it for a second too.
 

And I'm still wondering. The problem with LVM, AFAIK, is missing support
for write barriers. Once you disable the write-back cache on the disk,
you no longer need write barriers. So I'm missing something, what else
does LVM do to break fsync()?

It was my understanding that disabling disk caches was enough.

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-13 Thread Marco Colombo

Tom Lane wrote:
 Marco Colombo pg...@esiway.net writes:
 And I'm still wondering. The problem with LVM, AFAIK, is missing support
 for write barriers. Once you disable the write-back cache on the disk,
 you no longer need write barriers. So I'm missing something, what else
 does LVM do to break fsync()?
 
 I think you're imagining that the disk hardware is the only source of
 write reordering, which isn't the case ... various layers in the kernel
 can reorder operations before they get sent to the disk.
 
   regards, tom lane

You mean some layer (LVM) is lying about the fsync()?

write(A);
fsync();
...
write(B);
fsync();
...
write(C);
fsync();

you mean that the process may be awakened after the first fsync() while
A is still somewhere in OS buffers and not sent to disk yet, so it's
possible that B gets to the disk BEFORE A. And if the system crashes,
A never hits the platters while B (possibly) does. Is it this you
mean by write reodering?

But doesn't this break any application with transactional-like behavior,
such as sendmail? The problem being 3rd parties, if sendmail declares
ok, I saved the message (*after* a fsync()) to the SMTP client,
it's actually lying 'cause the message hasn't hit the platters yet.
Same applies to IMAP/POP server, say. Well, it applies to anything
using fsync().

I mean, all this with disk caches in write-thru modes? It's the OS
lying, not the disks?

Wait, this breaks all journaled FSes as well, a DM device is just
a block device to them, if it's lying about synchronous writes the
whole purpose of the journal is defeated... I find it hard to
believe, I have to say.

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Maximum transaction rate

2009-03-13 Thread Marco Colombo

Scott Marlowe wrote:
 On Fri, Mar 13, 2009 at 1:09 PM, Christophe x...@thebuild.com wrote:
 So, if the software calls fsync, but fsync doesn't actually push the data to
 the controller, you are still at risk... right?
 
 Ding!
 

I've been doing some googling, now I'm not sure that not supporting barriers
implies not supporting (of lying) at blkdev_issue_flush(). It seems that
it's pretty common (and well-defined) for block devices to report
-EOPNOTSUPP at BIO_RW_BARRIER requests. device mapper apparently falls in
this category.

See:
http://lkml.org/lkml/2007/5/25/71
this is an interesting discussion on barriers and flushing.

It seems to me that PostgreSQL needs both ordered and synchronous
writes, maybe at different times (not that EVERY write must be both ordered
and synchronous).

You can emulate ordered with single+synchronous althought with a price.
You can't emulate synchronous writes with just barriers.

OPTIMAL: write-barrier-write-barrier-write-barrier-flush

SUBOPTIMAL: write-flush-write-flush-write-flush


As I understand it, fsync() should always issue a real flush: it's unrelated
to the barriers issue.
There's no API to issue ordered writes (or barriers) at user level,
AFAIK. (Uhm... O_DIRECT, maybe implies that?)

FS code may internally issue barrier requests to the block device, for
its own purposes (e.g. journal updates), but there's not useland API for
that.

Yet, there's no reference to DM not supporting flush correctly in the
whole thread... actually there are refereces to the opposite. DM devices
are defined as FLUSHABLE.

Also see:
http://lkml.org/lkml/2008/2/26/41
but it seems to me that all this discussion is under the assuption that
disks have write-back caches.
The alternative is to disable the disk write cache. says it all.

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] php4 and postgresql 8.3

2009-03-04 Thread Marco Colombo

Tom Lane wrote:
 shadrack shadke...@hotmail.com writes:
 My basic question is...are php4 and postgresql 8.3 compatible?
 I'm running Linux Redhat 3.4.6, php4.3.9, and postgresql 8.3.  I know,
 some of those versions are old...its government, and I unfortunately
 don't have control over the version.
 
 Er ... Red Hat *what*?  I don't think they ever used such a version
 number.  If they did it was a very long time ago (for calibration,
 they were just about to release RHL 7.3 when I joined the company,
 in 2001).  You could probably scare the powers-that-be into letting
 you update the thing just on the grounds of it still being full of
 Y2K bugs.

Well, the first Red Hat Linux I'm used was 4.1 (I still remember the
code name, Vanderbilt). And it was '97 I think. So if you're *really*
thinking of RHL 3 something, wikipedia says about early RH revision
history:

# 1.0 (Mother's Day), November 3, 1994 (Linux 1.2.8)
# 1.1 (Mother's Day+0.1), August 1, 1995 (Linux 1.2.11)
# 2.0, September 20, 1995 (Linux 1.2.13-2)
# 2.1, November 23, 1995 (Linux 1.2.13)
# 3.0.3 (Picasso), May 1, 1996 - first release supporting DEC Alpha
# 4.0 (Colgate), October 3, 1996 (Linux 2.0.18) - first release supporting SPARC
# 4.1 (Vanderbilt), February 3, 1997 (Linux 2.0.27)

which reminds me, I've used Colgate a bit too. Back in those years I've used
also Postres95 (but on Slackware I suspect). Anyway, if I'm not mistaken,
the birth of PostgreSQL proper (6.0 I think) was at that time, '96.

The OP must refer to RHEL 3 something, it can't be RHL 3. He'd have a
tyrannosaur in his backyard and he's crying for help because he's got
a broken leg. We go send pictures!. Quite seriously I'd really like
to see a picture of a running RHL 3 still in production. Maybe even
at RH won't mind one. Too bad uptime wouldn't show more than 497 days...

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Good Delimiter for copy command

2009-02-17 Thread Marco Colombo

Tom Lane wrote:
 Andrew Gould andrewlylego...@gmail.com writes:
 To the list:  Does pg_dump escape characters that are the same as the
 delimiter?
 
 Yes.  The OP has not actually explained why he needs to pick a
 nondefault delimiter, unless maybe it is that he wants to feed the
 dump to some program that is too dumb to deal with escaping.
 
   regards, tom lane
 

Which makes me wonder, does copy accept UTF-8 input? Is it possibile
to use some unicode character which is unlikely to appear in the data
set as delimiter? Something like U+FFFC.

Here I'm using U+25CF for it's much more likely you can see it
(it's a black circle).

$ python -c 'print u;.join(ABCD).encode(utf8)'  1.txt
$ cat 1.txt
A;B;C;D

$ python -c 'print u\u25cf.join(ABCD).encode(utf8)'  2.txt
$ cat 2.txt
A●B●C●D

$ psql -c \d test
Table public.test
 Column | Type | Modifiers
+--+---
 a  | text |
 b  | text |
 c  | text |
 d  | text |

$ psql -c \copy test from '1.txt' delimiter ;
$ psql -c \copy test from '2.txt' delimiter ●
ERROR:  COPY delimiter must be a single character
\copy: ERROR:  COPY delimiter must be a single character


It doesn't seem to work, but I need to stress that I'm using
$ psql --version
psql (PostgreSQL) 8.2.9

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] NATURAL JOINs

2008-10-16 Thread Marco Colombo

Scott Marlowe wrote:
 On Wed, Oct 15, 2008 at 10:17 AM, Richard Broersma
 [EMAIL PROTECTED] wrote:
 On Wed, Oct 15, 2008 at 9:03 AM, Reg Me Please [EMAIL PROTECTED] wrote:

 Both are perfectly right, indeed.
 Nonetheless, in my opinion a NATURAL JOIN  exploiting the FKs
 instead of the column names would be much more helpful and much less error
 prone!

 As far as I know there is no way to exploit FKs in JOINs, right?
 Yes AFAIK, this would make postgresql's implementation of natural join
 violate the SQL standard.  Perhaps you could propose an UNNATURAL
 JOIN syntax extension. ;)
 
 Or a VERY VERY NATURAL JOIN syntax?  :)
 

Yeah, while we're at it, why don't we ask Tom to implement the
DO_THE_QUERY_I_WANT_BUT_CANT_EXPRESS_IN_SQL syntax? That would
solve a whole class of problems the newbie DBAs face every day. :)

Seriously, I find NATURAL joins useful. While I understand the
design queries resilient to schema changes (explicit is better
than implicit in the Zen of Python), I think if you design your
schema so that you can use NATURAL joins, you've done good (I like
using the same identifier for the same object, expecially pkeys,
when used in different places, e.g. as foreign keys in another
table, and conversely never use the same name for different
objects). Breaking NATURAL joins acts like a safeguard, when you've
done something bad to your schema.

Let's not forget SQL serves for two purposes... for embedding queries
in applications (then, yes, those queries should be as resilient as
possible) or for querying a db interactively. That's why it has been
designed with a syntax resembling natural language. When used
interactively, natural joins rock. Less typing and less thinking
about column names.

As for the OP request, my answer is just give KFs the same name of
the PKs they refer to AND never use the same name for columns in
different tables if they're not the same thing, which seems both...
hmm, natural, to me.

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Starting PostgreSQL

2008-10-12 Thread Marco Colombo

admin wrote:
 Sorry folks, a perennial one I'm sure ...
 
 I have read the manual and Googled for a couple of hours but still can't
 connect to PostgreSQL 8.3.4 (the PGDG RPMs running on an up to date
 CentOS 5.2).
 
 I continually get this message:
 
 psql: could not connect to server: No such file or firectory
 Is the server running locally and accepting
 connections on Unix domain socket /tmp/.s.PDSQL.0?
 
 Yes, the server is running as 'ps -aux' and 'netstat -l' and 'service
 postgresql status' all confirm.

Do you mean you have something like this in your netstan -l?
unix  2  [ ACC ] STREAM LISTENING 12587  /tmp/.s.PGSQL.5432

note, this is on a linux box with postgresql in standard configuration.
Just look at the port number embedded in the socket name. I don't really
think you can run a process on port 0. I think your psql is looking for
the wrong socket.

Try:
$ psql -p 5432 ...

If you don't see any unix socket for PG (I don't even think that's possible),
then you need to use IP sockets:

$ psql -p 5432 -h localhost ...

 
 service postgresql start/stop/restart works without errors
 pg_ctl start/stop/restart works without errors
 
 There is no socket file in /tmp.

Opps sorry I missed this. Well double check with netstat, but it's
possible your PG is not configured for Unix sockets... even if I
wouldn't know how to do that.

I just checked a CentOS5.2 running PG and there it is:

$ ls -l /tmp/.s.PGSQL.5432
srwxrwxrwx 1 postgres postgres 0 Oct 13 01:22 /tmp/.s.PGSQL.5432

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Frustrated...pg_dump/restore

2008-10-10 Thread Marco Colombo

Jeff Amiel wrote:

 Ahhh
 *looks at encoding*
 
 Well..they are both the same...BUT...they are set to
 ENCODING = 'SQL_ASCII';
 
 That explains a lotthey should probably be set to Unicode UTF8
 Duh
 
 Any way to change encoding without dumping/restoring database?

You can change client encoding any time with the PGCLIENTENCODING
environment variable. AFAIK, there's no way to change the encoding of
a database, it's set at creation time.

But I think SQL_ASCII makes it less picky about the input, so that
might not be the source of your problem.

You should look at the errors you see _before_ the invalid command \N.
I suspect a slight schema mismatch... that could cause a COPY to fail,
while an INSERT might still work.

How did you create the 'schema-only database'? With a
pg_dump --schema-only or with a different SQL script?

You may also try and pg_dump --schema-only both databases and diff
the output.

.TM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Dumping/Restoring with constraints?

2008-08-29 Thread Marco Colombo


Phoenix Kiula wrote:

Thanks Andrew.

On the server (the DB to be dumped) everything is UTF8.

On my home server (where I would like to mirror the DB), this is the output:


=# \l
List of databases
   Name|  Owner  | Encoding
---+-+---
 postgres  | postgres| SQL_ASCII
 pkiula| pkiula_pkiula   | UTF8
 template0 | postgres| SQL_ASCII
 template1 | postgres| SQL_ASCII
(4 rows)



This is a fresh install as you can see. The database into which I am
importing (pkiula) is in fact listed as UTF8! Is this not enough?



You said you're getting these errors:
ERROR:  invalid byte sequence for encoding UTF8: 0x80

those 0x80 bytes are inside the mydb.sql file, you may find it easier to 
 look for them there and identify the offending string(s). Try (on the 
linux machine):


zcat mydb.sql.gz | iconv -f utf8  /dev/null

should tell you something like:

illegal input sequence at position xxx

BTW, 0x80 is usually found in windows encoding, such as windows-1250, 
where it stands for the EURO symbol:


echo -n € | iconv -t windows-1250 | hexdump -C
  80|.|
0001


FYI, you *can* get non UTF-8 data from an UTF-8 database, if (and only 
if) your client encoding is something different (either because you 
explicitly set it so, or because of your client defaults).


Likewise, you can insert non UTF-8 data (such as your mydb.sql) into an 
UTF-8 database, provided you set your client encoding accordingly. 
PostgreSQL clients handle encoding conversions, but there's no way to 
guess (reliabily) the encoding of a text file.


OTOH, from a SQL_ASCII database you can get all sort of data, even mixed 
 encoding text (which you need to fix somehow). If your mydb.sql 
contains data from a SQL_ASCII database, you simply know nothing about 
the encoding.


I have seen SQL_ASCII databases containg data inserted from HTTP forms, 
both in UTF-8 and windows-1250 encoding. Displaying, dumping, restoring
that correctly is impossible, you need to fix it somehow before 
processing it as text.


.TM.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Why LIMIT and OFFSET are commutative

2007-12-03 Thread Marco Colombo

Andrus wrote:
 Under what interpretation would the results differ?
 
 Results must differ for easy creation of LinQ-PostgreSQL driver.
 If results are always the same , PostgreSQL should not allow to use both
 order of clauses.
 
 Nicholas explains:
 
Assuming the ordering is the same on each of them (because Skip and Take
 make no sense without ordering, LINQ to SQL will create an order for you,
 which irritates me to no end, but that's a separate thread), they will
 produce different results.
 
 Say your query will produce the ordered set {1, 2, 3}.  Let n = 1, m =
 2.
 
 The first query:
 
 var query = query.Skip(n).Take(m);
 
 converted to SELECT ... OFFSET n LIMIT m
 
 Will return the ordered set {2, 3}, while the second query:
 
 var query = query.Take(m).Skip(n);
 
 converted to SELECT ... LIMIT m OFFSET n
 
 Will return the ordered set {2}.
 
 The reason for this is that in the first query, the Skip method skips
 one element, then takes the remaining two, while in the second query, the
 first two elements are taken, and then the first one is skipped.

This semantics implies subqueries.

In SQL LIMIT and OFFSET refer to the whole query (that's why in PG you
can swap them). If you want to think OO, both are _attributes_ for the
query object, whose default values OFFSET 1 and LIMIT *inf*.

They are not _operators_ on the query result, as you seem to imply. The
Take() and Skip() should not be real methods, they should just set
internal instance variables.

(Pardon my pythonic syntax - I know zero of LinQ)

 q = Query(SELECT ...) # create a new query object
 q.limit = 2 # object properties
 q.offset = 1   
 result = q.execute()# perform the query

using wrapper methods:

 q = Query(SELECT ...) # create a new query object
 q.limit(2)
 q.offset(1)
 result = q.execute()

Methods allow a more pythonic way (the same you use):

 result = Query(SELECT ...).offset(1).limit(2).execute()

which matches quite closely the SQL syntax, if you have those methods
return self.

@Erik Jones
There's no real object-relational impedance mismatch here.
The above is pure OO, yet:

 result = Query(SELECT ...).limit(2).offset(1).execute()
  ^ methods swapped
is perfectly equivalent.

Setting properties on an instance object is commutative in general,
unless the set_property operation has side-effects, which I wouldn't
call good programming style.

The OP just maps LIMIT and OFFSET into operations (OFFSET and LIMIT
_operations_ are NOT commutative in general) instead of object
properties. Once you do the correct mapping, objects behave like queries
in SQL.

As Gregory Stark pointed out, if you want LIMIT and OFFSET to work as
operators, you need to nest queries.

Let's drop the idea there's SQL behind the scene, and let's think of a
more abstract DB model:

q = Query(SELECT ...).execute()
# this executes the query, and returs an object you can perform other
queries on

q = q.limit(2).execute()
# again, the query is executed, and the result set is in turn querable

q = q.offset(1).execute()

Now, we can just make the execute() method implied. With this new
semantics, we have:

q1 = Query(SELECT ...)
q1 = q.limit(2)
q1 = q.offset(1)

q2 = Query(SELECT ...)
q2 = q.offset(1)
q2 = q.limit(2)

and the results differ. I think that's what the OP meant.

But we needed to drop the idea of SQL behind the scene because there's
no way in SQL to directly query the result set from a previous query.
Here there's impedance mismatch. The closest thing is subqueries, but
you don't store intermediate results anywhere, like we do above with
objects.

One could implement the above by executing the query every time, but
that's a nightmare for performance on big tables. The only way to have
decent performance is to do lazy execution of the query, and use
subqueries as Gregory suggested.

Please note that there are ORM tools that do that. SQLAlchemy even
allows you to build a query (much of the above is valid sqlalchemy) and
then treat it as a list, even using array slices which would generate
the convenient LIMIT/OFFSET clauses automagically:

q = session.query(Table)
q1 = q[1:5]   # adds LIMIT 4 OFFSET 2

of course the query is actually executed only when you start using the
results.

.TM.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Restart a sequence regularly

2007-11-22 Thread Marco Colombo

Scott Marlowe wrote:
 revoke all privs on the sequence to anyone but the user about to reset it
 reset it
 grant the options back

Quoting the OP:
 That means, when others want to access the sequence between
 31-Dec 23:59:55 and 1-Jan 00:00:05, they are waiting instead of
 getting an error.

If you remove the privs, clients will get an error, unless I'm missing
something.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Cannot declare record members NOT NULL

2007-09-13 Thread Marco Colombo

Cultural Sublimation wrote:
 Unfortunately for you, they are not different types.  If the OCaml
 binding thinks they are, it's the binding's problem; especially since
 the binding seems to be using a completely lame method of trying to tell
 the difference.
 
 Hi,
 
 In OCaml and in other languages with strong type systems, int4 never NULL
 and int4 possibly NULL are definitely different types.  I think the source
 of the problem here is that SQL has a different philosophy, one where type
 constraints are not seen as creating new types.

There's no such a thing as a 'type constraint' in SQL, and there's no
point in defining a new type. Constraints are on table rows, sometimes
not even on the values of columns per se, but on combinations of values...

Think something like (table.col1  table.col2)... is that 'creating a
new type'? How'd you define this new type, even in OCaml, assuming that
originally both are int4? Is '4' a valid value for that type?

Now, some _table_ constraints may be similar to _type_ constraints, but
that's a corner case, in SQL. It's much more than a different
philosophy, we're speaking of apples and oranges here. Why should SQL
recognize a very limited kind of constraints, and treat them specially
by defining a new type?

 But anyway if you think that checking pg_attribute is a lame method of
 obtaining type information, what do you suggest should be done instead?
 What would you do if it were you creating the bindings?

I think the bindings get it right, the type *is* int4 possibly NULL,
because that't what the integer type in SQL means.

The problem here is that not every language type maps perfectly on a
database type (and of course the converse it true). int4 never NULL
may be stored into a table with appropriate constraints, but still some
code is needed at application level to convert it back, because there's
no such a native type in PG.

Think of dates and times, I believe no language bindings handle them in
a totally consistent way with PG types (unless they define
special-purpose types with the exact same semantics, which is hardly
worth it).

So, the application is wrong in expecting a SQL database to return
values of type int4 never NULL. Just write a small conversion layer,
changing int4 possibly NULL into int4 never NULL, after reading the
data.

.TM.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] Delete/update with limit

2007-07-24 Thread Marco Colombo

Csaba Nagy wrote:
 First of all, thanks for all the suggestions.
 
 put a SERIAL primary key on the table
 Or:
 Maybe add OIDs to the table, and delete based on the OID number?
 
 No, this is not acceptable, it adds overhead to the insertions. Normally
 the overhead will be small enough, but on occasions it is noticeable.

How about using the following?

delete from table
where ctid in (select ctid from table limit num);

Here's a live example:

db= select count(*) from sometable;
 count
---
   381
(1 row)

db= delete from sometable where ctid in (select ctid from sometable
limit 5);
DELETE 5
db= select count(*) from sometable;
 count
---
   376
(1 row)

Does anyone see problems with the above delete?

---

Anyway, do you have figures of the overhead you mentioned? How fast is
PG (with OIDs) and how does it compare with the alternatives you're using?

In your original post you wrote:
 On other databases, it is possible to limit the delete to a maximum
 number of rows to be deleted.

I don't know what other databases you're referring to, but are you
sure they don't have anything similar to PG OIDs, without even you
knowing it, and without any option to disable them? It's even possible
that in the other databases you're already paying that overhead, and
that makes it quite acceptable in PG, too. Or maybe there's some other
kind of overhead, much bigger than the OIDs one?

For example, you're using a high overhead mechanism to consume rows
(triggers on delete, insering into another table), are you sure that in
the other databases this doesn't slow all the inserts down much more
than adding OIDs on PG would do? PG has MVCC, I guess that makes deletes
and inserts on the same table play nice to each other, but how about the
other databases? Do they need to acquire a lock on inserts/deletes? That
would make your concurrent inserts/deletes much slower that just adding
a column to the table. Maybe you could even add an index, and still be
faster thanks to MVCC.

Also, the trigger is fired once for each deleted row. Have you
considered a single stored procedure that loops over the rows to be
processed, instead of relaying on deletes and triggers?

.TM.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] unexpected shutdown

2007-06-25 Thread Marco Colombo


[EMAIL PROTECTED] wrote:

[EMAIL PROTECTED] writes:

My database has shutdown several times in the last couple days.  I have
no
idea why.  I am running centos and I have not rebooted the server or
made
any configuration changes.

So in particular, you didn't disable memory overcommit?


LOG:  server process (PID 501) was terminated by signal 9

If you didn't issue a manual kill -9, then this is almost certainly a
trace of the kernel OOM killer at work.  Google for OOM kill to learn
more, or see memory overcommit in the PG docs.

Memory overcommit is evil on a server.

regards, tom lane




You guys were right
:Jun 17 11:04:57 kernel: Out of Memory: Killed process 24928 (postmaster).

I did not disable memory overcommit.  I guess this is something I will
have to do.  I have actually never seen this before or heard of memory
overcommit.  I am surprised a setting like this comes enabled by default. 
I read a bit about it and it seems to make sense to disable it, but from

practical experience do you know of any negative side effects?


The consensus on using overcommit_memory = 2 is far from general.

Your problem is a java application with memory issues, so I think you 
should address that directly first. Either run it elsewhere (and turn 
the host running PG into a dedicated one) or fix its memory leaks or use 
  resource limits provided by the OS to limit the java app.


Linux kernel people aren't totally clueless about VM. If they chose to 
keep overcommiting and the OOM killer enabled by default, there're reasons.


With overcommitting on, you save al lot of swap space from being 
allocated, leaving it for stuff that is actually used and not just 
potentially used. The overall system throughput is thus higher.


When it comes to OOM situation, with overcommitting off things aren't 
much better. First, OOM happens much before than with overcommiting on. 
This usually isn't perceived as a big advantage, since 95% of the cases 
the OOM is caused by one runaway process, so sooner or later it will 
cause OOM either way. But in a correctly administered server, with OS 
limits configured, a single runaway process doesn't cause OOM. OOM may 
still happen for excessive load, and I'd rather see my system handle 
some high load spikes than go into OOM situation. So lowering the 
threshold of what 'excessive load' is, isn't necessarily a good idea.


And OK, let's say you've hit OOM anyway. There's no win-win solution. 
Having PG processes SIGKILL'd is quite bad. But sitting in front of a 
keyboard watching your system die w/o being able to login (OOM, so fork 
fails) isn't much better. You may be able to do something (sysrq, maybe) 
but the chances you manage to run a proper shutdown are quite thin, in 
the general case. So you have to choose between the risk of PG being 
SIGKILL'd (but the OOM _may_ pick the right process instead) and the 
risk of being forced into hitting the 'reset' button. Either way, your 
precious data isn't happy at all.


So the bottom line is, avoid OOM by properly configuing OS resource 
limits. If you don't, then overcommit_memory = 2 is _definitely_ better. 
If you do, it's a hard call. If you think about it, the funny thing is 
that the more experienced the sysadm you're talking to is, the less 
experience he has about handling OOM situations. By definition. :)


.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/

Re: [GENERAL] unexpected shutdown

2007-06-20 Thread Marco Colombo


[EMAIL PROTECTED] wrote:

[EMAIL PROTECTED] writes:

My database has shutdown several times in the last couple days.  I have
no
idea why.  I am running centos and I have not rebooted the server or
made
any configuration changes.


Oh, I forgot. You do have plenty of swap space compared to RAM, yes? If 
you're running w/o swap, or little swap, the default settings of

overcommit_memory = 2 will cut your available RAM by a factor of 2.

This thread is interesting reading:
http://www.mail-archive.com/pgsql-general@postgresql.org/msg97648.html

Since disk space is usually cheap these days, my rule of thumb is (the 
old one):


swap = 2 * ram

read it this way: it you have 32GB of RAM, you can afford 64GB of disk 
storage


BTW, this is a good idea both with overcommit on and off, IMHO.

.TM.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/

Re: [GENERAL] PostGreSQL for a small Desktop Application

2007-06-14 Thread Marco Colombo


Gabriele wrote:

I'm going to develop a medium sized business desktop client server
application which will be deployed mostly on small sized networks and
later eventually, hopefully, on medium sized networks.
It will probably be developed using C#.

I do need a solid DBMS wich can work with .Net framework. I do know
PostGreSQL is a good DBMS in general (it sports most of the advanced
DBMS features, transactions and stored procedure included) but i
wonder if it is suited for my application. 


While PG has tons more features than SQLite, the major question here is: 
do you really need a database _server_? One thing that PG is designed 
for is handling many (as in 100) concurrent users. Database users, that 
is, meaning processes (running on different computers) opening a 
connection and issueing queries.


Of course, it handles it very well also when those processes all run on 
a single server (and all connections are local connections), such as an 
HTTP server running, say, PHP. That model is very similar to the 
distributed one, since there's no state shared by the httpd/PHP 
processes. All shared state is inside the database server. It also 
happens to be persistant.


Technically, that's not simply client/server, it's 3-tier, with 
httpd/PHP processes being multiple instances of a middle layer. As far 
the database server (PG) is concerned, those are (multiple) clients.


In this scenario PostgreSQL is at home, being that what it's designed 
for. To tell the truth, *any* serious RDBMS out there would do. SQLite 
won't, tho, since it's not a server at all - it's just a library.


But you mentioned using C#/.Net. AFAIK (but I'm no expert) that's yet a 
different model. You have a single process (although very likely 
multithreaded) which is able to hold a shared state while serving 
concurrent clients. Here, a database is just a backend for persistent 
state (that it, across reboots or crashes). Any good (thread-safe) 
library that writes to files would do. If you need/want SQL, SQLite 
comes into play. Actually, this is what it was designed for. It's much 
easier to install (it's all in a .dll) and administer (close to zero 
administration I think) than PostgreSQL (or any RDBMS). For such an use, 
PG would surely do, but may be just overkill.


PG still has advantages vs. SQLite, being more featured (do you need 
stored-procedures?). But if you plan to use an ORM tool for .Net
(see: http://www.google.com/search?q=ORM+.Net) you might even be able to 
switch between SQLite and PostgreSQL at any time w/o even noticing (be 
sure of choosing one that supports both backends, of course).


I'm a big fan of both PG and SQLite, and happily use them. When I design 
an application, I ask myself: is this going to be a strongly database 
oriented app, with potentially different implementations of the 
middlelayer, or just a server that happens to need a solid and nice way 
to access data on disk? If you can answer to that, the choice is 
natural: use different tools for different purposes. But also remember 
that PG can functionally replace SQLite anywhere, but not the other way 
around. If you have room enough in your toolbox for just one tool, go 
PostgreSQL. I think the best thing about PG is that it's a terrific 
general purpose tool: a full RDBMS, extremely reliable, with no 
compromises, almost covering anything you might need in the features 
area (even more if you consider how easy is to extend it), yet light 
enough to be easily embeddable.


.TM.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] PITR Base Backup on an idle 8.1 server

2007-06-06 Thread Marco Colombo


Greg Smith wrote:

On Tue, 5 Jun 2007, Marco Colombo wrote:

AFAIK, files in pg_xlog are first renamed (and only if and after the 
archive_command returned true) and later overwritten to. Never deleted.


No, they get deleted sometimes, too.  Not often, but it can happen under 
heavy load if more segments get temporarily created than are normally 
needed.  At checkpoint time, only 2*checkpoint_segments+1 xlog files are 
kept; if there are more than that, they are removed.  Probably never 
happen on your system from what you've described of it, but it is a 
possibility.


Ok, you're right. Anyway neither renames nor deletes can happen during 
my backups. My archive_command prevents it.


As Simon just pointed out, the danger with the approach you're taken 
comes from what happens if a checkpoint occurs in the middle of your 
backup. You've probably never seen that happen either.  As long as that 
continues to be true, you might be OK for now, but you really need to 
get to where you're following the recommended procedure rather than 
trying to do something a little different.  There are too many edge 
cases here that could^H^H^H^H^Hwill bite you one day.


Let's say you use the standard procedure. Let's say that your 
archive_commands starts failing at 1:00 AM (say, no space left on the 
archive directory). Let's say your backup starts at 2:00 AM.


Later, at 8:00 AM you solve the disk full problem, and PG resumes the 
archiving of WAL segments... as long as there's enough room in pg_xlog 
for the extra segments, PG is fine.


Is your backup broken if a checkpoint happens during the backup? In my 
understanding, no. There's no need for archiving to happen immediately.


My procedure just simulates an archiving failure during the backup, no 
more, no less. The only difference is that the problem is solved right 
after the backup. By that time, all WAL records created during the 
backup have already been saved.


If you're going to archive WAL segments anyway, my procedure buys you 
almost nothing. It just saves you from using a trick and artificially 
fill a segment with garbage data just to have it archived right after 
the backup. It does so by using another trick, so no big deal. But it 
allows anyone not willing to take the burden, or face the danger, of 
actually archiving logs, to take file level backups instead of pg_dumps 
and without relying on external snapshotting abilities.


.TM.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [GENERAL] PITR Base Backup on an idle 8.1 server

2007-06-06 Thread Marco Colombo


Simon Riggs wrote:

On Tue, 2007-06-05 at 18:39 +0200, Marco Colombo wrote:

I'm asking: what _exactly_ can go wrong?


If a checkpoint occurs while taking the backup then the contents of the
files will be overwritten

  ^
Data files or WAL segments? My archive command prevents WAL segments 
from being recycled during the backup.


 and you will be unable to rollforward from

before the backup until after the backup. This will give you the FATAL
error message WAL ends before end time of backup dump. You won't know
this until you have attempted recovery using those files, even if the
scripts give rc=0.


Well, my procedure currently produces two tar achives. One is the 'base 
backup' (a copy of the datafiles). One is a copy of wal segments, right 
after the backup. Which one do you expect to be corrupted if a 
checkpoint happens during the backup?


.TM.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] PITR Base Backup on an idle 8.1 server

2007-06-05 Thread Marco Colombo


Simon Riggs wrote:
 Marco Colombo wrote:

my method


...is dangerous


Ok, but why? Once again, I'm asking: what _exactly_ can go wrong?

 so we don't get loads of new DBAs picking up this idea
but missing the exact point of danger.

I'm one of them. I'm _am_ missing the exact point of danger.

 Making the assumption that its OK to archive WAL files in the pg_xlog
^^
 directory exposes you to the risk of having them deleted by the
 archiver, which will invalidate your backup.
  

I'm sorry I'm really having a hard time following you here... what is 
to archive and the archiver? The archive_command? The tar in the 
backup procedure? What do you mean by deleted? AFAIK, files in pg_xlog 
are first renamed (and only if and after the archive_command returned 
true) and later overwritten to. Never deleted. Anyway, how could that 
invalidate the backup? It's all about making a self-contained backup. 
What happens after that, it's irrelevant.


Hey, I haven't come here proposing a new revolutionary way to perform 
backups! I've made pretty clear it was for a not-so-common case. And 
anyway, I've just asked what may be wrong with my procedure, since it 
seems to fit _my_ needs and it makes _my_ life simpler, and _I_ don't 
see any flaw in it. It may be useful to others, _if_ it's correct. If 
not, I'd like to know why.


Can you provide a simple failure scenario, please? That would help me 
understand what I'm missing...


.TM.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] PITR Base Backup on an idle 8.1 server

2007-06-04 Thread Marco Colombo


Greg Smith wrote:
The way you're grabbing 
files directly from the xlog directory only works because your commit 
workload is so trivial that you can get away with it, and because you 
haven't then tried to apply future archive logs.


Well, it's only because I don't need future logs, just like I don't need 
future files. Backup is at 2:00 AM, any change after that is 
potentially lost. That includes e-mails, web contents, and database 
contents. The database contents are in no way different to us.


It's the your commit workload is so trivial that you can get away with 
it I don't really get, but more on this later.


In the general case, 
circumventing the archiving when the backup is going on won't guarantee 
everything is ordered just right for PITR to work correctly.


Generic PITR? You mean if backup is at 2:00 AM and the server crashes 
(all disks lost) at 2:00 PM, you want to be able to recover to some 
time like 11:00 AM, and be precise about it? That's PITR to me - and the 
precise part is key here... either the time or the transaction ID 
would do, the point is being able to draw a line and say anything 
before this is correct.


Well if that's what you mean by PITR, I never claimed my method would 
give you that ability. I'm pretty aware it won't do, in the general 
case. If you need that, you need to archive all the logs created after 
the backup, that's pretty obvious.


But even under heavy write load, my method works, if the only point in 
time you want to be able to recover is 2:00AM.


It works for you too, it gives you nice working backup. If you also need 
real PITR, your archive_commmand is going to be something like:


archive_command = 'test ! -f /var/lib/pgsql/backup_lock  cp %p 
/my_archive_dir/%f'


I consider 
what you're doing a bad idea that you happen to be comfortable with the 
ramifications of, and given the circumstances I understand how you have 
ended up with that solution.


I would highly recommend you consider switching at some point to the 
solution Simon threw out:



create table xlog_switch as
select '0123456789ABCDE' from generate_series(1,100);
drop table xlog_switch;


Ok, now the segment gets rotated, and a copy of the file appears 
somewhere. What's the difference in having the archive_command store it 
or your backup procedure store it?


Let's say my archive_command it's a cp to another directory, and let's 
say step 5) is a cp too. What exaclty buys me to force a segment switch 
with dummy data instead of doing a cp myself on the real segment data?


I mean, both ways would do.


you should reconsider doing your PITR backup
properly--where you never touch anything in the xlog directory and 
instead only work with what the archive_command is told.


Well, I'm copying files. That's exaclty what a typical archive_command 
does. It's no special in any way, just a cp (or tar or rsync or 
whatever). Unless you mean I'm not supposed to copy a partially filled 
segment. There can be only one, the others would be full ones, and full 
ones are no problem. I think PG correctly handles the partial one if I 
drop it in pg_xlog at recover time.


That segment you need to treat specially at recover time, if you use my 
procedure (in my case, I don't). If you have a later copy if it (most 
likely an archived one), you have to make it avalable to PG instead of 
the old one, if you want to make use of the rest of the archived 
segments. If you don't want to care about this, then I agree your method 
of forcing a segment switch is simpler. There's not partial segment at 
all. Anyway, it's running a psql -c at backup time vs. a test -nt  
rm at restore time, not a big deal in either case.


.TM.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] PITR Base Backup on an idle 8.1 server

2007-06-01 Thread Marco Colombo


Greg Smith wrote:

On Thu, 31 May 2007, Marco Colombo wrote:


archive_command = 'test ! -f /var/lib/pgsql/backup_lock /dev/null'
Under normal condition (no backup running) this will trick PG into 
thinking that segments get archived. If I'm not mistaken, PG should 
behave exactly as if no archive_command is configured, and recycle 
them ASAP.


That's correct.  I don't think you even need the /dev/null in that 
command.


Ok, thanks. I've seen that /dev/null somewhere in the docs, and blindly 
copied it.


Should a WAL segment fill up during the backup (unlikely as it is, 
since the system is mostly idle AND the tar completes withing a minute 
- but it's still possible), the test command would report failure in 
archiving the segment, and PG would keep it around in pg_xlog, ready 
to be tar'ed  at step 5 (mind you - this is speculation since I had no 
time to actually test it).


That's also correct.  What you're doing will work for getting a useful 
backup.


Great, that's all I need.

However, recognize the limitations of the approach:  this is a clever 
way to make a file-system level snapshot of your database without 
involving the archive logging process.  You'll get a good backup at that 
point, but it won't provide you with any ability to do roll-forward 
recovery if the database gets screwed up in the middle of the day.  
Since that's a requirement of most PITR setups, I'm not sure your 
workaround accomplishes what you really want.  More on why that is below.


Here's the original thread I started.

http://archives.postgresql.org/pgsql-general/2007-05/msg00673.php

Briefly, I don't need PITR proper, it may be even harmful in my case. 
The data on the db may be tied to the data on the filesystem in ways 
unknown to me... think of some kind of custom CMS. I'm able to restore 
.html, .php, .png or whatever files as they were at backup time (say, 
2:00AM). All I need to do with PG backups is restoring db contents at 
the same time (almost). The only point in time I'm interested in is 
backup time, so to say.


Restore would be done the usual way, extracting both the archives, 
maybe adding WAL segments from the crashed pg_xlog. Whether I need to 
configure a fake restore command I have still to find out.


This won't work, and resolving it will require going to grips with the 
full archive logging mechanism rather than working around it the way you 
suggest above.


This is interesting. Why won't it work exactly? Let's say I trick PG in 
thinking it's a recover from backup+archived wal. It'll find all 
segments it needs (and no more) already in pg_xlog. I expect it to just 
use them. Maybe I'd need to configure /bin/false as restore_command. Or 
maybe just something like 'test -f /var/lib/pgsql/data/pg_xlog/%f' (true 
if the file is already there). I'll have to experiment, but I don't see 
any major problem right now. The files are already there.


Every time the server hits a checkpoint, it recycles old WAL 
segments--renames them and then overwrites them with new data.  The 
first time your database hits a checkpoint after your backup is done, 
you will have lost segment files such that it's impossible to recover 
the current state of the database anymore.  You'll have the first part 
of the series (from the base backup), the last ones (from the current 
pg_xlog), but will be missing some number in the middle (the recycled 
files).


Sure, now I see what you mean, but I was under the assumption of very 
low database activity, in may case, it'a about 2 wal segments/day. I 
usually see files in my pg_xlog that are 2 days old, so there won't be 
any missing segments. And anyway, the ability to recover at some time 
after the backup is just a plus. I don't need it. In case of a complete 
crash, I'm going to restore the whole system as it was at backup time. 
And if only the PG datadir gets corrupted later, and I want to try and 
recover it as it was at that later time, still I have a 99% chance of 
being able to do so, due to very low write activity. And if that fails, 
because of some uncommon write activity right at that inconvenient time, 
I can just fall back to the case of a complete system crash. The chances 
of that happing are possibly lower of those of a system crash, so I'm 
not worried about it.


I think that all we want is a backup that is immediately usable, w/o 
waiting for the WAL segment it relies on to be archived. That is, if 
taken at 2:00AM, it may be used to recover a crash at 2:10AM (assuming 
the backup process ended by that time, of course).


If you need *both* a full backup *and* PITR, just add a real cp to the 
archive_command above. The important part is to return failure during 
the backup process, I think.


.TM.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [GENERAL] PITR Base Backup on an idle 8.1 server

2007-05-31 Thread Marco Colombo


Greg Smith wrote:
[...]
-Find something harmless I can execute in a loop that will generate WAL 
activity, run that until the segment gets archived.  Haven't really 
thought of something good to use for that purpose yet.


Some time ago I started a thread about taking on-the-fly backups at file 
level on idle servers. Problem was much the same of yours. After posting 
that, I'm doing some research on my own (in spare time) now.


Currently, I'm using the following procedure:

1) create a backup lockfile
2) issue pg_start_backup()
3) tar the data directory, excluding pg_xlog
4) issue pg_stop_backup()
5) tar pg_xlog
6) remove the lockfile

Meanwhile, a fake WAL archiving is active, which does pretty nothing.

archive_command = 'test ! -f /var/lib/pgsql/backup_lock /dev/null'

Under normal condition (no backup running) this will trick PG into 
thinking that segments get archived. If I'm not mistaken, PG should 
behave exactly as if no archive_command is configured, and recycle them 
ASAP. This saves me the burden of taking care of the archiving at all.


Should a WAL segment fill up during the backup (unlikely as it is, since 
the system is mostly idle AND the tar completes withing a minute - but 
it's still possible), the test command would report failure in archiving 
the segment, and PG would keep it around in pg_xlog, ready to be tar'ed 
 at step 5 (mind you - this is speculation since I had no time to 
actually test it).


So it ends up with two tar archives: one is the datafiles backup, the 
other the wal segments. As an optimization, I should exclude WAL 
segments older that the lockfile in step 5), since I know they are older 
than the backup.


What I really should do now is kill -STOP the tar at step 3), start some 
big write activity and see what exaclty happens to the WAL segment when 
it fills up and PG tries to archive it.


Restore would be done the usual way, extracting both the archives, maybe 
adding WAL segments from the crashed pg_xlog. Whether I need to 
configure a fake restore command I have still to find out.


Hope it helps,
.TM.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] a few questions on backup

2007-05-19 Thread Marco Colombo

Marco Colombo wrote:
 I'll try that out. Maybe my ideas are so far from the truth that I'm
 having a hard time in explaing them to people who actually know how
 things work. I'll be back with results. Meanwhile, thanks for your time.

I think I finally got it.

Segment 34 in my pg_xlog got archived and recycled. It became segment
39, but PG is still working on segment 35, after some hours. Now pg_xlog
contains 5 segments, from named from 35 to 39, 35 being the most
recently modified. 39 won't be used yet for about a couple of days.

Now I see what you mean for recycled: I thought it meant marked free
for later use, but it means renamed for future use. My mistake was
assuming that the rename part happens lazily when PG starts using the
file. Instead, it happens right after (the eventual) archiving.

That makes the strategy in my original post somehow unfeasable. Still, I
was not completely wrong:

# cmp /var/lib/pgsql/data/pg_xlog/000100010039
/u1/pg_wal_archive/000100010034  echo Yes
Yes

They do contain the same data, that of segment 34, and the *39 file will
stay there, untouched, for quite a while after the backup. So the WAL
segment I need *is* there, just with a different name.

The only problem is figuring out what segment that data actually belongs
to. I know only because I can compare it with the archived one. Now, I
could still make some educated guesses, by looking at modification
times, but definitely a guessing game is not something you want to play
when restoring your precious data. :) Archiving the WAL segments and
letting the recovery procedure handle them at restore time is easier anyway.

Again, thanks a lot.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] a few questions on backup

2007-05-16 Thread Marco Colombo

Tom Lane wrote:
 Marco Colombo [EMAIL PROTECTED] writes:
 Good to know, thanks. I think I'll experiment a bit with
 archive_command. My point was that since I know (or better assume) that
 old segments are going to stay in my pg_xlog for *days* before getting
 recycled,
 
 On what do you base that assumption?  Once the system thinks they're not
 needed anymore, they'll be recycled immediately.
 
   regards, tom lane

Well now that you make me think of it, I do make some assumptions. One
is that only one file in pg_xlog is the active segment. Two is that I
can trust modification times (so that a file inside pg_xlog that looks
old is actually old... and since postgresql does not run as root, it
couldn't cheat on that even if it tried to).

The best thing I can do is to configure archiving, and see what gets
archived exactly. I'm making assumptions there too. I expect for each
file in pg_xlog to find a copy in the archive directory (say archiving
is done with cp), with one exception, the segment currently beeing
written to. There will be a file with the same name but different
contents (and older modification time).

I'll try that out. Maybe my ideas are so far from the truth that I'm
having a hard time in explaing them to people who actually know how
things work. I'll be back with results. Meanwhile, thanks for your time.

.TM.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] a few questions on backup

2007-05-15 Thread Marco Colombo

Tom Lane wrote:
 No.  You have to have an actual archive_command script copying the WAL
 segments somewhere else when told to.  An asynchronous copy of the xlog
 directory will be nothing but garbage, because we recycle WAL segments
 as fast as we can (ie, as soon as the archive_command claims to have
 saved the data).

Mmm, sorry I'm not sure I'm following here. Maybe I should provide some
background. In my pg_xlog directory I see five files, WAL segments, I
suppose. Only one (as I expected) is begin currently used, the others
are old (one a couple of days old).

When PG performs a switch from one segment to another one (I assume it
recycles the oldest available), does it archive the recycled one (before
starting using it of course) or the just-filled one? If it's the one
being recycled, it means that in my setup it would takes two days to
archive a segment since it stopped being used. Am I missing something?

 1) 2) and 3) are OK, but you need to use archive_command to collect the
 xlog segments.
 
 Actually ... given your low requirements, I wonder why you don't just
 stop the postmaster, tar the datadir, start the postmaster.

Well, currently we do a pg_dump. The database mainly supports dynamic
websites. It's very unlikely they get updated at the time the backup
runs, and overall there is little updating even during the day, but I
don't like stopping the postmaster because, even if the write load is
negligible, the read one might be not. It's still small enough that a
tar (to disk) might take only a minute or two to complete, but yet it's
a minute of downtime for the web sites. If I can avoid that, why not?

I'm not unsatisfied with pg_dump, and I agree that with my requirements
the whole issue is accademic. I just wanted to learn how it works
exactly, such knowledge could provide useful for doing the Right Thing
in case of troubles. Maybe it's the right time for me to have a look at
the source...


Hannes Dorbath wrote:
 lvcreate -s -L5G -nbackup /dev/foo/postgresql
 mount /dev/foo/backup /mnt/backup-snap
 tar jcpvf pg-backup-time_stamp.bz2 /mnt/backup-snap

 You can't do much wrong with that, it's fast and easy to use.

Been there, done that. In my environment (Fedora Core 6) it's fast and
easy, but not reliable, unfortunately. Sometimes the snapshot won't get
created, sometimes it won't get removed after the backup is done.

.TM.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] a few questions on backup

2007-05-15 Thread Marco Colombo

Richard Huxton wrote:
 It calls archive_command on the just-filled one.

Good to know, thanks. I think I'll experiment a bit with
archive_command. My point was that since I know (or better assume) that
old segments are going to stay in my pg_xlog for *days* before getting
recycled, just copying them all after the call to backup_stop() should
be enough, in my case. It's more than I need, even.

 You do know that pg_dump gives you a guaranteed accurate snapshot of the
 database? It doesn't matter if it's in use.

Yes, I know, that's why I'm using it. A dump is also useful in that you
can restore it even in a different system easier. A text dump allows you
to make changes before restoring, even with sed. I'm comfortable with it.

The reasons for playing with WAL segments for backups are:
1) learning how it works;
2) everything else is backed up with cpio;
3) at restore time I need to have PG running already, and there may be
something different at database initialization.

Reason 1) is the driving one, by far. I can handle it for the rest (i.e.
the compressed dump is saved as part of the cpio archive).

.TM.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[GENERAL] a few questions on backup

2007-05-14 Thread Marco Colombo

Hello,
I have a few questions on backuping a PostgreSQL server (lets say
anything 8.x.x). I've read Continuous Archiving and Point-In-Time
Recovery (PITR) in the manual I'm still missing something...well
actually I think I don't but I've been debating on this with a friend
for a while, and there's something we don't seem to agree on, so I need
some bits of clarification. :)

Ok, let's say what I really need is poor man's PITR. That is, I
perform only one daily full backup, and I need to be able to restore the
database contents as they were at backup time. I don't even need to know
the exact time (yes, that means I don't really care about which
transactions turn out to be committed and which don't). Mmm, ok that
can't be rightfully called PITR at all, maybe.

Am I right in assuming that the following procedure is ok?

1) issue pg_start_backup();
2) copy (or tar or cpio) the data dir, w/o pg_xlog/
3) issue pg_stop_backup();
4) copy (or tar or cpio) pg_xlog/ contents.

That's all. Please note that I'm doing NO WAL archiving.

Whether this is going to work or not is based on what exactly
pg_start_backup() does. According to him (my friend), it may simply
prevent PostgreSQL from writing to data files until pg_stop_backup() is
issued, thus kind of enforcing a snapshot on the data files themselves.
Changes go to WAL only, and they are played on data files only after
the pg_stop_backup(). Hmm, I don't really like that. But if so, I don't
even need step 4), since the tar backup would be consistent. I'm able to
restore from that only, and data are those at the time of pg_start_backup().

Or, (that's me) pg_start_backup() only ensures that full_page_writes is
enabled during the backup phase, so that any change to data files can be
undone/redone based on the contents of WAL segments. The tar archive
contains potentially inconsistant data, that's why I need WAL segments
too. At restore time, PG eventually performs a crash recovery, and data
are those at the time of step 4) (which is an interval really, some time
between the start and the end of the copy, but I don't need to be able
to tell the exact time anyway).

BTW, I see a vulnerability in the above procedure... if there's enough
write activity so that PostgreSQL recycles WAL segments between 1) and
4), some changes may be lost. At step 4) I need to save _all_ WAL
segments that have been produced during the backup time. I assume that
activity at backup time is low enough that the event is extremely
unlikely. Another doubt I have is about WAL checkpointing... is it
possible that a checkpoint happens during step 2), and the tar archive
gets data files both from before and from after the checkpoint (which I
think is bad) or does pg_start_backup() prevent WAL checkpointing, too?

Finally, if I'm missing something and the above is wrong, I think that
the only way to perform a full backup on a live database at filesystem
level, is to enable WAL archiving as the first step of the backup
procedure (assuming it's not usually on, of course), and later save all
the WAL segments that were _archived_ during that time, including the
one made at pg_stop_backup(), on the same backup medium.

If I understand the documentation right, nothing breaks if the
archive_command just returns OK w/o saving anything during regular
operation, and starts saving segments only during the backup time... I
mean, that prevents me from doing arbitrary PITR after the backup, but
all I want to do is performing a full backup, w/o being able to do any
partial backup after that. If so, I may write a script that does nothing
most of the time, and archives WAL segments only to be included in the
full backup.

TIA, (and sorry if the above sounds a bit messy)
.TM.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/

Re: [GENERAL] Views- Advantages and Disadvantages

2007-05-10 Thread Marco Colombo

Ashish Karalkar wrote:
 Hello All,
 
 Can anybody please point me to Advantages and Disadvantages of using view
 
 
 With Regards
 Ashish...

Well, IMHO views are part of the business logic and not of the data
model. You can also think of them as an API to access the data from
applications (clients). By defining some nice views, you allow writing a
client with little knowledge about the actual database design. And
clients written by different people access the data consistently.

However, this is a two-edged sword. An API is usually designed to be
generic enough. One day you may find you just need only part of the
funtionality, and you that could do that part more efficently. That's
expecially true if the API is used to hide the details away from you.
Normal clients may be given access only to the views and not to the
actual tables. That's pretty an good design principle, but again it cuts
both ways.

Think of a database with a books table and a authors table, with a
nice view that joins them. One day you are writing a client application
and want to fetch just the list of book ids. Yes, you can select one
column from the view, but why execute the join when you don't need it?
But if you're given access only to the view, you can't do much about it.

Of course this is not specific to views, it's true for any abstraction
layer in any context.

.TM.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Partial dates

2005-09-15 Thread Marco Colombo

On Wed, 2005-09-14 at 15:49 +1200, Brent Wood wrote:
 
 Sanitizing is one thing, inventing data to fit an incomplete value into a
 date datatype is not good practice.

Choose another datatype, or make a new one, or split the date into
columns. The type of your data is not a timestamp nor a date.

In the first place, 0 is not NULL. So, even 1980-01-00 would be
different from 1980-01-NULL. For example, assuming 1980-01-00 is defined
to have some meaning, (1980-01-00  1980-01-02) is likely to be true,
but (1980-01-NULL  1980-01-02) definitely is not.

You're just asking if there's a way to store a number of which the lower
bits are ignored (considered NULL). Obviously, no, you need a different
datatype or a different arrangement.

Note: the string 1980-01-00 just *looks* like a possible value, but
definitely it's not: there's simply no space (or time) between
1979-12-31 and 1980-01-01. It's much like trying to store sqrt(-1) into
a real. I hardly can imagine how MySQL manages to store that (the
1980-01-00, I mean).

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Block Size and various FS settings

2005-09-14 Thread Marco Colombo

On Wed, 2005-09-14 at 11:25 +0300, Michael Ben-Nes wrote:
 After a week of testing i decided to go with JFS as the FS for Postgres.
 
 im not an expert benchmarker so i hope i initiated the right parameters 
 in bonnie.
 
 
 Any way here are the results of bonnie++  pgbench:
 
 
 http://www.canaan.co.il/users/miki/stats/stats.html
 
 
 Cheers

Have you tried data=journal / data=ordered / data=writeback mount
options for ext3? If so, did they make any difference?

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] RAMFS with Postgres

2005-07-26 Thread Marco Colombo

On Fri, 2005-07-22 at 15:56 +0100, Alex Stapleton wrote:
 On 21 Jul 2005, at 17:02, Scott Marlowe wrote:
 
  On Thu, 2005-07-21 at 02:43, vinita bansal wrote:
 
  Hi,
 
  My application is database intensive. I am using 4 processes since  
  I have 4
  processeors on my box. There are times when all the 4 processes  
  write to the
  database at the same time and times when all of them will read all  
  at once.
  The database is definitely not read only. Out of the entire  
  database, there
  are a few tables which are accessed most of the times and they are  
  the ones
  which seem to be the bottleneck. I am trying to get as much  
  performance
  improvement as possible by putting some of these tables in RAM so  
  that they
  dont have to be read to/written from hard disk as they will be  
  directly
  available in RAM. Here's where slony comes into picture, since  
  we'll have to
  mainatin a copy of the database somewhere before running our  
  application
  (everything in RAM will be lost if there's a power failure or  
  anything else
  goes wrong).
 
  My concern is how good Slony is?
  How much time does it take to replicate database? If the time  
  taken to
  replicate is much more then the perf. improvement we are getting  
  by putting
  tables in memory, then there's no point in going in for such a  
  solution. Do
  I have an alternative?
 
 
  My feeling is that you may be going about this the wrong way.  Most
  likely the issue so far has been I/O contention.  Have you tested your
  application using a fast, battery backed caching RAID controller on  
  top
  of, say, a 10 disk RAID 1+0 array?  Or even RAID 0 with another  
  machine
  as the slony slave?
 
 Isn't that slightly cost prohibitive? Even basic memory has  
 enormously fast access/throughput these days, and for a fraction of  
 the price.

We are comparing a RAM + network solution vs. a RAM + disk solution. RAM
alone in not enough, since the OP wants 100% safety of data. Then you
need a network solution, and it has to be synchronous if you want 100%
safety. No network is going to beat a directly attached disk array on
the basis of performance/price.

  Slony, by the way, is quite capable, but using a RAMFS master and a  
  Disk
  drive based slave is kind of a recipe for disaster in ANY replication
  system under heavy load, since it is quite possible that the master
  could get very far ahead of the slave, since Slony is asynchronous
  replication.  At some point you could have more data waiting to be
  replicated than your ramfs can hold and have some problems.
 
  If a built in RAID controller with battery backed caching isn't  
  enough,
  you might want to look at a large, external storage array then.  many
  hosting centers offer these as a standard part of their package, so
  rather than buying one, you might want to just rent one, so to speak.
 
 Again with the *money* RAM = Cheap. Disks = Expensive. At least when  
 you look at speed/$. Your right about replicating to disk and to ram  
 though, that is pretty likely to result in horrible problems if you  
 don't keep load down. For some workloads though, I can see it  
 working. As long as the total amount of data doesn't get larger than  
 your RAMFS it could probably survive.

Ever heard of the page cache? If your data fits your RAMFS, it would fit
the OS cache just the same. For reads, the effect is exactly the same.
And just disable fsync if writes are a problem. It's anyway safer than
RAMFS, even if not 100% safe.

Face it, if you want 100% safety (loosing nothing in case of power
failure), you need to synchronously write to _some_ disk platter. Where
this disk is attached to, it's a matter of convenience. _If_ disk write
throughput _is_ the problem, you have to fix it. Be it on the local
host, or on a remote replica server, the disk system has to be fast
enough.

Consider:

1) PostgreSQL - RAM - disk

2) PostgreSQL - RAM - network  network - RAM - disk

no matter if you choose 1) or 2), the disk part has to be fast enough.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] RAMFS with Postgres

2005-07-21 Thread Marco Colombo

On Thu, 2005-07-21 at 07:43 +, vinita bansal wrote:
 Hi,
 
 My application is database intensive. I am using 4 processes since I have 4 
 processeors on my box. There are times when all the 4 processes write to the 
 database at the same time and times when all of them will read all at once. 
 The database is definitely not read only. Out of the entire database, there 
 are a few tables which are accessed most of the times and they are the ones 
 which seem to be the bottleneck. I am trying to get as much performance 
 improvement as possible by putting some of these tables in RAM so that they 
 dont have to be read to/written from hard disk as they will be directly 
 available in RAM. Here's where slony comes into picture, since we'll have to 
 mainatin a copy of the database somewhere before running our application 
 (everything in RAM will be lost if there's a power failure or anything else 
 goes wrong).
 
 My concern is how good Slony is?
 How much time does it take to replicate database? If the time taken to 
 replicate is much more then the perf. improvement we are getting by putting 
 tables in memory, then there's no point in going in for such a solution. Do 
 I have an alternative?
 
 Regards,
 Vinita Bansal

You see, if those frequently accessed tables are read-only mostly,
there's no need at all to use RAMFS. They already are cached for sure in
either PostgreSQL buffers or the underlying OS page cache. If you don't
have enough RAM for that, increase it. Using RAMFS does only make things
worse. The OS page cache really knows about frequently accessed data,
usually much better than you do.

If there are frequent writes on those tables, still RAMFS is not the
answer. Have a look at the documentation and disable sync on writes, and
you'll get a similar effect (writes are in RAM and will be synced on
disk in blocks, much more efficiently). Of course you loose the safety
of data this way, in case of power failure, but it's still much better
than RAMFS, at least most of the data is on disk.

As for Slony, or other solutions, consider this: the _only_ way to have
data safety is to return 'OK' to the application only after you're
_sure_ about the fact the data is on some disk platter. So, even with a
replica server, data has to be transferred over the net, committed
remotely, the commit notification has to come back over the net, and
_then_ the database says 'OK' to the application. This is going to be
quite slow, possibly slower than synchronous writing on the local disks
(notice that the protocol is the same: send data to the disk, wait for a
write completed notification, say 'OK' to the application).

My advice is: tune you PostgreSQL, the best you can, _with_ sync write
enabled. Review and optimize your SQL. Do not use RAMFS. Upgrade your
hardware if that's not enough. Consider distributing the load on
different servers (you'll need a multi-master solution for that, search
the archives), that is, upgrade your hardware in number not in size.

I hope it helps,
.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] RAMFS with Postgres

2005-07-20 Thread Marco Colombo

On Tue, 2005-07-19 at 16:45 +, vinita bansal wrote:
 Hi,
 
 I am  trying RAMFS solution with Postgres wherein I am pushing the most 
 heavily used tables in RAM.

Why? I mean, what problem are you trying to solve?

 I have 32GB RAM on a 64 bit opteron machine. My database size is 40GB. I 
 think Linux allows max. of 16GB (half of available RAM) to be used directly 
 to push tables to it.
 
 I am concerned about reliabilty here (what if there is a power failure). 
 What are the things that need to be considered and what all can be done to 
 ensure that there is no data loss in case something goes wrong. What steps 
 must be taken to ensure data recovery. I am planning to use Slony 
 replication to replicate my database to a diff node so that incase something 
 goes wrong, I can restore it from replication node and start my runs on that 
 data again. The only problem here is that I need to run engines from 
 beginning. Is there any other way of doing the same thing or such a thing is 
 good enough given the fact that a failure like this happens very rarely. The 
 most imp. thing for me is the **data** which should not be lost under any 
 circumstances.

Then don't use RAMFS. Slony may be a good idea, but it's hard to tell if
you don't provide more info.

What is the database used for?
- heavy long running, CPU-based, read only queries?
- many simple queries but over the whole dataset (thus I/O based)?
- many INSERTs/UPDATEs?

Is the database accessed by many concurrent users? How many of them are
mostly read-only and how many perform writes?

Each problem in each scenario may have a different solution...

 Has anyone used Slony replication before. How good is it. Is there anything 
 else available which is better then Slony Replication?

better is meaningless w/o a context. There are tasks in which Slony
may the best tool in the world, and others that require a totally
different approach. First you have to define what your problem is, and
why the obvious solution (a normal PostGreSQL server, with a standard
filesystem) does not work/fit. Then you choose a solution.

 
 Regards,
 Vinita Bansal

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] How to create unique constraint on NULL columns

2005-07-18 Thread Marco Colombo

On Fri, 2005-07-15 at 13:46 +0300, Andrus wrote:
 I have table
 
 CREATE TABLE test( col1 CHAR NOT NULL, col2 CHAR,
   UNIQUE (col1, col2) );
 
 This table allows to insert duplicate rows if col2 is NULL:
 
 INSERT INTO test VALUES ( '1', NULL );
 INSERT INTO test VALUES ( '1', NULL );
 
 does NOT cause error!

The two rows are not duplicated. NULL means any value in that context,
so you can only say that the values for the first column are equal.
Nothing can be said about the values in the second column. The database
can't say they are the same, and can't say they are different either.

http://en.wikipedia.org/wiki/Null_%28SQL%29

 How to create constraint so that NULL values are treated equal and second 
 insert is rejected ?

I think you can do that with special operators (such as IS DISTINCT
FROM) but you're using NULL as a normal value, that is not what it's
meant to be in the first place, and I advice to be careful:

http://www.postgresql.org/docs/8.0/interactive/functions-comparison.html

please read the paragraphs on ... = NULL, IS NULL, and IS DISTINCT FROM.

NULL is meant to represent the lack of knowledge (unknown). If you are
ever treating NULL as a real value (i.e. comparing it to other values or
or other NULLs), you must think twice about your design. IMVHO, the only
place for IS NULL and IS DISTINCT FROM are meta-operations on data,
administrative tasks and so on. There should be no need to use them in
normal queries.

Unless you're coding quick and dirty hacks when you really know what
you're doing but don't care about the correctness of your design, of
course.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] automating backup ?

2005-06-27 Thread Marco Colombo

On Sun, 2005-06-26 at 11:18 +0200, Zlatko Matic wrote:
 How to automate backup, so that Postgres automatically backups, for example, 
 once in a week ?
 The same question about vacuum ?
 
 Concerning backup, how to prevent that someone makes a copy (for example 
 pg_dumpall) of a database, 

pg_dumpall is no special case. It just runs some SQL queries. Your users
are subject to the normal permission checking.

 then installs new instance of Postgres, create 
 the same user acount that was the original owner and then restore the 
 database. In that case all restrictions would be overriden, right ? 

No, because pg_dumpall doesn't override any restriction.

Of course, if someone puts his hands on _your_ backups (made with full
permissions), he can access everything, unless you encrypted it.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] Win32 users?

2005-06-27 Thread Marco Colombo

On Sun, 2005-06-26 at 10:59 -0700, Bob Pawley wrote:
 I'll date myself so you all may know from where I am coming. My first lesson 
 in binary math took place in a classroom in 1958. Since then I have 
 witnessed a lot that has since swept under the bridge.
 
 
 
 This thread reminds me of the discussion that surrounded the complexity of 
 using the first spreadsheets (the precursor to today's Databases)
^^

What? Well, I admit I was not there in the 60's, but IFAIK databases
predate spreadsheets by far.

[...]
 In my 47 years of being somewhat aligned to the software industry this story 
 was repeated over and over again.

 There is a lesson to be learned from the Wang experience.

 People want tools to do tasks. They do not want to spend their own time (or 
 their own money to hire others) to build, design, repair or change the tools 
 that they need just in order to accomplish their own work - the work they 
 know best and from which they make a living. A good tool - a perfect tool - 
 is like a hammer. Its use is immediately known and it can be deployed 
 quickly, accurately and with little or no specialized training.

Pardom me, but the last sentence sounds ridiculous. Think of a Master
Smith making a perfect Katana. Guess which tool he's going to use
mostly? The hammer. Explain me how its use is immediately known and it
can be deployed, accurately and with little or no specialized training
applies here. Expecially the last part. Do you _really_ think that just
looking at even a remarkably simple tool such a hammer makes you able to
accomplish (accurately!) _anything_ that can be done with it? Before
answering, think about Michelangelo or Da Vinci.

 So I caution all to not make light of newbies who are searching for good 
 tools (not even perfect tools - yet) to do the work that needs doing. The 
 world will not sit by and continue to pay for Wang operators.

There are tasks that require the human brain in order to overcome the
lack of row performance of computers. Those task are not complex, it's
the computer that is slow. Give 20 years of advance in the computer
industry, and those tasks will require no human brain at all, for the
computers will be fast enough.

But not all task are like that. Some will grow with the computers. The
bigger the processing power, the more data you want to process. Some are
just complex at human brain level, _no matter what tool they involve_.
We have wordprocessors these days, but they don't turn us all into great
poets and writers, even it they are terribly better compared to quills.

There is some (brain) complexity in computer systems in general, and
there is in databases. It's just that some tasks are not for newbies, be
the tool a hammer or a RDBMS. When the tool is aimed mostly at such
tasks, there's little need to make it too newbie-friendly.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] vulnerability/SSL

2005-06-09 Thread Marco Colombo

On Wed, 2005-06-08 at 10:00 -0700, dong changyu wrote:
 Hi,
 A possible countermeasure on Windows platform,
 inspired by Magnus.Thanks ;)
 First we remove the passphrase from the key file,
 making it plain.
 Windows provides a feature encrypted file system,
 provide transparent encryption/decryption. We can log
 on using the account we run Postgres with and encrypt
 the plaintext key file. Then we logon using another
 non-amin account, and start postgres using runas
 service. Therefore the file is encrypted, only the
 Postgres acount and the recovery agent(built-in
 administrator by default) can read/modify it. The file
 will remain encrypted when restored from backup. 
 I've tested it on my computer and it works.
 
 cheers,
 Changyu

You mean that every process that runs as postgres has the ability to
read the file _without typing any password_? Or when you start
PostgreSQL it prompts for one? Can administrator read it _without
knowing password_?

I may be missing something, but what's the difference with a file like
this:

-r  1 postgres postgres50 Jan 15 21:15 akey

in any Unix system? Only postgres and root can read it.

How about backups? Does the backup process (I assume it runs as
administrator) store the key in cleartext?

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [GENERAL] vulnerability/SSL

2005-06-09 Thread Marco Colombo

On Thu, 2005-06-09 at 02:59 -0700, Changyu Dong wrote:
 Hi Marco,
 The problem I described in the first mail is that
 because of some unknown reasons, if you save the
 server.key file with a passphrase, you will be
 prompted to enter the passphrase every time you start
 the server AND a client make a connection, which
 actually forbids us to use a passphrase to protect the
 key file, therefore the key file have to be saved in
 plaintext without encryption.
 EFS is a feature provided by Windows which will
 encrypt any selected file using a symmetric algorithm,
 the symmetric key will encrypted by the user¡¯s public
 key and the recovery agent¡¯s public key and the
 encrypted key will be saved within the file header.

As long as the 'postgres' user has access to it w/o typing any password,
that's only a detail. Unless someone physically steals your disk, the
fact it's stored encrypted is irrelevant. The only thing that matters is
who can access it, and how.

 Thus only the user and recovery agent can decrypt it.
 And for another user, he cannot even open it (but can
 delete it). So we can ensure no one can read and
 modify it.

That's how the permission bits work in Unix. No need to encrypt the
file, we know permission bits actually work as expected under Unix. In
this case encryption adds no extra level of security on a running
system.

  Decryption is transparent to users and
 applications. The operation system will do it
 automatically if it can find appropriate private key.
 The difference between this and -r  1 postgres
 postgres50 Jan 15 21:15
 is that the file is encrypted using EFS, while the
 latter is remain plaintext.

I fail to see the difference. On Windows, the 'postgres' user can read
it without password. 'Administrator' has access to it, too.

On Unix, with 400 permissions, the 'postgres' user can read it without
password. 'root' has access to it, too.

 When you backup the file, it remains encrypted.

Then the backup is useless. If the secret key of the user 'postgres' is
lost (and it can be, since it is stored elsewhere, I think buried
somewhere where 'Administrator' can find it, maybe in user profile),
you'll never recover then content of the file.

 If you
 restore the file to a file system which doesn¡¯t
 support EFS (non-NTFS), it will corrupt, else it will
 remain encrypted.

Now THAT puzzles me a lot. I can imagine it be restored in plain. I can
imagine it be restored encrypted. I have no way to justify the file
contents being lost only because of restoring it on FAT.

Anyway, that's not the point here.

The point is: on Windows, if someone breaks in your 'postgres' account,
he can read the key. If someone breaks in your 'administrator' account,
he can read the key. But other users cannot read it.

This level of protection is exactly the same provided by the 400
permissions above under Unix. If someone breaks in the 'postgres'
account, he can read the key. If someone breaks in the 'root' account,
he can read the key. But other users cannot read it.

I fail to see any difference in the above scenarios.

Encrypting the key in the .pem file (as supported by openssl) is
completely different! No one, ever, can access it w/o knowing the
password. That's why it takes the operator to type the password in.
Also backups are safe. And just as useful as the file itself, they can
be restored everywhere. If someone forgets the password, the contents
are lost, but that's true for the file itself. The backup is just what
you expect to be, a copy. You restore it, and get a _working_ copy for
the file, on every filesystem. The .pem key can be sent by email even,
as is (since it's base64 encoded).

The daemon should ask for the password only once, we agree on that.

Storing the key encrypted (in the openssl sense) doesn't help much
against root, if he's able to trick the operator into typing the
password again. If you're able to avoid it, that is you're in a highly
secure environment with operators trained _not_ to type the password in
just to have the server restarted, .pem encryption adds a lot to your
security.

The EFS encryption as you described it adds nothing but a false sense of
security (and the ability to use some more buzzwords). The level of
protection is just the same of a Unix file with the right permissions.
The key point here is that both the 'postgres' user and 'administrator'
have _transparent_ access to the file contents. No password required.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] vulnerability/SSL

2005-06-09 Thread Marco Colombo

On Thu, 2005-06-09 at 13:54 +0200, Magnus Hagander wrote:
  The EFS encryption as you described it adds nothing but a 
  false sense of security (and the ability to use some more 
  buzzwords). The level of protection is just the same of a 
  Unix file with the right permissions.
  The key point here is that both the 'postgres' user and 
  'administrator'
  have _transparent_ access to the file contents. No password required.
 
 While most of what you wrote is definitly correct, you missed a few
 things about EFS.

[...stuff on EFS deleted...]

I agree on that. I'm not saying EFS is a bad idea. I'm only considering
the server key case. Anyway protecting against a malicious superuser is
the subject of a chapter on its own. There are many ways for a superuser
to hijack an user account. I'm not a Windows guy, but with a similar
setup under Unix, I'd do:

su - user -c cp encrypted_file /tmp/unencrypted

or anything equivalent. That is, at C level, open(2) the file, read(2)
it (the system gives me plaintext contents) write(2) it to another file.
The OP said he starts PostgreSQL with 'runas' which I get being the
Windows relative to 'su' (or 'cron'?). If Administrator can run programs
as the user, he can read the key file. That's the big difference.

 So it does offer a bit of extra security. Just to protect the key used
 to set up the SSL sessions, I'm not sure it's worth it. Because again,
 if they hack your admin account, they can get to your files *without*
 going thruogh getting into the SSL stream.

I think you're missing two points:

1) the purpose of getting the key is _not_ to be able to decrypt the SSL
stream (even if that's a nice consequence). The worse you can do is
_sign_ things with the key, that is, impersonate the server, mount man
in the middle attacks, and so on. Without anyone notice it for a while.
The whole point of encrypting the key is that a compromised key is worse
than a compromised system (and way more likely to go unnoticed).

2) there's not need for the intruder to scale to administrator powers.
If they break the 'postgres' account, they read the key. Even if they
break the server at SQL level, i.e. they gain superuser for the
database, they may be able to read it with a COPY SQL command, since
it's likely the key is accessible to the server (I'm not claiming the
latter is feasible - just in theory that's all they need).

The problem reported by the OP is a real one, since it prevents the
usage of an encrypted key. Not a big one, but still one. The solution
the OP posted later is not solving the problem at all.

BTW, even with the key is encrypted (the .pem way), the cleartext copy
must stay in the server memory space (possibly in a locked area). It may
be available to root (debugging the server), the user or the process
itself of course, if they manage to execute arbitrary code. There's
not way to make it 100% safe.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] vulnerability/SSL

2005-06-09 Thread Marco Colombo

On Thu, 2005-06-09 at 05:21 -0700, Changyu Dong wrote:
 --- Marco Colombo [EMAIL PROTECTED] wrote:
 
  As long as the 'postgres' user has access to it w/o
  typing any password,
  that's only a detail. Unless someone physically
  steals your disk, the
  fact it's stored encrypted is irrelevant. The only
  thing that matters is
  who can access it, and how.
  
  That's how the permission bits work in Unix. No need
  to encrypt the
  file, we know permission bits actually work as
  expected under Unix. In
  this case encryption adds no extra level of security
  on a running
  system.
  
  I fail to see the difference. On Windows, the
  'postgres' user can read
  it without password. 'Administrator' has access to
  it, too.
  
  On Unix, with 400 permissions, the 'postgres' user
  can read it without
  password. 'root' has access to it, too.
  
 Then how about resore it from a backup to another
 system? In this way the permission is bypassed but EFS
 still works.

Either the Windows backup contains the private key of the user or not.

If not, the backup is incomplete and useless (to get the file contents).
You may get other files from it, but that's not the point. You may just
not include the key file in _that_ backup.

If you have two backups, one normal and another safe, just put the
keyfile on the safe one, along with the other private keys. You can do
the same under Unix of course.

If your single backup contains the user private key, EFS is bypassed as
well.

This is going offtopic. The EFS approach is no different from any
encrypted filesystem, nothing new under the sun. It shares the weakness
of any system that lets you access the data at runtime w/o password.

  Then the backup is useless. If the secret key of the
  user 'postgres' is
  lost (and it can be, since it is stored elsewhere, I
  think buried
  somewhere where 'Administrator' can find it, maybe
  in user profile),
  you'll never recover then content of the file.
  
 Right, but the user's private key can be exported into
 a password protected pem file.

Save the server key in the same way then. Put the server key and the
user key together.

[...]
 If an intruder can break the postgres or root account,
 he can read everything, as have been discussed, not
 only the key but also the data file. So in this
 situation, it's useless to protect the key only.

Yes, it has been discussed: the purpose of the key is not protecting the
data, but protecting your identity. If the key is compromised, they can
impersonate you. Generally, this is much bigger a damage. They can
create fake data, _signed by you_.

[...]
 Yes, the .pem file can be kept for distribution and
 backup, but the working copy has to be plain.
 
  The daemon should ask for the password only once, we
  agree on that.
  
 Yes, that's the ultimate solution. So we can use
 encrypted key without any outside mechanism.

We agree on that. That's the _only_ solution if you want that kind of
security.

  Storing the key encrypted (in the openssl sense)
  doesn't help much
  against root, if he's able to trick the operator
  into typing the
  password again. If you're able to avoid it, that is
  you're in a highly
  secure environment with operators trained _not_ to
  type the password in
  just to have the server restarted, .pem encryption
  adds a lot to your
  security.
  
 I'm not sure, but windows begins to support smart card
 logon, therefore no password will be need and stored.

That changes nothing. Somehow the key as to be given, unencrypted, to
the server. Be it an operator typing a password, or inserting a
smartcard, a patched server can store the key in cleartext anywhere.
You have to teach your operators to think twice before performing
anything that lets the server access the key. With your solution, you're
letting the server access the key automatically.

  The EFS encryption as you described it adds nothing
  but a false sense of
  security (and the ability to use some more
  buzzwords). The level of
  protection is just the same of a Unix file with the
  right permissions.
  The key point here is that both the 'postgres' user
  and 'administrator'
  have _transparent_ access to the file contents. No
  password required.
  
 At least it make it impossible to restore the plain
 key from backup. 

The safety of that backup lies only on its incompleteness.

If you include the user key in the same backup, there's no security.
If you don't include the user key (and thus create an incomplete
backup), it's easier not to include the server key either, and put it in
the same place you put the user key. They are both private keys.

Including a useless copy of the server key encrypted with the user key
(stored elsewhere) is just a perverse way to gain nothing.
But I agree that sometimes perverse systems make perverse things look
natural. :-)

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager

Re: [GENERAL] vulnerability/SSL

2005-06-09 Thread Marco Colombo

On Thu, 2005-06-09 at 15:04 +0200, Magnus Hagander wrote:
[...]
 Yes, that is correct - runas is similar to su. But in order to do
 runas, you need the service accounts password. Once you are root on
 a unix system, you can do su - user *without* the password. That's a
 big difference.
 (You can also use the postgres accounts smartcard, if you are using
 smartcard logins, but the deal is that you need *something* that is
 normally private to the account - even if you are an administrator)

Is that at application level or system level? You know I can install a
patched su that asks root for passwords as well, but the problem is with
the seteuid() system call, not su. You can (with SELinux) limit root
powers a lot, but that's not the point.

[...]
 I guess we could read in the password ourselves and drop it in our
 shared memory segment to pass to subprocesses - though that means they
 can get to the password easier as well. Assuming OpenSSL has the APIs
 for that, I haven't checked that. I'm unconvinced it makes enough of a
 difference to be worthwhile, though.
 (BTW, am I correct in reading this as a problem that only appears on
 win32, because of the exec nature of the backend, right? Or does it show
 up on Unix as well?)

Is the Unix version much different? I think the postmaster just forks
and execs the backends. But, aren't connections handled by the
postmaster? All the SSL thing should happen before the fork I think. Is
the Windows model different? Do backends handle SSL negotiation?

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [GENERAL] vulnerability/SSL

2005-06-08 Thread Marco Colombo

On Wed, 2005-06-08 at 16:08 +0200, Magnus Hagander wrote:
  Hi,
  Im using postgreSQL with SSL these days. The version Im 
  using is 8.0.3. I found that its impossible to use an 
  encrypted key file. 
  When you use a protected server.key file, you will be 
  prompted to input your passphrase EVERYTIME ITS USED, not 
  only when you start the server but also when a client makes a 
  connection. So you have to leave the key file un-protected. I 
  think its a serious vulnerability since the security relies 
  on the secrecy of the private key. Without encryption, the 
  only thing we can use to protect the private key is the 
  access control mechanism provided by the OS.
  Any comments on this issue?
 
 If you don't trust the access control provided by the OS, why are you putting 
 sensitive data on it?
 If one can break your access control in the OS they can read all your data 
 anyway - they don't even need to sniff the wire and decrypt it using the key. 
 Or they can just change the passwords of your users and connect - or *change* 
 they key.

Yes and no. They can't change the key. It's tied to the certificate,
which is signed. They need to get a signed certificate from a trusted
CA, and put the associated private key on your server after they cracked
it. Which is much like leaving a big banner with Yes, it was me!
signed by you on the crime scene. :-)

But overall I agree. If they gained enough privilege to read the key
file, it's possible they're able to access the data as well. They might
be able to patch the server and have the password that protects the key
logged somewhere next time you type it in.

OTOH, I see no advantage in reading the key at connection time instead
of startup time (like every other daemon does). Encrypted key has an
interesting significance with backups. Someone may be able to steal one
backup of yours. They'll get old data (maybe you don't care much about
that), _and_ the key. You don't want them to be able to sign stuff or
impersonate your servers with it.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] PostgreSQL vs. InnoDB performance

2005-06-03 Thread Marco Colombo

On Fri, 2005-06-03 at 11:38 +0200, Peter Eisentraut wrote:
 Am Freitag, 3. Juni 2005 00:36 schrieb Peter Eisentraut:
  On a particular system, loading 1 million rows (100 bytes, nothing
  fancy) into PostgreSQL one transaction at a time takes about 90
  minutes.  Doing the same in MySQL/InnoDB takes about 3 minutes.  InnoDB
  is supposed to have a similar level of functionality as far as the
  storage manager is concerned, so I'm puzzled about how this can be.
  Does anyone know whether InnoDB is taking some kind of questionable
  shortcuts it doesn't tell me about?
 
 So here's another little gem about our friends from Uppsala: If you create a 
 table with InnoDB storage and your server does not have InnoDB configured, it 
 falls back to MyISAM without telling you.

Silently falling back to something unexpected seems to be quite common
there. For sure it's not the only case.  :-|

 As it turns out, the test done with PostgreSQL vs. real InnoDB results in 
 just 
 about identical timings (90 min).  The test done using PostgreSQL with fsync 
 off vs. MyISAM also results in about identical timings (3 min).

The hardware seems to be the bottleneck. Try improving the performance
of your disk systems. It's very unlikely to get _exactly_ the same
figures from such two different RDBMS. You expect them to be close, but
not identical.

BTW, make sure the test correctly emulated multiple clients (say 25, 50
or 100). There's little point in stressing transaction support of a
RDBMS when there's only one single actor in the system, and therefore no
contention. Transaction code takes always the fast path that way and
you're testing the less important part of it.

Check out some performance tuning pages, you may need to adjust some OS
and PostgreSQL configuration parameters to allow and effectively handle
100+ connections (shared buffers come to mind). I believe the same is
true for MySQL.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] PostgreSQL vs. InnoDB performance

2005-06-03 Thread Marco Colombo

On Fri, 2005-06-03 at 08:43 -0400, Christopher Browne wrote:
 After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] (Marco Colombo) 
 belched out:
  The hardware seems to be the bottleneck. Try improving the performance
  of your disk systems. It's very unlikely to get _exactly_ the same
  figures from such two different RDBMS. You expect them to be close, but
  not identical.
 
 If the bottleneck is in the identical place, and they are otherwise
 well-tuned, it is actually *not* that surprising that the timings for
 PostgreSQL vs real InnoDB would be pretty close.
 
 If both are being bottlenecked by the same notion of how fast does
 the disk spin, then the differences in performance won't be dramatic.

That's my point. If the purpose of the test is to compare PostgreSQL vs
real InnoDB, there should not be any other bottleneck than software
itself.

  BTW, make sure the test correctly emulated multiple clients (say 25,
  50 or 100). There's little point in stressing transaction support of
  a RDBMS when there's only one single actor in the system, and
  therefore no contention. Transaction code takes always the fast path
  that way and you're testing the less important part of it.
 
 Actually, if you can demonstrate near-identical performance under a
 common set of conditions, that's a really useful datum to start with.
 
 It would then certainly be interesting to see how the behaviour
 changes as various stresses are introduced...

I take the purpose of the test is also to measure performance under
transactional load. Otherwise, inserting 1,000,000 rows one transaction
a time is just silly. I was able to do 12,000 row/s with COPY on very
cheap hardware (that's 1,000,000 rows in about 90 seconds, not minutes).
I think that if you benchmark how things perform in doing silly things,
you should expect silly results...

So, if you want transactions, make them _real_ transactions. One client
sequentially issuing transactions means nothing in this context.
Transactions is all about contention. You need a bunch of concurrent
clients processes, at least, and possibly a N-way system on the server
(to measure real contention at OS level too).

Otherwise you'd better measure many inserts per transaction (and COPY in
PostgreSQL) on a single client (which does make sense on its own).

.TM. (who has not been to Arrakis recently)
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] Adventures in Quest for GUI RAD

2005-05-10 Thread Marco Colombo

On Tue, 2005-05-10 at 01:51 -0400, [EMAIL PROTECTED] wrote:
 Wolfgang, thanks! I am very persuaded by your arguments regarding
 Python.  What you have written makes me look at Python in a different
 light.
  
 I happened to find a download of Python2.2 which I installed at work
 but have not tried out.  I wish I could find detailed instructions on
 WHICH python to download from WHERE, and what I would need to download
 to access Postgresql from Python, and then some simple examples of
 sending a query to postgres and processing the results.

2.2? Latest python is 2.4.1.

First place to look at is obviously:
http://www.python.org/

see the Download section.

To access PostgreSQL from Python, you have some choices, pros and cons
have been already discussed on this list. Here's a partial list of
options:

http://www.druid.net/pygresql/
I've used this, happily, under Linux. It may be already included in the
PostgreSQL Windows port, I don't know.

http://initd.org/projects/psycopg1
I've used this too, happily, again under Linux. My programs where simple
enough that I could switch from one driver to the other w/o touching the
rest of the application, since I'm using the DBI 2.0 interface.
I have no idea if there's a Win port of this driver.

There are other drivers, but I haven't used them. Have a look at:
http://www.python.org/pypi?%3Aaction=browse
(under the Database section).

For DBI, or DB-API, see: http://www.python.org/peps/pep-0249.html
It's a reference manual and not a tutorial, and it's not PostgreSQL
specific.

While we're at it (and completely unrelated and off-topic) have a look
at:

http://www.pcpm.ucl.ac.be/~gustin/win32_ports/

 I would be perfectly happy to work with it in that funny DOS window, I
 would not require that it be Windows GUI app.  Just to be able to read
 and write to Postgresql.  Do you think there is such a 
 tutorial/documentation.

Python for Windows has a nice console window (IDLE I think).

You should be able to find some examples via Google.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [GENERAL] Python DB-API 2.0 oddity (was: I receieved an example of

2005-05-02 Thread Marco Colombo

On Mon, 2005-05-02 at 11:56 +0200, Karsten Hilbert wrote:
 A note on what I think is a strangeness in the Python DB-API 2.0.
 Please correct me if I am wrong.
 
 (Note that I am not trying to throw off the OP but simply use
 his example to point out an oddity about that API. The point
 is to make sure it *is* an oddity so I can raise it with the
 appropriate forum, eg the Python community.)
 
 Observe the following notes by someone learning the DB-API:
 
  # Get a cursor. We do this by connecting to the database: the
  # (button,) arguments just connect to the database that the form  is
  # running in.
  #
  cursor =  RekallPYDBI.connect (button, '').cursor()
 So far so good... But - getting a cursor without knowing the
 SELECT query the results of which it stands for ?

AFAIK cursors are not limited to SELECTs.

  # Execute a  query. This only gets people whose ages are 21 or above.
  # This  is the important bit, we are accessing the database directly.
  #
  cursor.execute (select surname from users where age =  ?, [21])
 Ah, the query is set *after* getting a cursor for it - seems
 odd, but hey, as long as it's set before retrieving rows ...
 
  The key is getting the cursor. Once you have a cursor  you can do 
  inserts, 
  updates and deletes, like
 Huh ? Pardon me ? Doing inserts, updates and deletes via a
 cursor ? The PostgreSQL documentation clearly says that the
 query part of a cursor definition must be a SELECT:
 
 http://www.postgresql.org/docs/7.4/static/sql-declare.html

But what makes you think that Python DBI was designed to be PostgreSQL
specific?

http://www.python.org/peps/pep-0249.html

.cursor()
  
Return a new Cursor Object using the connection.  If the
database does not provide a direct cursor concept, the
module will have to emulate cursors using other means to
the extent needed by this specification.

It's up to the module implementation to use real SQL cursors when
possible. AFAIK, it's not done automagically for PostgreSQL.
In practice, DBI cursor objects and SQL cursors have little in common
in the PostgreSQL drivers I'm aware of (PygreSQL and psycopg). A DBI
cursor is just an handle to execute SQL commands.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [GENERAL] Python DB-API 2.0 oddity (was: I receieved an example of

2005-05-02 Thread Marco Colombo

On Mon, 2005-05-02 at 11:56 +0200, Karsten Hilbert wrote:

  The key is getting the cursor. Once you have a cursor  you can do 
  inserts, 
  updates and deletes, like
 Huh ? Pardon me ? Doing inserts, updates and deletes via a
 cursor ? The PostgreSQL documentation clearly says that the
 query part of a cursor definition must be a SELECT:
 
 http://www.postgresql.org/docs/7.4/static/sql-declare.html
 
 (I am well aware that SELECT queries may have side
 effects that change data in the backend such as in
 select add_new_customer() etc.)
 

BTW, look at this page (with the Oracle driver):

http://www.zope.org/Members/matt/dco2/dco2doc

  cursor.execute(INSERT INTO TEST (name, id) VALUES (:name, :id),
  name=Matt Kromer, id=1)

I believe there are databases that allow you to send SQL statements (any
kind, not only SELECTs) only in a cursor (either implicit or explicit),
hence the name for the cursor object.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [GENERAL] Persistent Connections in Webserver Environment

2005-05-02 Thread Marco Colombo

On Mon, 2005-05-02 at 14:45 +0200, Hannes Dorbath wrote:
 Hi,
 as the subject says I need some advice on setting up connection handling 
 to PG in a webserver environment. It's a typical dual Xeon FreeBSD box 
 running Apache2 with mod_php5 and PG 8. About 20 different applications 
 (ecommerce systems) will be running on this box. Each app resides in 
 it's own schema inside a single database. As far as I understand 
 persistent connections from apache processes can only be reused if the 
 authentication information of the allready existing connection is the 
 same. So in case an apache process holds a persistent connection to 
 database test, auth'ed with username user1 and another app wants to 
 connect as user2 the connection can't be reused and a new one will be 
 spawned.
 
 So what we are doing atm is telling all apps to use the user apache, 
 grant access for this user to all schemas and fire SET search_path TO 
 app_schema; at the startup of each app / script. It works, but I 
 really would like to have an dedicated user for each app / schema for 
 security reasons.
 
 The next better idea I came up with was to fire SET SESSION 
 AUTHORIZATION TO user; at each app / script startup, but for this to 
 work I would need to initially connect as superuser - and I really 
 dislike the idea of having a webserver connecting as superuser :/
 
 Any ideas? I can't be the first person on earth with that problem ;/
 

Have you measured the real gain in using persistent connections at all?

In my experience, it's just a CPU vs RAM tradeoff. Before you go thru
the pain of setting up a weird authentication mechanism, try and
consider whether you really need persistent connections. Search the
lists, it has been discussed in the past. I remember of this thread:
http://archives.postgresql.org/pgsql-php/2005-02/msg9.php

There may be others, too.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] Persistent Connections in Webserver Environment

2005-05-02 Thread Marco Colombo

On Mon, 2005-05-02 at 17:32 +0200, Hannes Dorbath wrote:
 On 02.05.2005 16:41, Marco Colombo wrote:
 
  Have you measured the real gain in using persistent connections at all?
 
 As simple as possible:
 
 ?php
 require_once('Benchmark/Timer.php');
 $timer = new Benchmark_Timer();
 $timer-start();
 
 pg_pconnect('host=myhost dbname=database user=user');
 pg_query(SET search_path TO myschema;);
 
 $q = SELECT u.login FROM users WHERE u.user_id = 1;;
 
 $qr = pg_query($q);
 
 print_r(pg_fetch_all($qr));
 
 $timer-setMarker('Database');
 $timer-stop();
 $timer-display();
 ?
 
 Results:
 
 pconnect: 0.001435995101928
 connect:  0.016793966293335
 
 It's factor 10 on such simple things on the BSD box.

Ok, but the difference is going to be unnoticed, that's not the point
at all.

The question was: have you measured any difference in the server load?
I did in the past and wasn't really able to measure it, with more than
300 http processes active. The web server load is _way_ lower than the
db server. Currently we're about at 100 processes (but with pconnect)
and:

(web) load average: 0.31, 0.27, 0.21
(db)  load average: 0.24, 0.21, 0.18

and I know that turning to use simple connect won't change much as page
load time is dominated by the time spent in the queries (and the
overhead of 1/100 or 1/1000 of second in the startup time goes unnoticed
at all).

With any modern operating system, the overhead is very low (15ms is very
good actually).

In my experience, pconnect my cause RAM problems. The number of
processes is useless high. You have make provisions for a large
number of backends, and that means little RAM to single backend.

My advice is: use pconnect only when you have CPU problems,
unless your case is very degenerated one (your db host being on the
other side of the globe).

And, in my experience again, the first reasons for CPU problems on
the database server are:

- wrong/missing vacuum/analyze (or similar);
- bad coding on the application side (placing silly load on the server);
- bad queries (misuse/lack of indexes);
- bad tuning of PostgreSQL (expecially RAM);
...
...
- connect overhead.

I've never managed to reach the last item in the list in real world
cases. I think it is by far the least important item.

#1 Golden Rule for optimizing:
- Don't.

(Expecially when it causes _real_ troubles elsewhere.)

Have a nice day,
.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] PRIMARY KEY on a group of columns imply that each column is

2005-04-28 Thread Marco Colombo

On Wed, 2005-04-27 at 17:00 +0200, Stephane Bortzmeyer wrote:
 On Wed, Apr 27, 2005 at 09:36:57AM -0500,
  Scott Marlowe [EMAIL PROTECTED] wrote 
  a message of 18 lines which said:
 
  Often the best bet here, btw, is to declare it not null then use
  something other than null to represent null, like the text
  characters NA or something.
 
 Yes, but it defeats the purpose of NULL. And what should I use as a
 pseudo-NULL value for INET? 127.0.0.1? 0.0.0.0? Special values are
 well-known for the problems they raise. That's why many languages have
 NULL-like solutions (None in Python, undef in Perl, Maybe types in
 Haskell, etc).

No. NULL is NOT 'None', nor 'undef', and definitely not NULL as in C.
Those are perfectly defined values, although special ones. Only 'undef'
is quite misleading, but nevertheless it is just _one_ value.
That is, given a variable A, you can always write a boolean expression
that evaluates True or False to test if A is _equal_ to None/undef/NULL
(in C):

$ python -c a = None; print a == None
True

$ perl -e 'print a == undef, \n'
1

$ cat p.c
#include stdio.h

int
main(int argc, char *argv[])
{
char *a = NULL;
printf (%d\n, a == NULL);
}

$ cc p.c
$ ./a.out
1

About Haskell, I don't know. For what I understand from quick reading
the manual, it'd say that Nothing is similar. You _can_ tell if
something is equal to Nothing.

In databases, NULL has a completely different meaning. It doesn't mean
_no value_, which is just a special (single) value, but it means
_unknown value_. You simply can't compare it with any single value 
(even special ones) and expect a boolean answer. The only possible
answer is 'I don't know', which is NULL in boolean. Notice that the
boolean NULL is _not_ the same of False. So you get:

marco=# select 2 = 2;
 ?column?
--
 t
(1 row)

marco=# select 2 = 3;
 ?column?
--
 f
(1 row)

marco=# select 2 = NULL;
 ?column?
--

(1 row)

that is, neither true nor false. Back to your example, you can compare
('a', 2) with ('a', 2), the result is 't' and thus you've managed
to identify the right row (it works as a primary key).
Also, ('a', 3') is different from ('a', '2'), so you can tell the two
rows are different. But what if you allow ('a', NULL)?

('a', NULL) is neither the same _nor different_ from ('a', 2). The
result of comparison is NULL, no matter how you're testing it:

marco=# select ('a', 2) = ('a', NULL);
 ?column?
--

(1 row)

marco=# select ('a', 2)  ('a', NULL);
 ?column?
--

(1 row)

see? _Neither_ one is true. This would completely defeat the purpose of
the primary key. And of course, comparing ('a', NULL) with ('a', NULL)
results in exactly the same:

marco=# select ('a', NULL) = ('a', NULL);
 ?column?
--

(1 row)

marco=# select ('a', NULL)  ('a', NULL);
 ?column?
--

(1 row)

That's why NULLs are not allowed in primary keys. The key simply won't
work. NULL in databases is not _one_ special value. It's _any_ value,
since it's unknown. The boolean expression:

2 = NULL

might be true or might be false, since NULL could be _any_ integer in
this expression.

This is completely different from the semantic of None/undef/NULL in
most programming languages. 

You wrote:

Special values are well-known for the problems they raise.

then NULL is definitely _not_ the value you're looking for.
Everything can be said of NULL, but that it is well-known. 

In your case, by choosing (name, address) as the primary key, you're
saying 'I need to know both the name and the address to be able to
retrieve a datum in the table'. This implies that if you have partial
knowledge (you don't know the address), you can't naturally retrieve a
single datum (or insert it).

Depending on what you're trying to achieve, you may need to split
the table (normalization the theorists call it). I don't like theory
much, but its conclusions sometimes just make a lot of sense. :-)
Review your design, maybe either the table schema or the choice of the
primary key is not natural for your database.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [GENERAL] optimal hardware for postgres?

2005-04-26 Thread Marco Colombo

On Tue, 2005-04-26 at 01:32 -0700, William Yu wrote:
 Linux 2.6 does have NUMA support. But whether it's actually a for 
 Postgres is debatable due to the architecture.
 
 First let's take a look at how NUMA makes this run faster in a 2x 
 Opteron system. The idea is that the processes running on CPU0 can 
 access memory attached to that CPU a lot faster than memory attached to 
 CPU1. So in a NUMA-aware system, a process running on CPU0 can request 
 all it's memory be located memory bank0.
[...]

This is only part of the truth. You should compare it with real SMP
solutions. The idea is that CPU0 can access directly attached memory
faster than it would on a SMP system, given equivalent or so technology,
of course. So NUMA has a fast path and a slow path, while SMP has only
one, uniform, medium path. The whole point is where the SMP path lies.

If it's close to the fast (local) path in NUMA, then NUMA won't pay off
(performance wise) unless the application is NUMA-aware _and_
NUMA-friendly (which depends on how the application is writter, assuming
the underlying problem _can_ be solved in a NUMA-friendly way).

If the SMP path is close to the slow (remote) path in NUMA (for example,
they have to keep the caches coherent, and obey to memory barriers and
locks) then NUMA has little to loose for NUMA-unaware or NUMA-unfriendly
applications (worst case), and a lot to gain when some NUMA-aware
optimizations kick in.

I've read some articles that refer to the name SUMO (sufficiently
uniform memory organization) AMD would use to describe their NUMA,
which seems to imply that their worst case is sufficiently close
to the usual SMP timing.

There are other interesting issues in SMP scaling, on the software side.
Scaling with N  8 might force partitioning at software level anyway,
in order to reduce the number of locks, both as software objects
(reduce software complexity) and as hardware events (reduce time spent
in useless synchronizations). See:

http://www.bitmover.com/llnl/smp.pdf

This also affects ccNUMA, of course, I'm not saying NUMA avoids this in
any way. But it's a price _both_ have to pay, moving their numbers
towards the worst case anyway (which makes the worst case not so worse).

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] Postgres source (tar file) for Fedora Core OS?

2005-04-26 Thread Marco Colombo

On Tue, 2005-04-26 at 09:43 +0530, Dinesh Pandey wrote:
 From where can I download latest Postgres source (tar file) for Fedora
 Core OS?
 
 Regards
 Dinesh Pandey

This is a question for Fedora mailing-lists. Anyway, FC distributes
sources in RPM format. Assuming you're referring to the latest
update, you can find the sources at:

http://download.fedora.redhat.com/pub/fedora/linux/core/updates/3/SRPMS/

the current latest PostgreSQL source rpm is: 

postgresql-7.4.7-3.FC3.1.src.rpm

which contains the original source tarball plus patches. To view the RPM
content, type:

rpm2cpio postgresql-7.4.7-3.FC3.1.src.rpm | cpio -itv

See rpmbuild(8) manual page for details on how to build binary RPMs from
the source one.
 
.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] Filesystem options for storing pg_data

2005-04-21 Thread Marco Colombo

[I've got a private reply from Scott, which I won't quote here, which
can be fairly (I hope) summarized as search the pgsql-performance
list. Well, I've done it, and I feel it's due to bring the issue
back in public. So if I seems I'm replying to myself, it's not,
I'm replying to Scott. I've realized the reply was private only
just before sending this out.]
 
  On Wed, 2005-04-20 at 12:07, Marco Colombo wrote:
   On Wed, 2005-04-20 at 11:18 -0500, Scott Marlowe wrote:
  
   Generally XFS and JFS are considered superior to ext2/3.
 
  Do you mind posting a reference? I'm really interested in the comparison
  but everytime I asked for a pointer, I got no valid resource, so far.

[...]

Well, my point being the ones I find lead to the conclusion that EXT3 is
considered superior to XFS and JFS. One for all:

http://www.oracle.com/technology/oramag/webcolumns/2002/techarticles/scalzo_linux02.html

It's reassuring when various industry-standard benchmarks yield similar
results. In case you're wondering, I obtained similar results with
Benchmark Factory's other half dozen or so database benchmarks-so for
me, it'll be ext3.

Have a look at the graphs, EXT3 is almost twice as fast in these
(database) benchmarks.

Another one is:
http://www.kerneltraffic.org/kernel-traffic/kt20020401_160.html#8

Again ext3 is the winner (among journalled fs), but by a small edge
only. And again, there are a lot of variables. Using for example
data=journal with a big journal file on a different disk would
be extremely interesting, just as using a different disk for WALs
is at PostgreSQL level (the result might be the same).

All the other benchmarks I've found, with a simple search for
'filesystem benchmark' on the pgsql-performance list, either are
the usual Bonnie/iozone irrelevant benchmarks, or don't seem to care
to tune ext3 mount options and use the defaults (thus comparing apples
to oranges).

I'm not stating that EXT3 is better. My opinion on the matter is that
you shouldn't care about the filesystem much (EXT3, JFS, XFS being the
same for _most_ purposes with PostgreSQL). That is, it's a small little
spot in the big picture of performance tuning. You'd better look at the
big picture.

I'm only countering your claim: 
Generally XFS and JFS are considered superior to ext2/3.

There's no general agreement on the lists about that, so just handwaving
and saying look at the lists isn't enough. Mind posting a pointer
to _any_ serious PostegreSQL (or any database, at least) based
benchmark that consistently shows XFS and JFS as superior? One that
cares to show ext3/noatime/data=ordered,data=writeback,data=journal
results, too?

If I were to choose based on the results posted on the list (that I've
managed to find), ext3 would be the winner. Maybe I've missed something.

   Having used ext3 quite a bit, I'd say it's fairly stable and reliable,
   but I have seen references here to know, possibly unfixable bugs.
  
  Again, mind posting a reference?

[...]

I've searched for 'EXT3 bug' but got nothing. I'm (loosely) following
l-k, and never heard of possibly unfixable bugs in EXT3 by any
developer. Care to post any real reference? There have been bugs of
course, but that holds true for everything, XFS and JFS included.

Having re-read many many messages right now, I'm under a even stronger
impression that _all_ negative comments on both the stability and the
performance of EXT3 start with I've heard that... w/o almost noone
providing direct experience. Many comments display little understanding
of the subject: some don't know about data= mount option (there's little
point in comparing to XFS, if you don't use data=writeback), some have
misconceptions about what the option does, and what difference it makes
when the application keeps _syncing_ the files (I don't know well
either). See the data=journal case.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] Filesystem options for storing pg_data

2005-04-21 Thread Marco Colombo

  generate a lot of unnecessary disk activity. XFS tries to cache as much
  data in memory as possible, and generally only writes things out to disk
  when memory pressure dictates that it do so.
so, if a benchmark shows XFS is faster, it's matter of better caching,
right? But it comes at a price of possible (data) corruption...
Thankfully, it's pretty useless to us, with every write followed by a sync.
I'm sorry, but with the links _you_ selected, applying my filter
conditions 1) and 2), which are necessary for a fair comparison,
one could say there's general consensus on EXT3 being far superior
to other filesystems, not the opposite.
Note that I'm not interested in supporting such a claim. As I already
wrote I think FS selection has generally a minimal impact on PostgreSQL
performance.
But again, what was you original claim
 Generally XFS and JFS are considered superior to ext2/3.
based upon?
I apologize if I sound somehow harsh, it's not really intented.
But next time please assume that:
- I'm able to do a 10 minute search;
- I've got some work to do, too, but I'm willing so spend more than
  10 minutes on this research (it already took me more than 2 hours
  actually, of my spare time);
- if I say I've searched the lists and read many messages, I've
  really done so.
You're absolutely entitled to have your opinion, if you like XFS and
JFS go ahead and use them, because of their name, the names of their
sponsors (IBM and SGI), or their features, or your personal experience,
or whatever. Just please don't claim that's general consensus for the
pgsql lists. There's _no_ general consensus. There's _no_ clear winner.
And if you do want a winner anyway, it's ext3, so far.
This ext3 is not good as XFS as JFS is a recurring subject, as
long as ext3 is buggy. _Every single time_ I've asked for 
references to back up such claims, nothing valuable was presented.
On the contrary, the only references I've found are on the
ext3 is equal or better side.

Now, feel free to prove me wrong.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [GENERAL] PostgreSQL as a filesystem

2005-04-19 Thread Marco Colombo

On Mon, 2005-04-18 at 17:18 -0400, Tom Lane wrote: 
 Christopher Nelson [EMAIL PROTECTED] writes:
  I'm developing a hobby OS and I'm looking into file systems.  I've
  thought about writing my own, and that appeals, but I'm also very
  interested in the database-as-a-filesystem paradigm.  It would be nice
  to not have to write all of the stuff that goes into the DBMS (e.g.
  parsers, query schedulers, etc) myself.
 
  So I was wondering what sort of filesystem requirements Postgre has.
 
 There are DB's you could use for this, but Postgres (not Postgre,
 please, there is no such animal) isn't one of them :-(.  We really
 assume we are sitting on top of a full-spec file system --- we want
 space management for variable-size files, robust storage of directory
 information, etc.

I've been thinking of it, too. I think no filesystem out there is really
optimized for a steady write load with many fsyncs, that is, is really
transaction-oriented on the data side (journalled ones may implement
real transactions for meta-data, but only for it). Out of curiosity,
do you have any feedback from filesystem people, are they interested in
optimizing for the kind of workload (expecially on write) a database
generates? I ask for it seems to me it's a corner case to them, or even
a degenerated one. I'm not aware of _any_ comparative benchmarch among
different filesystems that is based on write+fsync load, for one.

Using a DB as filesystem at OS level is a different matter, of course.

Christopher, you may have a look at FUSE.
http://fuse.sourceforge.net/

It may help in both developing a new filesystem and in understanding
how it works under Linux (with a nice separation of userspace and
kernelspace). I think you could even write one based on PostgreSQL,
but it won't help much, since PostgreSQL needs a filesystem to work.
But if your OS has TCP/IP, it could be interesting anyway.

Note that I'm not aware of any other way to access PostgreSQL than
sockets, so you need those at least. There's no standalone library
you can link to in order to access database files, AFAIK.

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/  [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] [HACKERS] plPHP in core?

2005-04-04 Thread Marco Colombo

[Cc: list purged a bit]
On Sun, 3 Apr 2005, Jim C. Nasby wrote:
On Sun, Apr 03, 2005 at 08:41:15PM -0700, Joshua D. Drake wrote:
None on the server side (except PostgreSQL) which makes the
argument all that more powerful :)
So what you're saying is that no database sounds complete because no
database includes PHP as a procedural language.
Sorry, but I don't buy it.
I do. Actually I think no database is complete because no one includes
LISP as a procedural language (pun on procedural intented).
(BTW, I have no idea if a pl/LISP module ever existed.)
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] plpython function problem workaround

2005-03-29 Thread Marco Colombo

On Tue, 29 Mar 2005, Sim Zacks wrote:
The only ?issue? that I have found with it is similar to an issue I
posted about multiline in general, which does not seem to be
considered a bug.
I've posted similar concerns in the past. The whole point is that
there are two possible approaches:
1) treat text as binary - as we do now;
2) do on the wire conversion - like FTP ASCII mode.
Both have disadvantages, and both lead to unexpected results.
As I wrote before, 2) is more problematic. You'll have to reject
any file with a bare \n from a Windows, otherwise you won't be able
to process it correclty.
I think if you do:
insert into test (sometext) values ('Line one\nLine two\r\n');
-- with the literal chars, not the escape sequences
you're expecting exaclty the same on output. If the server
converts it in the Unix form:
'Line one\nLine two\n'
for storing and the converts back to the Windows form, when you do:
select sometext from test; -- from a Windows client
you get:
Line one\r\nLine two\r\n
which is not the same you entered.
I doubt FTP ASCII mode handles this correctly.
As for the examples you made (the python functions), it's a problem
with python string literals (just don't use them). Let's try this:
-- CUT HERE 8 
#!/usr/bin/env python
import pgdb
db = pgdb.connect()
curs = db.cursor()
# this is only to emulate PGAdmin under Windows (I don't have it)
# (I ran the script with these uncommented on Linux)
#q = create temp table test1(f1 varchar(50));
#curs.execute(q)
#q = insert into test1 values('this is a multi line 
string\r\nline2\r\nline3\r\n');
#curs.execute(q)
 real test
# embedded in string literal
q1 = select count(f1) from test1 where f1 = 'this is a multi line string
line2
line3
'
# escapes (expanded by python)
q2 = select count(f1) from test1 where f1 = 'this is a multi line 
string\r\nline2\r\nline3\r\n'
# escapes (expanded by PostgreSQL)
q3 = rselect count(f1) from test1 where f1 = 'this is a multi line 
string\r\nline2\r\nline3\r\n'
curs.execute(q3)
# stating the obvious
print Comparisons:
print %-10s%-10s%-10s % (q1 == q2, q1 == q3, q2 == q3)
print %-10s%-10s%-10s % (q1 == q2,q1 == q3,   q2 == q3)
print \nRunning tests...
curs.execute(q1)
print Test 1 (string literal):, curs.fetchone()[0]
curs.execute(q2)
print Test 2 (Python escapes):, curs.fetchone()[0]
curs.execute(q3)
print Test 3 (PG escapes):, curs.fetchone()[0]
# in case someone wonders, let's try using query parameters
astring = this is a multi line string
line2
line3

q = select count(f1) from test1 where f1 = %(mystr)s
curs.execute(q, { mystr: astring })
print Test 4 (parameters):, curs.fetchone()[0]
-- 8 CUT HERE 
This is the output (on Linux):
Comparisons:
q1 == q2  q1 == q3  q2 == q3
False False False
Running tests...
Test 1 (string literal): 0
Test 2 (Python escapes): 1
Test 3 (PG escapes): 1
Test 4 (parameters): 0
Which is consistent with your examples, that is, it works _only_
with explicit escapes (either at Python level or at PostgreSQL level).
If client-side python works this way, why are you expecting server-side
python to work differently?
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [GENERAL] plpython function problem workaround

2005-03-29 Thread Marco Colombo

On Tue, 29 Mar 2005, Marco Colombo wrote:
# escapes (expanded by PostgreSQL)
q3 = rselect count(f1) from test1 where f1 = 'this is a multi line 
string\r\nline2\r\nline3\r\n'
curs.execute(q3)
^^
This line (no. 28) is useless (but harmless), please ignore it
(just a cutpaste error).
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [GENERAL] multi line text data/query ?bug?

2005-03-23 Thread Marco Colombo

On Wed, 23 Mar 2005, Sim Zacks wrote:
While I would agree with you that from a purely technical standpoint, the
user inserted into the database a CRLF and a query with just an LF does not
exactly match that, from a users and more practical perspective, that does
not make sense at all. That is why I surrounded the  word bug in ??.
I would say that from a users perspective it qualifies as a bug because they
did not put in specific binary characters. They want a newline. From a
database standards perspective, I would argue that any database that allows
connections from a client without qualifying a required operating system
should be OS neutral.
I would say it is a bug from a users perspective because the exact same
query works differently from different clients. Since the user does not
choose what binary characters to put in, they are invisible to the user.
Anything that is completely invisible to the user should not be considered
valid qualifying data.
As there is no postgresql database standard, such as all newlines are unix
newlines it is impossible to write a client that will necessarily return
the data that you want.
This is the exact problem we are having with Python right now, as a Windows
client cannot write a python function to be run on a linux server.
Unfortunately, it's not that simple. There are problems with python
when _both_ the client and the server are Windows. Python itself
_always_ uses \n even on Windows. So the only solution is to
pythonize the input (convert to \n), no matter what.
For the more general problem of handling text, see my comments in
this thread:
http://archives.postgresql.org/pgsql-general/2005-01/msg00792.php
There are interesting problems with multiline text, as a datatype.
Think of digital signatures and checksums. Think of a simple function:
len(text)
should it count line separators as characters? In theory, the only
way to get cross-platform consistent behaviour, is to _ignore_ line
separators when counting or checksumming. But the real world solution
is to treat textfiles as binary and let the users or the application
handle the conversion.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [GENERAL] multi line text data/query ?bug?

2005-03-23 Thread Marco Colombo

On Wed, 23 Mar 2005, Sim Zacks wrote:
In any case, there are 2 correct solutions to the problem for the case of
postgresql.
1) Database standard - postgresql chooses a newline standard and every
client must support that if they support postgresql. Either put the onus on
the client developers to override the OS standard and support the postgresql
standard, or have the db convert the incoming newlines into the db standard.
2) Server's OS - the server must convert any new lines coming in to the
standard that it expects. This is similar to an ftp client that converts
newlines on the transfer. That means that data sent to a Windows server with
an LF will be converted to a CRLF and vice versa.
The data restore function will also have to follow the above procedures to
make sure you can take data from one server to the other without
compromising integrity.
Without one of these solutions, PostGreSQL is not compatible between servers
and clients.
A query written on any client should return the same result. The query being
the visible appearance on the screen. That is what the users would expect to
have returned.
Yeah, those were my points.
The _open_ problems are:
- what about storing a signed document? it's possible that newline
  conversion makes the signature invalid. How would you restore the original
  document? Before you answer think of:
  a) a client running on a platform different from the one that inserted
 the document;
  b) a document with _mixed_ newline types, such as a windows text with
 enmbedded bare \n or \r. [*]
- what about any other function that may be affected by newline style?
  I mean, the user may insert a text that he knows it's 1000 chars long,
  and finds that PG thinks it's only 980. Is this consistent? What if
  the user selects for the messages longer than 990? What is the expected
  answer, from the user standpoint?
There's no easy solution I think.
[*] This is _way_ more common than you'd think. RFC2822, internet message
format, says lines are CRFL separated. It happens sometimes that a message
contains a NL or a CR alone.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [GENERAL] New user: Windows, Postgresql, Python

2005-03-20 Thread Marco Colombo

Michael Fuhr wrote:
On Tue, Mar 15, 2005 at 10:46:09PM +, Paul Moore wrote:

The long and short of it is that I believe you just use \n to delimit
lines on Windows, just like anywhere else.

Many thanks -- your test results contain the info we've been seeking.
Thanks a lot Paul.
Micheal, you were right.
It seems python documentation is plain wrong, or I'm not able to
read it at all:
http://docs.python.org/ref/physical.html
A physical line ends in whatever the current platform's convention is for
terminating lines. On Unix, this is the ASCII LF (linefeed) character. On
Windows, it is the ASCII sequence CR LF (return followed by linefeed). On
Macintosh, it is the ASCII CR (return) character.
This is the language _reference_ manual, btw. I'm very surprised to hear
python on windows is so broken.
Anyway, that makes life simpler for us. plpython programs are \n separated,
no matter what platform the server runs on. Client applications just need
to conply, which is what I suggested some time ago. I'm glad to hear
there's nothing to change on the server side.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [GENERAL] plpython function problem workaround

2005-03-19 Thread Marco Colombo

On Fri, 18 Mar 2005, Tom Lane wrote:
Marco Colombo [EMAIL PROTECTED] writes:
Right now I'm parsing the string first, changing the resulting
parse tree adding missing nodes (def, INDENT, DEINDENT) and
then compiling it.
Hmmm ... is this really going to be simpler or more robust than lexing
the string carefully enough to insert tabs at the right places?  The
impression I had so far was that you'd only need to understand about
Python's string-literal conventions to get that right ... and that's
something that's not likely to change.  I'm not so sure that parse
trees can be regarded as an immutable API.
			regards, tom lane
I've completed a proof of concept, I think I can answer:
- simpler? not at all. It requires understanding of how the parser
  works. The whole thing is about 50 lines long, but quite a bit of
  parser magic is going on. And I'm far from the point I can be
  confident about it doing always the right thing. I still have to
  handle (de)allocations correctly.
- more robust - yes. The only way to make sure we're lexing the string
  the same way python does is to use its lexer. Every single difference
  however subtle would call for a bug. And it's re-invening the wheel.
  But there's no way you can work at lexer level that I'm aware of.
  That is, to add tokens before sending them to the parser. So you
  have to work on the parser output.
- I have no idea if the node API is immutable at all. For sure,
  the interface I'm using is one or two levels below the current one,
  and yes, it's more likely to change. I share your concerns here.
If our problem is only string literals, maybe we can handle them
with a dedicated lexer. Python string literals are quite complex
(compared to other languages):
http://docs.python.org/ref/strings.html
but not that hard.
Still, my first concern is that one day we find another corner case
in python syntax that makes our blind tab adding fail. And we're
back to square one.
BTW, I'm not preparing a patch for now, I'm working with a test
program. As soon as I finish it, either I'll post it or prepare
a patch against plpython.c, for consideration. I won't say it is
ready for inclusion until someone else more knowledgeable than
me on both PostgreSQL and python embedding looks at it, anyway.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [GENERAL] plpython function problem workaround

2005-03-18 Thread Marco Colombo

On Thu, 17 Mar 2005, Tom Lane wrote:
Martijn van Oosterhout kleptog@svana.org writes:
On Thu, Mar 17, 2005 at 01:03:36PM +0100, Marco Colombo wrote:
OMG! It's indenting the funtion body. I think you can't do that
w/o being syntax-aware. I'm not familiar with the code, why is it
adding a 'def' in front of it at all? I undestand that once you do
it you'll have to shift the code by an indentation level.

Presumbly because it wants to create a function, which can later be
called. Since python is sensetive to whitespace it has to indent the
code to make it work.
Seems like we have to upgrade that thing to have a complete
understanding of Python lexical rules --- at least enough to know where
the line boundaries are.  Which is pretty much exactly the same as
knowing which CRs to strip out.  So I guess we have a candidate place
for a solution.
Anyone want to code it up?  I don't know enough Python to do it ...
I'm no expert but I'll look into it. Unless someone else already
tried it, I want to investigate first if it's possible to create
a callable object w/o using 'def', which alters the name space and
(the thing we're interested to) needs an extra identation level.
At first sight, what we do now (at function creation time) is:
1) execute a function definition;
2) compile a function call, and save the resulting code object for later use.
I'm wondering if we can save one step, and use a python callable object.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [GENERAL] plpython function problem workaround

2005-03-18 Thread Marco Colombo

On Thu, 17 Mar 2005, Michael Fuhr wrote:
On Thu, Mar 17, 2005 at 09:48:51PM -0500, Tom Lane wrote:
Michael Fuhr [EMAIL PROTECTED] writes:
Line-ending CRs stripped, even inside quotes; mid-line CRs converted
to LF.  Tests done with Python 2.4 on FreeBSD 4.11-STABLE; I wonder
what Python on Windows would do.
Unfortunately, I don't think that proves anything, because according
to earlier discussion Python will do newline-munging when it reads
a file (including a script file).  The question that we have to deal
with is what are the rules for a string fed to PyRun_String ... and it
seems those rules are not the same.
Marco, you've stated that you're against munging the code because
it's not our job to 'fix' data coming from the client.  But I'm
suggesting that we think about the code in a different way than the
current implementation does: not as a literal that we pass untouched
to the Python interpreter, but rather as code that Python would
munge anyway if it had read that code from a file.  We could still
store the code exactly as received and have the language handler
munge it on the fly, as we've discovered it's already doing.
Comments?  Have I overlooked anything?  Could munging CRs have
effects that a Python programmer wouldn't expect if the same code
had been read from a file?  Since it mimics Python's own behavior
with code read from a file, can anybody justify not doing it?
If you put it that way, I'm 100% with you.
Just let me suggest not to mimic its behaviour, but to use the
Real Thing if we manage to. That is, directly use the Universal Line
Support code provided by python itself, so that we don't even have
to think about being compatible.
Unfortunately, I'm new to python embedding. I think I'll ask on
python lists about the function definition code. Actually, we are
kind of a corner case: we ask the user to provide the function
body w/o the initial def declaration. We're treating partial, incomplete
python code and not a well-formed program, so we have to munge it
anyway. I have no idea if and how the python C API lets you 
control such low level details. I think what we really want is to
create a callable (anonymous) object from the source of its body.

I'm experimenting a bit, trying to write a PLy_procedure_compile()
that does not require source munging. I'm aiming at removing the
need for extra indentation. The \r\n thing is another beast, and I'm
not sure it belongs to the same place in our code.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] plpython function problem workaround

2005-03-18 Thread Marco Colombo

On Fri, 18 Mar 2005, Tom Lane wrote:
Michael Fuhr [EMAIL PROTECTED] writes:
On Fri, Mar 18, 2005 at 12:35:07AM -0500, Tom Lane wrote:
So that part of it can be solved fairly easily.

Should I submit a patch?  It should be only a few additional lines
in PLy_procedure_munge_source().  Would you apply it only to HEAD,
or would it be considered a bug fix that REL8_0_STABLE could get
as well?  It might be nice to have it in 8.0.2, whenever that comes
out.
I think it would be reasonable to back-patch a small fix to convert CRLF.
The sort of rewrite Marco is considering, I wouldn't back-patch.
			regards, tom lane
Short update: it isn't possible to get a callable object directly from
the source provided by the user, since 'return' is not valid outside
function definitions in Python.
Right now I'm parsing the string first, changing the resulting
parse tree adding missing nodes (def, INDENT, DEINDENT) and
then compiling it. Grammar definitions for a python function is:
funcdef: [decorators] 'def' NAME parameters ':' suite
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
What we get from the users is stmt+ (a sequence of stmt). The INDENT
and DEDENT tokens are generated by the parser only when indentation
level _changes_.
My plan is to generate two parse trees, one from this code
(with the right fname of course):
def fname():
pass
and one from the function definition. Then, we attach the root
of the second tree where the pass node is in the first tree.
We should get a parse tree ready for compilation.
I wish I could push the right tokens in the right places,
but it seems it's not possible.
Stay tuned.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] plpython function problem workaround

2005-03-17 Thread Marco Colombo

On Wed, 16 Mar 2005, Michael Fuhr wrote:
[I've changed the Subject back to the thread that started this
discussion.]
On Wed, Mar 16, 2005 at 05:52:02PM +0100, Marco Colombo wrote:
I'm against to any on-the-fly conversion, now.
I don't like the idea of PostgreSQL accepting input in one form
(\r\n) and providing output in a different form (\n). Also think of
a function definition with mixed \r\n and \n lines: we'd have no way
to reconstruct the original input.
Yeah, that's a reasonable argument against modifying the function
source code before storing it in pg_proc.  But I expect this problem
will come up again, and some people might not care about being able
to reconstruct the original input if it's just a matter of stripped
carriage returns, especially if the function logic doesn't use
literal carriage return characters that would be missed.  For those
people, the validator hack might be an acceptable way to deal with
a client interface that inserts carriage returns that the programmer
didn't intend anyway.  Not necessarily as part of the core PostgreSQL
code or even distributed with PostgreSQL, but as something they
could install if they wanted to.
Agreed.
I think we should just state that text used for function definitions
is \n-delimited.  Some languages may accept \r\n as well, but that's
undocumented side effect, and bad practice.
Whether it's an undocumented side effect depends on the language,
and whether it's bad practice is a matter of opinion.
Sure. I mean, we may just state that, per spec. Program data
should be \n-delimeted, full stop. It sounds sensible to me.
Just put it somewhere in the docs, problem solved. We're loosing 
nothing. I'm just proposing to add that to the docs/specs.

  In any case,
that's the language's concern and not something PostgreSQL should
judge or enforce.  PostgreSQL shouldn't have to know or care about a
procedural language's syntax -- a function's source code should be an
opaque object that PostgreSQL stores and passes to the language's
handler without caring about its contents.  Syntax enforcement should
be in the language's validator or handler according to the language's
own rules.
That's what we do now. My point being it's not our job to fix data
coming from the client. If a client app creates a plpython function
the wrong way, fix it. Why should we place a paperbag on a client bug?
Speaking of code munging and syntax enforcement, have a look at this:
CREATE FUNCTION foo() RETURNS text AS $$
return line 1
line 2
line 3

$$ LANGUAGE plpythonu;
SELECT foo();
  foo
--
line 1
   line 2
   line 3
(1 row)
Eh?  Where'd those leading tabs come from?  Why, they came from
PLy_procedure_munge_source() in src/pl/plpython/plpython.c:
   mrc = PLy_malloc(mlen);
   plen = snprintf(mrc, mlen, def %s():\n\t, name);
   Assert(plen = 0  plen  mlen);
   sp = src;
   mp = mrc + plen;
   while (*sp != '\0')
   {
   if (*sp == '\n')
   {
   *mp++ = *sp++;
   *mp++ = '\t';
   }
   else
   *mp++ = *sp++;
   }
   *mp++ = '\n';
   *mp++ = '\n';
   *mp = '\0';
How about them apples?  The PL/Python handler is already doing some
fixup behind the scenes (and potentially causing problems, as the
example illustrates).
OMG! It's indenting the funtion body. I think you can't do that
w/o being syntax-aware. I'm not familiar with the code, why is it
adding a 'def' in front of it at all? I undestand that once you do
it you'll have to shift the code by an indentation level.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [GENERAL] pg_dump large-file support 16GB

2005-03-17 Thread Marco Colombo

On Thu, 17 Mar 2005, Rafael Martinez Guerrero wrote:
My question is why is this limit (16GB) there, when my OS does not have
that limit? Is it possible to take it away in a easy way? It looks like
pg_dump is compiled with large-file support because it can work with
files bigger than 4GB.
More ideas? :)
Things to try:
a) shell redirection:
$ pg_dump ...  outfile
b) some pipes:
$ pg_dump ... | cat  outfile
$ pg_dump ... | dd of=outfile
a) may fail if there's something with pg_dump and large files.
b) is different in that it's the right side of the pipe that outputs
to the filesystem.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [GENERAL] New user: Windows, Postgresql, Python

2005-03-16 Thread Marco Colombo

On Wed, 16 Mar 2005, Michael Fuhr wrote:
On Wed, Mar 16, 2005 at 01:46:23PM +0100, Marco Colombo wrote:
It seems python documentation is plain wrong, or I'm not able to
read it at all:
http://docs.python.org/ref/physical.html
A physical line ends in whatever the current platform's convention is for
terminating lines. On Unix, this is the ASCII LF (linefeed) character. On
Windows, it is the ASCII sequence CR LF (return followed by linefeed). On
Macintosh, it is the ASCII CR (return) character.
Perhaps the Python documentation could use some clarification about
when the platform's convention is required and when it isn't.
The Embedding Python documentation shows embedded code with lines
ending in \n and without saying anything about requiring the
platform's convention:
http://docs.python.org/ext/high-level-embedding.html
This is the language _reference_ manual, btw. I'm very surprised to hear
python on windows is so broken.
Anyway, that makes life simpler for us. plpython programs are \n separated,
no matter what platform the server runs on.
That the behavior makes life simpler is an argument against it being
broken (although it would be even less broken if it were more
flexible about what line endings it allows).
broken == 'not conforming to the specifications or the documentation'
The fact it helps us is just a side effect.
 A detailed response
would be getting off-topic for PostgreSQL, but I'll stand by what
I said earlier: I would find it bizarre if embedded Python code had
to use different line endings on different platforms.  That would
mean the programmer couldn't simply do this:
   PyRun_SimpleString(x = 1\n
  print x\n);
Instead, the programmer would have to do a compile-time or run-time
check and build the string in a platform-dependent manner.  What
problem would the language be solving by requiring that?
This one:
aprogram = x = 1\nprint x\n;
printf(aprogram);
PyRun_SimpleString(aprogram);
See? THIS program requires compile-time or run-time checks. You
can't run it on Windows, or Mac: it'll write garbage to the screen
(something that looks like garbage, that is).
Make it more general:
aprogram = get_program_from_somewhere();
PyRun_SimpleString(aprogram);
write_program_to_somefile_possibly_stdout(aprogram);
What if get_program_from_somewhere() reads user input? On Windows
lines will be \r\n separated. Now, should this program make
platform checks? Why not simply read a file (or stdin) in text
mode, and pass the result to PyRun_SimpleString()? The same applies
to output, of course.
Now something strikes me... in his tests, Paul tried my program and
the output looks identical to Linux. Now... I was expecting 
program1 (the one with just \n) do display badly under Windows.
Am I missing something? Does C runtime support in Windows convert
\n into \r\n automatically in printf()?  If so, I'm on the wrong track.
It may do the same with scanf() and other stdio functions.

I must say I wasn't expecting my program to run just fine, with all
those \n I used in it. Staring from
printf( Initialized.\n);
Paul can you please tell me which compiler you used under Windows
to complile my program and if you used some weird compiling options? TIA.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] New user: Windows, Postgresql, Python

2005-03-16 Thread Marco Colombo

On Wed, 16 Mar 2005, Michael Fuhr wrote:
On Wed, Mar 16, 2005 at 04:17:51PM +0100, Marco Colombo wrote:
aprogram = x = 1\nprint x\n;
printf(aprogram);
PyRun_SimpleString(aprogram);
See? THIS program requires compile-time or run-time checks. You
can't run it on Windows, or Mac: it'll write garbage to the screen
(something that looks like garbage, that is).
Are you sure about that?  It's been forever since I programmed in
a Microsoft environment, but as I recall, I/O streams opened in
text mode do automatic translations between \n and \r\n.
No I wasn't sure and I actually was wrong. I've never programmed under
Windows.  I've just learned something.
Apparently, as far as Python is concerned, the platform presents \n
at C level, so it makes sense for PyRun_SimpleString() to expect \n
as line terminator. Still I don't understand when the lexxer would
use \r\n as pysical line ending on Windows, but I can live with it. :-)
It seems that any client application under Windows is likely to use
only \n-delimited text, as long as it uses stdio functions and text
mode. Problems arise when it gets text from some other source. But since
at C level text is expected to be \n-delimited, the application should
take care of the conversion as soon as it receives the data.
I think that if we want to be conservative, any input that is supposed
to be treated (actively) as text by the server, should be \n-delimited.
That includes any function source.
I'm against to any on-the-fly conversion, now.
I don't like the idea of PostgreSQL accepting input in one form 
(\r\n) and providing output in a different form (\n). Also think of
a function definition with mixed \r\n and \n lines: we'd have no way
to reconstruct the original input. I think we should just state that
text used for function definitions is \n-delimited. Some languages may
accept \r\n as well, but that's undocumented side effect, and bad practice.

Now that I learned that C programs on Windows are expected to handle
\n-delimited text, I can't think of any reason why an application should
send \r\n-delimited text via libpq as a function definition, unless
the programmer forgot to perform the standard \r\n to \n conversion
somewhere.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq

Re: [GENERAL] plpython function problem workaround

2005-03-15 Thread Marco Colombo

The following is how I understand it, but please let's delay further
discussion until someone tests the program under Windows.
On Mon, 14 Mar 2005, Michael Fuhr wrote:
Hmmm...I think that would be inconsistent with previous reports.
For example, in the following message, the poster said that everything
(PostgreSQL, pgAdmin) was running on Windows 2003:
http://archives.postgresql.org/pgsql-interfaces/2005-02/msg00066.php
I'm sorry, he's wrong.
The initial report was by Hong Yuan:
http://archives.postgresql.org/pgsql-general/2005-01/msg00792.php
later he clarified:
http://archives.postgresql.org/pgsql-general/2005-01/msg00858.php
I am using pgAdmin III Version 1.2.0 under Chinese Windows XP, while
 the database is 7.4.6 under Linux.
BTW I just noticed someone else provided a simpler example:
http://archives.postgresql.org/pgsql-general/2005-01/msg00876.php
someone should try and complile those under windows.
I suggested that he strip the CRs from pg_proc.prosrc and he said
it worked:
http://archives.postgresql.org/pgsql-interfaces/2005-03/msg00014.php
It's not clear that the test in the second message was run on a
Windows server (apparently pgAdmin was run on a Windows client),
but I think the beginning of the message is saying that he didn't
reply for so long because he didn't have access to a Windows server.
From that I infer that he tried my suggestion and posted the results
when he finally did get access to a customer's Windows server.  I
could be misreading that, however.
I have no idea of where Michele Bendazzoli ran that code. He's not
the original poster, tho.
A couple of months ago Stuart Bishop brought this issue up in
python-dev.  Most of the thread is along the lines of strip the
carriage returns:
http://mail.python.org/pipermail/python-dev/2005-January/051203.html
Sorry again, but he's assuming Unix on the server side, and
Windows or Mac on the client side.
If anyone manages to compile the following code on Windows...
...
I bet on windows the first program fails and the second is ok.
Hopefully somebody will do a Windows test of the code you posted.
I'd find it bizarre that the Python code embedded in a C program
had to care whether it was running on *nix or Windows.
I find it perfectly consistent! Face it, _any_ C program that's handling
data of type _text_ has to know which platform it is running on. If you
don't like the behaviour of Python functions, think of printf().
C programmers under Windows are used to write:
printf(Hello World!\r\n);
as much (old) Mac programmers write:
printf(Hello World!\r);
and Unix programmers write:
printf(Hello World!\n);
_ANY_ C program that processes multiline text input has to know which
platform it is running on, otherwise it reads or proceduces garbage.
Python just requires the input being text, which seems reasonable to me,
since, by design, a python program is more that just a sequence of ';'-
separated statements (ala C or Perl), with _optional_ intermixing
whitespaces. White spaces ('eol' included) do have a meaning in Python
syntax.
BTW, this attitude reminds me of PostgreSQL strict validation of input,
compared to more relaxed checking made by MySQL. I really don't feel
the need to enumerate the merits of input validation on this list.
Python functions want 'valid text' and the definition is platform
dependent, unfortunately. Why should it be relaxed, and accept invalid
text as input?
If you are to compile any unix text processing utility, such as grep,
on a mac, you'll have to change the source and have the program split
lines at \r (under windows, splitting lines at \n still works, but
leaves a spurious \r around). Python is the same, since it expects
programs as _text_.
The universal newline support is totally another matter. And it applies
to files only (it's a wrapper around file functions). It's a tool
for lazy programmers.
It is possible that in the feature the Python lexxer is changed to
recognize different line endings (since there's no valid case for a
\r at the end of a line that I can think of). But requiring the
input being text is not 'bizarre' at all.
The issue about text representation affects _any_ application.
Treating text as binary data is plain wrong, IMHO, and will always
lead to problems.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [GENERAL] plpython function problem workaround

2005-03-15 Thread Marco Colombo

On Tue, 15 Mar 2005, Tom Lane wrote:
Sim Zacks [EMAIL PROTECTED] writes:
I've been looking at the possibility of having a planned CR in the source
code and I don't see a case where it would happen.
Does python actually disallow newlines in string literals?  That is
x = 'foo
bar'
Whether you think this is good style is not the question --- is it
allowed by the language?
You can with triple-quoting and by escaping it with backslash.
The following code, admitedly ugly, is valid python:
a = 'a\
bc'
print a
b = '''a
bc'''
print b
and produces:
abc
a
bc
as output. \newline in any non raw literal is allowed and ignored,
while a bare newline in a triple-quoted string literal is allowed
and retained.
Moreover, this is not an execise of bad style only. It's customary to
write docstrings as multiline triple-quoted string literals:
def afunction(a, b, c):
This is a function.
Its arguments are:
 a - first argument
 b - second argument
 c - third argument.
It does ans returns nothing.

pass
It's more or less the recommended way to document a function (or class
or module or whatever). See PEP 257 for more examples:
http://www.python.org/peps/pep-0257.html
So, to answer to your question, newlines are more than allowed in
string literals.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [GENERAL] plpython function problem workaround

2005-03-15 Thread Marco Colombo

On Tue, 15 Mar 2005, Michael Fuhr wrote:
[...]
That somebody was me.
Ok, sorry.
I'll postpone commenting on the rest until we find out how the
example programs run on Windows.  If nobody follows up here then
maybe I'll wander over to comp.lang.python.
Yeah, there's no point in discussing until we have some real world
data. I can't compile on windows, so I'll have to wait someone else
to do that. I'm basing my opinions on Python documentation only.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [GENERAL] plpython function problem workaround

2005-03-14 Thread Marco Colombo

On Mon, 14 Mar 2005, Michael Fuhr wrote:
Would we?  My understanding is that code passed to PyRun_String()
and friends must be free of line-ending CRs on all platforms, and
that the code that reads a normal Python script takes care of
that (i.e., normalizes line endings to be LF only).  Can anybody
confirm or deny?
I'm not sure of that. I suspect you'll need to pass CRs on windows.
If anyone manages to compile the following code on Windows...
#include Python.h
void
run_program(const char *program)
{
PyObject *ret, *globals, *locals;
printf( running:\n%s\n, program);
globals = PyDict_New();
locals = PyDict_New();
ret = PyRun_String(program, Py_file_input, globals, locals);
if (ret) {
Py_DECREF(ret);
printf(\n);
} else {
PyErr_Print();
}
Py_DECREF(locals);
Py_DECREF(globals);
printf( end\n\n);
}
int
main(int argc, char *argv[])
{
const char *program1 = print 1\nprint 2\n;
const char *program2 = print 1\r\nprint 2\r\n;
Py_Initialize();
printf( Initialized.\n);
printf( Python %s\n, Py_GetVersion());
run_program(program1);
run_program(program2);
Py_Finalize();
printf( Finalized.\n);
}
On my Fedora Core 2, I need to complile it with the following command:
gcc -I/usr/include/python2.3 -L/usr/lib/python2.3/config py-test.c -o py-test\
-lpython2.3 -ldl -lm -lpthread -lutil
This is my first attempt to embed python, so I may be missing something...
On Linux, you get:
$ ./py-test 21 | cat -v
Initialized.
Python 2.3.3 (#1, May  7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)]
running:
print 1
print 2
1
2
end

running:
print 1^M
print 2^M
  File string, line 1
print 1^M
   ^
SyntaxError: invalid syntax
end

Finalized.

I bet on windows the first program fails and the second is ok.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [GENERAL] postgresql vs mysql performance comparison

2005-03-09 Thread Marco Colombo

On Tue, 8 Mar 2005, Scott Marlowe wrote:
On Tue, 2005-03-08 at 09:06, Shelby Cain wrote:
--- Howard Cole [EMAIL PROTECTED] wrote:
Although not appropriate for a speed comparison, you
might want to note
that the use of Mysql versions 4.0 upward now
require commercial license
for clients, which are no longer LGPL, whereas
Postgres is free (BSD
license). This makes transactions per dollar an
interesting statistic
when comparing the Postgres and MySql!
Reading over their site that doesn't appear true for
every case.  The client libraries are under the GPL
and thus any application that links to them would also
be covered under the GPL.  No commercial license is
required unless the terms of the GPL (ie: if you
distribute a binary to someone you must also be
willing to distribute your source code if asked) a
problem.
There have been some statements from MySQL in the past that implied they
might be taking a narrower view of what distribution meant than what
the GPL was saying.  Also, it was impossible for PHP to be packaged with
MySQL libs due to incompatibilities with the GPL'd mysql connection
libs.  It seems MySQL AB has clarified both on these pages:
http://www.mysql.com/company/legal/licensing/
http://www.mysql.com/company/legal/licensing/foss-exception.html
http://www.mysql.com/company/legal/licensing/faq.html
However, Fedora Core 2 still includes MySQL V 3.xx.yy because of the
issues wth V4.xx.yy's licensing.  However, Suse does include the latest
version.  So there's some difference of opinion on the issue from
different distros.
Or different policies.
One of the biggest problem of their dual licencing policy is that
no one in really interested in provinding them with patches. In other
words, they cannot accept third party contributions so easily.
_My_ patches are going to be, likely, GPL-only. So they can't use
them in their commercial product, unless they make two different
lines (which they claim they don't), or they get a (commercial) licence
from _me_ allowing _them_ to sell a work including _my_ patches.
So in order to accept patches from me, they need a lot of paperwork
(not to mention money, they're gonna pay for being able to sell my
work). Not pratical.
This is not the case of truly GPL software, such as the Linux kernel.
Patches, being a derived work, are GPL and they can include them
anytime.
Note that client libraries are optional. As long the protocol is
openly defined (we have open specs), you can write your own
client layer, and still connect to the GPL server. Which is _the_
thing. Protecting the client library (switching the licence
from LGPL to GPL) makes little sense, IMHO. It greatly reduces
the number of potential users, and protects little value.
If want to develop a commercial application that:
- runs under Linux - I can;
- uses HTTP as protocol, and connects to a GPL-ed web server - I can;
- uses MySQL as a database backend - I can't, unless I rewrite the
  client library, or buy a commercial licence from them. Why?
With PostgreSQL you don't have to thing about these issues. A big win.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq

Re: [GENERAL] [Auth] ident method and LDAP user accounts

2005-03-03 Thread Marco Colombo

On Thu, 3 Mar 2005, Stephane Bortzmeyer wrote:
On Thu, Mar 03, 2005 at 10:04:32AM +0100,
Florian G. Pflug [EMAIL PROTECTED] wrote
a message of 114 lines which said:
Might it be that the postgres user is not allowed to read
/etc/ldap.conf - or however your nss_ldap config file is called?
myriam:~ % ls -ld /etc/*ldap*
drwxr-xr-x  2 root root 4096 Oct 18 17:17 /etc/ldap
-rw---  1 root root   13 Oct 18 17:19 /etc/ldap.secret
-rw-r--r--  1 root root 8442 Oct 18 17:27 /etc/libnss-ldap.conf
-rw-r--r--  1 root root 7070 Oct 18 17:19 /etc/pam_ldap.conf
I'd try su-ing to the postgres user, and check if everything (ls -l
/home, ... - you get the idea) works as expected.
It does:
myriam:~ % id
uid=104(postgres) gid=108(postgres) groups=108(postgres)
myriam:~ % ls -l /home/bortzmeyer
total 68
drwxr-sr-x   3 bortzmeyer staff 4096 Nov 19 11:47 AFGNIC
While bortzmeyer is not on /etc/passwd, only in LDAP.
So, we still have a mystery :-(
Does Debian include and activate SELinux?
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [GENERAL] pgadmin3 / postgresql newbie question

2005-03-02 Thread Marco Colombo

On Wed, 2 Mar 2005, Jonathan Schreiter wrote:
hi all,
running amd64 fedora core 3 w/ default postgresql
7.4.7-3. did asu - , su postgres and createdb mydb as
explained in the postgresql tutorial. installed the
latest pgadmin3 and am trying to connect to this
database. as i wasn't sure what the FC3 default
password was for postgres, i changed it to something i
could remember.
i can't seem to connect to the new database using
pgadmin3. i have the correct IP address of the local
computer, default port 5432, mydb as the initaldb,
postgres as the username, and my new password as the
password. i keep getting the error
Error connecting to the server: could not connect to
server: Connection refused
Is the server running on host 192.168.1.24 and
accepting
TCP/IP connections on port 5432?
i also verified the postgresql service is running, and
that i've added the following to
/etc/raddb/postgresql.conf:
This file is part of the freeradius package, and despite the name,
has nothing to do with your PostgreSQL configuration.
The default path for the real PostgreSQL configuration file is:
/var/lib/pgsql/data/postgresql.conf
I don't know if running TCP/IP is a requirement for pgadmin3, but
if you need to access the _local_ PostgreSQL server, most clients
would do w/o configuring TCP/IP support at all.
login = postgres
password = mynewpassword
and right underneath it:
tcpip = true
These do not belong to PostgreSQL server configurarion. It's RADIUS stuff.
i've also disabled my local firewall and SELINUX just
for kicks. and yes, i did a reboot.
so...anyone know what else i can look at?
1) make sure postgresql is running (use ps - look for a postmaster process)
2) if it's not there, run following command as root:
service postgresql start
3) if you want it to run automatically at boot, and it doesn't, run
   the following command as root:
chkconfig postgresql on
this won't start the process if it's not running. It just sets a flag
for the next boot.
4) i don't get what you mean for changing postgres password. To switch
to the postgres user, I usually switch to root first, and then to postgres.
Root can switch to any user w/o password. Actually, it's good security
practice not to assign any password to system pseudo-accounts (postgres
is one of them) and leave them locked. If you need a different access
method, I strongly suggest to look at the PostgreSQL way to authenticate
users and stop using the 'ident' method (see pg_hba.conf), which forces
you to run clients with a certain Unix user id.
5) try and access to the db with the psql client first. Use the same
connection method you're using with pgadmin3, and run it under the same
user you run pgadmin3 with. E.g.:
psql -h localhost -p 5432 -U postgres mydb
see psql manual for details.
If you successfully get to 5), it's likely it's a pgadmin3 problem.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [GENERAL] Clay Shirky observation regarding MySQL

2005-03-01 Thread Marco Colombo

On Mon, 28 Feb 2005, Martijn van Oosterhout wrote:
On Mon, Feb 28, 2005 at 01:46:16PM -0600, [EMAIL PROTECTED] wrote:
Hello!
Clay Shirky made a comment about MySQL that I thought the PostgreSQL
community should be aware of:
   http://www.shirky.com/writings/situated_software.html
It's the section (mostly toward the bottom) entitled, The Nature of
Programming, and the Curious Case of MySQL. The whole article is, as
normal, interesting and thought-provoking.
Interesting article, but w.r.t. to the MySQL statement, I read: If you
don't need any of the things that databases are good for (ACID,
transactions, triggers, views) then MySQL is an acceptable choice.
Interesting article, but I'm not much forgiving when a teacher
is so imprecise expecially on software history and qualities.
It is plain wrong that Apache made it easy, and that before apache
a web server was rocket science.
 [...] In the mid-90s, getting a web server running was such a messy
  endeavor that it was a project goal in and of itself. Then Apache came
  along, and so simplified the process that the web server became a simple
  building block for larger things.
I've used both cernd and NCSA httpd, and when I finally switched
to Apache the choice was based on project activity: faster evolution,
more features, better stability, certainly not because it was easier.
Apache is quite a beast to configure from scratch even nowadays,
I'd even say it's _more_ complicated than it used to be (think of the
different server models it supports, and the large number of modules
available). Fore sure, running cernd was not any harder than running
apache is today. Mr. Shirky completely missed the point here.
 [...] MySQL makes the job much easier, so much easier in fact that
  after MySQL, it becomes a different kind of job. There are complicated
  technical arguments for and against using MySQL vs. other databases,
  but none of those arguments matter anymore.
Same goes for MySQL. I don't think MySQL is easier to install,
configure and administer: the tasks are almost the same of other
open source databases. And _definitely_ it's not easier to use!
I'm not following their development much, but when I had to use it
I've _always_ found that missing features do make it harder to use MySQL.
Hell, even NATURAL JOIN is nice sugar for small projects, where you
can name columns and tables so that you write queries so naturally,
hardly having to stop and think. So are foreign keys. These are 
_not_ advanced features that only SQL gurus use: they reduce development
time and save headaches for _most_ application programmers. They
make programming faster, safer, easier.
So, MySQL success is _not_ based on user-friendlyness. Again, wrong
example.

Back in '95, on the Microsoft side, they didn't even know about TCP/IP.
Linux and i386/NetBSD were quite hard to get and install, not to mention
completely unknown to the public. No wonder running a web server was not
for everyone.
Today, we have distributions that come with a ready-to-run web server.
On the Microsoft side, they turned to Unix (NT/2000/2003 is POSIX, and
even Bill used to claim NT is Unix), and to Internet services. You
can run many Unix daemons and they have thier own Web server and SQL
server.
So, Mr. Shirky is right, installing and running a web server, or a RDBMS,
today is a matter of a few mouse clicks. But _not thanks to Apache
and MySQL_ (and to do that _professionally_ is still totally another
matter). They're only small bricks in the comfortable house build by the
open source movement as a whole.
Development teams behind projects such as Apache (which today is a lot
more than a HTTP server) and PostgreSQL pursue the goal of making
good products for _professionals_ to use effectively. Most of the burden
of making technologies available to as many non-guru users as possible
is on distribution makers. If Mr. Shirky wants to set a date, and
say before that and after that, it's the day open source
distrubutions hit the masses. Certainly there's no after Apache and
no after MySQL.
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] Lost rows/data corruption?

2005-02-16 Thread Marco Colombo

On Wed, 16 Feb 2005, Andrew Hall wrote:
fsync is on for all these boxes. Our customers run their own hardware with 
many different specification of hardware in use. Many of our customers don't 
have UPS, although their power is probably pretty reliable (normal city based 
utilities), but of course I can't guarantee they don't get an outage once in 
a while with a thunderstorm etc.
I see. Well I can't help much, then, I don't run PG on XFS. I suggest 
testing
on a different FS, to exclude XFS problems. But with fsync on, the FS has
very little to do with reliability, unless it _lies_ about fsync(). Any
FS should return from fsync only after data is on disc, journal or not
(there might be issues with meta-data, but it's hardly a problem with PG).
It's more likely the hardware (IDE disks) lies about data being on plate.
But again that's only in case of sudden poweroffs.
[...]
this condition. I'd be really surprised if XFS is the problem as I know there 
are plenty of other people across the world using it reliability with PG.
This is kind of OT, but I don't follow your logic here.
I don't see why plenty of success stories of XFS+PG suggest to you
the culprit is PG. To me it's still 50% - 50%. :-)
Moreover, XFS is continuosly updated (as it follows normal linux kernel
fast release cycle, like any other linux FS), so it's hard to make a
data point unless someone else is using _exactly_ the same versions as
you do.
For example, in kernel changelog from 2.6.7 to 2.6.10 you can read:
[XFS] Fix a race condition in the undo-delayed-write buffer routine.
[XFS] Fix up memory allocators to be more resilient.
[XFS] Fix a possible data loss issue after an unaligned unwritten
 extent write.
[XFS] handle inode creating race
(only a few of them)
Now, I don't have even the faintest idea if that might have affected you
or nor, but still the point is that the linux kernel changes a lot.
And vendors tend to customize their kernels a lot, too. On the PostreSQL
side, releases are slowly-paced, so it's easier.
Anyway, I agree your problem is weird, and that it must be something
on the server side.
No matter what you do on the client side (pool manager, JDBC driver,
servlets engige), in no way the DB should get corrupted with duplicated
primary keys.
I know this is a silly question, but when you write 'We do nothing with
any indexes' do you mean indeces are never, _never_ touched (I mean
explicitly, as in drop/create index), i.e. they are created at schema
creation time and then left alone? Just to make sure...
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [GENERAL] Lost rows/data corruption?

2005-02-16 Thread Marco Colombo

On Wed, 16 Feb 2005, Scott Marlowe wrote:
I know there are write modes in ext3 that will allow corruption on power
loss (I think it's writeback).  I know little of XFS in a production
environment, as I run ext3, warts and all.
Yeah, but even in writeback mode, ext3 doesn't lie on fsync. No FS does.
Since PG can't expect any data to be on disk _before_ fsync completes,
it doesn't really make a difference. You can loose data in writeback mode
_if_ the application is not fsync-ing it (XFS only mode is similar to
writeback). I'm not aware of any case in which the system can lie about
fsync(), unless the hardware is lying in turn.
One question for gurus: does PG use fsync() on dirty data pages when
they are flushed to disk at checkpoint time? Does it fsync() the
directory in case of file creation/deletion/rename?
.TM.
--
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq

1 2 >

1 - 100 of 138 matches

Mail list logo