Re: [GENERAL] Preventing OOM kills
On 05/25/2011 03:01 AM, John R Pierce wrote: On 05/24/11 5:50 PM, Andrej wrote: Add more RAM? Look at tunables for other processes on the machine? At the end of the day making the kernel shoot anything out of despair shouldn't be the done thing. somehow, 'real' unix has neither a OOMkiller nor does it flat out die under heavy loads, it just degrades gracefully. I've seen Solaris and AIX and BSD servers happily chugging along with load factors in the 100s, significant portions of memory paging, etc, without completely crumbling to a halt. Soimetimes I wonder why Linux even pretends to support virtual memory, as you sure don't want it to be paging. http://developers.sun.com/solaris/articles/subprocess/subprocess.html Some operating systems (such as Linux, IBM AIX, and HP-UX) have a feature called memory overcommit (also known as lazy swap allocation). In a memory overcommit mode, malloc() does not reserve swap space and always returns a non-NULL pointer, regardless of whether there is enough VM on the system to support it or not. The memory overcommit feature has advantages and disadvantages. (the page goes on with some interesting info) [*] It appears by your definition that neither Linux, AIX nor HP-UX are 'real' Unix. Oh, wait, FreeBSD overcommits, too, so can't be 'real' either. /me wonders now what a 'real' Unix is. :) Must be something related with 'true' SysV derivatives. If memory serves me well, that's where the word 'thrashing' originated, right? Actually in my experience nothing 'thrashes' better than a SysV, Solaris included. The solution for the OP problem is to keep the system from reaching OOM state in the first place. That is necessary even with overcommitting turned off. PG not performing its job because malloc() keeps failing isn't really a solution. .TM. [*] One missing piece is that overcommitting actually prevents or delays OOM state. The article does mention system memory can be used more flexibly and efficiently w/o really elaborating further. It means that, given the same amount of memory (RAM+swap), a non overcommitting system reaches OOM way before than a overcommitting one. Also it is rarely a good idea, when running low on memory, to switch to an allocation policy that is _less_ efficient, memory wise. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pl/Python error when import from __future__
On 03/31/2011 09:58 PM, Davi Duarte wrote: Hello, I'm using PL/Python in PostegreSQL 9.0.3 and Python 2.6.5, I want to use a feature of Python 3, Python have an option to import a module from future version of Python. In my case, I want to use the Python 3 division module, because it returns a float division of integers by defaut, which in Python 2.x only returns an integer, then I need use division module of the Python 3. So to import a module from a future version, simply: from __future__ import modulename In my case I'm importing the division: from __future__ import division Python must requires that the import be in the first line of code, then that is my problem, e.g., when running the function below: CREATE OR REPLACE FUNCTION test() RETURNS text AS $$ from __future__ import division a = 8/5 return eval(a) $$ LANGUAGE plpython2u; returns the following error: SyntaxError: from __future__ imports must occur at the beginning of the file But from __future__ ... is on first line of code. Has anyone had this error? and what may be this error? a bug in PL/Python? How can I fix this error? Thanks, Davi Duarte. AFAIK, a CREATE FUNCTION in plpython2u adds a python function definition right before the code you provide (which is the fuction body, or 'suite' in python parlance): def function_name(parameter_list): your_suite (it does some other mangling to correct indentation, BTW) As far as the python interpreter is concerned, 'from __future__ import ...' statement is not at the beginning of the program. I started looking into this some years ago, but at the time the only solution I managed to design was using some internal interpreter API, unsupported and subject to change. My idea was to mangle the parse tree, not the program source: in theory it would be possible to spot the 'form __future__ import ...' statement and move it at the top. It also avoids any problem with indentation. At the time there was no proper way to do that, tho. I'm afraid there's no solution for your problem at the moment. Well, a workaround is to multiply one of the division arguments by 1.0, effectively casting it to float. Note this isn't the same of using float() itself. a = 1.0*8/5 eval(a) 1.6 .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_restore
On 03/31/2011 01:08 AM, Mike Orr wrote: That might be a better solution. I was hoping to use the same pgdump file for this that I also use for routine offline backup, but maybe this is such a special case that a separate dump file would be better. Why don't you post the exact mysqldump/mysql commands you are using? It would be easier to provide equivalent pgdump/psql commands. Meanwhile, have a look at the pg_restore man page, -t and/or -L options. Maybe -c as well. Oh, and -1. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Linux x Windows LOCALE/ENCODING compatibility
On 11/09/2010 02:31 AM, Carlos Henrique Reimer wrote: Hi, I'm currently in the process of moving the data from the Windows server to the new Linux box but facing some problems with the encoding. Additional configuration information: Windows is running PG 8.3 and the new Linux box is PG 8.4. Windows dump command: pg_dump -U postgres -Fc -v -f f:\backup Linux restore command: pg_restore -v -n brasil -d mapas /backup pg_restore error: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 3574; 0 40805 TABLE DATA cidade postgres pg_restore: [archiver (db)] COPY failed: ERROR: character 0x81 of encoding WIN 1252 has no equivalent in UTF8 CONTEXT: COPY cidade, line 6 I also tried to dump using pg_dump -E UTF8 but then I got: pg_dump: SQL command failed pg_dump: Error message from server: ERROR: character 0x81 of encoding WIN1252 has no equivalent in UTF8 pg_dump: The command was: COPY brasil.cidade (gid, municpio, municpi0, uf, longitude, latitude, the_geom) TO stdout; pg_dump: *** aborted because of error How can I fix this error? Well, that's pretty much evidence that the text you have on windows is NOT win1252-encoded. Or some parts of it, at least. According to this page: http://en.wikipedia.org/wiki/Windows-1252 0x81 is undefined in win1252. Please note that if the old DB has SQL_ASCII encoding, you may have mixed encoding text in the same database (either on purpose or by mistake) and you have to either keep using SQL_ASCII in the new DB (which means keeping potentially invalid data around), or do proper conversion to UTF-8 (which possibly has do be done differently for different tables, or even different rows in the same table) BEFORE importing it in the new DB. To convert from one encoding to another, you can use, surprisingly, the 'convert' function. The tricky part is that once you have text of unknown encoding, you can only guess. That is, try to convert it from some reasonable encodings to UTF-8, and look at the result. If it looks right, that may be the right encoding. :) BTW, 0x81 is a rather weird char, it's invalid in many common encodings. It's valid in win-1251, where it represents U+0403 (CYRILLIC CAPITAL LETTER GJE), Ѓ (if you don't have the font, you won't be able to see it), but I think it's an unlikely candidate. In both CP437 and CP850 (old DOS encodings for western languages) it represents U+00FC (LATIN SMALL LETTER U WITH DIAERESIS), ü (again, I hope you can see it). That's a better candidate, it's possible someone accessed the DB either directly, or via a web application, from a Windows ME or older system. Unfortunately, AFAIK, PostgreSQL doesn't support the cp850 encoding. Maybe you can dump the data (even a single table) in text format and use a text editor (or shell utils like perl, sed, tr) to replace offending charaters with their proper win-1252 counterparts (e.g., 0x81 can be replaced with 0xfc, which is u with diaresis in win-1252). I hope it helps. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Linux
On 11/04/2010 04:00 PM, Michael Gould wrote: I know that this is probably a religion issue but we are looking to move Postgres to a Linux server. We currently have a Windows 2008 R2 active directory and all of the other servers are virtualized via VMWare ESXi. One of the reasons is that we want to use a 64 bit Postgres server and the UUID processing contrib module does not provide a 64 bit version for Windows. I would also assume that the database when properly tuned will probably run faster in a *inx environment. What and why should I look at certain distributions? It appears from what I read, Ubanta is a good desktop but not a server. Best Regards Just find one that ships with the latest PG, to save you some work. Unless you plan to compile install PG manually, in that case, any major distribution would do. For production use, how long your version will be supported for (security updates) is likely to be the most important item in your checklist. I use CentOS. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MySQL versus Postgres
On 12/08/2010 03:43, Tom Lane wrote: Marco Colombopg...@esiway.net writes: It's a matter of correctness: I see PG as a high performance database system. Allowing to start it in awfully suboptimal conditions it's no different from allowing '-00-00' as a date: it may give you the idea you did the right thing, but most of the time you didn't. Unfortunately, there are quite a few of us for whom correctness doesn't mean automatically try to eat all the resources available. Your view of what is useful behavior is far too narrow-minded ... regards, tom lane Well, my idea was more along the line of automatically try to acquire a reasonable amount of the available resources. You know, you don't jump directly from 0,1% to 100%. There's a lot in between. In medio stat virtus. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MySQL versus Postgres
On 11/08/2010 04:01, Greg Smith wrote: 3. The default configuration settings for PostgreSQL are not optimal for performance. Can there be a recommended configuration file in the installation (assuming certain amount of RAM and processor type) ? This doesn't work because there are many different types of database applications, and what's optimal even as a starting configuration for each type is very different. Also, hardware changes pretty fast; you'd be hard pressed to write down useful generic recommendations (or insert them into the core database code) that are still relevant at all after a release has been out a few years. Well, many defaults are hardcoded into a file now. I'd like to see 'auto' among possible values of parameters, e.g.: max_connections = auto shared_buffers = auto work_mem = auto with PG wild guessing reasonable values based on system specs. It may be a awful piece of code (getting system info is very platform specific), and sometimes the guess may be wrong. Anyway nothing prevents PG to have a postgresql_failsafe.conf. Not that I'm advocating it. Complex systems need well-thought configuration. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MySQL versus Postgres
On 11/08/2010 17:34, Greg Smith wrote: The problem here is that the amount of shared memory a system can allocate is hard to discover any other way than starting the server and seeing if it works. So doing what you advise will leave the database unable to start on any system that hasn't gotten the right OS kernel tweaks done first. Well, is that true under Windows, too? I think we need to cover Windows, here. Under unix, having postgresql start correctly is a concern of the distribution vendor. Even if the guessing isn't bullet-proof, the vendor either knows how to configure the kernel to have the 'auto' thing work, or is able to provide its own postgresql.conf. Sure, there are people who download and compile, but I don't think they are afraid of editing postgresql.conf should the server fail to start. Also, I'd say this is a case where it's much better to fail with a message listen buddy, your server has 64GB of RAM installed but your kernel is configured for 20MB of shared memory only, you should really increase it, rather than start successfully but with very poor performance. It's a matter of correctness: I see PG as a high performance database system. Allowing to start it in awfully suboptimal conditions it's no different from allowing '-00-00' as a date: it may give you the idea you did the right thing, but most of the time you didn't. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] solaris slow
On 02/08/2010 21:14, John R Pierce wrote: Another factor, if your linux system was using LVM (its the default storage configuration on many distributions), there's a pretty good chance the drive mapper is ignoring write barriers, which greatly speeds up random writes at the expense of reliable commits. A lot has been discussed about how well LVM plays with PostgreSQL. But for sure, write barriers are not related to commits. A reliable commit is about disks synchronously flushing their caches to platters (and of course, before that, OS cache being flushed to disk cache, which is the easy part). In ACID terms, write barries help with C, not with D. Actually, a write barrier is kind of the opposite of a flush, it's a don't flush this before that. It enforces an order on writes, but doesn't enforce their immediate execution. Of course, if the disk _doesn't_ support write barriers, poor's man workaround is to enforce a complete flush instead. AFAIK, there's no POSIX API to request a write barrier at application level. There's only the sync (i.e. flush) related API. And I'm pretty sure PostegreSQL issues the right syncs at the right times. LVM correctly reports it doesn't support write barriers. The FS then knows what to do (issue a full flush instead), and no harm is done (because not all disks support wbs, the code to handle that case has to be there in the FS). The FS just runs slower. But just as safe as without LVM. And since PostgreSQL issues syncs (or uses synchronous APIs), there's little point discussing of wbs here, since it's all about flushing. Wbs are issued by the FS in order to _avoid_ a flush: the FS major concern is C, not D; but when it's the application requesting a flush, the flush can't be avoided, and wbs are almost useless. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing HTML: HTML entities being rendered in that raw form
linnewbie wrote: On Apr 9, 1:00 pm, st...@blighty.com (Steve Atkins) wrote: On Apr 9, 2009, at 9:27 AM, linnewbie wrote: Hi all, I have stored HTML in a text field that I subsequently render on the web. However when I retrieve and render this data on the web I am getting the entities being rendered in their raw form, ie, instead of getting the '' symbol when 'amp;' is stored I'm getting the 'raw' 'amp;'. I would be grateful if anyone can point out how I can get around this. It's a problem in your code, not the database. You're probably escaping it one time to many or unescaping it one time too few. Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general To clarify, I am not escaping the string in any way. Say the page I am saving the database is the about us page form a company website. First, make a from to create the about us page in a text area field, then I copy the html from my text editor and past it ino this text area from. I then have a cgi script which takes the contents from the text area field and stores it in the database. What I have on disk would be: .. p Bonnie amp; Clyde/p which would usually be rendered as: Bonnie Clype but this is not happening, it's being rendered as: Bonnie amp; Clyde That's because, as someone else suggested, something is quoting the . In order to be rendered 'Bonnie amp; Clyde' by the browser, it needs to be 'Bonnie amp;amp; Clyde' in the HTML (just view the HTML source from inside the browser). You haven't provided any detail on the rendering phase, BTW. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Markus Wanner wrote: Hi, Martijn van Oosterhout wrote: And fsync better do what you're asking (how fast is just a performance issue, just as long as it's done). Where are we on this issue? I've read all of this thread and the one on the lvm-linux mailing list as well, but still don't feel confident. In the following scenario: fsync - filesystem - physical disk I'm assuming the filesystem correctly issues an blkdev_issue_flush() on the physical disk upon fsync(), to do what it's told: flush the cache(s) to disk. Further, I'm also assuming the physical disk is flushable (i.e. it correctly implements the blkdev_issue_flush() call). Here we can be pretty certain that fsync works as advertised, I think. The unanswered question to me is, what's happening, if I add LVM in between as follows: fsync - filesystmem - device mapper (lvm) - physical disk(s) Again, assume the filesystem issues a blkdev_issue_flush() to the lower layer and the physical disks are all flushable (and implement that correctly). How does the device mapper behave? I'd expect it to forward the blkdev_issue_flush() call to all affected devices and only return after the last one has confirmed and completed flushing its caches. Is that the case? I've also read about the newish write barriers and about filesystems implementing fsync with such write barriers. That seems fishy to me and would of course break in combination with LVM (which doesn't completely support write barriers, AFAIU). However, that's clearly the filesystem side of the story and has not much to do with whether fsync lies on top of LVM or not. Help in clarifying this issue greatly appreciated. Kind Regards Markus Wanner Well, AFAIK, the summary would be: 1) adding LVM to the chain makes no difference; 2) you still need to disable the write-back cache in IDE/SATA disks, for fsync() to work properly. 3) without LVM and with write-back cache enabled, due to current(?) limitations in the linux kernel, with some journaled filesystems (but not ext3 in data=write-back or data=ordered mode, I'm not sure about data=journal), you may be less vulnerable, if you use fsync() (or O_SYNC). less vulnerable means that all pending changes are commetted to disk, but the very last one. So: - write-back cache + EXT3 = unsafe - write-back cache + other fs = (depending on the fs)[*] safer but not 100% safe - write-back cache + LVM + any fs = unsafe - write-thru cache + any fs = safe - write-thru cache + LVM + any fs = safe [*] the fs must use (directly or indirectly via journal commit) a write barrier on fsync(). Ext3 doesn't (it does when the inode changes, but that happens once a second only). If you want both speed and safety, use a batter-backed controller (and write-thru cache on disks, but the controller should enforce it when you plug the disks in). It's the usual Fast, Safe, Cheap: choose two. This is an interesting article: http://support.microsoft.com/kb/234656/en-us/ note how for all three kinds of disk (IDE/SATA/SCSI) they say: Disk caching should be disabled in order to use the drive with SQL Server. They don't mention write barriers. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Martijn van Oosterhout wrote: True, but the relative wakeup order of two different processes is not important since by definition they are working on different transactions. As long as the WAL writes for a single transaction (in a single process) are not reordered you're fine. I'm not totally sure, but I think I understand what you mean here, indipendent transactions by definition don't care about relative ordering. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Ron Mayer wrote: Marco Colombo wrote: Yes, but we knew it already, didn't we? It's always been like that, with IDE disks and write-back cache enabled, fsync just waits for the disk reporting completion and disks lie about I've looked hard, and I have yet to see a disk that lies. No, lie in the sense they report completion before the data hit the platters. Of course, that's the expected behaviour with write-back caches. ext3, OTOH seems to lie. ext3 simply doesn't know, it interfaces with a block device, which does the caching (OS level) and the reordering (e.g. elevator algorithm). ext3 doesn't directly send commands to the disk, neither manages the OS cache. When software raid and device mapper come into play, you have virtual block devices built on top of other block devices. My home desktop has ext3 on top of a dm device (/dev/mapper/something, a LV set up by LVM in this case), on top of a raid1 device (/dev/mdX), on top of /dev/sdaX and /dev/sdbX, which, in a way, on their own are blocks device built on others, /dev/sda and /dev/sdb (you don't actually send commands to partitions, do you? although the mapping sector offset relative to partition - real sector on disk is trivial). Each of these layers potentially caches writes and reorders them, it's the job of a block device, although it makes sense at most only for the last one, the one that controls the disk. Anyway there isn't much ext3 can do, but posting wb and flush requests to the block device at the top of the stack. IDE drives happily report whether they support write barriers or not, which you can see with the command: %hdparm -I /dev/hdf | grep FLUSH_CACHE_EXT Of course a write barrier is not a cache flush. A flush is synchronous, a write barrier asyncronous. The disk supports flushing, not write barriers. Well, technically if you can control the ordering of the requests, that's barriers proper. With SCSI you can, IIRC. But a cache flush is, well, a flush. Linux kernels since 2005 or so check for this feature. It'll happily tell you which of your devices don't support it. %dmesg | grep 'disabling barriers' JBD: barrier-based sync failed on md1 - disabling barriers And for devices that do, it will happily send IDE FLUSH CACHE commands to IDE drives that support the feature. At the same time Linux kernels started sending the very similar. SCSI SYNCHRONIZE CACHE commands. Anyway, it's the block device job to control disk caches. A filesystem is just a client to the block device, it posts a flush request, what happens depends on the block device code. The FS doesn't talk to disks directly. And a write barrier is not a flush request, is a please do not reorder request. On fsync(), ext3 issues a flush request to the block device, that's all it's expected to do. But AFAICT ext3 fsync() only tell the block device to flush disk caches if the inode was changed. No, ext3 posts a write barrier request when the inode changes and it commits the journal, which is not a flush. [*] Or, at least empirically if I modify a file and do fsync(fd); on ext3 it does not wait until the disk spun to where it's supposed to spin. But if I put a couple fchmod()'s right before the fsync() it does. If you were right, and ext3 didn't wait, it would make no difference to have disk cache enabled or not, on fsync. My test shows a 50x speedup when turning the disk cache on. So for sure ext3 is waiting for the block device to report completion. It's the block device that - on flush - doesn't issue a FLUSH command to the disk. .TM. [*] A barrier ends up in a FLUSH for the disk, but it doesn't mean it's synchronous, like a real flush. Even journal updates done with barriers don't mean hit the disk now, they just mean keep order when writing. If you turn off automatic page cache flushing and if you have zero memory pressure, a write request with a barrier may stay forever in the OS cache, at least in theory. Imagine you don't have bdflush and nothing reclaims resources: days of activity may stay in RAM, as far as write barriers are concerned. Now someone types 'sync' as root. The block device starts flushing dirty pages, reordering writes, but honoring barriers, that is, it reorders anything up to the first barrier, posts write requests to the disk, issues a FLUSH command then waits until the flush is completed. Then consumes the barrier, and starts processing writes, reordering them up to the next barrier, and so on. So yes, a barrier turns into a FLUSH command for the disk. But in this scenario, days have passed since the original write/barrier request from the filesystem. Compare with a fsync(). Even in the above scenario, a fsync() should end up in a FLUSH command to the disk, and wait for the request to complete, before awakening the process that issued it. So the filesystem has to request a flush operation to the block device, not a barrier. And so it does. If it turns out that the block device just issues writes
Re: [GENERAL] Maximum transaction rate
Greg Smith wrote: On Wed, 18 Mar 2009, Marco Colombo wrote: If you fsync() after each write you want ordered, there can't be any subsequent I/O (unless there are many different processes cuncurrently writing to the file w/o synchronization). Inside PostgreSQL, each of the database backend processes ends up writing blocks to the database disk, if they need to allocate a new buffer and the one they are handed is dirty. You can easily have several of those writing to the same 1GB underlying file on disk. So that prerequisite is there. The main potential for a problem here would be if a stray unsynchronized write from one of those backends happened in a way that wasn't accounted for by the WAL+checkpoint design. Wow, that would be quite a bug. That's why I wrote w/o synchronization. stray + unaccounted + cuncurrent smells like the recipe for an explosive to me :) What I was suggesting is that the way that synchronization happens in the database provides some defense from running into problems in this area. I hope it's full defence. If you have two processes doing at the same time write(); fsycn(); on the same file, either there are no order requirements, or it will boom sooner or later... fsync() works inside a single process, but any system call may put the process to sleep, and who knows when it will be awakened and what other processes did to that file meanwhile. I'm pretty confident that PG code protects access to shared resources with synchronization primitives. Anyway I was referring to WAL writes... due to the nature of a log, it's hard to think of many unordered writes and of cuncurrent access w/o synchronization. But inside a critical region, there can be more than one single write, and you may need to enforce an order, but no more than that before the final fsycn(). If so, userland originated barriers instead of full fsync()'s may help with performance. But I'm speculating. The way backends handle writes themselves is also why your suggestion about the database being able to utilize barriers isn't really helpful. Those trickle out all the time, and normally you don't even have to care about ordering them. The only you do need to care, at checkpoint time, only a hard line is really practical--all writes up to that point, period. Trying to implement ordered writes for everything that happened before then would complicate the code base, which isn't going to happen for such a platform+filesystem specific feature, one that really doesn't offer much acceleration from the database's perspective. I don't know the internals of WAL writing, I can't really reply on that. only when the journal wraps around there's a (extremely) small window of vulnerability. You need to write a careful crafted torture program to get any chance to observe that... such program exists, and triggers the problem Yeah, I've been following all that. The PostgreSQL WAL design works on ext2 filesystems with no journal at all. Some people even put their pg_xlog directory onto ext2 filesystems for best performance, relying on the WAL to be the journal. As long as fsync is honored correctly, the WAL writes should be re-writing already allocated space, which makes this category of journal mayhem not so much of a problem. But when I read about fsync doing unexpected things, that gets me more concerned. Well, that's highly dependant on your expectations :) I don't expect a fsync to trigger a journal commit, if metadata hasn't changed. That's obviuosly true for metadata-only journals (like most of them, with notable exceptions of ext3 in data=journal mode). Yet, if you're referring to this http://article.gmane.org/gmane.linux.file-systems/21373 well that seems to me the same usual thing/bug, fsync() allows disks to lie when it comes to caching writes. Nothing new under the sun. Barriers don't change much, because they don't replace a flush. They're about consistency, not durability. So even with full barriers support, a fsync implementation needs to end up in a disk cache flush, to be fully compliant with its own semantics. .TM. - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Ron Mayer wrote: Marco Colombo wrote: Ron Mayer wrote: Greg Smith wrote: There are some known limitations to Linux fsync that I remain somewhat concerned about, independantly of LVM, like ext3 fsync() only does a journal commit when the inode has changed (see http://kerneltrap.org/mailarchive/linux-kernel/2008/2/26/990504 ) I wonder if there should be an optional fsync mode in postgres should turn fsync() into fchmod (fd, 0644); fchmod (fd, 0664); 'course I meant: fchmod (fd, 0644); fchmod (fd, 0664); fsync(fd); to work around this issue. Question is... why do you care if the journal is not flushed on fsync? Only the file data blocks need to be, if the inode is unchanged. You don't - but ext3 fsync won't even push the file data blocks through a disk cache unless the inode was changed. The point is that ext3 only does the write barrier processing that issues the FLUSH CACHE (IDE) or SYNCHRONIZE CACHE (SCSI) commands on inode changes, not data changes. And with no FLUSH CACHE or SYNCHRONINZE IDE the data blocks may sit in the disks cache after the fsync() as well. Yes, but we knew it already, didn't we? It's always been like that, with IDE disks and write-back cache enabled, fsync just waits for the disk reporting completion and disks lie about that. Write barriers enforce ordering, WHEN writes are committed to disk, they will be in order, but that doesn't mean NOW. Ordering is enough for FS a journal, the only requirement is consistency. Anyway, it's the block device job to control disk caches. A filesystem is just a client to the block device, it posts a flush request, what happens depends on the block device code. The FS doesn't talk to disks directly. And a write barrier is not a flush request, is a please do not reorder request. On fsync(), ext3 issues a flush request to the block device, that's all it's expected to do. Now, some block devices may implement write barriers issuing FLUSH commands to the disk, but that's another matter. A FS shouldn't rely on that. You can replace a barrier with a flush (not as efficently), but not the other way around. If a block device driver issues FLUSH for a barrier, and doesn't issue a FLUSH for a flush, well, it's a buggy driver, IMHO. .TM. - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Martijn van Oosterhout wrote: Generally PG uses O_SYNC on open, so it's only one system call, not two. And the file it's writing to is generally preallocated (not always though). It has to wait for I/O completion on write(), then, it has to go to sleep. If two different processes do a write(), you don't know which will be awakened first. Preallocation don't mean much here, since with O_SYNC you expect a physical write to be done (with the whole sleep/ HW interrupt/SW interrupt/awake dance). It's true that you may expect the writes to be carried out in order, and that might be enough. I'm not sure tho. Well, that's highly dependant on your expectations :) I don't expect a fsync to trigger a journal commit, if metadata hasn't changed. That's obviuosly true for metadata-only journals (like most of them, with notable exceptions of ext3 in data=journal mode). Really the only thing needed is that the WAL entry reaches disk before the actual data does. AIUI as long as you have that the situation is recoverable. Given that the actual data probably won't be written for a while it'd need to go pretty wonky before you see an issue. You're giveing up Durability here. In a closed system, that doesn't mean much, but when you report payment accepted to third parties, you can't forget about it later. The requirement you stated is for Consistency only. That's what a journaled FS cares about, i.e. no need for fsck (internal consistency checks) after a crash. It may be acceptable for a remote standby backup, you replay as much of the WAL as it's available after the crash (the part you managed to copy, that is). But you know there can be lost transactions. It may be acceptable or not. Sometimes it's not. Sometimes you must be sure the data in on platters before you report committed. Sometimes when you say fsync! you mean i want data flushed to disk NOW, and I really mean it!. :) .TM. - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
John R Pierce wrote: Stefan Kaltenbrunner wrote: So in my understanding LVM is safe on disks that have write cache disabled or behave as one (like a controller with a battery backed cache). what about drive write caches on battery backed raid controllers? do the controllers ensure the drive cache gets flushed prior to releasing the cached write blocks ? If LVM/dm is lying about fsync(), all this is moot. There's no point talking about disk caches. BTW. This discussion is continuing on the linux-lvm mailing list. https://www.redhat.com/archives/linux-lvm/2009-March/msg00025.html I have some PG databases on LVM systems, so I need to know for sure I have have to move them elsewhere. It seemed to me the right place for asking about the issue. Someone there pointed out that fsycn() is not LVM's responsibility. Correct. For sure, there's an API (or more than one) a filesystem uses to force a flush on the underlying block device, and for sure it has to called while inside the fsync() system call. So lying to fsync() maybe is more correct than lying about fsync(). .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Greg Smith wrote: On Tue, 17 Mar 2009, Marco Colombo wrote: If LVM/dm is lying about fsync(), all this is moot. There's no point talking about disk caches. I decided to run some tests to see what's going on there, and it looks like some of my quick criticism of LVM might not actually be valid--it's only the performance that is problematic, not necessarily the reliability. Appears to support fsync just fine. I tested with kernel 2.6.22, so certainly not before the recent changes to LVM behavior improving this area, but with the bugs around here from earlier kernels squashed (like crummy HPA support circa 2.6.18-2.6.19, see https://launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/82314 ) I've run tests too, you can seen them here: https://www.redhat.com/archives/linux-lvm/2009-March/msg00055.html in case you're looking for something trivial (write/fsync loop). You can do a quick test of fsync rate using sysbench; got the idea from http://www.mysqlperformanceblog.com/2006/05/03/group-commit-and-real-fsync/ (their command has some typos, fixed one below) If fsync is working properly, you'll get something near the RPM rate of the disk. If it's lying, you'll see a much higher number. Same results. -W1 gives x50 speedup, it must be waiting for something at disk level with -W0. [...] Based on this test, it looks to me like fsync works fine on LVM. It must be passing that down to the physical disk correctly or I'd still be seeing inflated rates. If you've got a physical disk that lies about fsync, and you put a database on it, you're screwed whether or not you use LVM; nothing different on LVM than in the regular case. A battery-backed caching controller should also handle fsync fine if it turns off the physical disk cache, which most of them do--and, again, you're no more or less exposed to that particular problem with LVM than a regular filesystem. That was my initial understanding. The thing that barriers helps out with is that it makes it possible to optimize flushing ext3 journal metadata when combined with hard drives that support the appropriate cache flushing mechanism (what hdparm calls FLUSH CACHE EXT; see http://forums.opensuse.org/archives/sls-archives/archives-suse-linux/archives-desktop-environments/379681-barrier-sync.html ). That way you can prioritize flushing just the metadata needed to prevent filesystem corruption while still fully caching less critical regular old writes. In that situation, performance could be greatly improved over turning off caching altogether. However, in the PostgreSQL case, the fsync hammer doesn't appreciate this optimization anyway--all the database writes are going to get forced out by that no matter what before the database considers them reliable. Proper barriers support might be helpful in the case where you're using a database on a shared disk that has other files being written to as well, basically allowing caching on those while forcing the database blocks to physical disk, but that presumes the Linux fsync implementation is more sophisticated than I believe it currently is. This is the same conclusion I came to. Moreover, once you have barriers passed down to the disks, it would be nice to have a userland API to send them to the kernel. Any application managing a 'journal' or 'log' type of object, would benefit from that. I'm not familiar with PG internals, but it's likely you can have some records you just want to be ordered, and you can do something like write-barrier-write-barrier-...-fsync instead of write-fsync-write-fsync-... Currenly fsync() (and friends, O_SYNC, fdatasync(), O_DSYNC) is the only way to enforce ordering on writes from userland. Far as I can tell, the main open question I didn't directly test here is whether LVM does any write reordering that can impact database use because it doesn't handle write barriers properly. According to https://www.redhat.com/archives/linux-lvm/2009-March/msg00026.html it does not, and I never got the impression that was impacted by the LVM layer before. The concern is nicely summarized by the comment from Xman at http://lwn.net/Articles/283161/ : fsync will block until the outstanding requests have been sync'd do disk, but it doesn't guarantee that subsequent I/O's to the same fd won't potentially also get completed, and potentially ahead of the I/O's submitted prior to the fsync. In fact it can't make such guarantees without functioning barriers. Sure, but from userland you can't set barriers. If you fsync() after each write you want ordered, there can't be any subsequent I/O (unless there are many different processes cuncurrently writing to the file w/o synchronization). Since we know LVM does not have functioning barriers, this would seem to be one area where PostgreSQL would be vulnerable. But since ext3 doesn't have barriers turned by default either (except some recent SuSE system), it's not unique to a LVM setup
Re: [GENERAL] Maximum transaction rate
Ron Mayer wrote: Greg Smith wrote: There are some known limitations to Linux fsync that I remain somewhat concerned about, independantly of LVM, like ext3 fsync() only does a journal commit when the inode has changed (see http://kerneltrap.org/mailarchive/linux-kernel/2008/2/26/990504 ). The way files are preallocated, the PostgreSQL WAL is supposed to function just fine even if you're using fdatasync after WAL writes, which also wouldn't touch the journal (last time I checked fdatasync was implemented as a full fsync on Linux). Since the new ext4 is more Indeed it does. I wonder if there should be an optional fsync mode in postgres should turn fsync() into fchmod (fd, 0644); fchmod (fd, 0664); to work around this issue. Question is... why do you care if the journal is not flushed on fsync? Only the file data blocks need to be, if the inode is unchanged. For example this program below will show one write per disk revolution if you leave the fchmod() in there, and run many times faster (i.e. lying) if you remove it. This with ext3 on a standard IDE drive with the write cache enabled, and no LVM or anything between them. == /* ** based on http://article.gmane.org/gmane.linux.file-systems/21373 ** http://thread.gmane.org/gmane.linux.kernel/646040 */ #include sys/types.h #include sys/stat.h #include fcntl.h #include unistd.h #include stdio.h #include stdlib.h int main(int argc,char *argv[]) { if (argc2) { printf(usage: fs filename\n); exit(1); } int fd = open (argv[1], O_RDWR | O_CREAT | O_TRUNC, 0666); int i; for (i=0;i100;i++) { char byte; pwrite (fd, byte, 1, 0); fchmod (fd, 0644); fchmod (fd, 0664); fsync (fd); } } == I ran the program above, w/o the fchmod()s. $ time ./test2 testfile real0m0.056s user0m0.001s sys 0m0.008s This is with ext3+LVM+raid1+sata disks with hdparm -W1. With -W0 I get: $ time ./test2 testfile real0m1.014s user0m0.000s sys 0m0.008s Big difference. The fsync() there does its job. The same program runs with a x3 slowdown with the fsyncs, but that's expected, it's doing twice the writes, and in different places. .TM. - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Joshua D. Drake wrote: I understand but disabling cache is not an option for anyone I know. So I need to know the other :) Joshua D. Drake Come on, how many people/organizations do you know who really need 30+ MB/s sustained write throughtput in the disk subsystem but can't afford a battery backed controller at the same time? Something must take care of writing data in the disk cache on permanent storage; write-thru caches, battery backed controllers, write barriers are all alternatives, choose the one you like most. The problem here is fsync(). We know that not fsync()'ing gives you a big performance boost, but that's not the point. I want to choose, and I want a true fsync() when I ask for one. Because if the data don't make it to the disk cache, the whole point about wt, bb and wb is moot. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Joshua D. Drake wrote: On Sat, 2009-03-14 at 05:25 +0100, Marco Colombo wrote: Scott Marlowe wrote: Also see: http://lkml.org/lkml/2008/2/26/41 but it seems to me that all this discussion is under the assuption that disks have write-back caches. The alternative is to disable the disk write cache. says it all. If this applies to raid based cache as well then performance is going to completely tank. For users of Linux + PostgreSQL using LVM. Joshua D. Drake Yet that's not the point. The point is safety. I may have a lightly loaded database, with low write rate, but still I want it to be reliable. I just want to know if disabling the caches makes it reliable or not. People on LK seem to think it does. And it seems to me they may have a point. fsync() is a flush operation on the block device, not a write barrier. LVM doesn't pass write barriers down, but that doesn't mean it doesn't perform a flush when requested to. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Scott Marlowe wrote: On Fri, Mar 6, 2009 at 2:22 PM, Ben Chobot be...@silentmedia.com wrote: On Fri, 6 Mar 2009, Greg Smith wrote: On Fri, 6 Mar 2009, Tom Lane wrote: Otherwise you need to reconfigure your drive to not cache writes. I forget the incantation for that but it's in the PG list archives. There's a dicussion of this in the docs now, http://www.postgresql.org/docs/8.3/interactive/wal-reliability.html How does turning off write caching on the disk stop the problem with LVM? It still seems like you have to get the data out of the OS buffer, and if fsync() doesn't do that for you I think he was saying otherwise (if you're not using LVM and you still have this super high transaction rate) you'll need to turn off the drive's write caches. I kinda wondered at it for a second too. And I'm still wondering. The problem with LVM, AFAIK, is missing support for write barriers. Once you disable the write-back cache on the disk, you no longer need write barriers. So I'm missing something, what else does LVM do to break fsync()? It was my understanding that disabling disk caches was enough. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Tom Lane wrote: Marco Colombo pg...@esiway.net writes: And I'm still wondering. The problem with LVM, AFAIK, is missing support for write barriers. Once you disable the write-back cache on the disk, you no longer need write barriers. So I'm missing something, what else does LVM do to break fsync()? I think you're imagining that the disk hardware is the only source of write reordering, which isn't the case ... various layers in the kernel can reorder operations before they get sent to the disk. regards, tom lane You mean some layer (LVM) is lying about the fsync()? write(A); fsync(); ... write(B); fsync(); ... write(C); fsync(); you mean that the process may be awakened after the first fsync() while A is still somewhere in OS buffers and not sent to disk yet, so it's possible that B gets to the disk BEFORE A. And if the system crashes, A never hits the platters while B (possibly) does. Is it this you mean by write reodering? But doesn't this break any application with transactional-like behavior, such as sendmail? The problem being 3rd parties, if sendmail declares ok, I saved the message (*after* a fsync()) to the SMTP client, it's actually lying 'cause the message hasn't hit the platters yet. Same applies to IMAP/POP server, say. Well, it applies to anything using fsync(). I mean, all this with disk caches in write-thru modes? It's the OS lying, not the disks? Wait, this breaks all journaled FSes as well, a DM device is just a block device to them, if it's lying about synchronous writes the whole purpose of the journal is defeated... I find it hard to believe, I have to say. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum transaction rate
Scott Marlowe wrote: On Fri, Mar 13, 2009 at 1:09 PM, Christophe x...@thebuild.com wrote: So, if the software calls fsync, but fsync doesn't actually push the data to the controller, you are still at risk... right? Ding! I've been doing some googling, now I'm not sure that not supporting barriers implies not supporting (of lying) at blkdev_issue_flush(). It seems that it's pretty common (and well-defined) for block devices to report -EOPNOTSUPP at BIO_RW_BARRIER requests. device mapper apparently falls in this category. See: http://lkml.org/lkml/2007/5/25/71 this is an interesting discussion on barriers and flushing. It seems to me that PostgreSQL needs both ordered and synchronous writes, maybe at different times (not that EVERY write must be both ordered and synchronous). You can emulate ordered with single+synchronous althought with a price. You can't emulate synchronous writes with just barriers. OPTIMAL: write-barrier-write-barrier-write-barrier-flush SUBOPTIMAL: write-flush-write-flush-write-flush As I understand it, fsync() should always issue a real flush: it's unrelated to the barriers issue. There's no API to issue ordered writes (or barriers) at user level, AFAIK. (Uhm... O_DIRECT, maybe implies that?) FS code may internally issue barrier requests to the block device, for its own purposes (e.g. journal updates), but there's not useland API for that. Yet, there's no reference to DM not supporting flush correctly in the whole thread... actually there are refereces to the opposite. DM devices are defined as FLUSHABLE. Also see: http://lkml.org/lkml/2008/2/26/41 but it seems to me that all this discussion is under the assuption that disks have write-back caches. The alternative is to disable the disk write cache. says it all. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] php4 and postgresql 8.3
Tom Lane wrote: shadrack shadke...@hotmail.com writes: My basic question is...are php4 and postgresql 8.3 compatible? I'm running Linux Redhat 3.4.6, php4.3.9, and postgresql 8.3. I know, some of those versions are old...its government, and I unfortunately don't have control over the version. Er ... Red Hat *what*? I don't think they ever used such a version number. If they did it was a very long time ago (for calibration, they were just about to release RHL 7.3 when I joined the company, in 2001). You could probably scare the powers-that-be into letting you update the thing just on the grounds of it still being full of Y2K bugs. Well, the first Red Hat Linux I'm used was 4.1 (I still remember the code name, Vanderbilt). And it was '97 I think. So if you're *really* thinking of RHL 3 something, wikipedia says about early RH revision history: # 1.0 (Mother's Day), November 3, 1994 (Linux 1.2.8) # 1.1 (Mother's Day+0.1), August 1, 1995 (Linux 1.2.11) # 2.0, September 20, 1995 (Linux 1.2.13-2) # 2.1, November 23, 1995 (Linux 1.2.13) # 3.0.3 (Picasso), May 1, 1996 - first release supporting DEC Alpha # 4.0 (Colgate), October 3, 1996 (Linux 2.0.18) - first release supporting SPARC # 4.1 (Vanderbilt), February 3, 1997 (Linux 2.0.27) which reminds me, I've used Colgate a bit too. Back in those years I've used also Postres95 (but on Slackware I suspect). Anyway, if I'm not mistaken, the birth of PostgreSQL proper (6.0 I think) was at that time, '96. The OP must refer to RHEL 3 something, it can't be RHL 3. He'd have a tyrannosaur in his backyard and he's crying for help because he's got a broken leg. We go send pictures!. Quite seriously I'd really like to see a picture of a running RHL 3 still in production. Maybe even at RH won't mind one. Too bad uptime wouldn't show more than 497 days... .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Good Delimiter for copy command
Tom Lane wrote: Andrew Gould andrewlylego...@gmail.com writes: To the list: Does pg_dump escape characters that are the same as the delimiter? Yes. The OP has not actually explained why he needs to pick a nondefault delimiter, unless maybe it is that he wants to feed the dump to some program that is too dumb to deal with escaping. regards, tom lane Which makes me wonder, does copy accept UTF-8 input? Is it possibile to use some unicode character which is unlikely to appear in the data set as delimiter? Something like U+FFFC. Here I'm using U+25CF for it's much more likely you can see it (it's a black circle). $ python -c 'print u;.join(ABCD).encode(utf8)' 1.txt $ cat 1.txt A;B;C;D $ python -c 'print u\u25cf.join(ABCD).encode(utf8)' 2.txt $ cat 2.txt A●B●C●D $ psql -c \d test Table public.test Column | Type | Modifiers +--+--- a | text | b | text | c | text | d | text | $ psql -c \copy test from '1.txt' delimiter ; $ psql -c \copy test from '2.txt' delimiter ● ERROR: COPY delimiter must be a single character \copy: ERROR: COPY delimiter must be a single character It doesn't seem to work, but I need to stress that I'm using $ psql --version psql (PostgreSQL) 8.2.9 .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] NATURAL JOINs
Scott Marlowe wrote: On Wed, Oct 15, 2008 at 10:17 AM, Richard Broersma [EMAIL PROTECTED] wrote: On Wed, Oct 15, 2008 at 9:03 AM, Reg Me Please [EMAIL PROTECTED] wrote: Both are perfectly right, indeed. Nonetheless, in my opinion a NATURAL JOIN exploiting the FKs instead of the column names would be much more helpful and much less error prone! As far as I know there is no way to exploit FKs in JOINs, right? Yes AFAIK, this would make postgresql's implementation of natural join violate the SQL standard. Perhaps you could propose an UNNATURAL JOIN syntax extension. ;) Or a VERY VERY NATURAL JOIN syntax? :) Yeah, while we're at it, why don't we ask Tom to implement the DO_THE_QUERY_I_WANT_BUT_CANT_EXPRESS_IN_SQL syntax? That would solve a whole class of problems the newbie DBAs face every day. :) Seriously, I find NATURAL joins useful. While I understand the design queries resilient to schema changes (explicit is better than implicit in the Zen of Python), I think if you design your schema so that you can use NATURAL joins, you've done good (I like using the same identifier for the same object, expecially pkeys, when used in different places, e.g. as foreign keys in another table, and conversely never use the same name for different objects). Breaking NATURAL joins acts like a safeguard, when you've done something bad to your schema. Let's not forget SQL serves for two purposes... for embedding queries in applications (then, yes, those queries should be as resilient as possible) or for querying a db interactively. That's why it has been designed with a syntax resembling natural language. When used interactively, natural joins rock. Less typing and less thinking about column names. As for the OP request, my answer is just give KFs the same name of the PKs they refer to AND never use the same name for columns in different tables if they're not the same thing, which seems both... hmm, natural, to me. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Starting PostgreSQL
admin wrote: Sorry folks, a perennial one I'm sure ... I have read the manual and Googled for a couple of hours but still can't connect to PostgreSQL 8.3.4 (the PGDG RPMs running on an up to date CentOS 5.2). I continually get this message: psql: could not connect to server: No such file or firectory Is the server running locally and accepting connections on Unix domain socket /tmp/.s.PDSQL.0? Yes, the server is running as 'ps -aux' and 'netstat -l' and 'service postgresql status' all confirm. Do you mean you have something like this in your netstan -l? unix 2 [ ACC ] STREAM LISTENING 12587 /tmp/.s.PGSQL.5432 note, this is on a linux box with postgresql in standard configuration. Just look at the port number embedded in the socket name. I don't really think you can run a process on port 0. I think your psql is looking for the wrong socket. Try: $ psql -p 5432 ... If you don't see any unix socket for PG (I don't even think that's possible), then you need to use IP sockets: $ psql -p 5432 -h localhost ... service postgresql start/stop/restart works without errors pg_ctl start/stop/restart works without errors There is no socket file in /tmp. Opps sorry I missed this. Well double check with netstat, but it's possible your PG is not configured for Unix sockets... even if I wouldn't know how to do that. I just checked a CentOS5.2 running PG and there it is: $ ls -l /tmp/.s.PGSQL.5432 srwxrwxrwx 1 postgres postgres 0 Oct 13 01:22 /tmp/.s.PGSQL.5432 .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Frustrated...pg_dump/restore
Jeff Amiel wrote: Ahhh *looks at encoding* Well..they are both the same...BUT...they are set to ENCODING = 'SQL_ASCII'; That explains a lotthey should probably be set to Unicode UTF8 Duh Any way to change encoding without dumping/restoring database? You can change client encoding any time with the PGCLIENTENCODING environment variable. AFAIK, there's no way to change the encoding of a database, it's set at creation time. But I think SQL_ASCII makes it less picky about the input, so that might not be the source of your problem. You should look at the errors you see _before_ the invalid command \N. I suspect a slight schema mismatch... that could cause a COPY to fail, while an INSERT might still work. How did you create the 'schema-only database'? With a pg_dump --schema-only or with a different SQL script? You may also try and pg_dump --schema-only both databases and diff the output. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dumping/Restoring with constraints?
Phoenix Kiula wrote: Thanks Andrew. On the server (the DB to be dumped) everything is UTF8. On my home server (where I would like to mirror the DB), this is the output: =# \l List of databases Name| Owner | Encoding ---+-+--- postgres | postgres| SQL_ASCII pkiula| pkiula_pkiula | UTF8 template0 | postgres| SQL_ASCII template1 | postgres| SQL_ASCII (4 rows) This is a fresh install as you can see. The database into which I am importing (pkiula) is in fact listed as UTF8! Is this not enough? You said you're getting these errors: ERROR: invalid byte sequence for encoding UTF8: 0x80 those 0x80 bytes are inside the mydb.sql file, you may find it easier to look for them there and identify the offending string(s). Try (on the linux machine): zcat mydb.sql.gz | iconv -f utf8 /dev/null should tell you something like: illegal input sequence at position xxx BTW, 0x80 is usually found in windows encoding, such as windows-1250, where it stands for the EURO symbol: echo -n € | iconv -t windows-1250 | hexdump -C 80|.| 0001 FYI, you *can* get non UTF-8 data from an UTF-8 database, if (and only if) your client encoding is something different (either because you explicitly set it so, or because of your client defaults). Likewise, you can insert non UTF-8 data (such as your mydb.sql) into an UTF-8 database, provided you set your client encoding accordingly. PostgreSQL clients handle encoding conversions, but there's no way to guess (reliabily) the encoding of a text file. OTOH, from a SQL_ASCII database you can get all sort of data, even mixed encoding text (which you need to fix somehow). If your mydb.sql contains data from a SQL_ASCII database, you simply know nothing about the encoding. I have seen SQL_ASCII databases containg data inserted from HTTP forms, both in UTF-8 and windows-1250 encoding. Displaying, dumping, restoring that correctly is impossible, you need to fix it somehow before processing it as text. .TM. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why LIMIT and OFFSET are commutative
Andrus wrote: Under what interpretation would the results differ? Results must differ for easy creation of LinQ-PostgreSQL driver. If results are always the same , PostgreSQL should not allow to use both order of clauses. Nicholas explains: Assuming the ordering is the same on each of them (because Skip and Take make no sense without ordering, LINQ to SQL will create an order for you, which irritates me to no end, but that's a separate thread), they will produce different results. Say your query will produce the ordered set {1, 2, 3}. Let n = 1, m = 2. The first query: var query = query.Skip(n).Take(m); converted to SELECT ... OFFSET n LIMIT m Will return the ordered set {2, 3}, while the second query: var query = query.Take(m).Skip(n); converted to SELECT ... LIMIT m OFFSET n Will return the ordered set {2}. The reason for this is that in the first query, the Skip method skips one element, then takes the remaining two, while in the second query, the first two elements are taken, and then the first one is skipped. This semantics implies subqueries. In SQL LIMIT and OFFSET refer to the whole query (that's why in PG you can swap them). If you want to think OO, both are _attributes_ for the query object, whose default values OFFSET 1 and LIMIT *inf*. They are not _operators_ on the query result, as you seem to imply. The Take() and Skip() should not be real methods, they should just set internal instance variables. (Pardon my pythonic syntax - I know zero of LinQ) q = Query(SELECT ...) # create a new query object q.limit = 2 # object properties q.offset = 1 result = q.execute()# perform the query using wrapper methods: q = Query(SELECT ...) # create a new query object q.limit(2) q.offset(1) result = q.execute() Methods allow a more pythonic way (the same you use): result = Query(SELECT ...).offset(1).limit(2).execute() which matches quite closely the SQL syntax, if you have those methods return self. @Erik Jones There's no real object-relational impedance mismatch here. The above is pure OO, yet: result = Query(SELECT ...).limit(2).offset(1).execute() ^ methods swapped is perfectly equivalent. Setting properties on an instance object is commutative in general, unless the set_property operation has side-effects, which I wouldn't call good programming style. The OP just maps LIMIT and OFFSET into operations (OFFSET and LIMIT _operations_ are NOT commutative in general) instead of object properties. Once you do the correct mapping, objects behave like queries in SQL. As Gregory Stark pointed out, if you want LIMIT and OFFSET to work as operators, you need to nest queries. Let's drop the idea there's SQL behind the scene, and let's think of a more abstract DB model: q = Query(SELECT ...).execute() # this executes the query, and returs an object you can perform other queries on q = q.limit(2).execute() # again, the query is executed, and the result set is in turn querable q = q.offset(1).execute() Now, we can just make the execute() method implied. With this new semantics, we have: q1 = Query(SELECT ...) q1 = q.limit(2) q1 = q.offset(1) q2 = Query(SELECT ...) q2 = q.offset(1) q2 = q.limit(2) and the results differ. I think that's what the OP meant. But we needed to drop the idea of SQL behind the scene because there's no way in SQL to directly query the result set from a previous query. Here there's impedance mismatch. The closest thing is subqueries, but you don't store intermediate results anywhere, like we do above with objects. One could implement the above by executing the query every time, but that's a nightmare for performance on big tables. The only way to have decent performance is to do lazy execution of the query, and use subqueries as Gregory suggested. Please note that there are ORM tools that do that. SQLAlchemy even allows you to build a query (much of the above is valid sqlalchemy) and then treat it as a list, even using array slices which would generate the convenient LIMIT/OFFSET clauses automagically: q = session.query(Table) q1 = q[1:5] # adds LIMIT 4 OFFSET 2 of course the query is actually executed only when you start using the results. .TM. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Restart a sequence regularly
Scott Marlowe wrote: revoke all privs on the sequence to anyone but the user about to reset it reset it grant the options back Quoting the OP: That means, when others want to access the sequence between 31-Dec 23:59:55 and 1-Jan 00:00:05, they are waiting instead of getting an error. If you remove the privs, clients will get an error, unless I'm missing something. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Cannot declare record members NOT NULL
Cultural Sublimation wrote: Unfortunately for you, they are not different types. If the OCaml binding thinks they are, it's the binding's problem; especially since the binding seems to be using a completely lame method of trying to tell the difference. Hi, In OCaml and in other languages with strong type systems, int4 never NULL and int4 possibly NULL are definitely different types. I think the source of the problem here is that SQL has a different philosophy, one where type constraints are not seen as creating new types. There's no such a thing as a 'type constraint' in SQL, and there's no point in defining a new type. Constraints are on table rows, sometimes not even on the values of columns per se, but on combinations of values... Think something like (table.col1 table.col2)... is that 'creating a new type'? How'd you define this new type, even in OCaml, assuming that originally both are int4? Is '4' a valid value for that type? Now, some _table_ constraints may be similar to _type_ constraints, but that's a corner case, in SQL. It's much more than a different philosophy, we're speaking of apples and oranges here. Why should SQL recognize a very limited kind of constraints, and treat them specially by defining a new type? But anyway if you think that checking pg_attribute is a lame method of obtaining type information, what do you suggest should be done instead? What would you do if it were you creating the bindings? I think the bindings get it right, the type *is* int4 possibly NULL, because that't what the integer type in SQL means. The problem here is that not every language type maps perfectly on a database type (and of course the converse it true). int4 never NULL may be stored into a table with appropriate constraints, but still some code is needed at application level to convert it back, because there's no such a native type in PG. Think of dates and times, I believe no language bindings handle them in a totally consistent way with PG types (unless they define special-purpose types with the exact same semantics, which is hardly worth it). So, the application is wrong in expecting a SQL database to return values of type int4 never NULL. Just write a small conversion layer, changing int4 possibly NULL into int4 never NULL, after reading the data. .TM. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Delete/update with limit
Csaba Nagy wrote: First of all, thanks for all the suggestions. put a SERIAL primary key on the table Or: Maybe add OIDs to the table, and delete based on the OID number? No, this is not acceptable, it adds overhead to the insertions. Normally the overhead will be small enough, but on occasions it is noticeable. How about using the following? delete from table where ctid in (select ctid from table limit num); Here's a live example: db= select count(*) from sometable; count --- 381 (1 row) db= delete from sometable where ctid in (select ctid from sometable limit 5); DELETE 5 db= select count(*) from sometable; count --- 376 (1 row) Does anyone see problems with the above delete? --- Anyway, do you have figures of the overhead you mentioned? How fast is PG (with OIDs) and how does it compare with the alternatives you're using? In your original post you wrote: On other databases, it is possible to limit the delete to a maximum number of rows to be deleted. I don't know what other databases you're referring to, but are you sure they don't have anything similar to PG OIDs, without even you knowing it, and without any option to disable them? It's even possible that in the other databases you're already paying that overhead, and that makes it quite acceptable in PG, too. Or maybe there's some other kind of overhead, much bigger than the OIDs one? For example, you're using a high overhead mechanism to consume rows (triggers on delete, insering into another table), are you sure that in the other databases this doesn't slow all the inserts down much more than adding OIDs on PG would do? PG has MVCC, I guess that makes deletes and inserts on the same table play nice to each other, but how about the other databases? Do they need to acquire a lock on inserts/deletes? That would make your concurrent inserts/deletes much slower that just adding a column to the table. Maybe you could even add an index, and still be faster thanks to MVCC. Also, the trigger is fired once for each deleted row. Have you considered a single stored procedure that loops over the rows to be processed, instead of relaying on deletes and triggers? .TM. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] unexpected shutdown
[EMAIL PROTECTED] wrote: [EMAIL PROTECTED] writes: My database has shutdown several times in the last couple days. I have no idea why. I am running centos and I have not rebooted the server or made any configuration changes. So in particular, you didn't disable memory overcommit? LOG: server process (PID 501) was terminated by signal 9 If you didn't issue a manual kill -9, then this is almost certainly a trace of the kernel OOM killer at work. Google for OOM kill to learn more, or see memory overcommit in the PG docs. Memory overcommit is evil on a server. regards, tom lane You guys were right :Jun 17 11:04:57 kernel: Out of Memory: Killed process 24928 (postmaster). I did not disable memory overcommit. I guess this is something I will have to do. I have actually never seen this before or heard of memory overcommit. I am surprised a setting like this comes enabled by default. I read a bit about it and it seems to make sense to disable it, but from practical experience do you know of any negative side effects? The consensus on using overcommit_memory = 2 is far from general. Your problem is a java application with memory issues, so I think you should address that directly first. Either run it elsewhere (and turn the host running PG into a dedicated one) or fix its memory leaks or use resource limits provided by the OS to limit the java app. Linux kernel people aren't totally clueless about VM. If they chose to keep overcommiting and the OOM killer enabled by default, there're reasons. With overcommitting on, you save al lot of swap space from being allocated, leaving it for stuff that is actually used and not just potentially used. The overall system throughput is thus higher. When it comes to OOM situation, with overcommitting off things aren't much better. First, OOM happens much before than with overcommiting on. This usually isn't perceived as a big advantage, since 95% of the cases the OOM is caused by one runaway process, so sooner or later it will cause OOM either way. But in a correctly administered server, with OS limits configured, a single runaway process doesn't cause OOM. OOM may still happen for excessive load, and I'd rather see my system handle some high load spikes than go into OOM situation. So lowering the threshold of what 'excessive load' is, isn't necessarily a good idea. And OK, let's say you've hit OOM anyway. There's no win-win solution. Having PG processes SIGKILL'd is quite bad. But sitting in front of a keyboard watching your system die w/o being able to login (OOM, so fork fails) isn't much better. You may be able to do something (sysrq, maybe) but the chances you manage to run a proper shutdown are quite thin, in the general case. So you have to choose between the risk of PG being SIGKILL'd (but the OOM _may_ pick the right process instead) and the risk of being forced into hitting the 'reset' button. Either way, your precious data isn't happy at all. So the bottom line is, avoid OOM by properly configuing OS resource limits. If you don't, then overcommit_memory = 2 is _definitely_ better. If you do, it's a hard call. If you think about it, the funny thing is that the more experienced the sysadm you're talking to is, the less experience he has about handling OOM situations. By definition. :) .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] unexpected shutdown
[EMAIL PROTECTED] wrote: [EMAIL PROTECTED] writes: My database has shutdown several times in the last couple days. I have no idea why. I am running centos and I have not rebooted the server or made any configuration changes. Oh, I forgot. You do have plenty of swap space compared to RAM, yes? If you're running w/o swap, or little swap, the default settings of overcommit_memory = 2 will cut your available RAM by a factor of 2. This thread is interesting reading: http://www.mail-archive.com/pgsql-general@postgresql.org/msg97648.html Since disk space is usually cheap these days, my rule of thumb is (the old one): swap = 2 * ram read it this way: it you have 32GB of RAM, you can afford 64GB of disk storage BTW, this is a good idea both with overcommit on and off, IMHO. .TM. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] PostGreSQL for a small Desktop Application
Gabriele wrote: I'm going to develop a medium sized business desktop client server application which will be deployed mostly on small sized networks and later eventually, hopefully, on medium sized networks. It will probably be developed using C#. I do need a solid DBMS wich can work with .Net framework. I do know PostGreSQL is a good DBMS in general (it sports most of the advanced DBMS features, transactions and stored procedure included) but i wonder if it is suited for my application. While PG has tons more features than SQLite, the major question here is: do you really need a database _server_? One thing that PG is designed for is handling many (as in 100) concurrent users. Database users, that is, meaning processes (running on different computers) opening a connection and issueing queries. Of course, it handles it very well also when those processes all run on a single server (and all connections are local connections), such as an HTTP server running, say, PHP. That model is very similar to the distributed one, since there's no state shared by the httpd/PHP processes. All shared state is inside the database server. It also happens to be persistant. Technically, that's not simply client/server, it's 3-tier, with httpd/PHP processes being multiple instances of a middle layer. As far the database server (PG) is concerned, those are (multiple) clients. In this scenario PostgreSQL is at home, being that what it's designed for. To tell the truth, *any* serious RDBMS out there would do. SQLite won't, tho, since it's not a server at all - it's just a library. But you mentioned using C#/.Net. AFAIK (but I'm no expert) that's yet a different model. You have a single process (although very likely multithreaded) which is able to hold a shared state while serving concurrent clients. Here, a database is just a backend for persistent state (that it, across reboots or crashes). Any good (thread-safe) library that writes to files would do. If you need/want SQL, SQLite comes into play. Actually, this is what it was designed for. It's much easier to install (it's all in a .dll) and administer (close to zero administration I think) than PostgreSQL (or any RDBMS). For such an use, PG would surely do, but may be just overkill. PG still has advantages vs. SQLite, being more featured (do you need stored-procedures?). But if you plan to use an ORM tool for .Net (see: http://www.google.com/search?q=ORM+.Net) you might even be able to switch between SQLite and PostgreSQL at any time w/o even noticing (be sure of choosing one that supports both backends, of course). I'm a big fan of both PG and SQLite, and happily use them. When I design an application, I ask myself: is this going to be a strongly database oriented app, with potentially different implementations of the middlelayer, or just a server that happens to need a solid and nice way to access data on disk? If you can answer to that, the choice is natural: use different tools for different purposes. But also remember that PG can functionally replace SQLite anywhere, but not the other way around. If you have room enough in your toolbox for just one tool, go PostgreSQL. I think the best thing about PG is that it's a terrific general purpose tool: a full RDBMS, extremely reliable, with no compromises, almost covering anything you might need in the features area (even more if you consider how easy is to extend it), yet light enough to be easily embeddable. .TM. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] PITR Base Backup on an idle 8.1 server
Greg Smith wrote: On Tue, 5 Jun 2007, Marco Colombo wrote: AFAIK, files in pg_xlog are first renamed (and only if and after the archive_command returned true) and later overwritten to. Never deleted. No, they get deleted sometimes, too. Not often, but it can happen under heavy load if more segments get temporarily created than are normally needed. At checkpoint time, only 2*checkpoint_segments+1 xlog files are kept; if there are more than that, they are removed. Probably never happen on your system from what you've described of it, but it is a possibility. Ok, you're right. Anyway neither renames nor deletes can happen during my backups. My archive_command prevents it. As Simon just pointed out, the danger with the approach you're taken comes from what happens if a checkpoint occurs in the middle of your backup. You've probably never seen that happen either. As long as that continues to be true, you might be OK for now, but you really need to get to where you're following the recommended procedure rather than trying to do something a little different. There are too many edge cases here that could^H^H^H^H^Hwill bite you one day. Let's say you use the standard procedure. Let's say that your archive_commands starts failing at 1:00 AM (say, no space left on the archive directory). Let's say your backup starts at 2:00 AM. Later, at 8:00 AM you solve the disk full problem, and PG resumes the archiving of WAL segments... as long as there's enough room in pg_xlog for the extra segments, PG is fine. Is your backup broken if a checkpoint happens during the backup? In my understanding, no. There's no need for archiving to happen immediately. My procedure just simulates an archiving failure during the backup, no more, no less. The only difference is that the problem is solved right after the backup. By that time, all WAL records created during the backup have already been saved. If you're going to archive WAL segments anyway, my procedure buys you almost nothing. It just saves you from using a trick and artificially fill a segment with garbage data just to have it archived right after the backup. It does so by using another trick, so no big deal. But it allows anyone not willing to take the burden, or face the danger, of actually archiving logs, to take file level backups instead of pg_dumps and without relying on external snapshotting abilities. .TM. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] PITR Base Backup on an idle 8.1 server
Simon Riggs wrote: On Tue, 2007-06-05 at 18:39 +0200, Marco Colombo wrote: I'm asking: what _exactly_ can go wrong? If a checkpoint occurs while taking the backup then the contents of the files will be overwritten ^ Data files or WAL segments? My archive command prevents WAL segments from being recycled during the backup. and you will be unable to rollforward from before the backup until after the backup. This will give you the FATAL error message WAL ends before end time of backup dump. You won't know this until you have attempted recovery using those files, even if the scripts give rc=0. Well, my procedure currently produces two tar achives. One is the 'base backup' (a copy of the datafiles). One is a copy of wal segments, right after the backup. Which one do you expect to be corrupted if a checkpoint happens during the backup? .TM. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] PITR Base Backup on an idle 8.1 server
Simon Riggs wrote: Marco Colombo wrote: my method ...is dangerous Ok, but why? Once again, I'm asking: what _exactly_ can go wrong? so we don't get loads of new DBAs picking up this idea but missing the exact point of danger. I'm one of them. I'm _am_ missing the exact point of danger. Making the assumption that its OK to archive WAL files in the pg_xlog ^^ directory exposes you to the risk of having them deleted by the archiver, which will invalidate your backup. I'm sorry I'm really having a hard time following you here... what is to archive and the archiver? The archive_command? The tar in the backup procedure? What do you mean by deleted? AFAIK, files in pg_xlog are first renamed (and only if and after the archive_command returned true) and later overwritten to. Never deleted. Anyway, how could that invalidate the backup? It's all about making a self-contained backup. What happens after that, it's irrelevant. Hey, I haven't come here proposing a new revolutionary way to perform backups! I've made pretty clear it was for a not-so-common case. And anyway, I've just asked what may be wrong with my procedure, since it seems to fit _my_ needs and it makes _my_ life simpler, and _I_ don't see any flaw in it. It may be useful to others, _if_ it's correct. If not, I'd like to know why. Can you provide a simple failure scenario, please? That would help me understand what I'm missing... .TM. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] PITR Base Backup on an idle 8.1 server
Greg Smith wrote: The way you're grabbing files directly from the xlog directory only works because your commit workload is so trivial that you can get away with it, and because you haven't then tried to apply future archive logs. Well, it's only because I don't need future logs, just like I don't need future files. Backup is at 2:00 AM, any change after that is potentially lost. That includes e-mails, web contents, and database contents. The database contents are in no way different to us. It's the your commit workload is so trivial that you can get away with it I don't really get, but more on this later. In the general case, circumventing the archiving when the backup is going on won't guarantee everything is ordered just right for PITR to work correctly. Generic PITR? You mean if backup is at 2:00 AM and the server crashes (all disks lost) at 2:00 PM, you want to be able to recover to some time like 11:00 AM, and be precise about it? That's PITR to me - and the precise part is key here... either the time or the transaction ID would do, the point is being able to draw a line and say anything before this is correct. Well if that's what you mean by PITR, I never claimed my method would give you that ability. I'm pretty aware it won't do, in the general case. If you need that, you need to archive all the logs created after the backup, that's pretty obvious. But even under heavy write load, my method works, if the only point in time you want to be able to recover is 2:00AM. It works for you too, it gives you nice working backup. If you also need real PITR, your archive_commmand is going to be something like: archive_command = 'test ! -f /var/lib/pgsql/backup_lock cp %p /my_archive_dir/%f' I consider what you're doing a bad idea that you happen to be comfortable with the ramifications of, and given the circumstances I understand how you have ended up with that solution. I would highly recommend you consider switching at some point to the solution Simon threw out: create table xlog_switch as select '0123456789ABCDE' from generate_series(1,100); drop table xlog_switch; Ok, now the segment gets rotated, and a copy of the file appears somewhere. What's the difference in having the archive_command store it or your backup procedure store it? Let's say my archive_command it's a cp to another directory, and let's say step 5) is a cp too. What exaclty buys me to force a segment switch with dummy data instead of doing a cp myself on the real segment data? I mean, both ways would do. you should reconsider doing your PITR backup properly--where you never touch anything in the xlog directory and instead only work with what the archive_command is told. Well, I'm copying files. That's exaclty what a typical archive_command does. It's no special in any way, just a cp (or tar or rsync or whatever). Unless you mean I'm not supposed to copy a partially filled segment. There can be only one, the others would be full ones, and full ones are no problem. I think PG correctly handles the partial one if I drop it in pg_xlog at recover time. That segment you need to treat specially at recover time, if you use my procedure (in my case, I don't). If you have a later copy if it (most likely an archived one), you have to make it avalable to PG instead of the old one, if you want to make use of the rest of the archived segments. If you don't want to care about this, then I agree your method of forcing a segment switch is simpler. There's not partial segment at all. Anyway, it's running a psql -c at backup time vs. a test -nt rm at restore time, not a big deal in either case. .TM. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] PITR Base Backup on an idle 8.1 server
Greg Smith wrote: On Thu, 31 May 2007, Marco Colombo wrote: archive_command = 'test ! -f /var/lib/pgsql/backup_lock /dev/null' Under normal condition (no backup running) this will trick PG into thinking that segments get archived. If I'm not mistaken, PG should behave exactly as if no archive_command is configured, and recycle them ASAP. That's correct. I don't think you even need the /dev/null in that command. Ok, thanks. I've seen that /dev/null somewhere in the docs, and blindly copied it. Should a WAL segment fill up during the backup (unlikely as it is, since the system is mostly idle AND the tar completes withing a minute - but it's still possible), the test command would report failure in archiving the segment, and PG would keep it around in pg_xlog, ready to be tar'ed at step 5 (mind you - this is speculation since I had no time to actually test it). That's also correct. What you're doing will work for getting a useful backup. Great, that's all I need. However, recognize the limitations of the approach: this is a clever way to make a file-system level snapshot of your database without involving the archive logging process. You'll get a good backup at that point, but it won't provide you with any ability to do roll-forward recovery if the database gets screwed up in the middle of the day. Since that's a requirement of most PITR setups, I'm not sure your workaround accomplishes what you really want. More on why that is below. Here's the original thread I started. http://archives.postgresql.org/pgsql-general/2007-05/msg00673.php Briefly, I don't need PITR proper, it may be even harmful in my case. The data on the db may be tied to the data on the filesystem in ways unknown to me... think of some kind of custom CMS. I'm able to restore .html, .php, .png or whatever files as they were at backup time (say, 2:00AM). All I need to do with PG backups is restoring db contents at the same time (almost). The only point in time I'm interested in is backup time, so to say. Restore would be done the usual way, extracting both the archives, maybe adding WAL segments from the crashed pg_xlog. Whether I need to configure a fake restore command I have still to find out. This won't work, and resolving it will require going to grips with the full archive logging mechanism rather than working around it the way you suggest above. This is interesting. Why won't it work exactly? Let's say I trick PG in thinking it's a recover from backup+archived wal. It'll find all segments it needs (and no more) already in pg_xlog. I expect it to just use them. Maybe I'd need to configure /bin/false as restore_command. Or maybe just something like 'test -f /var/lib/pgsql/data/pg_xlog/%f' (true if the file is already there). I'll have to experiment, but I don't see any major problem right now. The files are already there. Every time the server hits a checkpoint, it recycles old WAL segments--renames them and then overwrites them with new data. The first time your database hits a checkpoint after your backup is done, you will have lost segment files such that it's impossible to recover the current state of the database anymore. You'll have the first part of the series (from the base backup), the last ones (from the current pg_xlog), but will be missing some number in the middle (the recycled files). Sure, now I see what you mean, but I was under the assumption of very low database activity, in may case, it'a about 2 wal segments/day. I usually see files in my pg_xlog that are 2 days old, so there won't be any missing segments. And anyway, the ability to recover at some time after the backup is just a plus. I don't need it. In case of a complete crash, I'm going to restore the whole system as it was at backup time. And if only the PG datadir gets corrupted later, and I want to try and recover it as it was at that later time, still I have a 99% chance of being able to do so, due to very low write activity. And if that fails, because of some uncommon write activity right at that inconvenient time, I can just fall back to the case of a complete system crash. The chances of that happing are possibly lower of those of a system crash, so I'm not worried about it. I think that all we want is a backup that is immediately usable, w/o waiting for the WAL segment it relies on to be archived. That is, if taken at 2:00AM, it may be used to recover a crash at 2:10AM (assuming the backup process ended by that time, of course). If you need *both* a full backup *and* PITR, just add a real cp to the archive_command above. The important part is to return failure during the backup process, I think. .TM. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] PITR Base Backup on an idle 8.1 server
Greg Smith wrote: [...] -Find something harmless I can execute in a loop that will generate WAL activity, run that until the segment gets archived. Haven't really thought of something good to use for that purpose yet. Some time ago I started a thread about taking on-the-fly backups at file level on idle servers. Problem was much the same of yours. After posting that, I'm doing some research on my own (in spare time) now. Currently, I'm using the following procedure: 1) create a backup lockfile 2) issue pg_start_backup() 3) tar the data directory, excluding pg_xlog 4) issue pg_stop_backup() 5) tar pg_xlog 6) remove the lockfile Meanwhile, a fake WAL archiving is active, which does pretty nothing. archive_command = 'test ! -f /var/lib/pgsql/backup_lock /dev/null' Under normal condition (no backup running) this will trick PG into thinking that segments get archived. If I'm not mistaken, PG should behave exactly as if no archive_command is configured, and recycle them ASAP. This saves me the burden of taking care of the archiving at all. Should a WAL segment fill up during the backup (unlikely as it is, since the system is mostly idle AND the tar completes withing a minute - but it's still possible), the test command would report failure in archiving the segment, and PG would keep it around in pg_xlog, ready to be tar'ed at step 5 (mind you - this is speculation since I had no time to actually test it). So it ends up with two tar archives: one is the datafiles backup, the other the wal segments. As an optimization, I should exclude WAL segments older that the lockfile in step 5), since I know they are older than the backup. What I really should do now is kill -STOP the tar at step 3), start some big write activity and see what exaclty happens to the WAL segment when it fills up and PG tries to archive it. Restore would be done the usual way, extracting both the archives, maybe adding WAL segments from the crashed pg_xlog. Whether I need to configure a fake restore command I have still to find out. Hope it helps, .TM. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] a few questions on backup
Marco Colombo wrote: I'll try that out. Maybe my ideas are so far from the truth that I'm having a hard time in explaing them to people who actually know how things work. I'll be back with results. Meanwhile, thanks for your time. I think I finally got it. Segment 34 in my pg_xlog got archived and recycled. It became segment 39, but PG is still working on segment 35, after some hours. Now pg_xlog contains 5 segments, from named from 35 to 39, 35 being the most recently modified. 39 won't be used yet for about a couple of days. Now I see what you mean for recycled: I thought it meant marked free for later use, but it means renamed for future use. My mistake was assuming that the rename part happens lazily when PG starts using the file. Instead, it happens right after (the eventual) archiving. That makes the strategy in my original post somehow unfeasable. Still, I was not completely wrong: # cmp /var/lib/pgsql/data/pg_xlog/000100010039 /u1/pg_wal_archive/000100010034 echo Yes Yes They do contain the same data, that of segment 34, and the *39 file will stay there, untouched, for quite a while after the backup. So the WAL segment I need *is* there, just with a different name. The only problem is figuring out what segment that data actually belongs to. I know only because I can compare it with the archived one. Now, I could still make some educated guesses, by looking at modification times, but definitely a guessing game is not something you want to play when restoring your precious data. :) Archiving the WAL segments and letting the recovery procedure handle them at restore time is easier anyway. Again, thanks a lot. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] a few questions on backup
Tom Lane wrote: Marco Colombo [EMAIL PROTECTED] writes: Good to know, thanks. I think I'll experiment a bit with archive_command. My point was that since I know (or better assume) that old segments are going to stay in my pg_xlog for *days* before getting recycled, On what do you base that assumption? Once the system thinks they're not needed anymore, they'll be recycled immediately. regards, tom lane Well now that you make me think of it, I do make some assumptions. One is that only one file in pg_xlog is the active segment. Two is that I can trust modification times (so that a file inside pg_xlog that looks old is actually old... and since postgresql does not run as root, it couldn't cheat on that even if it tried to). The best thing I can do is to configure archiving, and see what gets archived exactly. I'm making assumptions there too. I expect for each file in pg_xlog to find a copy in the archive directory (say archiving is done with cp), with one exception, the segment currently beeing written to. There will be a file with the same name but different contents (and older modification time). I'll try that out. Maybe my ideas are so far from the truth that I'm having a hard time in explaing them to people who actually know how things work. I'll be back with results. Meanwhile, thanks for your time. .TM. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] a few questions on backup
Tom Lane wrote: No. You have to have an actual archive_command script copying the WAL segments somewhere else when told to. An asynchronous copy of the xlog directory will be nothing but garbage, because we recycle WAL segments as fast as we can (ie, as soon as the archive_command claims to have saved the data). Mmm, sorry I'm not sure I'm following here. Maybe I should provide some background. In my pg_xlog directory I see five files, WAL segments, I suppose. Only one (as I expected) is begin currently used, the others are old (one a couple of days old). When PG performs a switch from one segment to another one (I assume it recycles the oldest available), does it archive the recycled one (before starting using it of course) or the just-filled one? If it's the one being recycled, it means that in my setup it would takes two days to archive a segment since it stopped being used. Am I missing something? 1) 2) and 3) are OK, but you need to use archive_command to collect the xlog segments. Actually ... given your low requirements, I wonder why you don't just stop the postmaster, tar the datadir, start the postmaster. Well, currently we do a pg_dump. The database mainly supports dynamic websites. It's very unlikely they get updated at the time the backup runs, and overall there is little updating even during the day, but I don't like stopping the postmaster because, even if the write load is negligible, the read one might be not. It's still small enough that a tar (to disk) might take only a minute or two to complete, but yet it's a minute of downtime for the web sites. If I can avoid that, why not? I'm not unsatisfied with pg_dump, and I agree that with my requirements the whole issue is accademic. I just wanted to learn how it works exactly, such knowledge could provide useful for doing the Right Thing in case of troubles. Maybe it's the right time for me to have a look at the source... Hannes Dorbath wrote: lvcreate -s -L5G -nbackup /dev/foo/postgresql mount /dev/foo/backup /mnt/backup-snap tar jcpvf pg-backup-time_stamp.bz2 /mnt/backup-snap You can't do much wrong with that, it's fast and easy to use. Been there, done that. In my environment (Fedora Core 6) it's fast and easy, but not reliable, unfortunately. Sometimes the snapshot won't get created, sometimes it won't get removed after the backup is done. .TM. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] a few questions on backup
Richard Huxton wrote: It calls archive_command on the just-filled one. Good to know, thanks. I think I'll experiment a bit with archive_command. My point was that since I know (or better assume) that old segments are going to stay in my pg_xlog for *days* before getting recycled, just copying them all after the call to backup_stop() should be enough, in my case. It's more than I need, even. You do know that pg_dump gives you a guaranteed accurate snapshot of the database? It doesn't matter if it's in use. Yes, I know, that's why I'm using it. A dump is also useful in that you can restore it even in a different system easier. A text dump allows you to make changes before restoring, even with sed. I'm comfortable with it. The reasons for playing with WAL segments for backups are: 1) learning how it works; 2) everything else is backed up with cpio; 3) at restore time I need to have PG running already, and there may be something different at database initialization. Reason 1) is the driving one, by far. I can handle it for the rest (i.e. the compressed dump is saved as part of the cpio archive). .TM. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] a few questions on backup
Hello, I have a few questions on backuping a PostgreSQL server (lets say anything 8.x.x). I've read Continuous Archiving and Point-In-Time Recovery (PITR) in the manual I'm still missing something...well actually I think I don't but I've been debating on this with a friend for a while, and there's something we don't seem to agree on, so I need some bits of clarification. :) Ok, let's say what I really need is poor man's PITR. That is, I perform only one daily full backup, and I need to be able to restore the database contents as they were at backup time. I don't even need to know the exact time (yes, that means I don't really care about which transactions turn out to be committed and which don't). Mmm, ok that can't be rightfully called PITR at all, maybe. Am I right in assuming that the following procedure is ok? 1) issue pg_start_backup(); 2) copy (or tar or cpio) the data dir, w/o pg_xlog/ 3) issue pg_stop_backup(); 4) copy (or tar or cpio) pg_xlog/ contents. That's all. Please note that I'm doing NO WAL archiving. Whether this is going to work or not is based on what exactly pg_start_backup() does. According to him (my friend), it may simply prevent PostgreSQL from writing to data files until pg_stop_backup() is issued, thus kind of enforcing a snapshot on the data files themselves. Changes go to WAL only, and they are played on data files only after the pg_stop_backup(). Hmm, I don't really like that. But if so, I don't even need step 4), since the tar backup would be consistent. I'm able to restore from that only, and data are those at the time of pg_start_backup(). Or, (that's me) pg_start_backup() only ensures that full_page_writes is enabled during the backup phase, so that any change to data files can be undone/redone based on the contents of WAL segments. The tar archive contains potentially inconsistant data, that's why I need WAL segments too. At restore time, PG eventually performs a crash recovery, and data are those at the time of step 4) (which is an interval really, some time between the start and the end of the copy, but I don't need to be able to tell the exact time anyway). BTW, I see a vulnerability in the above procedure... if there's enough write activity so that PostgreSQL recycles WAL segments between 1) and 4), some changes may be lost. At step 4) I need to save _all_ WAL segments that have been produced during the backup time. I assume that activity at backup time is low enough that the event is extremely unlikely. Another doubt I have is about WAL checkpointing... is it possible that a checkpoint happens during step 2), and the tar archive gets data files both from before and from after the checkpoint (which I think is bad) or does pg_start_backup() prevent WAL checkpointing, too? Finally, if I'm missing something and the above is wrong, I think that the only way to perform a full backup on a live database at filesystem level, is to enable WAL archiving as the first step of the backup procedure (assuming it's not usually on, of course), and later save all the WAL segments that were _archived_ during that time, including the one made at pg_stop_backup(), on the same backup medium. If I understand the documentation right, nothing breaks if the archive_command just returns OK w/o saving anything during regular operation, and starts saving segments only during the backup time... I mean, that prevents me from doing arbitrary PITR after the backup, but all I want to do is performing a full backup, w/o being able to do any partial backup after that. If so, I may write a script that does nothing most of the time, and archives WAL segments only to be included in the full backup. TIA, (and sorry if the above sounds a bit messy) .TM. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Views- Advantages and Disadvantages
Ashish Karalkar wrote: Hello All, Can anybody please point me to Advantages and Disadvantages of using view With Regards Ashish... Well, IMHO views are part of the business logic and not of the data model. You can also think of them as an API to access the data from applications (clients). By defining some nice views, you allow writing a client with little knowledge about the actual database design. And clients written by different people access the data consistently. However, this is a two-edged sword. An API is usually designed to be generic enough. One day you may find you just need only part of the funtionality, and you that could do that part more efficently. That's expecially true if the API is used to hide the details away from you. Normal clients may be given access only to the views and not to the actual tables. That's pretty an good design principle, but again it cuts both ways. Think of a database with a books table and a authors table, with a nice view that joins them. One day you are writing a client application and want to fetch just the list of book ids. Yes, you can select one column from the view, but why execute the join when you don't need it? But if you're given access only to the view, you can't do much about it. Of course this is not specific to views, it's true for any abstraction layer in any context. .TM. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Partial dates
On Wed, 2005-09-14 at 15:49 +1200, Brent Wood wrote: Sanitizing is one thing, inventing data to fit an incomplete value into a date datatype is not good practice. Choose another datatype, or make a new one, or split the date into columns. The type of your data is not a timestamp nor a date. In the first place, 0 is not NULL. So, even 1980-01-00 would be different from 1980-01-NULL. For example, assuming 1980-01-00 is defined to have some meaning, (1980-01-00 1980-01-02) is likely to be true, but (1980-01-NULL 1980-01-02) definitely is not. You're just asking if there's a way to store a number of which the lower bits are ignored (considered NULL). Obviously, no, you need a different datatype or a different arrangement. Note: the string 1980-01-00 just *looks* like a possible value, but definitely it's not: there's simply no space (or time) between 1979-12-31 and 1980-01-01. It's much like trying to store sqrt(-1) into a real. I hardly can imagine how MySQL manages to store that (the 1980-01-00, I mean). .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Block Size and various FS settings
On Wed, 2005-09-14 at 11:25 +0300, Michael Ben-Nes wrote: After a week of testing i decided to go with JFS as the FS for Postgres. im not an expert benchmarker so i hope i initiated the right parameters in bonnie. Any way here are the results of bonnie++ pgbench: http://www.canaan.co.il/users/miki/stats/stats.html Cheers Have you tried data=journal / data=ordered / data=writeback mount options for ext3? If so, did they make any difference? .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] RAMFS with Postgres
On Fri, 2005-07-22 at 15:56 +0100, Alex Stapleton wrote: On 21 Jul 2005, at 17:02, Scott Marlowe wrote: On Thu, 2005-07-21 at 02:43, vinita bansal wrote: Hi, My application is database intensive. I am using 4 processes since I have 4 processeors on my box. There are times when all the 4 processes write to the database at the same time and times when all of them will read all at once. The database is definitely not read only. Out of the entire database, there are a few tables which are accessed most of the times and they are the ones which seem to be the bottleneck. I am trying to get as much performance improvement as possible by putting some of these tables in RAM so that they dont have to be read to/written from hard disk as they will be directly available in RAM. Here's where slony comes into picture, since we'll have to mainatin a copy of the database somewhere before running our application (everything in RAM will be lost if there's a power failure or anything else goes wrong). My concern is how good Slony is? How much time does it take to replicate database? If the time taken to replicate is much more then the perf. improvement we are getting by putting tables in memory, then there's no point in going in for such a solution. Do I have an alternative? My feeling is that you may be going about this the wrong way. Most likely the issue so far has been I/O contention. Have you tested your application using a fast, battery backed caching RAID controller on top of, say, a 10 disk RAID 1+0 array? Or even RAID 0 with another machine as the slony slave? Isn't that slightly cost prohibitive? Even basic memory has enormously fast access/throughput these days, and for a fraction of the price. We are comparing a RAM + network solution vs. a RAM + disk solution. RAM alone in not enough, since the OP wants 100% safety of data. Then you need a network solution, and it has to be synchronous if you want 100% safety. No network is going to beat a directly attached disk array on the basis of performance/price. Slony, by the way, is quite capable, but using a RAMFS master and a Disk drive based slave is kind of a recipe for disaster in ANY replication system under heavy load, since it is quite possible that the master could get very far ahead of the slave, since Slony is asynchronous replication. At some point you could have more data waiting to be replicated than your ramfs can hold and have some problems. If a built in RAID controller with battery backed caching isn't enough, you might want to look at a large, external storage array then. many hosting centers offer these as a standard part of their package, so rather than buying one, you might want to just rent one, so to speak. Again with the *money* RAM = Cheap. Disks = Expensive. At least when you look at speed/$. Your right about replicating to disk and to ram though, that is pretty likely to result in horrible problems if you don't keep load down. For some workloads though, I can see it working. As long as the total amount of data doesn't get larger than your RAMFS it could probably survive. Ever heard of the page cache? If your data fits your RAMFS, it would fit the OS cache just the same. For reads, the effect is exactly the same. And just disable fsync if writes are a problem. It's anyway safer than RAMFS, even if not 100% safe. Face it, if you want 100% safety (loosing nothing in case of power failure), you need to synchronously write to _some_ disk platter. Where this disk is attached to, it's a matter of convenience. _If_ disk write throughput _is_ the problem, you have to fix it. Be it on the local host, or on a remote replica server, the disk system has to be fast enough. Consider: 1) PostgreSQL - RAM - disk 2) PostgreSQL - RAM - network network - RAM - disk no matter if you choose 1) or 2), the disk part has to be fast enough. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] RAMFS with Postgres
On Thu, 2005-07-21 at 07:43 +, vinita bansal wrote: Hi, My application is database intensive. I am using 4 processes since I have 4 processeors on my box. There are times when all the 4 processes write to the database at the same time and times when all of them will read all at once. The database is definitely not read only. Out of the entire database, there are a few tables which are accessed most of the times and they are the ones which seem to be the bottleneck. I am trying to get as much performance improvement as possible by putting some of these tables in RAM so that they dont have to be read to/written from hard disk as they will be directly available in RAM. Here's where slony comes into picture, since we'll have to mainatin a copy of the database somewhere before running our application (everything in RAM will be lost if there's a power failure or anything else goes wrong). My concern is how good Slony is? How much time does it take to replicate database? If the time taken to replicate is much more then the perf. improvement we are getting by putting tables in memory, then there's no point in going in for such a solution. Do I have an alternative? Regards, Vinita Bansal You see, if those frequently accessed tables are read-only mostly, there's no need at all to use RAMFS. They already are cached for sure in either PostgreSQL buffers or the underlying OS page cache. If you don't have enough RAM for that, increase it. Using RAMFS does only make things worse. The OS page cache really knows about frequently accessed data, usually much better than you do. If there are frequent writes on those tables, still RAMFS is not the answer. Have a look at the documentation and disable sync on writes, and you'll get a similar effect (writes are in RAM and will be synced on disk in blocks, much more efficiently). Of course you loose the safety of data this way, in case of power failure, but it's still much better than RAMFS, at least most of the data is on disk. As for Slony, or other solutions, consider this: the _only_ way to have data safety is to return 'OK' to the application only after you're _sure_ about the fact the data is on some disk platter. So, even with a replica server, data has to be transferred over the net, committed remotely, the commit notification has to come back over the net, and _then_ the database says 'OK' to the application. This is going to be quite slow, possibly slower than synchronous writing on the local disks (notice that the protocol is the same: send data to the disk, wait for a write completed notification, say 'OK' to the application). My advice is: tune you PostgreSQL, the best you can, _with_ sync write enabled. Review and optimize your SQL. Do not use RAMFS. Upgrade your hardware if that's not enough. Consider distributing the load on different servers (you'll need a multi-master solution for that, search the archives), that is, upgrade your hardware in number not in size. I hope it helps, .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] RAMFS with Postgres
On Tue, 2005-07-19 at 16:45 +, vinita bansal wrote: Hi, I am trying RAMFS solution with Postgres wherein I am pushing the most heavily used tables in RAM. Why? I mean, what problem are you trying to solve? I have 32GB RAM on a 64 bit opteron machine. My database size is 40GB. I think Linux allows max. of 16GB (half of available RAM) to be used directly to push tables to it. I am concerned about reliabilty here (what if there is a power failure). What are the things that need to be considered and what all can be done to ensure that there is no data loss in case something goes wrong. What steps must be taken to ensure data recovery. I am planning to use Slony replication to replicate my database to a diff node so that incase something goes wrong, I can restore it from replication node and start my runs on that data again. The only problem here is that I need to run engines from beginning. Is there any other way of doing the same thing or such a thing is good enough given the fact that a failure like this happens very rarely. The most imp. thing for me is the **data** which should not be lost under any circumstances. Then don't use RAMFS. Slony may be a good idea, but it's hard to tell if you don't provide more info. What is the database used for? - heavy long running, CPU-based, read only queries? - many simple queries but over the whole dataset (thus I/O based)? - many INSERTs/UPDATEs? Is the database accessed by many concurrent users? How many of them are mostly read-only and how many perform writes? Each problem in each scenario may have a different solution... Has anyone used Slony replication before. How good is it. Is there anything else available which is better then Slony Replication? better is meaningless w/o a context. There are tasks in which Slony may the best tool in the world, and others that require a totally different approach. First you have to define what your problem is, and why the obvious solution (a normal PostGreSQL server, with a standard filesystem) does not work/fit. Then you choose a solution. Regards, Vinita Bansal .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] How to create unique constraint on NULL columns
On Fri, 2005-07-15 at 13:46 +0300, Andrus wrote: I have table CREATE TABLE test( col1 CHAR NOT NULL, col2 CHAR, UNIQUE (col1, col2) ); This table allows to insert duplicate rows if col2 is NULL: INSERT INTO test VALUES ( '1', NULL ); INSERT INTO test VALUES ( '1', NULL ); does NOT cause error! The two rows are not duplicated. NULL means any value in that context, so you can only say that the values for the first column are equal. Nothing can be said about the values in the second column. The database can't say they are the same, and can't say they are different either. http://en.wikipedia.org/wiki/Null_%28SQL%29 How to create constraint so that NULL values are treated equal and second insert is rejected ? I think you can do that with special operators (such as IS DISTINCT FROM) but you're using NULL as a normal value, that is not what it's meant to be in the first place, and I advice to be careful: http://www.postgresql.org/docs/8.0/interactive/functions-comparison.html please read the paragraphs on ... = NULL, IS NULL, and IS DISTINCT FROM. NULL is meant to represent the lack of knowledge (unknown). If you are ever treating NULL as a real value (i.e. comparing it to other values or or other NULLs), you must think twice about your design. IMVHO, the only place for IS NULL and IS DISTINCT FROM are meta-operations on data, administrative tasks and so on. There should be no need to use them in normal queries. Unless you're coding quick and dirty hacks when you really know what you're doing but don't care about the correctness of your design, of course. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] automating backup ?
On Sun, 2005-06-26 at 11:18 +0200, Zlatko Matic wrote: How to automate backup, so that Postgres automatically backups, for example, once in a week ? The same question about vacuum ? Concerning backup, how to prevent that someone makes a copy (for example pg_dumpall) of a database, pg_dumpall is no special case. It just runs some SQL queries. Your users are subject to the normal permission checking. then installs new instance of Postgres, create the same user acount that was the original owner and then restore the database. In that case all restrictions would be overriden, right ? No, because pg_dumpall doesn't override any restriction. Of course, if someone puts his hands on _your_ backups (made with full permissions), he can access everything, unless you encrypted it. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Win32 users?
On Sun, 2005-06-26 at 10:59 -0700, Bob Pawley wrote: I'll date myself so you all may know from where I am coming. My first lesson in binary math took place in a classroom in 1958. Since then I have witnessed a lot that has since swept under the bridge. This thread reminds me of the discussion that surrounded the complexity of using the first spreadsheets (the precursor to today's Databases) ^^ What? Well, I admit I was not there in the 60's, but IFAIK databases predate spreadsheets by far. [...] In my 47 years of being somewhat aligned to the software industry this story was repeated over and over again. There is a lesson to be learned from the Wang experience. People want tools to do tasks. They do not want to spend their own time (or their own money to hire others) to build, design, repair or change the tools that they need just in order to accomplish their own work - the work they know best and from which they make a living. A good tool - a perfect tool - is like a hammer. Its use is immediately known and it can be deployed quickly, accurately and with little or no specialized training. Pardom me, but the last sentence sounds ridiculous. Think of a Master Smith making a perfect Katana. Guess which tool he's going to use mostly? The hammer. Explain me how its use is immediately known and it can be deployed, accurately and with little or no specialized training applies here. Expecially the last part. Do you _really_ think that just looking at even a remarkably simple tool such a hammer makes you able to accomplish (accurately!) _anything_ that can be done with it? Before answering, think about Michelangelo or Da Vinci. So I caution all to not make light of newbies who are searching for good tools (not even perfect tools - yet) to do the work that needs doing. The world will not sit by and continue to pay for Wang operators. There are tasks that require the human brain in order to overcome the lack of row performance of computers. Those task are not complex, it's the computer that is slow. Give 20 years of advance in the computer industry, and those tasks will require no human brain at all, for the computers will be fast enough. But not all task are like that. Some will grow with the computers. The bigger the processing power, the more data you want to process. Some are just complex at human brain level, _no matter what tool they involve_. We have wordprocessors these days, but they don't turn us all into great poets and writers, even it they are terribly better compared to quills. There is some (brain) complexity in computer systems in general, and there is in databases. It's just that some tasks are not for newbies, be the tool a hammer or a RDBMS. When the tool is aimed mostly at such tasks, there's little need to make it too newbie-friendly. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] vulnerability/SSL
On Wed, 2005-06-08 at 10:00 -0700, dong changyu wrote: Hi, A possible countermeasure on Windows platform, inspired by Magnus.Thanks ;) First we remove the passphrase from the key file, making it plain. Windows provides a feature encrypted file system, provide transparent encryption/decryption. We can log on using the account we run Postgres with and encrypt the plaintext key file. Then we logon using another non-amin account, and start postgres using runas service. Therefore the file is encrypted, only the Postgres acount and the recovery agent(built-in administrator by default) can read/modify it. The file will remain encrypted when restored from backup. I've tested it on my computer and it works. cheers, Changyu You mean that every process that runs as postgres has the ability to read the file _without typing any password_? Or when you start PostgreSQL it prompts for one? Can administrator read it _without knowing password_? I may be missing something, but what's the difference with a file like this: -r 1 postgres postgres50 Jan 15 21:15 akey in any Unix system? Only postgres and root can read it. How about backups? Does the backup process (I assume it runs as administrator) store the key in cleartext? .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] vulnerability/SSL
On Thu, 2005-06-09 at 02:59 -0700, Changyu Dong wrote: Hi Marco, The problem I described in the first mail is that because of some unknown reasons, if you save the server.key file with a passphrase, you will be prompted to enter the passphrase every time you start the server AND a client make a connection, which actually forbids us to use a passphrase to protect the key file, therefore the key file have to be saved in plaintext without encryption. EFS is a feature provided by Windows which will encrypt any selected file using a symmetric algorithm, the symmetric key will encrypted by the user¡¯s public key and the recovery agent¡¯s public key and the encrypted key will be saved within the file header. As long as the 'postgres' user has access to it w/o typing any password, that's only a detail. Unless someone physically steals your disk, the fact it's stored encrypted is irrelevant. The only thing that matters is who can access it, and how. Thus only the user and recovery agent can decrypt it. And for another user, he cannot even open it (but can delete it). So we can ensure no one can read and modify it. That's how the permission bits work in Unix. No need to encrypt the file, we know permission bits actually work as expected under Unix. In this case encryption adds no extra level of security on a running system. Decryption is transparent to users and applications. The operation system will do it automatically if it can find appropriate private key. The difference between this and -r 1 postgres postgres50 Jan 15 21:15 is that the file is encrypted using EFS, while the latter is remain plaintext. I fail to see the difference. On Windows, the 'postgres' user can read it without password. 'Administrator' has access to it, too. On Unix, with 400 permissions, the 'postgres' user can read it without password. 'root' has access to it, too. When you backup the file, it remains encrypted. Then the backup is useless. If the secret key of the user 'postgres' is lost (and it can be, since it is stored elsewhere, I think buried somewhere where 'Administrator' can find it, maybe in user profile), you'll never recover then content of the file. If you restore the file to a file system which doesn¡¯t support EFS (non-NTFS), it will corrupt, else it will remain encrypted. Now THAT puzzles me a lot. I can imagine it be restored in plain. I can imagine it be restored encrypted. I have no way to justify the file contents being lost only because of restoring it on FAT. Anyway, that's not the point here. The point is: on Windows, if someone breaks in your 'postgres' account, he can read the key. If someone breaks in your 'administrator' account, he can read the key. But other users cannot read it. This level of protection is exactly the same provided by the 400 permissions above under Unix. If someone breaks in the 'postgres' account, he can read the key. If someone breaks in the 'root' account, he can read the key. But other users cannot read it. I fail to see any difference in the above scenarios. Encrypting the key in the .pem file (as supported by openssl) is completely different! No one, ever, can access it w/o knowing the password. That's why it takes the operator to type the password in. Also backups are safe. And just as useful as the file itself, they can be restored everywhere. If someone forgets the password, the contents are lost, but that's true for the file itself. The backup is just what you expect to be, a copy. You restore it, and get a _working_ copy for the file, on every filesystem. The .pem key can be sent by email even, as is (since it's base64 encoded). The daemon should ask for the password only once, we agree on that. Storing the key encrypted (in the openssl sense) doesn't help much against root, if he's able to trick the operator into typing the password again. If you're able to avoid it, that is you're in a highly secure environment with operators trained _not_ to type the password in just to have the server restarted, .pem encryption adds a lot to your security. The EFS encryption as you described it adds nothing but a false sense of security (and the ability to use some more buzzwords). The level of protection is just the same of a Unix file with the right permissions. The key point here is that both the 'postgres' user and 'administrator' have _transparent_ access to the file contents. No password required. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] vulnerability/SSL
On Thu, 2005-06-09 at 13:54 +0200, Magnus Hagander wrote: The EFS encryption as you described it adds nothing but a false sense of security (and the ability to use some more buzzwords). The level of protection is just the same of a Unix file with the right permissions. The key point here is that both the 'postgres' user and 'administrator' have _transparent_ access to the file contents. No password required. While most of what you wrote is definitly correct, you missed a few things about EFS. [...stuff on EFS deleted...] I agree on that. I'm not saying EFS is a bad idea. I'm only considering the server key case. Anyway protecting against a malicious superuser is the subject of a chapter on its own. There are many ways for a superuser to hijack an user account. I'm not a Windows guy, but with a similar setup under Unix, I'd do: su - user -c cp encrypted_file /tmp/unencrypted or anything equivalent. That is, at C level, open(2) the file, read(2) it (the system gives me plaintext contents) write(2) it to another file. The OP said he starts PostgreSQL with 'runas' which I get being the Windows relative to 'su' (or 'cron'?). If Administrator can run programs as the user, he can read the key file. That's the big difference. So it does offer a bit of extra security. Just to protect the key used to set up the SSL sessions, I'm not sure it's worth it. Because again, if they hack your admin account, they can get to your files *without* going thruogh getting into the SSL stream. I think you're missing two points: 1) the purpose of getting the key is _not_ to be able to decrypt the SSL stream (even if that's a nice consequence). The worse you can do is _sign_ things with the key, that is, impersonate the server, mount man in the middle attacks, and so on. Without anyone notice it for a while. The whole point of encrypting the key is that a compromised key is worse than a compromised system (and way more likely to go unnoticed). 2) there's not need for the intruder to scale to administrator powers. If they break the 'postgres' account, they read the key. Even if they break the server at SQL level, i.e. they gain superuser for the database, they may be able to read it with a COPY SQL command, since it's likely the key is accessible to the server (I'm not claiming the latter is feasible - just in theory that's all they need). The problem reported by the OP is a real one, since it prevents the usage of an encrypted key. Not a big one, but still one. The solution the OP posted later is not solving the problem at all. BTW, even with the key is encrypted (the .pem way), the cleartext copy must stay in the server memory space (possibly in a locked area). It may be available to root (debugging the server), the user or the process itself of course, if they manage to execute arbitrary code. There's not way to make it 100% safe. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] vulnerability/SSL
On Thu, 2005-06-09 at 05:21 -0700, Changyu Dong wrote: --- Marco Colombo [EMAIL PROTECTED] wrote: As long as the 'postgres' user has access to it w/o typing any password, that's only a detail. Unless someone physically steals your disk, the fact it's stored encrypted is irrelevant. The only thing that matters is who can access it, and how. That's how the permission bits work in Unix. No need to encrypt the file, we know permission bits actually work as expected under Unix. In this case encryption adds no extra level of security on a running system. I fail to see the difference. On Windows, the 'postgres' user can read it without password. 'Administrator' has access to it, too. On Unix, with 400 permissions, the 'postgres' user can read it without password. 'root' has access to it, too. Then how about resore it from a backup to another system? In this way the permission is bypassed but EFS still works. Either the Windows backup contains the private key of the user or not. If not, the backup is incomplete and useless (to get the file contents). You may get other files from it, but that's not the point. You may just not include the key file in _that_ backup. If you have two backups, one normal and another safe, just put the keyfile on the safe one, along with the other private keys. You can do the same under Unix of course. If your single backup contains the user private key, EFS is bypassed as well. This is going offtopic. The EFS approach is no different from any encrypted filesystem, nothing new under the sun. It shares the weakness of any system that lets you access the data at runtime w/o password. Then the backup is useless. If the secret key of the user 'postgres' is lost (and it can be, since it is stored elsewhere, I think buried somewhere where 'Administrator' can find it, maybe in user profile), you'll never recover then content of the file. Right, but the user's private key can be exported into a password protected pem file. Save the server key in the same way then. Put the server key and the user key together. [...] If an intruder can break the postgres or root account, he can read everything, as have been discussed, not only the key but also the data file. So in this situation, it's useless to protect the key only. Yes, it has been discussed: the purpose of the key is not protecting the data, but protecting your identity. If the key is compromised, they can impersonate you. Generally, this is much bigger a damage. They can create fake data, _signed by you_. [...] Yes, the .pem file can be kept for distribution and backup, but the working copy has to be plain. The daemon should ask for the password only once, we agree on that. Yes, that's the ultimate solution. So we can use encrypted key without any outside mechanism. We agree on that. That's the _only_ solution if you want that kind of security. Storing the key encrypted (in the openssl sense) doesn't help much against root, if he's able to trick the operator into typing the password again. If you're able to avoid it, that is you're in a highly secure environment with operators trained _not_ to type the password in just to have the server restarted, .pem encryption adds a lot to your security. I'm not sure, but windows begins to support smart card logon, therefore no password will be need and stored. That changes nothing. Somehow the key as to be given, unencrypted, to the server. Be it an operator typing a password, or inserting a smartcard, a patched server can store the key in cleartext anywhere. You have to teach your operators to think twice before performing anything that lets the server access the key. With your solution, you're letting the server access the key automatically. The EFS encryption as you described it adds nothing but a false sense of security (and the ability to use some more buzzwords). The level of protection is just the same of a Unix file with the right permissions. The key point here is that both the 'postgres' user and 'administrator' have _transparent_ access to the file contents. No password required. At least it make it impossible to restore the plain key from backup. The safety of that backup lies only on its incompleteness. If you include the user key in the same backup, there's no security. If you don't include the user key (and thus create an incomplete backup), it's easier not to include the server key either, and put it in the same place you put the user key. They are both private keys. Including a useless copy of the server key encrypted with the user key (stored elsewhere) is just a perverse way to gain nothing. But I agree that sometimes perverse systems make perverse things look natural. :-) .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager
Re: [GENERAL] vulnerability/SSL
On Thu, 2005-06-09 at 15:04 +0200, Magnus Hagander wrote: [...] Yes, that is correct - runas is similar to su. But in order to do runas, you need the service accounts password. Once you are root on a unix system, you can do su - user *without* the password. That's a big difference. (You can also use the postgres accounts smartcard, if you are using smartcard logins, but the deal is that you need *something* that is normally private to the account - even if you are an administrator) Is that at application level or system level? You know I can install a patched su that asks root for passwords as well, but the problem is with the seteuid() system call, not su. You can (with SELinux) limit root powers a lot, but that's not the point. [...] I guess we could read in the password ourselves and drop it in our shared memory segment to pass to subprocesses - though that means they can get to the password easier as well. Assuming OpenSSL has the APIs for that, I haven't checked that. I'm unconvinced it makes enough of a difference to be worthwhile, though. (BTW, am I correct in reading this as a problem that only appears on win32, because of the exec nature of the backend, right? Or does it show up on Unix as well?) Is the Unix version much different? I think the postmaster just forks and execs the backends. But, aren't connections handled by the postmaster? All the SSL thing should happen before the fork I think. Is the Windows model different? Do backends handle SSL negotiation? .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] vulnerability/SSL
On Wed, 2005-06-08 at 16:08 +0200, Magnus Hagander wrote: Hi, Im using postgreSQL with SSL these days. The version Im using is 8.0.3. I found that its impossible to use an encrypted key file. When you use a protected server.key file, you will be prompted to input your passphrase EVERYTIME ITS USED, not only when you start the server but also when a client makes a connection. So you have to leave the key file un-protected. I think its a serious vulnerability since the security relies on the secrecy of the private key. Without encryption, the only thing we can use to protect the private key is the access control mechanism provided by the OS. Any comments on this issue? If you don't trust the access control provided by the OS, why are you putting sensitive data on it? If one can break your access control in the OS they can read all your data anyway - they don't even need to sniff the wire and decrypt it using the key. Or they can just change the passwords of your users and connect - or *change* they key. Yes and no. They can't change the key. It's tied to the certificate, which is signed. They need to get a signed certificate from a trusted CA, and put the associated private key on your server after they cracked it. Which is much like leaving a big banner with Yes, it was me! signed by you on the crime scene. :-) But overall I agree. If they gained enough privilege to read the key file, it's possible they're able to access the data as well. They might be able to patch the server and have the password that protects the key logged somewhere next time you type it in. OTOH, I see no advantage in reading the key at connection time instead of startup time (like every other daemon does). Encrypted key has an interesting significance with backups. Someone may be able to steal one backup of yours. They'll get old data (maybe you don't care much about that), _and_ the key. You don't want them to be able to sign stuff or impersonate your servers with it. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] PostgreSQL vs. InnoDB performance
On Fri, 2005-06-03 at 11:38 +0200, Peter Eisentraut wrote: Am Freitag, 3. Juni 2005 00:36 schrieb Peter Eisentraut: On a particular system, loading 1 million rows (100 bytes, nothing fancy) into PostgreSQL one transaction at a time takes about 90 minutes. Doing the same in MySQL/InnoDB takes about 3 minutes. InnoDB is supposed to have a similar level of functionality as far as the storage manager is concerned, so I'm puzzled about how this can be. Does anyone know whether InnoDB is taking some kind of questionable shortcuts it doesn't tell me about? So here's another little gem about our friends from Uppsala: If you create a table with InnoDB storage and your server does not have InnoDB configured, it falls back to MyISAM without telling you. Silently falling back to something unexpected seems to be quite common there. For sure it's not the only case. :-| As it turns out, the test done with PostgreSQL vs. real InnoDB results in just about identical timings (90 min). The test done using PostgreSQL with fsync off vs. MyISAM also results in about identical timings (3 min). The hardware seems to be the bottleneck. Try improving the performance of your disk systems. It's very unlikely to get _exactly_ the same figures from such two different RDBMS. You expect them to be close, but not identical. BTW, make sure the test correctly emulated multiple clients (say 25, 50 or 100). There's little point in stressing transaction support of a RDBMS when there's only one single actor in the system, and therefore no contention. Transaction code takes always the fast path that way and you're testing the less important part of it. Check out some performance tuning pages, you may need to adjust some OS and PostgreSQL configuration parameters to allow and effectively handle 100+ connections (shared buffers come to mind). I believe the same is true for MySQL. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] PostgreSQL vs. InnoDB performance
On Fri, 2005-06-03 at 08:43 -0400, Christopher Browne wrote: After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] (Marco Colombo) belched out: The hardware seems to be the bottleneck. Try improving the performance of your disk systems. It's very unlikely to get _exactly_ the same figures from such two different RDBMS. You expect them to be close, but not identical. If the bottleneck is in the identical place, and they are otherwise well-tuned, it is actually *not* that surprising that the timings for PostgreSQL vs real InnoDB would be pretty close. If both are being bottlenecked by the same notion of how fast does the disk spin, then the differences in performance won't be dramatic. That's my point. If the purpose of the test is to compare PostgreSQL vs real InnoDB, there should not be any other bottleneck than software itself. BTW, make sure the test correctly emulated multiple clients (say 25, 50 or 100). There's little point in stressing transaction support of a RDBMS when there's only one single actor in the system, and therefore no contention. Transaction code takes always the fast path that way and you're testing the less important part of it. Actually, if you can demonstrate near-identical performance under a common set of conditions, that's a really useful datum to start with. It would then certainly be interesting to see how the behaviour changes as various stresses are introduced... I take the purpose of the test is also to measure performance under transactional load. Otherwise, inserting 1,000,000 rows one transaction a time is just silly. I was able to do 12,000 row/s with COPY on very cheap hardware (that's 1,000,000 rows in about 90 seconds, not minutes). I think that if you benchmark how things perform in doing silly things, you should expect silly results... So, if you want transactions, make them _real_ transactions. One client sequentially issuing transactions means nothing in this context. Transactions is all about contention. You need a bunch of concurrent clients processes, at least, and possibly a N-way system on the server (to measure real contention at OS level too). Otherwise you'd better measure many inserts per transaction (and COPY in PostgreSQL) on a single client (which does make sense on its own). .TM. (who has not been to Arrakis recently) -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Adventures in Quest for GUI RAD
On Tue, 2005-05-10 at 01:51 -0400, [EMAIL PROTECTED] wrote: Wolfgang, thanks! I am very persuaded by your arguments regarding Python. What you have written makes me look at Python in a different light. I happened to find a download of Python2.2 which I installed at work but have not tried out. I wish I could find detailed instructions on WHICH python to download from WHERE, and what I would need to download to access Postgresql from Python, and then some simple examples of sending a query to postgres and processing the results. 2.2? Latest python is 2.4.1. First place to look at is obviously: http://www.python.org/ see the Download section. To access PostgreSQL from Python, you have some choices, pros and cons have been already discussed on this list. Here's a partial list of options: http://www.druid.net/pygresql/ I've used this, happily, under Linux. It may be already included in the PostgreSQL Windows port, I don't know. http://initd.org/projects/psycopg1 I've used this too, happily, again under Linux. My programs where simple enough that I could switch from one driver to the other w/o touching the rest of the application, since I'm using the DBI 2.0 interface. I have no idea if there's a Win port of this driver. There are other drivers, but I haven't used them. Have a look at: http://www.python.org/pypi?%3Aaction=browse (under the Database section). For DBI, or DB-API, see: http://www.python.org/peps/pep-0249.html It's a reference manual and not a tutorial, and it's not PostgreSQL specific. While we're at it (and completely unrelated and off-topic) have a look at: http://www.pcpm.ucl.ac.be/~gustin/win32_ports/ I would be perfectly happy to work with it in that funny DOS window, I would not require that it be Windows GUI app. Just to be able to read and write to Postgresql. Do you think there is such a tutorial/documentation. Python for Windows has a nice console window (IDLE I think). You should be able to find some examples via Google. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] Python DB-API 2.0 oddity (was: I receieved an example of
On Mon, 2005-05-02 at 11:56 +0200, Karsten Hilbert wrote: A note on what I think is a strangeness in the Python DB-API 2.0. Please correct me if I am wrong. (Note that I am not trying to throw off the OP but simply use his example to point out an oddity about that API. The point is to make sure it *is* an oddity so I can raise it with the appropriate forum, eg the Python community.) Observe the following notes by someone learning the DB-API: # Get a cursor. We do this by connecting to the database: the # (button,) arguments just connect to the database that the form is # running in. # cursor = RekallPYDBI.connect (button, '').cursor() So far so good... But - getting a cursor without knowing the SELECT query the results of which it stands for ? AFAIK cursors are not limited to SELECTs. # Execute a query. This only gets people whose ages are 21 or above. # This is the important bit, we are accessing the database directly. # cursor.execute (select surname from users where age = ?, [21]) Ah, the query is set *after* getting a cursor for it - seems odd, but hey, as long as it's set before retrieving rows ... The key is getting the cursor. Once you have a cursor you can do inserts, updates and deletes, like Huh ? Pardon me ? Doing inserts, updates and deletes via a cursor ? The PostgreSQL documentation clearly says that the query part of a cursor definition must be a SELECT: http://www.postgresql.org/docs/7.4/static/sql-declare.html But what makes you think that Python DBI was designed to be PostgreSQL specific? http://www.python.org/peps/pep-0249.html .cursor() Return a new Cursor Object using the connection. If the database does not provide a direct cursor concept, the module will have to emulate cursors using other means to the extent needed by this specification. It's up to the module implementation to use real SQL cursors when possible. AFAIK, it's not done automagically for PostgreSQL. In practice, DBI cursor objects and SQL cursors have little in common in the PostgreSQL drivers I'm aware of (PygreSQL and psycopg). A DBI cursor is just an handle to execute SQL commands. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] Python DB-API 2.0 oddity (was: I receieved an example of
On Mon, 2005-05-02 at 11:56 +0200, Karsten Hilbert wrote: The key is getting the cursor. Once you have a cursor you can do inserts, updates and deletes, like Huh ? Pardon me ? Doing inserts, updates and deletes via a cursor ? The PostgreSQL documentation clearly says that the query part of a cursor definition must be a SELECT: http://www.postgresql.org/docs/7.4/static/sql-declare.html (I am well aware that SELECT queries may have side effects that change data in the backend such as in select add_new_customer() etc.) BTW, look at this page (with the Oracle driver): http://www.zope.org/Members/matt/dco2/dco2doc cursor.execute(INSERT INTO TEST (name, id) VALUES (:name, :id), name=Matt Kromer, id=1) I believe there are databases that allow you to send SQL statements (any kind, not only SELECTs) only in a cursor (either implicit or explicit), hence the name for the cursor object. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] Persistent Connections in Webserver Environment
On Mon, 2005-05-02 at 14:45 +0200, Hannes Dorbath wrote: Hi, as the subject says I need some advice on setting up connection handling to PG in a webserver environment. It's a typical dual Xeon FreeBSD box running Apache2 with mod_php5 and PG 8. About 20 different applications (ecommerce systems) will be running on this box. Each app resides in it's own schema inside a single database. As far as I understand persistent connections from apache processes can only be reused if the authentication information of the allready existing connection is the same. So in case an apache process holds a persistent connection to database test, auth'ed with username user1 and another app wants to connect as user2 the connection can't be reused and a new one will be spawned. So what we are doing atm is telling all apps to use the user apache, grant access for this user to all schemas and fire SET search_path TO app_schema; at the startup of each app / script. It works, but I really would like to have an dedicated user for each app / schema for security reasons. The next better idea I came up with was to fire SET SESSION AUTHORIZATION TO user; at each app / script startup, but for this to work I would need to initially connect as superuser - and I really dislike the idea of having a webserver connecting as superuser :/ Any ideas? I can't be the first person on earth with that problem ;/ Have you measured the real gain in using persistent connections at all? In my experience, it's just a CPU vs RAM tradeoff. Before you go thru the pain of setting up a weird authentication mechanism, try and consider whether you really need persistent connections. Search the lists, it has been discussed in the past. I remember of this thread: http://archives.postgresql.org/pgsql-php/2005-02/msg9.php There may be others, too. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Persistent Connections in Webserver Environment
On Mon, 2005-05-02 at 17:32 +0200, Hannes Dorbath wrote: On 02.05.2005 16:41, Marco Colombo wrote: Have you measured the real gain in using persistent connections at all? As simple as possible: ?php require_once('Benchmark/Timer.php'); $timer = new Benchmark_Timer(); $timer-start(); pg_pconnect('host=myhost dbname=database user=user'); pg_query(SET search_path TO myschema;); $q = SELECT u.login FROM users WHERE u.user_id = 1;; $qr = pg_query($q); print_r(pg_fetch_all($qr)); $timer-setMarker('Database'); $timer-stop(); $timer-display(); ? Results: pconnect: 0.001435995101928 connect: 0.016793966293335 It's factor 10 on such simple things on the BSD box. Ok, but the difference is going to be unnoticed, that's not the point at all. The question was: have you measured any difference in the server load? I did in the past and wasn't really able to measure it, with more than 300 http processes active. The web server load is _way_ lower than the db server. Currently we're about at 100 processes (but with pconnect) and: (web) load average: 0.31, 0.27, 0.21 (db) load average: 0.24, 0.21, 0.18 and I know that turning to use simple connect won't change much as page load time is dominated by the time spent in the queries (and the overhead of 1/100 or 1/1000 of second in the startup time goes unnoticed at all). With any modern operating system, the overhead is very low (15ms is very good actually). In my experience, pconnect my cause RAM problems. The number of processes is useless high. You have make provisions for a large number of backends, and that means little RAM to single backend. My advice is: use pconnect only when you have CPU problems, unless your case is very degenerated one (your db host being on the other side of the globe). And, in my experience again, the first reasons for CPU problems on the database server are: - wrong/missing vacuum/analyze (or similar); - bad coding on the application side (placing silly load on the server); - bad queries (misuse/lack of indexes); - bad tuning of PostgreSQL (expecially RAM); ... ... - connect overhead. I've never managed to reach the last item in the list in real world cases. I think it is by far the least important item. #1 Golden Rule for optimizing: - Don't. (Expecially when it causes _real_ troubles elsewhere.) Have a nice day, .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] PRIMARY KEY on a *group* of columns imply that each column is
On Wed, 2005-04-27 at 17:00 +0200, Stephane Bortzmeyer wrote: On Wed, Apr 27, 2005 at 09:36:57AM -0500, Scott Marlowe [EMAIL PROTECTED] wrote a message of 18 lines which said: Often the best bet here, btw, is to declare it not null then use something other than null to represent null, like the text characters NA or something. Yes, but it defeats the purpose of NULL. And what should I use as a pseudo-NULL value for INET? 127.0.0.1? 0.0.0.0? Special values are well-known for the problems they raise. That's why many languages have NULL-like solutions (None in Python, undef in Perl, Maybe types in Haskell, etc). No. NULL is NOT 'None', nor 'undef', and definitely not NULL as in C. Those are perfectly defined values, although special ones. Only 'undef' is quite misleading, but nevertheless it is just _one_ value. That is, given a variable A, you can always write a boolean expression that evaluates True or False to test if A is _equal_ to None/undef/NULL (in C): $ python -c a = None; print a == None True $ perl -e 'print a == undef, \n' 1 $ cat p.c #include stdio.h int main(int argc, char *argv[]) { char *a = NULL; printf (%d\n, a == NULL); } $ cc p.c $ ./a.out 1 About Haskell, I don't know. For what I understand from quick reading the manual, it'd say that Nothing is similar. You _can_ tell if something is equal to Nothing. In databases, NULL has a completely different meaning. It doesn't mean _no value_, which is just a special (single) value, but it means _unknown value_. You simply can't compare it with any single value (even special ones) and expect a boolean answer. The only possible answer is 'I don't know', which is NULL in boolean. Notice that the boolean NULL is _not_ the same of False. So you get: marco=# select 2 = 2; ?column? -- t (1 row) marco=# select 2 = 3; ?column? -- f (1 row) marco=# select 2 = NULL; ?column? -- (1 row) that is, neither true nor false. Back to your example, you can compare ('a', 2) with ('a', 2), the result is 't' and thus you've managed to identify the right row (it works as a primary key). Also, ('a', 3') is different from ('a', '2'), so you can tell the two rows are different. But what if you allow ('a', NULL)? ('a', NULL) is neither the same _nor different_ from ('a', 2). The result of comparison is NULL, no matter how you're testing it: marco=# select ('a', 2) = ('a', NULL); ?column? -- (1 row) marco=# select ('a', 2) ('a', NULL); ?column? -- (1 row) see? _Neither_ one is true. This would completely defeat the purpose of the primary key. And of course, comparing ('a', NULL) with ('a', NULL) results in exactly the same: marco=# select ('a', NULL) = ('a', NULL); ?column? -- (1 row) marco=# select ('a', NULL) ('a', NULL); ?column? -- (1 row) That's why NULLs are not allowed in primary keys. The key simply won't work. NULL in databases is not _one_ special value. It's _any_ value, since it's unknown. The boolean expression: 2 = NULL might be true or might be false, since NULL could be _any_ integer in this expression. This is completely different from the semantic of None/undef/NULL in most programming languages. You wrote: Special values are well-known for the problems they raise. then NULL is definitely _not_ the value you're looking for. Everything can be said of NULL, but that it is well-known. In your case, by choosing (name, address) as the primary key, you're saying 'I need to know both the name and the address to be able to retrieve a datum in the table'. This implies that if you have partial knowledge (you don't know the address), you can't naturally retrieve a single datum (or insert it). Depending on what you're trying to achieve, you may need to split the table (normalization the theorists call it). I don't like theory much, but its conclusions sometimes just make a lot of sense. :-) Review your design, maybe either the table schema or the choice of the primary key is not natural for your database. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] optimal hardware for postgres?
On Tue, 2005-04-26 at 01:32 -0700, William Yu wrote: Linux 2.6 does have NUMA support. But whether it's actually a for Postgres is debatable due to the architecture. First let's take a look at how NUMA makes this run faster in a 2x Opteron system. The idea is that the processes running on CPU0 can access memory attached to that CPU a lot faster than memory attached to CPU1. So in a NUMA-aware system, a process running on CPU0 can request all it's memory be located memory bank0. [...] This is only part of the truth. You should compare it with real SMP solutions. The idea is that CPU0 can access directly attached memory faster than it would on a SMP system, given equivalent or so technology, of course. So NUMA has a fast path and a slow path, while SMP has only one, uniform, medium path. The whole point is where the SMP path lies. If it's close to the fast (local) path in NUMA, then NUMA won't pay off (performance wise) unless the application is NUMA-aware _and_ NUMA-friendly (which depends on how the application is writter, assuming the underlying problem _can_ be solved in a NUMA-friendly way). If the SMP path is close to the slow (remote) path in NUMA (for example, they have to keep the caches coherent, and obey to memory barriers and locks) then NUMA has little to loose for NUMA-unaware or NUMA-unfriendly applications (worst case), and a lot to gain when some NUMA-aware optimizations kick in. I've read some articles that refer to the name SUMO (sufficiently uniform memory organization) AMD would use to describe their NUMA, which seems to imply that their worst case is sufficiently close to the usual SMP timing. There are other interesting issues in SMP scaling, on the software side. Scaling with N 8 might force partitioning at software level anyway, in order to reduce the number of locks, both as software objects (reduce software complexity) and as hardware events (reduce time spent in useless synchronizations). See: http://www.bitmover.com/llnl/smp.pdf This also affects ccNUMA, of course, I'm not saying NUMA avoids this in any way. But it's a price _both_ have to pay, moving their numbers towards the worst case anyway (which makes the worst case not so worse). .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] Postgres source (tar file) for Fedora Core OS?
On Tue, 2005-04-26 at 09:43 +0530, Dinesh Pandey wrote: From where can I download latest Postgres source (tar file) for Fedora Core OS? Regards Dinesh Pandey This is a question for Fedora mailing-lists. Anyway, FC distributes sources in RPM format. Assuming you're referring to the latest update, you can find the sources at: http://download.fedora.redhat.com/pub/fedora/linux/core/updates/3/SRPMS/ the current latest PostgreSQL source rpm is: postgresql-7.4.7-3.FC3.1.src.rpm which contains the original source tarball plus patches. To view the RPM content, type: rpm2cpio postgresql-7.4.7-3.FC3.1.src.rpm | cpio -itv See rpmbuild(8) manual page for details on how to build binary RPMs from the source one. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Filesystem options for storing pg_data
[I've got a private reply from Scott, which I won't quote here, which can be fairly (I hope) summarized as search the pgsql-performance list. Well, I've done it, and I feel it's due to bring the issue back in public. So if I seems I'm replying to myself, it's not, I'm replying to Scott. I've realized the reply was private only just before sending this out.] On Wed, 2005-04-20 at 12:07, Marco Colombo wrote: On Wed, 2005-04-20 at 11:18 -0500, Scott Marlowe wrote: Generally XFS and JFS are considered superior to ext2/3. Do you mind posting a reference? I'm really interested in the comparison but everytime I asked for a pointer, I got no valid resource, so far. [...] Well, my point being the ones I find lead to the conclusion that EXT3 is considered superior to XFS and JFS. One for all: http://www.oracle.com/technology/oramag/webcolumns/2002/techarticles/scalzo_linux02.html It's reassuring when various industry-standard benchmarks yield similar results. In case you're wondering, I obtained similar results with Benchmark Factory's other half dozen or so database benchmarks-so for me, it'll be ext3. Have a look at the graphs, EXT3 is almost twice as fast in these (database) benchmarks. Another one is: http://www.kerneltraffic.org/kernel-traffic/kt20020401_160.html#8 Again ext3 is the winner (among journalled fs), but by a small edge only. And again, there are a lot of variables. Using for example data=journal with a big journal file on a different disk would be extremely interesting, just as using a different disk for WALs is at PostgreSQL level (the result might be the same). All the other benchmarks I've found, with a simple search for 'filesystem benchmark' on the pgsql-performance list, either are the usual Bonnie/iozone irrelevant benchmarks, or don't seem to care to tune ext3 mount options and use the defaults (thus comparing apples to oranges). I'm not stating that EXT3 is better. My opinion on the matter is that you shouldn't care about the filesystem much (EXT3, JFS, XFS being the same for _most_ purposes with PostgreSQL). That is, it's a small little spot in the big picture of performance tuning. You'd better look at the big picture. I'm only countering your claim: Generally XFS and JFS are considered superior to ext2/3. There's no general agreement on the lists about that, so just handwaving and saying look at the lists isn't enough. Mind posting a pointer to _any_ serious PostegreSQL (or any database, at least) based benchmark that consistently shows XFS and JFS as superior? One that cares to show ext3/noatime/data=ordered,data=writeback,data=journal results, too? If I were to choose based on the results posted on the list (that I've managed to find), ext3 would be the winner. Maybe I've missed something. Having used ext3 quite a bit, I'd say it's fairly stable and reliable, but I have seen references here to know, possibly unfixable bugs. Again, mind posting a reference? [...] I've searched for 'EXT3 bug' but got nothing. I'm (loosely) following l-k, and never heard of possibly unfixable bugs in EXT3 by any developer. Care to post any real reference? There have been bugs of course, but that holds true for everything, XFS and JFS included. Having re-read many many messages right now, I'm under a even stronger impression that _all_ negative comments on both the stability and the performance of EXT3 start with I've heard that... w/o almost noone providing direct experience. Many comments display little understanding of the subject: some don't know about data= mount option (there's little point in comparing to XFS, if you don't use data=writeback), some have misconceptions about what the option does, and what difference it makes when the application keeps _syncing_ the files (I don't know well either). See the data=journal case. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Filesystem options for storing pg_data
generate a lot of unnecessary disk activity. XFS tries to cache as much data in memory as possible, and generally only writes things out to disk when memory pressure dictates that it do so. so, if a benchmark shows XFS is faster, it's matter of better caching, right? But it comes at a price of possible (data) corruption... Thankfully, it's pretty useless to us, with every write followed by a sync. I'm sorry, but with the links _you_ selected, applying my filter conditions 1) and 2), which are necessary for a fair comparison, one could say there's general consensus on EXT3 being far superior to other filesystems, not the opposite. Note that I'm not interested in supporting such a claim. As I already wrote I think FS selection has generally a minimal impact on PostgreSQL performance. But again, what was you original claim Generally XFS and JFS are considered superior to ext2/3. based upon? I apologize if I sound somehow harsh, it's not really intented. But next time please assume that: - I'm able to do a 10 minute search; - I've got some work to do, too, but I'm willing so spend more than 10 minutes on this research (it already took me more than 2 hours actually, of my spare time); - if I say I've searched the lists and read many messages, I've really done so. You're absolutely entitled to have your opinion, if you like XFS and JFS go ahead and use them, because of their name, the names of their sponsors (IBM and SGI), or their features, or your personal experience, or whatever. Just please don't claim that's general consensus for the pgsql lists. There's _no_ general consensus. There's _no_ clear winner. And if you do want a winner anyway, it's ext3, so far. This ext3 is not good as XFS as JFS is a recurring subject, as long as ext3 is buggy. _Every single time_ I've asked for references to back up such claims, nothing valuable was presented. On the contrary, the only references I've found are on the ext3 is equal or better side. Now, feel free to prove me wrong. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] PostgreSQL as a filesystem
On Mon, 2005-04-18 at 17:18 -0400, Tom Lane wrote: Christopher Nelson [EMAIL PROTECTED] writes: I'm developing a hobby OS and I'm looking into file systems. I've thought about writing my own, and that appeals, but I'm also very interested in the database-as-a-filesystem paradigm. It would be nice to not have to write all of the stuff that goes into the DBMS (e.g. parsers, query schedulers, etc) myself. So I was wondering what sort of filesystem requirements Postgre has. There are DB's you could use for this, but Postgres (not Postgre, please, there is no such animal) isn't one of them :-(. We really assume we are sitting on top of a full-spec file system --- we want space management for variable-size files, robust storage of directory information, etc. I've been thinking of it, too. I think no filesystem out there is really optimized for a steady write load with many fsyncs, that is, is really transaction-oriented on the data side (journalled ones may implement real transactions for meta-data, but only for it). Out of curiosity, do you have any feedback from filesystem people, are they interested in optimizing for the kind of workload (expecially on write) a database generates? I ask for it seems to me it's a corner case to them, or even a degenerated one. I'm not aware of _any_ comparative benchmarch among different filesystems that is based on write+fsync load, for one. Using a DB as filesystem at OS level is a different matter, of course. Christopher, you may have a look at FUSE. http://fuse.sourceforge.net/ It may help in both developing a new filesystem and in understanding how it works under Linux (with a nice separation of userspace and kernelspace). I think you could even write one based on PostgreSQL, but it won't help much, since PostgreSQL needs a filesystem to work. But if your OS has TCP/IP, it could be interesting anyway. Note that I'm not aware of any other way to access PostgreSQL than sockets, so you need those at least. There's no standalone library you can link to in order to access database files, AFAIK. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] [HACKERS] plPHP in core?
[Cc: list purged a bit] On Sun, 3 Apr 2005, Jim C. Nasby wrote: On Sun, Apr 03, 2005 at 08:41:15PM -0700, Joshua D. Drake wrote: None on the server side (except PostgreSQL) which makes the argument all that more powerful :) So what you're saying is that no database sounds complete because no database includes PHP as a procedural language. Sorry, but I don't buy it. I do. Actually I think no database is complete because no one includes LISP as a procedural language (pun on procedural intented). (BTW, I have no idea if a pl/LISP module ever existed.) .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] plpython function problem workaround
On Tue, 29 Mar 2005, Sim Zacks wrote: The only ?issue? that I have found with it is similar to an issue I posted about multiline in general, which does not seem to be considered a bug. I've posted similar concerns in the past. The whole point is that there are two possible approaches: 1) treat text as binary - as we do now; 2) do on the wire conversion - like FTP ASCII mode. Both have disadvantages, and both lead to unexpected results. As I wrote before, 2) is more problematic. You'll have to reject any file with a bare \n from a Windows, otherwise you won't be able to process it correclty. I think if you do: insert into test (sometext) values ('Line one\nLine two\r\n'); -- with the literal chars, not the escape sequences you're expecting exaclty the same on output. If the server converts it in the Unix form: 'Line one\nLine two\n' for storing and the converts back to the Windows form, when you do: select sometext from test; -- from a Windows client you get: Line one\r\nLine two\r\n which is not the same you entered. I doubt FTP ASCII mode handles this correctly. As for the examples you made (the python functions), it's a problem with python string literals (just don't use them). Let's try this: -- CUT HERE 8 #!/usr/bin/env python import pgdb db = pgdb.connect() curs = db.cursor() # this is only to emulate PGAdmin under Windows (I don't have it) # (I ran the script with these uncommented on Linux) #q = create temp table test1(f1 varchar(50)); #curs.execute(q) #q = insert into test1 values('this is a multi line string\r\nline2\r\nline3\r\n'); #curs.execute(q) real test # embedded in string literal q1 = select count(f1) from test1 where f1 = 'this is a multi line string line2 line3 ' # escapes (expanded by python) q2 = select count(f1) from test1 where f1 = 'this is a multi line string\r\nline2\r\nline3\r\n' # escapes (expanded by PostgreSQL) q3 = rselect count(f1) from test1 where f1 = 'this is a multi line string\r\nline2\r\nline3\r\n' curs.execute(q3) # stating the obvious print Comparisons: print %-10s%-10s%-10s % (q1 == q2, q1 == q3, q2 == q3) print %-10s%-10s%-10s % (q1 == q2,q1 == q3, q2 == q3) print \nRunning tests... curs.execute(q1) print Test 1 (string literal):, curs.fetchone()[0] curs.execute(q2) print Test 2 (Python escapes):, curs.fetchone()[0] curs.execute(q3) print Test 3 (PG escapes):, curs.fetchone()[0] # in case someone wonders, let's try using query parameters astring = this is a multi line string line2 line3 q = select count(f1) from test1 where f1 = %(mystr)s curs.execute(q, { mystr: astring }) print Test 4 (parameters):, curs.fetchone()[0] -- 8 CUT HERE This is the output (on Linux): Comparisons: q1 == q2 q1 == q3 q2 == q3 False False False Running tests... Test 1 (string literal): 0 Test 2 (Python escapes): 1 Test 3 (PG escapes): 1 Test 4 (parameters): 0 Which is consistent with your examples, that is, it works _only_ with explicit escapes (either at Python level or at PostgreSQL level). If client-side python works this way, why are you expecting server-side python to work differently? .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] plpython function problem workaround
On Tue, 29 Mar 2005, Marco Colombo wrote: # escapes (expanded by PostgreSQL) q3 = rselect count(f1) from test1 where f1 = 'this is a multi line string\r\nline2\r\nline3\r\n' curs.execute(q3) ^^ This line (no. 28) is useless (but harmless), please ignore it (just a cutpaste error). .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] multi line text data/query ?bug?
On Wed, 23 Mar 2005, Sim Zacks wrote: While I would agree with you that from a purely technical standpoint, the user inserted into the database a CRLF and a query with just an LF does not exactly match that, from a users and more practical perspective, that does not make sense at all. That is why I surrounded the word bug in ??. I would say that from a users perspective it qualifies as a bug because they did not put in specific binary characters. They want a newline. From a database standards perspective, I would argue that any database that allows connections from a client without qualifying a required operating system should be OS neutral. I would say it is a bug from a users perspective because the exact same query works differently from different clients. Since the user does not choose what binary characters to put in, they are invisible to the user. Anything that is completely invisible to the user should not be considered valid qualifying data. As there is no postgresql database standard, such as all newlines are unix newlines it is impossible to write a client that will necessarily return the data that you want. This is the exact problem we are having with Python right now, as a Windows client cannot write a python function to be run on a linux server. Unfortunately, it's not that simple. There are problems with python when _both_ the client and the server are Windows. Python itself _always_ uses \n even on Windows. So the only solution is to pythonize the input (convert to \n), no matter what. For the more general problem of handling text, see my comments in this thread: http://archives.postgresql.org/pgsql-general/2005-01/msg00792.php There are interesting problems with multiline text, as a datatype. Think of digital signatures and checksums. Think of a simple function: len(text) should it count line separators as characters? In theory, the only way to get cross-platform consistent behaviour, is to _ignore_ line separators when counting or checksumming. But the real world solution is to treat textfiles as binary and let the users or the application handle the conversion. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] multi line text data/query ?bug?
On Wed, 23 Mar 2005, Sim Zacks wrote: In any case, there are 2 correct solutions to the problem for the case of postgresql. 1) Database standard - postgresql chooses a newline standard and every client must support that if they support postgresql. Either put the onus on the client developers to override the OS standard and support the postgresql standard, or have the db convert the incoming newlines into the db standard. 2) Server's OS - the server must convert any new lines coming in to the standard that it expects. This is similar to an ftp client that converts newlines on the transfer. That means that data sent to a Windows server with an LF will be converted to a CRLF and vice versa. The data restore function will also have to follow the above procedures to make sure you can take data from one server to the other without compromising integrity. Without one of these solutions, PostGreSQL is not compatible between servers and clients. A query written on any client should return the same result. The query being the visible appearance on the screen. That is what the users would expect to have returned. Yeah, those were my points. The _open_ problems are: - what about storing a signed document? it's possible that newline conversion makes the signature invalid. How would you restore the original document? Before you answer think of: a) a client running on a platform different from the one that inserted the document; b) a document with _mixed_ newline types, such as a windows text with enmbedded bare \n or \r. [*] - what about any other function that may be affected by newline style? I mean, the user may insert a text that he knows it's 1000 chars long, and finds that PG thinks it's only 980. Is this consistent? What if the user selects for the messages longer than 990? What is the expected answer, from the user standpoint? There's no easy solution I think. [*] This is _way_ more common than you'd think. RFC2822, internet message format, says lines are CRFL separated. It happens sometimes that a message contains a NL or a CR alone. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] New user: Windows, Postgresql, Python
Michael Fuhr wrote: On Tue, Mar 15, 2005 at 10:46:09PM +, Paul Moore wrote: The long and short of it is that I believe you just use \n to delimit lines on Windows, just like anywhere else. Many thanks -- your test results contain the info we've been seeking. Thanks a lot Paul. Micheal, you were right. It seems python documentation is plain wrong, or I'm not able to read it at all: http://docs.python.org/ref/physical.html A physical line ends in whatever the current platform's convention is for terminating lines. On Unix, this is the ASCII LF (linefeed) character. On Windows, it is the ASCII sequence CR LF (return followed by linefeed). On Macintosh, it is the ASCII CR (return) character. This is the language _reference_ manual, btw. I'm very surprised to hear python on windows is so broken. Anyway, that makes life simpler for us. plpython programs are \n separated, no matter what platform the server runs on. Client applications just need to conply, which is what I suggested some time ago. I'm glad to hear there's nothing to change on the server side. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] plpython function problem workaround
On Fri, 18 Mar 2005, Tom Lane wrote: Marco Colombo [EMAIL PROTECTED] writes: Right now I'm parsing the string first, changing the resulting parse tree adding missing nodes (def, INDENT, DEINDENT) and then compiling it. Hmmm ... is this really going to be simpler or more robust than lexing the string carefully enough to insert tabs at the right places? The impression I had so far was that you'd only need to understand about Python's string-literal conventions to get that right ... and that's something that's not likely to change. I'm not so sure that parse trees can be regarded as an immutable API. regards, tom lane I've completed a proof of concept, I think I can answer: - simpler? not at all. It requires understanding of how the parser works. The whole thing is about 50 lines long, but quite a bit of parser magic is going on. And I'm far from the point I can be confident about it doing always the right thing. I still have to handle (de)allocations correctly. - more robust - yes. The only way to make sure we're lexing the string the same way python does is to use its lexer. Every single difference however subtle would call for a bug. And it's re-invening the wheel. But there's no way you can work at lexer level that I'm aware of. That is, to add tokens before sending them to the parser. So you have to work on the parser output. - I have no idea if the node API is immutable at all. For sure, the interface I'm using is one or two levels below the current one, and yes, it's more likely to change. I share your concerns here. If our problem is only string literals, maybe we can handle them with a dedicated lexer. Python string literals are quite complex (compared to other languages): http://docs.python.org/ref/strings.html but not that hard. Still, my first concern is that one day we find another corner case in python syntax that makes our blind tab adding fail. And we're back to square one. BTW, I'm not preparing a patch for now, I'm working with a test program. As soon as I finish it, either I'll post it or prepare a patch against plpython.c, for consideration. I won't say it is ready for inclusion until someone else more knowledgeable than me on both PostgreSQL and python embedding looks at it, anyway. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] plpython function problem workaround
On Thu, 17 Mar 2005, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: On Thu, Mar 17, 2005 at 01:03:36PM +0100, Marco Colombo wrote: OMG! It's indenting the funtion body. I think you can't do that w/o being syntax-aware. I'm not familiar with the code, why is it adding a 'def' in front of it at all? I undestand that once you do it you'll have to shift the code by an indentation level. Presumbly because it wants to create a function, which can later be called. Since python is sensetive to whitespace it has to indent the code to make it work. Seems like we have to upgrade that thing to have a complete understanding of Python lexical rules --- at least enough to know where the line boundaries are. Which is pretty much exactly the same as knowing which CRs to strip out. So I guess we have a candidate place for a solution. Anyone want to code it up? I don't know enough Python to do it ... I'm no expert but I'll look into it. Unless someone else already tried it, I want to investigate first if it's possible to create a callable object w/o using 'def', which alters the name space and (the thing we're interested to) needs an extra identation level. At first sight, what we do now (at function creation time) is: 1) execute a function definition; 2) compile a function call, and save the resulting code object for later use. I'm wondering if we can save one step, and use a python callable object. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] plpython function problem workaround
On Thu, 17 Mar 2005, Michael Fuhr wrote: On Thu, Mar 17, 2005 at 09:48:51PM -0500, Tom Lane wrote: Michael Fuhr [EMAIL PROTECTED] writes: Line-ending CRs stripped, even inside quotes; mid-line CRs converted to LF. Tests done with Python 2.4 on FreeBSD 4.11-STABLE; I wonder what Python on Windows would do. Unfortunately, I don't think that proves anything, because according to earlier discussion Python will do newline-munging when it reads a file (including a script file). The question that we have to deal with is what are the rules for a string fed to PyRun_String ... and it seems those rules are not the same. Marco, you've stated that you're against munging the code because it's not our job to 'fix' data coming from the client. But I'm suggesting that we think about the code in a different way than the current implementation does: not as a literal that we pass untouched to the Python interpreter, but rather as code that Python would munge anyway if it had read that code from a file. We could still store the code exactly as received and have the language handler munge it on the fly, as we've discovered it's already doing. Comments? Have I overlooked anything? Could munging CRs have effects that a Python programmer wouldn't expect if the same code had been read from a file? Since it mimics Python's own behavior with code read from a file, can anybody justify not doing it? If you put it that way, I'm 100% with you. Just let me suggest not to mimic its behaviour, but to use the Real Thing if we manage to. That is, directly use the Universal Line Support code provided by python itself, so that we don't even have to think about being compatible. Unfortunately, I'm new to python embedding. I think I'll ask on python lists about the function definition code. Actually, we are kind of a corner case: we ask the user to provide the function body w/o the initial def declaration. We're treating partial, incomplete python code and not a well-formed program, so we have to munge it anyway. I have no idea if and how the python C API lets you control such low level details. I think what we really want is to create a callable (anonymous) object from the source of its body. I'm experimenting a bit, trying to write a PLy_procedure_compile() that does not require source munging. I'm aiming at removing the need for extra indentation. The \r\n thing is another beast, and I'm not sure it belongs to the same place in our code. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] plpython function problem workaround
On Fri, 18 Mar 2005, Tom Lane wrote: Michael Fuhr [EMAIL PROTECTED] writes: On Fri, Mar 18, 2005 at 12:35:07AM -0500, Tom Lane wrote: So that part of it can be solved fairly easily. Should I submit a patch? It should be only a few additional lines in PLy_procedure_munge_source(). Would you apply it only to HEAD, or would it be considered a bug fix that REL8_0_STABLE could get as well? It might be nice to have it in 8.0.2, whenever that comes out. I think it would be reasonable to back-patch a small fix to convert CRLF. The sort of rewrite Marco is considering, I wouldn't back-patch. regards, tom lane Short update: it isn't possible to get a callable object directly from the source provided by the user, since 'return' is not valid outside function definitions in Python. Right now I'm parsing the string first, changing the resulting parse tree adding missing nodes (def, INDENT, DEINDENT) and then compiling it. Grammar definitions for a python function is: funcdef: [decorators] 'def' NAME parameters ':' suite suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT What we get from the users is stmt+ (a sequence of stmt). The INDENT and DEDENT tokens are generated by the parser only when indentation level _changes_. My plan is to generate two parse trees, one from this code (with the right fname of course): def fname(): pass and one from the function definition. Then, we attach the root of the second tree where the pass node is in the first tree. We should get a parse tree ready for compilation. I wish I could push the right tokens in the right places, but it seems it's not possible. Stay tuned. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] plpython function problem workaround
On Wed, 16 Mar 2005, Michael Fuhr wrote: [I've changed the Subject back to the thread that started this discussion.] On Wed, Mar 16, 2005 at 05:52:02PM +0100, Marco Colombo wrote: I'm against to any on-the-fly conversion, now. I don't like the idea of PostgreSQL accepting input in one form (\r\n) and providing output in a different form (\n). Also think of a function definition with mixed \r\n and \n lines: we'd have no way to reconstruct the original input. Yeah, that's a reasonable argument against modifying the function source code before storing it in pg_proc. But I expect this problem will come up again, and some people might not care about being able to reconstruct the original input if it's just a matter of stripped carriage returns, especially if the function logic doesn't use literal carriage return characters that would be missed. For those people, the validator hack might be an acceptable way to deal with a client interface that inserts carriage returns that the programmer didn't intend anyway. Not necessarily as part of the core PostgreSQL code or even distributed with PostgreSQL, but as something they could install if they wanted to. Agreed. I think we should just state that text used for function definitions is \n-delimited. Some languages may accept \r\n as well, but that's undocumented side effect, and bad practice. Whether it's an undocumented side effect depends on the language, and whether it's bad practice is a matter of opinion. Sure. I mean, we may just state that, per spec. Program data should be \n-delimeted, full stop. It sounds sensible to me. Just put it somewhere in the docs, problem solved. We're loosing nothing. I'm just proposing to add that to the docs/specs. In any case, that's the language's concern and not something PostgreSQL should judge or enforce. PostgreSQL shouldn't have to know or care about a procedural language's syntax -- a function's source code should be an opaque object that PostgreSQL stores and passes to the language's handler without caring about its contents. Syntax enforcement should be in the language's validator or handler according to the language's own rules. That's what we do now. My point being it's not our job to fix data coming from the client. If a client app creates a plpython function the wrong way, fix it. Why should we place a paperbag on a client bug? Speaking of code munging and syntax enforcement, have a look at this: CREATE FUNCTION foo() RETURNS text AS $$ return line 1 line 2 line 3 $$ LANGUAGE plpythonu; SELECT foo(); foo -- line 1 line 2 line 3 (1 row) Eh? Where'd those leading tabs come from? Why, they came from PLy_procedure_munge_source() in src/pl/plpython/plpython.c: mrc = PLy_malloc(mlen); plen = snprintf(mrc, mlen, def %s():\n\t, name); Assert(plen = 0 plen mlen); sp = src; mp = mrc + plen; while (*sp != '\0') { if (*sp == '\n') { *mp++ = *sp++; *mp++ = '\t'; } else *mp++ = *sp++; } *mp++ = '\n'; *mp++ = '\n'; *mp = '\0'; How about them apples? The PL/Python handler is already doing some fixup behind the scenes (and potentially causing problems, as the example illustrates). OMG! It's indenting the funtion body. I think you can't do that w/o being syntax-aware. I'm not familiar with the code, why is it adding a 'def' in front of it at all? I undestand that once you do it you'll have to shift the code by an indentation level. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] pg_dump large-file support 16GB
On Thu, 17 Mar 2005, Rafael Martinez Guerrero wrote: My question is why is this limit (16GB) there, when my OS does not have that limit? Is it possible to take it away in a easy way? It looks like pg_dump is compiled with large-file support because it can work with files bigger than 4GB. More ideas? :) Things to try: a) shell redirection: $ pg_dump ... outfile b) some pipes: $ pg_dump ... | cat outfile $ pg_dump ... | dd of=outfile a) may fail if there's something with pg_dump and large files. b) is different in that it's the right side of the pipe that outputs to the filesystem. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] New user: Windows, Postgresql, Python
On Wed, 16 Mar 2005, Michael Fuhr wrote: On Wed, Mar 16, 2005 at 01:46:23PM +0100, Marco Colombo wrote: It seems python documentation is plain wrong, or I'm not able to read it at all: http://docs.python.org/ref/physical.html A physical line ends in whatever the current platform's convention is for terminating lines. On Unix, this is the ASCII LF (linefeed) character. On Windows, it is the ASCII sequence CR LF (return followed by linefeed). On Macintosh, it is the ASCII CR (return) character. Perhaps the Python documentation could use some clarification about when the platform's convention is required and when it isn't. The Embedding Python documentation shows embedded code with lines ending in \n and without saying anything about requiring the platform's convention: http://docs.python.org/ext/high-level-embedding.html This is the language _reference_ manual, btw. I'm very surprised to hear python on windows is so broken. Anyway, that makes life simpler for us. plpython programs are \n separated, no matter what platform the server runs on. That the behavior makes life simpler is an argument against it being broken (although it would be even less broken if it were more flexible about what line endings it allows). broken == 'not conforming to the specifications or the documentation' The fact it helps us is just a side effect. A detailed response would be getting off-topic for PostgreSQL, but I'll stand by what I said earlier: I would find it bizarre if embedded Python code had to use different line endings on different platforms. That would mean the programmer couldn't simply do this: PyRun_SimpleString(x = 1\n print x\n); Instead, the programmer would have to do a compile-time or run-time check and build the string in a platform-dependent manner. What problem would the language be solving by requiring that? This one: aprogram = x = 1\nprint x\n; printf(aprogram); PyRun_SimpleString(aprogram); See? THIS program requires compile-time or run-time checks. You can't run it on Windows, or Mac: it'll write garbage to the screen (something that looks like garbage, that is). Make it more general: aprogram = get_program_from_somewhere(); PyRun_SimpleString(aprogram); write_program_to_somefile_possibly_stdout(aprogram); What if get_program_from_somewhere() reads user input? On Windows lines will be \r\n separated. Now, should this program make platform checks? Why not simply read a file (or stdin) in text mode, and pass the result to PyRun_SimpleString()? The same applies to output, of course. Now something strikes me... in his tests, Paul tried my program and the output looks identical to Linux. Now... I was expecting program1 (the one with just \n) do display badly under Windows. Am I missing something? Does C runtime support in Windows convert \n into \r\n automatically in printf()? If so, I'm on the wrong track. It may do the same with scanf() and other stdio functions. I must say I wasn't expecting my program to run just fine, with all those \n I used in it. Staring from printf( Initialized.\n); Paul can you please tell me which compiler you used under Windows to complile my program and if you used some weird compiling options? TIA. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] New user: Windows, Postgresql, Python
On Wed, 16 Mar 2005, Michael Fuhr wrote: On Wed, Mar 16, 2005 at 04:17:51PM +0100, Marco Colombo wrote: aprogram = x = 1\nprint x\n; printf(aprogram); PyRun_SimpleString(aprogram); See? THIS program requires compile-time or run-time checks. You can't run it on Windows, or Mac: it'll write garbage to the screen (something that looks like garbage, that is). Are you sure about that? It's been forever since I programmed in a Microsoft environment, but as I recall, I/O streams opened in text mode do automatic translations between \n and \r\n. No I wasn't sure and I actually was wrong. I've never programmed under Windows. I've just learned something. Apparently, as far as Python is concerned, the platform presents \n at C level, so it makes sense for PyRun_SimpleString() to expect \n as line terminator. Still I don't understand when the lexxer would use \r\n as pysical line ending on Windows, but I can live with it. :-) It seems that any client application under Windows is likely to use only \n-delimited text, as long as it uses stdio functions and text mode. Problems arise when it gets text from some other source. But since at C level text is expected to be \n-delimited, the application should take care of the conversion as soon as it receives the data. I think that if we want to be conservative, any input that is supposed to be treated (actively) as text by the server, should be \n-delimited. That includes any function source. I'm against to any on-the-fly conversion, now. I don't like the idea of PostgreSQL accepting input in one form (\r\n) and providing output in a different form (\n). Also think of a function definition with mixed \r\n and \n lines: we'd have no way to reconstruct the original input. I think we should just state that text used for function definitions is \n-delimited. Some languages may accept \r\n as well, but that's undocumented side effect, and bad practice. Now that I learned that C programs on Windows are expected to handle \n-delimited text, I can't think of any reason why an application should send \r\n-delimited text via libpq as a function definition, unless the programmer forgot to perform the standard \r\n to \n conversion somewhere. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] plpython function problem workaround
The following is how I understand it, but please let's delay further discussion until someone tests the program under Windows. On Mon, 14 Mar 2005, Michael Fuhr wrote: Hmmm...I think that would be inconsistent with previous reports. For example, in the following message, the poster said that everything (PostgreSQL, pgAdmin) was running on Windows 2003: http://archives.postgresql.org/pgsql-interfaces/2005-02/msg00066.php I'm sorry, he's wrong. The initial report was by Hong Yuan: http://archives.postgresql.org/pgsql-general/2005-01/msg00792.php later he clarified: http://archives.postgresql.org/pgsql-general/2005-01/msg00858.php I am using pgAdmin III Version 1.2.0 under Chinese Windows XP, while the database is 7.4.6 under Linux. BTW I just noticed someone else provided a simpler example: http://archives.postgresql.org/pgsql-general/2005-01/msg00876.php someone should try and complile those under windows. I suggested that he strip the CRs from pg_proc.prosrc and he said it worked: http://archives.postgresql.org/pgsql-interfaces/2005-03/msg00014.php It's not clear that the test in the second message was run on a Windows server (apparently pgAdmin was run on a Windows client), but I think the beginning of the message is saying that he didn't reply for so long because he didn't have access to a Windows server. From that I infer that he tried my suggestion and posted the results when he finally did get access to a customer's Windows server. I could be misreading that, however. I have no idea of where Michele Bendazzoli ran that code. He's not the original poster, tho. A couple of months ago Stuart Bishop brought this issue up in python-dev. Most of the thread is along the lines of strip the carriage returns: http://mail.python.org/pipermail/python-dev/2005-January/051203.html Sorry again, but he's assuming Unix on the server side, and Windows or Mac on the client side. If anyone manages to compile the following code on Windows... ... I bet on windows the first program fails and the second is ok. Hopefully somebody will do a Windows test of the code you posted. I'd find it bizarre that the Python code embedded in a C program had to care whether it was running on *nix or Windows. I find it perfectly consistent! Face it, _any_ C program that's handling data of type _text_ has to know which platform it is running on. If you don't like the behaviour of Python functions, think of printf(). C programmers under Windows are used to write: printf(Hello World!\r\n); as much (old) Mac programmers write: printf(Hello World!\r); and Unix programmers write: printf(Hello World!\n); _ANY_ C program that processes multiline text input has to know which platform it is running on, otherwise it reads or proceduces garbage. Python just requires the input being text, which seems reasonable to me, since, by design, a python program is more that just a sequence of ';'- separated statements (ala C or Perl), with _optional_ intermixing whitespaces. White spaces ('eol' included) do have a meaning in Python syntax. BTW, this attitude reminds me of PostgreSQL strict validation of input, compared to more relaxed checking made by MySQL. I really don't feel the need to enumerate the merits of input validation on this list. Python functions want 'valid text' and the definition is platform dependent, unfortunately. Why should it be relaxed, and accept invalid text as input? If you are to compile any unix text processing utility, such as grep, on a mac, you'll have to change the source and have the program split lines at \r (under windows, splitting lines at \n still works, but leaves a spurious \r around). Python is the same, since it expects programs as _text_. The universal newline support is totally another matter. And it applies to files only (it's a wrapper around file functions). It's a tool for lazy programmers. It is possible that in the feature the Python lexxer is changed to recognize different line endings (since there's no valid case for a \r at the end of a line that I can think of). But requiring the input being text is not 'bizarre' at all. The issue about text representation affects _any_ application. Treating text as binary data is plain wrong, IMHO, and will always lead to problems. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] plpython function problem workaround
On Tue, 15 Mar 2005, Tom Lane wrote: Sim Zacks [EMAIL PROTECTED] writes: I've been looking at the possibility of having a planned CR in the source code and I don't see a case where it would happen. Does python actually disallow newlines in string literals? That is x = 'foo bar' Whether you think this is good style is not the question --- is it allowed by the language? You can with triple-quoting and by escaping it with backslash. The following code, admitedly ugly, is valid python: a = 'a\ bc' print a b = '''a bc''' print b and produces: abc a bc as output. \newline in any non raw literal is allowed and ignored, while a bare newline in a triple-quoted string literal is allowed and retained. Moreover, this is not an execise of bad style only. It's customary to write docstrings as multiline triple-quoted string literals: def afunction(a, b, c): This is a function. Its arguments are: a - first argument b - second argument c - third argument. It does ans returns nothing. pass It's more or less the recommended way to document a function (or class or module or whatever). See PEP 257 for more examples: http://www.python.org/peps/pep-0257.html So, to answer to your question, newlines are more than allowed in string literals. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] plpython function problem workaround
On Tue, 15 Mar 2005, Michael Fuhr wrote: [...] That somebody was me. Ok, sorry. I'll postpone commenting on the rest until we find out how the example programs run on Windows. If nobody follows up here then maybe I'll wander over to comp.lang.python. Yeah, there's no point in discussing until we have some real world data. I can't compile on windows, so I'll have to wait someone else to do that. I'm basing my opinions on Python documentation only. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] plpython function problem workaround
On Mon, 14 Mar 2005, Michael Fuhr wrote: Would we? My understanding is that code passed to PyRun_String() and friends must be free of line-ending CRs on all platforms, and that the code that reads a normal Python script takes care of that (i.e., normalizes line endings to be LF only). Can anybody confirm or deny? I'm not sure of that. I suspect you'll need to pass CRs on windows. If anyone manages to compile the following code on Windows... #include Python.h void run_program(const char *program) { PyObject *ret, *globals, *locals; printf( running:\n%s\n, program); globals = PyDict_New(); locals = PyDict_New(); ret = PyRun_String(program, Py_file_input, globals, locals); if (ret) { Py_DECREF(ret); printf(\n); } else { PyErr_Print(); } Py_DECREF(locals); Py_DECREF(globals); printf( end\n\n); } int main(int argc, char *argv[]) { const char *program1 = print 1\nprint 2\n; const char *program2 = print 1\r\nprint 2\r\n; Py_Initialize(); printf( Initialized.\n); printf( Python %s\n, Py_GetVersion()); run_program(program1); run_program(program2); Py_Finalize(); printf( Finalized.\n); } On my Fedora Core 2, I need to complile it with the following command: gcc -I/usr/include/python2.3 -L/usr/lib/python2.3/config py-test.c -o py-test\ -lpython2.3 -ldl -lm -lpthread -lutil This is my first attempt to embed python, so I may be missing something... On Linux, you get: $ ./py-test 21 | cat -v Initialized. Python 2.3.3 (#1, May 7 2004, 10:31:40) [GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] running: print 1 print 2 1 2 end running: print 1^M print 2^M File string, line 1 print 1^M ^ SyntaxError: invalid syntax end Finalized. I bet on windows the first program fails and the second is ok. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] postgresql vs mysql performance comparison
On Tue, 8 Mar 2005, Scott Marlowe wrote: On Tue, 2005-03-08 at 09:06, Shelby Cain wrote: --- Howard Cole [EMAIL PROTECTED] wrote: Although not appropriate for a speed comparison, you might want to note that the use of Mysql versions 4.0 upward now require commercial license for clients, which are no longer LGPL, whereas Postgres is free (BSD license). This makes transactions per dollar an interesting statistic when comparing the Postgres and MySql! Reading over their site that doesn't appear true for every case. The client libraries are under the GPL and thus any application that links to them would also be covered under the GPL. No commercial license is required unless the terms of the GPL (ie: if you distribute a binary to someone you must also be willing to distribute your source code if asked) a problem. There have been some statements from MySQL in the past that implied they might be taking a narrower view of what distribution meant than what the GPL was saying. Also, it was impossible for PHP to be packaged with MySQL libs due to incompatibilities with the GPL'd mysql connection libs. It seems MySQL AB has clarified both on these pages: http://www.mysql.com/company/legal/licensing/ http://www.mysql.com/company/legal/licensing/foss-exception.html http://www.mysql.com/company/legal/licensing/faq.html However, Fedora Core 2 still includes MySQL V 3.xx.yy because of the issues wth V4.xx.yy's licensing. However, Suse does include the latest version. So there's some difference of opinion on the issue from different distros. Or different policies. One of the biggest problem of their dual licencing policy is that no one in really interested in provinding them with patches. In other words, they cannot accept third party contributions so easily. _My_ patches are going to be, likely, GPL-only. So they can't use them in their commercial product, unless they make two different lines (which they claim they don't), or they get a (commercial) licence from _me_ allowing _them_ to sell a work including _my_ patches. So in order to accept patches from me, they need a lot of paperwork (not to mention money, they're gonna pay for being able to sell my work). Not pratical. This is not the case of truly GPL software, such as the Linux kernel. Patches, being a derived work, are GPL and they can include them anytime. Note that client libraries are optional. As long the protocol is openly defined (we have open specs), you can write your own client layer, and still connect to the GPL server. Which is _the_ thing. Protecting the client library (switching the licence from LGPL to GPL) makes little sense, IMHO. It greatly reduces the number of potential users, and protects little value. If want to develop a commercial application that: - runs under Linux - I can; - uses HTTP as protocol, and connects to a GPL-ed web server - I can; - uses MySQL as a database backend - I can't, unless I rewrite the client library, or buy a commercial licence from them. Why? With PostgreSQL you don't have to thing about these issues. A big win. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] [Auth] ident method and LDAP user accounts
On Thu, 3 Mar 2005, Stephane Bortzmeyer wrote: On Thu, Mar 03, 2005 at 10:04:32AM +0100, Florian G. Pflug [EMAIL PROTECTED] wrote a message of 114 lines which said: Might it be that the postgres user is not allowed to read /etc/ldap.conf - or however your nss_ldap config file is called? myriam:~ % ls -ld /etc/*ldap* drwxr-xr-x 2 root root 4096 Oct 18 17:17 /etc/ldap -rw--- 1 root root 13 Oct 18 17:19 /etc/ldap.secret -rw-r--r-- 1 root root 8442 Oct 18 17:27 /etc/libnss-ldap.conf -rw-r--r-- 1 root root 7070 Oct 18 17:19 /etc/pam_ldap.conf I'd try su-ing to the postgres user, and check if everything (ls -l /home, ... - you get the idea) works as expected. It does: myriam:~ % id uid=104(postgres) gid=108(postgres) groups=108(postgres) myriam:~ % ls -l /home/bortzmeyer total 68 drwxr-sr-x 3 bortzmeyer staff 4096 Nov 19 11:47 AFGNIC While bortzmeyer is not on /etc/passwd, only in LDAP. So, we still have a mystery :-( Does Debian include and activate SELinux? .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] pgadmin3 / postgresql newbie question
On Wed, 2 Mar 2005, Jonathan Schreiter wrote: hi all, running amd64 fedora core 3 w/ default postgresql 7.4.7-3. did asu - , su postgres and createdb mydb as explained in the postgresql tutorial. installed the latest pgadmin3 and am trying to connect to this database. as i wasn't sure what the FC3 default password was for postgres, i changed it to something i could remember. i can't seem to connect to the new database using pgadmin3. i have the correct IP address of the local computer, default port 5432, mydb as the initaldb, postgres as the username, and my new password as the password. i keep getting the error Error connecting to the server: could not connect to server: Connection refused Is the server running on host 192.168.1.24 and accepting TCP/IP connections on port 5432? i also verified the postgresql service is running, and that i've added the following to /etc/raddb/postgresql.conf: This file is part of the freeradius package, and despite the name, has nothing to do with your PostgreSQL configuration. The default path for the real PostgreSQL configuration file is: /var/lib/pgsql/data/postgresql.conf I don't know if running TCP/IP is a requirement for pgadmin3, but if you need to access the _local_ PostgreSQL server, most clients would do w/o configuring TCP/IP support at all. login = postgres password = mynewpassword and right underneath it: tcpip = true These do not belong to PostgreSQL server configurarion. It's RADIUS stuff. i've also disabled my local firewall and SELINUX just for kicks. and yes, i did a reboot. so...anyone know what else i can look at? 1) make sure postgresql is running (use ps - look for a postmaster process) 2) if it's not there, run following command as root: service postgresql start 3) if you want it to run automatically at boot, and it doesn't, run the following command as root: chkconfig postgresql on this won't start the process if it's not running. It just sets a flag for the next boot. 4) i don't get what you mean for changing postgres password. To switch to the postgres user, I usually switch to root first, and then to postgres. Root can switch to any user w/o password. Actually, it's good security practice not to assign any password to system pseudo-accounts (postgres is one of them) and leave them locked. If you need a different access method, I strongly suggest to look at the PostgreSQL way to authenticate users and stop using the 'ident' method (see pg_hba.conf), which forces you to run clients with a certain Unix user id. 5) try and access to the db with the psql client first. Use the same connection method you're using with pgadmin3, and run it under the same user you run pgadmin3 with. E.g.: psql -h localhost -p 5432 -U postgres mydb see psql manual for details. If you successfully get to 5), it's likely it's a pgadmin3 problem. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Clay Shirky observation regarding MySQL
On Mon, 28 Feb 2005, Martijn van Oosterhout wrote: On Mon, Feb 28, 2005 at 01:46:16PM -0600, [EMAIL PROTECTED] wrote: Hello! Clay Shirky made a comment about MySQL that I thought the PostgreSQL community should be aware of: http://www.shirky.com/writings/situated_software.html It's the section (mostly toward the bottom) entitled, The Nature of Programming, and the Curious Case of MySQL. The whole article is, as normal, interesting and thought-provoking. Interesting article, but w.r.t. to the MySQL statement, I read: If you don't need any of the things that databases are good for (ACID, transactions, triggers, views) then MySQL is an acceptable choice. Interesting article, but I'm not much forgiving when a teacher is so imprecise expecially on software history and qualities. It is plain wrong that Apache made it easy, and that before apache a web server was rocket science. [...] In the mid-90s, getting a web server running was such a messy endeavor that it was a project goal in and of itself. Then Apache came along, and so simplified the process that the web server became a simple building block for larger things. I've used both cernd and NCSA httpd, and when I finally switched to Apache the choice was based on project activity: faster evolution, more features, better stability, certainly not because it was easier. Apache is quite a beast to configure from scratch even nowadays, I'd even say it's _more_ complicated than it used to be (think of the different server models it supports, and the large number of modules available). Fore sure, running cernd was not any harder than running apache is today. Mr. Shirky completely missed the point here. [...] MySQL makes the job much easier, so much easier in fact that after MySQL, it becomes a different kind of job. There are complicated technical arguments for and against using MySQL vs. other databases, but none of those arguments matter anymore. Same goes for MySQL. I don't think MySQL is easier to install, configure and administer: the tasks are almost the same of other open source databases. And _definitely_ it's not easier to use! I'm not following their development much, but when I had to use it I've _always_ found that missing features do make it harder to use MySQL. Hell, even NATURAL JOIN is nice sugar for small projects, where you can name columns and tables so that you write queries so naturally, hardly having to stop and think. So are foreign keys. These are _not_ advanced features that only SQL gurus use: they reduce development time and save headaches for _most_ application programmers. They make programming faster, safer, easier. So, MySQL success is _not_ based on user-friendlyness. Again, wrong example. Back in '95, on the Microsoft side, they didn't even know about TCP/IP. Linux and i386/NetBSD were quite hard to get and install, not to mention completely unknown to the public. No wonder running a web server was not for everyone. Today, we have distributions that come with a ready-to-run web server. On the Microsoft side, they turned to Unix (NT/2000/2003 is POSIX, and even Bill used to claim NT is Unix), and to Internet services. You can run many Unix daemons and they have thier own Web server and SQL server. So, Mr. Shirky is right, installing and running a web server, or a RDBMS, today is a matter of a few mouse clicks. But _not thanks to Apache and MySQL_ (and to do that _professionally_ is still totally another matter). They're only small bricks in the comfortable house build by the open source movement as a whole. Development teams behind projects such as Apache (which today is a lot more than a HTTP server) and PostgreSQL pursue the goal of making good products for _professionals_ to use effectively. Most of the burden of making technologies available to as many non-guru users as possible is on distribution makers. If Mr. Shirky wants to set a date, and say before that and after that, it's the day open source distrubutions hit the masses. Certainly there's no after Apache and no after MySQL. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Lost rows/data corruption?
On Wed, 16 Feb 2005, Andrew Hall wrote: fsync is on for all these boxes. Our customers run their own hardware with many different specification of hardware in use. Many of our customers don't have UPS, although their power is probably pretty reliable (normal city based utilities), but of course I can't guarantee they don't get an outage once in a while with a thunderstorm etc. I see. Well I can't help much, then, I don't run PG on XFS. I suggest testing on a different FS, to exclude XFS problems. But with fsync on, the FS has very little to do with reliability, unless it _lies_ about fsync(). Any FS should return from fsync only after data is on disc, journal or not (there might be issues with meta-data, but it's hardly a problem with PG). It's more likely the hardware (IDE disks) lies about data being on plate. But again that's only in case of sudden poweroffs. [...] this condition. I'd be really surprised if XFS is the problem as I know there are plenty of other people across the world using it reliability with PG. This is kind of OT, but I don't follow your logic here. I don't see why plenty of success stories of XFS+PG suggest to you the culprit is PG. To me it's still 50% - 50%. :-) Moreover, XFS is continuosly updated (as it follows normal linux kernel fast release cycle, like any other linux FS), so it's hard to make a data point unless someone else is using _exactly_ the same versions as you do. For example, in kernel changelog from 2.6.7 to 2.6.10 you can read: [XFS] Fix a race condition in the undo-delayed-write buffer routine. [XFS] Fix up memory allocators to be more resilient. [XFS] Fix a possible data loss issue after an unaligned unwritten extent write. [XFS] handle inode creating race (only a few of them) Now, I don't have even the faintest idea if that might have affected you or nor, but still the point is that the linux kernel changes a lot. And vendors tend to customize their kernels a lot, too. On the PostreSQL side, releases are slowly-paced, so it's easier. Anyway, I agree your problem is weird, and that it must be something on the server side. No matter what you do on the client side (pool manager, JDBC driver, servlets engige), in no way the DB should get corrupted with duplicated primary keys. I know this is a silly question, but when you write 'We do nothing with any indexes' do you mean indeces are never, _never_ touched (I mean explicitly, as in drop/create index), i.e. they are created at schema creation time and then left alone? Just to make sure... .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Lost rows/data corruption?
On Wed, 16 Feb 2005, Scott Marlowe wrote: I know there are write modes in ext3 that will allow corruption on power loss (I think it's writeback). I know little of XFS in a production environment, as I run ext3, warts and all. Yeah, but even in writeback mode, ext3 doesn't lie on fsync. No FS does. Since PG can't expect any data to be on disk _before_ fsync completes, it doesn't really make a difference. You can loose data in writeback mode _if_ the application is not fsync-ing it (XFS only mode is similar to writeback). I'm not aware of any case in which the system can lie about fsync(), unless the hardware is lying in turn. One question for gurus: does PG use fsync() on dirty data pages when they are flushed to disk at checkpoint time? Does it fsync() the directory in case of file creation/deletion/rename? .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq