Re: [HACKERS] TODO-Item: B-tree fillfactor control
Bruce Momjian pgman@candle.pha.pa.us wrote: - Is fillfactor useful for hash and gist indexes? I think hash does not need it, but gist might need it. Not sure. We don't know what type of index a GIST will be so we have no way of knowing. I am thinking we can implement just btree now and the GIST folks can add it later if they want. My guess is that each GIST is going to behave differently for different fill-factors, so if allow it to be set for GIST, GIST developers can pull the value if they want. My understanding about hash was wrong. It uses fill factor of 75%, which is hard-coded. On the other hand, GIST has no ability to control fill factor currently. I'm trying to add fill factors to hash and gist, so I'll ask index developers to review a patch in the future. - Is it appropriate to use GUC variables to control fillfactors? Is it better to extend CREATE INDEX / REINDEX grammar? I think it has to be part of CREATE INDEX and ALTER INDEX. SQL standard has no regulation for indexes, so I refered to other databases. - Oracle and DB2 : CREATE INDEX index ON table (...) PCTFREE 30; - MS SQL Server : CREATE INDEX index ON table (...) WITH FILLFACTOR = 70; PCTFREE seems to be common, so I'll extend DDL to use PCTFREE syntax. The following two syntaxes will be able to be used. 1. SET btree_free_percent = 30; CREATE INDEX index ON table (...); SET btree_free_percent = 10; -- revert 2. CREATE INDEX index ON table (...) PCTFREE 30; 1 would be useful for a compatibe pg_dump format, per suggestion from Tom. Is there a use for separate node and leaf settings? We should use different settings for leaf and node, but it may confuse users. So I'll simplify the setting as follows: node_free_percent = Min(30%, 3 * leaf_free_percent) When leaf_free_percent is 10%, node_free_percent is 30%. They are the same values of the current implementation. --- ITAGAKI Takahiro NTT Cyber Space Laboratories ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Question about ALTER TABLE SET TABLESPACE locing
Ühel kenal päeval, K, 2006-02-01 kell 18:08, kirjutas Tom Lane: Hannu Krosing [EMAIL PROTECTED] writes: Does ALTER TABLE SET TABLESPACE lock the table It had better ... see nearby discussion about relaxing locking for TRUNCATE. Is it some recent disussion ? Exactly the same problem that at commit we'd be cutting the ground out from under any concurrent query. Can't we just keep READ locks and reopen the datafile from its new location before releasing it ? Does our shared memory page cache implementation track logical or physical pages ? If it's just logical pages, then moving the physical storage around below should not affect it. Of course there are problems with WAL/PITR which *have* to deal with physical storage, but this should be mostly unrelated. -- Hannu ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Question about ALTER TABLE SET TABLESPACE locing
Hannu Krosing wrote: Ühel kenal päeval, K, 2006-02-01 kell 18:08, kirjutas Tom Lane: Hannu Krosing [EMAIL PROTECTED] writes: Does ALTER TABLE SET TABLESPACE lock the table It had better ... see nearby discussion about relaxing locking for TRUNCATE. Is it some recent disussion ? it is [PERFORM] partitioning and locking problems (on the performance list) Regards, ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Multiple logical databases
I am working on an issue that I deal with a lot, there is of course a standard answer, but maybe it is something to think about for PostgreSQL 9.0 or something. I think I finally understand what I have been fighting for a number of years. When I have been grousing about postgresql configuration, this has been what I have been fighting. One of the problems with the current PostgreSQL design is that all the databases operated by one postmaster server process are interlinked at some core level. They all share the same system tables. If one database becomes corrupt because of disk or something, the whole cluster is affected. If one db is REALLY REALLY huge and doesn't change, and a few others are small and change often, pg_dumpall will spend most of its time dumping the unchanging data. Now, the answer, obviously, is to create multiple postgresql database clusters and run postmaster for each logical group of databases, right? That really is a fine idea, but Say, in pgsql, I do this: \c newdb It will only find the database that I have in that logical group. If another postmaster is running, obviously, psql doesn't know anything about it. From the DB admin perspective, maybe there should be some heirarchical structure to this. What if there were a program, maybe a special parent postmaster process, I don't know, that started a list of child postmasters based on some site config? The parent postmaster would hold all the configuration parameters of the child postmaster processes, so there would only be on postgresql.conf. This also answers how do we get postgresql options in a database, because the parent postmaster only needs to bootstrap the others, it can be configured to run lean and mean, and the real settings can be inspected and changed at will. A trigger will send a HUP to child postmasters when their settings change. The parent postmaster only needs one connection for each child and one admin, right? Does anyone see this as useful? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Multiple logical databases
On Thu, 2006-02-02 at 10:23 -0500, Mark Woodward wrote: If one db is REALLY REALLY huge and doesn't change, and a few others are small and change often, pg_dumpall will spend most of its time dumping the unchanging data. My usual backup strategy does pg_dumpall -g to get the (tiny) global data, and then pg_dump for each individual database. Quite apart from anything else I prefer to have custom format dumps anyway, but I think this should meet your need for less frequent dumping of some constant database. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Multiple logical databases
Mark Woodward [EMAIL PROTECTED] writes: One of the problems with the current PostgreSQL design is that all the databases operated by one postmaster server process are interlinked at some core level. They all share the same system tables. If one database becomes corrupt because of disk or something, the whole cluster is affected. This problem is not as large as you paint it, because most of the system catalogs are *not* shared. Does anyone see this as useful? No... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Multiple logical databases
Mark Woodward [EMAIL PROTECTED] writes: One of the problems with the current PostgreSQL design is that all the databases operated by one postmaster server process are interlinked at some core level. They all share the same system tables. If one database becomes corrupt because of disk or something, the whole cluster is affected. This problem is not as large as you paint it, because most of the system catalogs are *not* shared. Does anyone see this as useful? Seriously? No use at all? You don't see any purpose in controlling and managing multiple postgresql postmaster processes from one central point? Sure you don't want to think about this a little? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Multiple logical databases
Mark Woodward wrote: Mark Woodward [EMAIL PROTECTED] writes: One of the problems with the current PostgreSQL design is that all the databases operated by one postmaster server process are interlinked at some core level. They all share the same system tables. If one database becomes corrupt because of disk or something, the whole cluster is affected. This problem is not as large as you paint it, because most of the system catalogs are *not* shared. Does anyone see this as useful? Seriously? No use at all? You don't see any purpose in controlling and managing multiple postgresql postmaster processes from one central point? pgAdmin does so. IMHO it's totally sufficient to handle this on a client side level. Regards, Andreas ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] streamlined standby process
Hi all, After spending some time (~2 days to fully(?) understand the process) setting up a standby machine using WAL shipping and experimentations to be fairly sure it is working reliably, I started thinking of how I would like it to be done. My dream-process of setting up a standby would be: - start up the postmaster on an empty directory (except maybe the postgres.conf and pg_hba.conf to be used after switching to active mode), with a build standby option; - the postmaster would start in a special standby mode, where it connects to the primary server using a normal connection (which in turn must be configured for e.g. in the command line, just as with psql); - the standby would issue a STANDBY command to the primary (possibly implemented at low level, i.e. it doesn't have to be a SQL command but could be at protocol level); - the primary would mark the start of the backup, just as today with the 'pg_start_backup' function; - the primary would start to stream on the connection all the files of the data base, interleaved with all the committed WAL records just as they commit (of course buffered somehow so it won't affect too much performance); - the standby would save the received files and store the WAL records for later replay; - when all the db files were transmitted, the primary would mark the end of the backup, just as today with 'pg_stop_backup', and signal this to the standby; - the standby can now start replaying the WAL records; - when all the saved WAL records up to the 'pg_stop_backup' mark were replayed, the standby is operational in the sense that it can be switched on in normal operation mode; - in the meantime the server keeps sending the WAL records as they become available, and the standby replays them as they arrive. This would make sure the standby is always as up to date as possible; - bringing the standby up would be a matter of loosing the connection to the master server, or some kind of local signalling; - the standby then would continue to work as a normal postmaster, using the configuration file which must be existing in the standby directory when starting the process; This process would only require from the user to know where he wants the standby and where's the master... no more scripting and worrying about partial WAL files. And it could provide some degree of automatic failover if relying on the connection failure is good enough sign of the primary server being down, but of course this is not mandatory. Is this scenario viable ? I realize it needs a lot of coding in the postmaster, and it must get somehow all WAL records as they commit. I'm mostly ignorant regarding the postgres code structure, so I have no idea how much work that would involve... so can you gurus please comment ? I really have no experience with C, but I would start doing some coding if it's deemed viable. Thanks, Csaba. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Multiple logical databases
Mark Woodward wrote: Seriously? No use at all? You don't see any purpose in controlling and managing multiple postgresql postmaster processes from one central point? I'd rather spend effort in fixing the problems that arise from big databases; for example Hannu's patch for concurrent vacuum attacks one of the problems that IMHO are important. More elaborate partitioning does too. Anyway, if you're very excited about it, I don't think it's impossible to code a super-postmaster that would redirect a client to the real postmaster. I even think it can be done without modifying the regular postmaster. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Multiple logical databases
On Thu, 2 Feb 2006, Mark Woodward wrote: Now, the answer, obviously, is to create multiple postgresql database clusters and run postmaster for each logical group of databases, right? That really is a fine idea, but Say, in pgsql, I do this: \c newdb It will only find the database that I have in that logical group. If another postmaster is running, obviously, psql doesn't know anything about it. From the DB admin perspective, maybe there should be some heirarchical structure to this. What if there were a program, maybe a special parent postmaster process, I don't know, that started a list of child postmasters based on some site config? The parent postmaster would hold all the configuration parameters of the child postmaster processes, so there would only be on postgresql.conf. This also answers how do we get postgresql options in a database, because the parent postmaster only needs to bootstrap the others, it can be configured to run lean and mean, and the real settings can be inspected and changed at will. A trigger will send a HUP to child postmasters when their settings change. The parent postmaster only needs one connection for each child and one admin, right? Does anyone see this as useful? Not as described above, no. Perhaps with a more concrete plan that actually talks about these things in more details. For example, you posit the \c thing as an issue, I don't personally agree, but you also don't address it with a solution. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Some platform-specific MemSet research
I wanted to chime in that I also see this speedup from using XLC 6.0 (IBM's cc), even in 32bit mode. I have tested on AIX 5.2 and 5.1. I think this would be good to include in the regular release. Not sure how many people are running older versions of AIX that would want a new version of postgres. -rocco -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bruce Momjian Sent: Wednesday, February 01, 2006 12:11 PM To: Seneca Cunningham Cc: Martijn van Oosterhout; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Some platform-specific MemSet research My guess is that there is some really fast assembler for memory copy on AIX, and only libc memset() has it. If you want, we can make MEMSET_LOOP_LIMIT in c.h a configure value, and allow template/aix to set it to zero, causing memset() to be always used. Are you prepared to make this optimization decision for all AIX users using gcc, or only for certain versions? -- - Seneca Cunningham wrote: Martijn van Oosterhout wrote: On Tue, Jan 24, 2006 at 05:24:28PM -0500, Seneca Cunningham wrote: After reading the post on -patches proposing that MemSet be changed to use long instead of int32 on the grounds that a pair of x86-64 linux boxes took less time to execute the long code 64*10^6 times[1], I took a look at how the testcode performed on AIX with gcc. While the switch to long did result in a minor performance improvement, dropping the MemSetLoop in favour of the native memset resulted in the tests taking ~25% the time as the MemSetLoop-like int loop. The 32-bit linux system I ran the expanded tests on showed that for the buffer size range that postgres can use the looping MemSet instead of memset (size = 1024 bytes), MemSet generally had better performance. Could you please check the asm output to see what's going on. We've had tests like these produce odd results in the past because the compiler optimised away stuff that didn't have any effect. Since every memset after the first is a no-op, you want to make sure it's still actually doing the work... Well, on both linux and AIX, all 30 of the 6400 iterations loops from the source exist (10 int, 10 long, 10 memset). According to my understanding of the assembler, memset itself is only called for values = 64 bytes on both platforms and the memset is called in each iteration. The assembler for the 64 byte loops, with prepended line number, first loop MemSetLoop int-variant, second loop memset, third loop MemSetLoop long-variant: 64-bit AIX: 419 addi 3,1,112 420 li 4,0 421 bl .gettimeofday 422 nop 423 lis 10,0x3d0 424 cmpld 6,26,16 425 li 11,0 426 ori 10,10,36864 427 L..41: 428 bge 6,L..42 429 mr 9,26 430 li 0,0 431 L..44: 432 stw 0,0(9) 433 addi 9,9,4 434 cmpld 7,16,9 435 bgt 7,L..44 436 L..42: 437 addi 0,11,1 438 extsw 11,0 439 cmpw 7,11,10 440 bne+ 7,L..41 441 li 4,0 442 mr 3,22 443 lis 25,0x3d0 444 li 28,0 445 bl .gettimeofday 446 nop 447 li 4,64 448 addi 5,1,112 449 ld 3,LC..9(2) 450 mr 6,22 451 ori 25,25,36864 452 bl .print_time 453 addi 3,1,112 454 li 4,0 455 bl .gettimeofday 456 nop 457 L..46: 458 mr 3,26 459 li 4,0 460 li 5,64 461 bl .memset 462 nop 463 addi 0,28,1 464 extsw 28,0 465 cmpw 7,28,25 466 bne+ 7,L..46 467 li 4,0 468 mr 3,22 469 bl .gettimeofday 470 nop 471 li 4,64 472 addi 5,1,112 473 ld 3,LC..11(2) 474 mr 6,22 475 bl .print_time 476 addi 3,1,112 477 li 4,0 478 bl .gettimeofday 479 nop 480 lis 10,0x3d0 481 cmpld 6,26,16 482 li 11,0 483 ori 10,10,36864 484 L..48: 485 bge 6,L..49 486 mr 9,26 487 li 0,0 488 L..51: 489 std 0,0(9) 490 addi 9,9,8 491 cmpld 7,9,16 492 blt 7,L..51 493 L..49: 494 addi 0,11,1 495 extsw 11,0 496 cmpw 7,11,10 497 bne+ 7,L..48 498 li 4,0 499 mr 3,22 500 bl .gettimeofday 501 nop 502 li 4,64 503 addi 5,1,112 504 ld 3,LC..13(2) 505 mr 6,22 506 bl .print_time 32-bit Linux: 387 popl%ecx 388 popl%edi 389 pushl $0 390
Re: [HACKERS] TODO-Item: B-tree fillfactor control
ITAGAKI Takahiro wrote: Bruce Momjian pgman@candle.pha.pa.us wrote: - Is fillfactor useful for hash and gist indexes? I think hash does not need it, but gist might need it. Not sure. We don't know what type of index a GIST will be so we have no way of knowing. I am thinking we can implement just btree now and the GIST folks can add it later if they want. My guess is that each GIST is going to behave differently for different fill-factors, so if allow it to be set for GIST, GIST developers can pull the value if they want. My understanding about hash was wrong. It uses fill factor of 75%, which is hard-coded. On the other hand, GIST has no ability to control fill factor currently. I'm trying to add fill factors to hash and gist, so I'll ask index developers to review a patch in the future. OK. - Is it appropriate to use GUC variables to control fillfactors? Is it better to extend CREATE INDEX / REINDEX grammar? I think it has to be part of CREATE INDEX and ALTER INDEX. SQL standard has no regulation for indexes, so I refered to other databases. - Oracle and DB2 : CREATE INDEX index ON table (...) PCTFREE 30; - MS SQL Server : CREATE INDEX index ON table (...) WITH FILLFACTOR = 70; PCTFREE seems to be common, so I'll extend DDL to use PCTFREE syntax. The following two syntaxes will be able to be used. 1. SET btree_free_percent = 30; CREATE INDEX index ON table (...); SET btree_free_percent = 10; -- revert 2. CREATE INDEX index ON table (...) PCTFREE 30; 1 would be useful for a compatibe pg_dump format, per suggestion from Tom. I personally like FILLFACTOR, but I understand the desire to match Oracle. PCTFREE seems too abreviated for me, but it would match the GUC better, so maybe it is the best. Is there a use for separate node and leaf settings? We should use different settings for leaf and node, but it may confuse users. So I'll simplify the setting as follows: node_free_percent = Min(30%, 3 * leaf_free_percent) When leaf_free_percent is 10%, node_free_percent is 30%. They are the same values of the current implementation. Yes, I think that is ideal. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Some platform-specific MemSet research
Rocco Altier wrote: I wanted to chime in that I also see this speedup from using XLC 6.0 (IBM's cc), even in 32bit mode. I have tested on AIX 5.2 and 5.1. I think this would be good to include in the regular release. Not sure how many people are running older versions of AIX that would want a new version of postgres. OK, perfect. I will work on making it platform-specific and report back. --- -rocco -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bruce Momjian Sent: Wednesday, February 01, 2006 12:11 PM To: Seneca Cunningham Cc: Martijn van Oosterhout; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Some platform-specific MemSet research My guess is that there is some really fast assembler for memory copy on AIX, and only libc memset() has it. If you want, we can make MEMSET_LOOP_LIMIT in c.h a configure value, and allow template/aix to set it to zero, causing memset() to be always used. Are you prepared to make this optimization decision for all AIX users using gcc, or only for certain versions? -- - Seneca Cunningham wrote: Martijn van Oosterhout wrote: On Tue, Jan 24, 2006 at 05:24:28PM -0500, Seneca Cunningham wrote: After reading the post on -patches proposing that MemSet be changed to use long instead of int32 on the grounds that a pair of x86-64 linux boxes took less time to execute the long code 64*10^6 times[1], I took a look at how the testcode performed on AIX with gcc. While the switch to long did result in a minor performance improvement, dropping the MemSetLoop in favour of the native memset resulted in the tests taking ~25% the time as the MemSetLoop-like int loop. The 32-bit linux system I ran the expanded tests on showed that for the buffer size range that postgres can use the looping MemSet instead of memset (size = 1024 bytes), MemSet generally had better performance. Could you please check the asm output to see what's going on. We've had tests like these produce odd results in the past because the compiler optimised away stuff that didn't have any effect. Since every memset after the first is a no-op, you want to make sure it's still actually doing the work... Well, on both linux and AIX, all 30 of the 6400 iterations loops from the source exist (10 int, 10 long, 10 memset). According to my understanding of the assembler, memset itself is only called for values = 64 bytes on both platforms and the memset is called in each iteration. The assembler for the 64 byte loops, with prepended line number, first loop MemSetLoop int-variant, second loop memset, third loop MemSetLoop long-variant: 64-bit AIX: 419 addi 3,1,112 420 li 4,0 421 bl .gettimeofday 422 nop 423 lis 10,0x3d0 424 cmpld 6,26,16 425 li 11,0 426 ori 10,10,36864 427 L..41: 428 bge 6,L..42 429 mr 9,26 430 li 0,0 431 L..44: 432 stw 0,0(9) 433 addi 9,9,4 434 cmpld 7,16,9 435 bgt 7,L..44 436 L..42: 437 addi 0,11,1 438 extsw 11,0 439 cmpw 7,11,10 440 bne+ 7,L..41 441 li 4,0 442 mr 3,22 443 lis 25,0x3d0 444 li 28,0 445 bl .gettimeofday 446 nop 447 li 4,64 448 addi 5,1,112 449 ld 3,LC..9(2) 450 mr 6,22 451 ori 25,25,36864 452 bl .print_time 453 addi 3,1,112 454 li 4,0 455 bl .gettimeofday 456 nop 457 L..46: 458 mr 3,26 459 li 4,0 460 li 5,64 461 bl .memset 462 nop 463 addi 0,28,1 464 extsw 28,0 465 cmpw 7,28,25 466 bne+ 7,L..46 467 li 4,0 468 mr 3,22 469 bl .gettimeofday 470 nop 471 li 4,64 472 addi 5,1,112 473 ld 3,LC..11(2) 474 mr 6,22 475 bl .print_time 476 addi 3,1,112 477 li 4,0 478 bl .gettimeofday 479 nop 480 lis 10,0x3d0 481 cmpld 6,26,16 482 li 11,0 483 ori 10,10,36864 484 L..48: 485 bge 6,L..49 486 mr 9,26 487 li 0,0 488 L..51: 489 std 0,0(9) 490 addi 9,9,8 491 cmpld 7,9,16 492 blt 7,L..51 493 L..49: 494 addi 0,11,1 495 extsw 11,0 496 cmpw 7,11,10 497 bne+ 7,L..48
Re: [HACKERS] Multiple logical databases
On Thu, 2 Feb 2006, Mark Woodward wrote: Now, the answer, obviously, is to create multiple postgresql database clusters and run postmaster for each logical group of databases, right? That really is a fine idea, but Say, in pgsql, I do this: \c newdb It will only find the database that I have in that logical group. If another postmaster is running, obviously, psql doesn't know anything about it. From the DB admin perspective, maybe there should be some heirarchical structure to this. What if there were a program, maybe a special parent postmaster process, I don't know, that started a list of child postmasters based on some site config? The parent postmaster would hold all the configuration parameters of the child postmaster processes, so there would only be on postgresql.conf. This also answers how do we get postgresql options in a database, because the parent postmaster only needs to bootstrap the others, it can be configured to run lean and mean, and the real settings can be inspected and changed at will. A trigger will send a HUP to child postmasters when their settings change. The parent postmaster only needs one connection for each child and one admin, right? Does anyone see this as useful? Not as described above, no. Perhaps with a more concrete plan that actually talks about these things in more details. For example, you posit the \c thing as an issue, I don't personally agree, but you also don't address it with a solution. While I understand that it is quite a vague suggestion, I guess I was brainstorming more than detailing an actual set of features. My issue is this, (and this is NOT a slam on PostgreSQL), I have a number of physical databases on one machine on ports 5432, 5433, 5434. All running the same version and in fact, installation of PostgreSQL. Even though they run on the same machine, run the same version of the software, and are used by the same applications, they have NO interoperability. For now, lets just accept that they need to be on separate physical clusters because some need to be able to started and stopped while others need to remain running, there are other reasons, but one reason will suffice for the discussion. From an administration perspective, a single point of admin would seem like a logical and valuable objective, no? Beyond just the admin advanatges, the utilities could be modified to handle a root server that redirects to child servers. The psql program, when handling a \c command, queries the root server to find the child server and then connects to that. libpq could also be modified to handle this without changing the applications. The child postmasters will query the root postmaster when a DB is created and deleted to keep it up to date. Conflicts between two children can be managed by either some sort of first come first serve or disallow creating of a duplicate name, or some other method. So, conn = connect(host=localhost dbname=mydb); Will connect to the root server, find the actual server, and then connect to it, completely hiding the different physical databases, and creating one very large logical install. Perhaps this can even be written to include large scale clustering. Who knows? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Multiple logical databases
On Thu, Feb 02, 2006 at 02:05:03PM -0500, Mark Woodward wrote: My issue is this, (and this is NOT a slam on PostgreSQL), I have a number of physical databases on one machine on ports 5432, 5433, 5434. All running the same version and in fact, installation of PostgreSQL. One way of acheiving this would be to allow the PGHOST and/or PGPORT variables to be lists and when you connect it tries each combination until it finds on that works. Maybe not as clean but a lot easier to implement. Unless ofcourse you want psql -l to list all databases in all clusters... I think it would be better to put the intelligence into libpq rather than trying to create more servers... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. signature.asc Description: Digital signature
Re: [HACKERS] Backslashes in string literals
On Wed, Feb 1, 2006 at 10:50 am, in message [EMAIL PROTECTED], Bruce Momjian pgman@candle.pha.pa.us wrote: Kevin Grittner wrote: We found a bug in the code from my first patch. Since it was a low frequency, non- destructive type of problem for us, I was able to take my time and look over the task a little more closely. Attached is a patch which should come close to implementing the TODO. In particular, it is now implemented as a configurable option, which can be set in the postgresql.conf file or at run time. There are some remaining issues: (1) I couldn't figure out the best way to obtain a value for standard_conforming_strings in the psql version of the scanner. For our needs, could just assume it is always on, so I left it that way. Someone with a better handle on this issue can hopefully finish that part. Alternatively, if you give me some direction, I might have time to generalize it. As far as I can tell from some testing today, everything works fine issuing statements through a connection, but psql isn't settled down. Sounds like you made great progress! Thanks. It was actually pretty easy once I took the time to learn flex. I'd kinda winged it in my emergency compile-time version. I'm pretty sure that what I've done works; my biggest concern is over what I've missed. For example, I was using pg_dump and pg_restore today and realized that these, and other applications, likely need some kind of work to support the feature. The proper way to do (1) is to call libpq's pqSaveParameterStatus() from psql. Take a look for psql's session_username(). It is called everytime the prompt is printed if the username is required. One great feature of using pqSaveParameterStatus() is that it reads server packets and keeps the tracked value updated for you without query overhead. I'll take a look at it. If I feel confident that I get it, I'll do the work and post another patch. Would you prefer that I resend the whole works, or just the delta? Also, since we're doing this out of need to fix the issue on our production system, I'm compelled to work on the stable branch. Is it OK to post patches from the tip of that branch, or should I really check out the trunk (HEAD), too, and replicate it there for my patch posts? -Kevin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Multiple logical databases
Mark, Even though they run on the same machine, run the same version of the software, and are used by the same applications, they have NO interoperability. For now, lets just accept that they need to be on separate physical clusters because some need to be able to started and stopped while others need to remain running, there are other reasons, but one reason will suffice for the discussion. Well, to answer your original question, I personally would not see your general idea as useful at all. I admin 9 or 10 PostgreSQL servers currently and have never run across a need, or even a desire, to do what you are doing. In fact, if there's any general demand, it's to go the opposite way: patches to lock down the system tables and prevent switching databases to support ISPs and other shared-hosting situations. For an immediate solution to what you are encountering, have you looked at pgPool? -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Persistent error
Dear hackers I have downloadedand instaled pgadim2 (and pgadmin3 corrected for my Windows98 -second edition) .When I try to conect my web server,where I have an AVL program to nurse and inspect an error message comes and I cannot proceed with the inspection. The error is " Column "datpath"does not exist " . And now? How to proceed ??? Please help me . Flavio Caiuby --São Paulo - Brazil
Re: [HACKERS] Multiple logical databases
Mark Woodward wrote: My issue is this, (and this is NOT a slam on PostgreSQL), I have a number of physical databases on one machine on ports 5432, 5433, 5434. All running the same version and in fact, installation of PostgreSQL. Even though they run on the same machine, run the same version of the software, and are used by the same applications, they have NO interoperability. For now, lets just accept that they need to be on separate physical clusters because some need to be able to started and stopped while others need to remain running, there are other reasons, but one reason will suffice for the discussion. Hmmm - do you really need to start and stop them? or are you just doing that to forbid user access whilst doing data loads etc? If so, then you might get more buy-in by requesting enhancements that work with the design of Pg a little more (or I hope they do anyway) e.g: 1/ Enable/disable (temporarily) user access to individual databases via a simple admin command (tho 'ALTER DATABASE xxx CONNECTION LIMIT 0' will suffice if you do loads with a superuser role). 2/ Restrict certain users to certain databases via simple admin commands (editing pg_hba.conf is not always convenient or possible). 3/ Make cross db relation references a little more transparent (e.g maybe introduce SYNONYM for this). Other related possibilities come to mind, like being able to segment the buffer cache on a database level (e.g: bigdb gets 90% of the shared buffers not 100%, as I want to keep smalldb's tables cached always). Cheers Mark ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Multiple logical databases
Mark Woodward wrote: From an administration perspective, a single point of admin would seem like a logical and valuable objective, no? I don't understand why you are going out of your way to separate your databases (for misinformed reasons, it appears) and then want to design a way to centrally control them so they can all fail together. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Passing arguments to views
I've written some complicated queries that I'd like to save inside the server and then call from my clients using a short name. For the queries that require no external information, views are perfect. For queries that *do* require external information (like a search date range), I've used functions (written in the SQL language, because I'm just writing plain SQL queries but using $1, $2 tokens for passed-in arguments). When I use these functions, I'm typically joining the results of the function with other tables. Since much of the work is being done inside the function, the planner can't use both the inside-function and outside-function query information when generating a query plan. Instead, it has to do Function Scans (planning and executing the SQL inside the function at each execution, I'm assuming) and then manipulate the output. Ideally, I'd like to be able to write queries that contain $n tokens that will be substituted at execution time, save them on the server, and let the query planner plan the whole query before it's executed. Basically, writing views with arguments. For example, a sales_figures view that requires start_date and end_date parameters could be used like this: CREATE VIEW sales_figures($1, $2) AS SELECT ... FROM ... WHERE purchase_date BETWEEN $1 AND $2; SELECT * FROM sales_figures('2005-08-22', '2005-09-14') sf JOIN customers c ON (sf.customer_id = c.customer_id) What do you think? Is this an interesting feature? Is this the right way to go about it, or should I try to get the planner to see through SQL function boundaries (e.g., enhance the function system instead of enhancing the view system)? Would this be a good project for a newbie to the code? I can see that the syntax used above would be problematic: how would it distinguish that from a call to a sales_figures() function? Any comments about alternative syntax would be welcome, too! Thanks! - Chris smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Proposal: new pg_dump options --copy-delimiter and
On Fri, Jan 27, 2006 at 01:12:35PM -0500, Greg Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: David I don't get this... what are you copying from/to that would wouldn't just script? If you throw into a script you can change the delimiter on the fly using translation. The problems are a) there's no guarantee the data is safe to put through your hypothetical tab-destroying copy/paste anyways. There could be tabs or other unsafe characters in the data. b) We have no way of knowing which characters are or aren't safe in your hypothetical copy/paste system. Why is tab unsafe in the first place? Personally I find anything that would encourage people to use anything other than tabs evil anyways. All those people who think | is somehow a reasonable choice or want to use commas and then get all confused trying to escape them and invent ever more confused syntaxes for escaping the escape characters. Just use tab separated data like man was meant to. I'd say that the multiplicity of strong opinions on The Right Delimiter is a pretty strong argument for having a switch to control it. Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Backslashes in string literals
Kevin Grittner wrote: to generalize it. As far as I can tell from some testing today, everything works fine issuing statements through a connection, but psql isn't settled down. Sounds like you made great progress! Thanks. It was actually pretty easy once I took the time to learn flex. I'd kinda winged it in my emergency compile-time version. I'm pretty sure that what I've done works; my biggest concern is over what I've missed. For example, I was using pg_dump and pg_restore today and realized that these, and other applications, likely need some kind of work to support the feature. Right, I will look over the rest of the code and fix any places you missed. I think most of it centers around ESCAPE_STRING_SYNTAX usage. The proper way to do (1) is to call libpq's pqSaveParameterStatus() from psql. Take a look for psql's session_username(). It is called everytime the prompt is printed if the username is required. One great feature of using pqSaveParameterStatus() is that it reads server packets and keeps the tracked value updated for you without query overhead. I'll take a look at it. If I feel confident that I get it, I'll do the work and post another patch. Would you prefer that I resend the whole works, or just the delta? I would like the whole patch rather than just an additional one. Again, I will review it and polish whatever you don't do. Also, since we're doing this out of need to fix the issue on our production system, I'm compelled to work on the stable branch. Is it OK to post patches from the tip of that branch, or should I really check out the trunk (HEAD), too, and replicate it there for my patch posts? The branch is fine. I will merge any changes in to HEAD. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Persistent error
I believe pgAdmin only supports PostgreSQL 7.3 and above. Chris Flavio Caiuby wrote: Dear hackers I have downloaded and instaled pgadim2 (and pgadmin3 corrected for my Windows98 -second edition) .When I try to conect my web server, where I have an AVL program to nurse and inspect an error message comes and I cannot proceed with the inspection. The error is Column datpathdoes not exist . And now? How to proceed ??? Please help me . Flavio Caiuby --São Paulo - Brazil ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Krb5 multiple DB connections
Greetings, I've just run smack-dab into the bug described here: http://archives.postgresql.org/pgsql-interfaces/2002-05/msg00083.php and it's somewhat frustrating since the end of that thread is a reasonably small patch which fixes the problem: http://archives.postgresql.org/pgsql-interfaces/2002-05/msg00083.php Or did for that user anyway but it sure looks like it's solve my problem as well (which is also a case of using mod_auth_krb5, etc, though it breaks even for the same user since apache now, correctly, deletes and recreates the cache with a different filename for each connection). I realize it's not entirely fair (given that it was years ago) to ask this, but, anyone happen to know why the patch wasn't accepted? It almost patched cleanly against current HEAD even. I went ahead and made the few changes by hand that didn't apply cleanly and it compiled just fine; attached patch is against current HEAD and should apply cleanly. I'd really love to get this fixed. Thanks (off to go implement Tom's suggestion for pg_restore :)! Stephen ? src/interfaces/libpq/.fe-auth.c.rej.swp Index: src/interfaces/libpq/fe-auth.c === RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-auth.c,v retrieving revision 1.110 diff -c -r1.110 fe-auth.c *** src/interfaces/libpq/fe-auth.c 26 Dec 2005 14:58:05 - 1.110 --- src/interfaces/libpq/fe-auth.c 3 Feb 2006 02:12:49 - *** *** 101,122 * Various krb5 state which is not connection specific, and a flag to * indicate whether we have initialised it yet. */ static intpg_krb5_initialised; static krb5_context pg_krb5_context; static krb5_ccache pg_krb5_ccache; static krb5_principal pg_krb5_client; static char *pg_krb5_name; static int ! pg_krb5_init(char *PQerrormsg) { krb5_error_code retval; ! if (pg_krb5_initialised) return STATUS_OK; ! retval = krb5_init_context(pg_krb5_context); if (retval) { snprintf(PQerrormsg, PQERRORMSG_LENGTH, --- 101,133 * Various krb5 state which is not connection specific, and a flag to * indicate whether we have initialised it yet. */ + /* static intpg_krb5_initialised; static krb5_context pg_krb5_context; static krb5_ccache pg_krb5_ccache; static krb5_principal pg_krb5_client; static char *pg_krb5_name; + */ + + struct krb5_info + { + int pg_krb5_initialised; + krb5_contextpg_krb5_context; + krb5_ccache pg_krb5_ccache; + krb5_principal pg_krb5_client; + char*pg_krb5_name; + }; static int ! pg_krb5_init(char *PQerrormsg, struct krb5_info *info) { krb5_error_code retval; ! if (info-pg_krb5_initialised) return STATUS_OK; ! retval = krb5_init_context((info-pg_krb5_context)); if (retval) { snprintf(PQerrormsg, PQERRORMSG_LENGTH, *** *** 125,170 return STATUS_ERROR; } ! retval = krb5_cc_default(pg_krb5_context, pg_krb5_ccache); if (retval) { snprintf(PQerrormsg, PQERRORMSG_LENGTH, pg_krb5_init: krb5_cc_default: %s\n, error_message(retval)); ! krb5_free_context(pg_krb5_context); return STATUS_ERROR; } ! retval = krb5_cc_get_principal(pg_krb5_context, pg_krb5_ccache, ! pg_krb5_client); if (retval) { snprintf(PQerrormsg, PQERRORMSG_LENGTH, pg_krb5_init: krb5_cc_get_principal: %s\n, error_message(retval)); ! krb5_cc_close(pg_krb5_context, pg_krb5_ccache); ! krb5_free_context(pg_krb5_context); return STATUS_ERROR; } ! retval = krb5_unparse_name(pg_krb5_context, pg_krb5_client, pg_krb5_name); if (retval) { snprintf(PQerrormsg, PQERRORMSG_LENGTH, pg_krb5_init: krb5_unparse_name: %s\n, error_message(retval)); ! krb5_free_principal(pg_krb5_context, pg_krb5_client); ! krb5_cc_close(pg_krb5_context, pg_krb5_ccache); ! krb5_free_context(pg_krb5_context); return STATUS_ERROR; } ! pg_krb5_name = pg_an_to_ln(pg_krb5_name); ! pg_krb5_initialised = 1; return STATUS_OK; } /* * pg_krb5_authname -- returns a pointer to static space containing whatever --- 136,191 return STATUS_ERROR; } ! retval = krb5_cc_default(info-pg_krb5_context, (info-pg_krb5_ccache)); if
Re: [HACKERS] Krb5 multiple DB connections
* Stephen Frost ([EMAIL PROTECTED]) wrote: and it's somewhat frustrating since the end of that thread is a reasonably small patch which fixes the problem: http://archives.postgresql.org/pgsql-interfaces/2002-05/msg00083.php Erp, not quite sure how I managed that, the end of the thread is *here*: http://archives.postgresql.org/pgsql-interfaces/2002-05/msg00090.php Sorry about that. Stephen signature.asc Description: Digital signature
Re: [HACKERS] Multiple logical databases
Mark Woodward wrote: From an administration perspective, a single point of admin would seem like a logical and valuable objective, no? I don't understand why you are going out of your way to separate your databases (for misinformed reasons, it appears) and then want to design a way to centrally control them so they can all fail together. Oh come on, misinformed? is that really called for? Think about a website that (and I have one) has the U.S.A. Streetmap database, the freedb CD database, and a slew of sites based on phpbb and drupal. Maybe one should put them all in one database cluster, but... The street database is typically generated and QAed in the lab. It is then uploaded to the server. It has many millions of rows and about a half dozen indexes. To dump and reload takes almost a day. Compressing the DB and uploading it into the site, uncompressing it, stoping the current postgresql process, swapping the data directory, and restarting it can be done in about an hour. One can not do this if the street map database is part of the standard database cluster. The same thing happens with the freedb database. Unless you can tell me how to insert live data and indexes to a cluster without having to reload the data and recreate the indexes, then I hardly think I am misinformed. The ad hominem attack wasn't nessisary. I have no problem with disagreement, but I take exception to insult. If no one sees a way to manage multiple physical database clusters as one logical cluster as something worth doing, then so be it. I have a practical example of a valid reason how this would make PostgreSQL easier to work with. Yes there are work arounds. Yes it is not currently unworkable. It is just that it could be better. As I mentioned earlier, I have been dealing with this sort of problem for a number of years now, and I think this is the cool solution to the problem. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Where to execute the compiled psql
Greetings, I am writing some plugin for pq. After running ./configure and make, shall I just go to /src/bin/psql to execute my executable psql -U (usrname) dbname? It seems that my updates in the file parse_expr.c is not reflect in this executable? For example, I changed one line from if (NULL == tup) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_OBJECT), errmsg(key value not found))); -- if (NULL == tup) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_OBJECT), errmsg(key value not found while building relation))); But the program still reports key value not found when it runs. Best wishes, John ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Passing arguments to views
Chris Campbell [EMAIL PROTECTED] writes: What do you think? Is this an interesting feature? Is this the right way to go about it, or should I try to get the planner to see through SQL function boundaries The right way to go about this in the original abstract set-theoretic mindset of SQL is to code the view to retrieve all the rows and then apply further WHERE clause restrictions to the results of the view. So for example this: CREATE VIEW sales_figures($1, $2) AS SELECT ... FROM ... WHERE purchase_date BETWEEN $1 AND $2; Becomes: CREATE VIEW sales_figures AS SELECT ... FROM ... And then you query it with SELECT * FROM sales_figures WHERE purchase_date BETWEEN $1 AND $2 sales_figures could have any number of joins and complex where clauses built-in. It could even be an aggregate grouped by some column (like purchase_date). This relies on the SQL optimizer to push the WHERE clause down into the view to the appropriate depth. Postgres isn't always capable of doing so but it does a pretty decent job. -- greg ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Passing arguments to views
On Feb 2, 2006, at 23:33, Greg Stark wrote: The right way to go about this in the original abstract set- theoretic mindset of SQL is to code the view to retrieve all the rows and then apply further WHERE clause restrictions to the results of the view. So for example this: CREATE VIEW sales_figures($1, $2) AS SELECT ... FROM ... WHERE purchase_date BETWEEN $1 AND $2; Becomes: CREATE VIEW sales_figures AS SELECT ... FROM ... And then you query it with SELECT * FROM sales_figures WHERE purchase_date BETWEEN $1 AND $2 That was a very simplistic example and didn't illustrate my point -- I apologize. I was trying to think of something succinct and illustrative for a quick mailing list post but came up short. Maybe a better example would be a situation where you want to do substitutions in places other than the WHERE clause? There's no way to push that out to the calling query. But even in this simple case, the easier-to-grok syntax of making a view look like a function (and codifying the options for restricting the results as arguments to the view) is a nice win in terms of readability and maintainability. I was hoping that people would overlook my bad example because they've had the need for a view with arguments tool in their own work, and the conversation would just be about how it could be implemented. :) I'll try to distill a better example from some of the projects I'm working on. Thanks! - Chris ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Persistent error
Flavio Caiuby [EMAIL PROTECTED] writes: The error is Column datpathdoes not exist . You need a newer version of pgadmin --- pg_database.datpath went away in PG 8.0, but older versions of pgadmin don't know about that. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Where to execute the compiled psql
John wrote: I am writing some plugin for pq. After running ./configure and make, shall I just go to /src/bin/psql to execute my executable psql -U (usrname) dbname? It seems that my updates in the file parse_expr.c is not reflect in this executable? Evidently you're not (only) writing a plugin but changing the source code. In that case, follow the normal installation instructions. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Passing arguments to views
Chris Campbell schrieb: ... That was a very simplistic example and didn't illustrate my point -- I apologize. I was trying to think of something succinct and illustrative for a quick mailing list post but came up short. Maybe a better example would be a situation where you want to do substitutions in places other than the WHERE clause? There's no way to push that out to the calling query. But even in this simple case, the easier-to-grok syntax of making a view look like a function (and codifying the options for restricting the results as arguments to the view) is a nice win in terms of readability and maintainability. Well if the view does not suit your needs, why dont you use an set returnung function instead? Inside it you can do all the magic you want and still use it similar to a table or view. Regards Tino ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Multiple logical databases
Mark Woodward schrieb: ... Unless you can tell me how to insert live data and indexes to a cluster without having to reload the data and recreate the indexes, then I hardly think I am misinformed. The ad hominem attack wasn't nessisary. I see you had a usecase for something like pg_diff and pg_patch ;) ... If no one sees a way to manage multiple physical database clusters as one logical cluster as something worth doing, then so be it. I have a practical example of a valid reason how this would make PostgreSQL easier to work with. Yes there are work arounds. Yes it is not currently unworkable. I dont see your problem, really ;) 1) if you have very big and very workloaded databases, you often have them on different physically boxes anyway 2) you can run any number of postmasters on the same box - just put them to listen on different ip:port. Now to the management - you say cddb and geodb are managed off host. So they are not managed on the life server and so you dont need to switch your psql console to them. And yeah, its really not a problem, to quit psql and connect to a different server anyway :-) If you dont like to type -p otherport, you can either create aliases with all the arguments or use something like pgadmin3 which enables you to easy switch from database to database, from host to host as you like. Now is there any usecase I have missed which you still would like to have addressed? Kind regards Tino Wildenhain ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org