Re: [HACKERS] New XML section for documentation
On 8/26/06, Peter Eisentraut <[EMAIL PROTECTED]> wrote: Bruce Momjian wrote: > Validation > -- > /contrib/xml2 has a function called xml_valid() that can be used in > a CHECK constraint to enforce that a field contains valid XML. It > does not support validation against a specific XML schema. Then this is not validation but only checking for well-formedness. The xml2 README says so, in fact. Exactly. contrib/xml2 mixes the term here, xml_valid() should be another function, that takes two types of data - XML value and corresponding XML schema - and validate the XML data. Actually, the latest version of SQL/XML standard includes such a function (XMLVALIDATE). If you decide to include the mentioning about contrib/xml2 to docs, I would suggest the patch for this module. The patch renames that function to xml_check() and adds xml_array() (issue from the current TODO). Or it's too late for 8.2? Also, I would add a little introduction to XML terms (from XML standards) to this documentation section. -- Best regards, Nikolay ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [Open Item] Re: [HACKERS] Autovacuum on by default?
Jim C. Nasby wrote: > I thought we had agreed it would be a good idea to turn autovac_delay > on? We had not, because there was no experience available about where to put the default numbers. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New XML section for documentation
Bruce Momjian wrote: > XML Document Support > > XML support is not one capability, but a variety of features > supported by a database. database system > Storage > --- > PostgreSQL stores XML documents as ordinary text documents. It is "possible" to do that, but this sounds like it's done automatically or implicitly. Maybe: "PostgreSQL does not have a specialized XML data type. The recommended way is to store XML documents as text." > Import/Export > - > Because XML documents are stored as normal text documents, they can > be imported/exported with little complexity. Import/export refers to exporting schema data with XML decorations. Of course you can export column data trivially, but that's not what this is about. > Validation > -- > /contrib/xml2 has a function called xml_valid() that can be used in > a CHECK constraint to enforce that a field contains valid XML. It > does not support validation against a specific XML schema. Then this is not validation but only checking for well-formedness. The xml2 README says so, in fact. > Indexing > I think the expression index capability combined with contrib/xml2 is more relevant here than the full-text search capability. > Transforming > > /contrib/xml2 supports XSL transformations. That's XSLT. > XML to SQL Mapping > --- > This involves converting XML data to and from relational structures. > PostgreSQL has no internal support for such mapping, and relies on > external tools to do such conversions. Are there instances of such tools? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [DOCS] New XML section for documentation
> Indexing > > Because XML documents are stored as text, full-text indexing tool > /contrib/tsearch2 can be used to index XML documents. Of > course, the searches are text searches, with no XML > awareness, but tsearch2 can be used with other XML > capabilities to dramatically reduce the amount of data > processed at the XML level. You can also use a functional index and /contrib/xml2 to do limited XPath indexing. (Can't make it "subtree-aware" for example, unless you are willing to change your queries, but you can index specific xpath nodes). //Magnus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Coding style for emacs
I am still waiting for an actual patch to the developer's FAQ for this. --- Gregory Stark wrote: > > Andrew Dunstan <[EMAIL PROTECTED]> writes: > > > I use more or less what is in the developers' FAQ (not surprising, since I > > contributed it). It works just fine for me. See > > http://www.postgresql.org/docs/faqs.FAQ_DEV.html#item1.9 > > > > We should probably bring the docs in line with that, unless someone with > > more > > emacs-fu than me has something to add. > > Oh, I've been meaning to email about this. There's a file in the source tree > but it's buggy. The code in your FAQ seems ok, but I prefer the way I've done > it: > > (add-hook 'c-mode-hook > (function > (lambda nil >(if (string-match "pgsql" buffer-file-name) >(progn > (c-set-style "bsd") > (setq c-basic-offset 4) > (setq tab-width 4) > (c-set-offset 'case-label '+) > (setq indent-tabs-mode t) > ) > > > > > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Is there anything to do for 8.2 here? --- ITAGAKI Takahiro wrote: > This is an additional information. > > I wrote: > > If we want to resolve the probmen fundamentally, we might have to > > improve SubTrans using a better buffer management algorithm or so. > > The above is maybe wrong. I checked each lwlock of pg_subtrans's buffers. > All lwlocks are uniformly acquired and I could not see any differences > among buffers. So the cause seems not to be a buffer management algorithm, > but just a lack of SLRU buffer pages. > > NUM_SUBTRANS_BUFFERS is defined as 32 in HEAD. If we increase it, > we can avoid the old transaction problem for a certain time. > However, it doesn't help much on high-load -- for example, on a workload > with 2000 tps, we will use up 1000 pg_subtrans pages in 15 minites. > I suppose it is not enough for online and batch/maintenance mix. > > Also, the simple scanning way in SLRU will likely cause another performance > issue when we highly increase the number of buffers. A sequential scanning > is used in SLRU, so it will not work well against many buffers. > > > I hope some cares in upper layer, snapshot, hitbits or something, > being discussed in the recent thread. > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center > > > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pgcrypto deprecated functions?
Michael Fuhr wrote: > In README.pgcrypto, Section 2.3 "Deprecated functions" says that > digest_exists(), hmac_exists(), and cipher_exists() are planned to > be removed in PostgreSQL 8.2. Those functions still exist -- should > they be removed or does that section need updating? Yes, I see this text: The `digest_exists()`, `hmac_exists()` and `cipher_exists()` functions are deprecated. The plan is to remove them in PostgreSQL 8.2. Would someone address this? -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] VALUES clause memory optimization
Has this been addressed? --- Joe Conway wrote: > Tom Lane wrote: > > The reason we could safely list_free inside transformInsertRow is that > > we know all its callers have just built the passed-in list and so there > > are no other pointers to it. That doesn't apply in the general case of > > grammar output. > > What about for the specific case of an InsertStmt? It seems that we > could at least get away with freeing the raw-expression list in that case. > > In terms of freeing an entire arbitrary node, could we create a > backend/nodes/freefuncs.c file that does a recursive freeObject() > similar to the way copyObject() does in backend/nodes/copyfuncs.c? > > > My advice is to get that low-hanging fruit > > in transformInsertRow and leave the other ideas for 8.3. > > OK. This should be safe also, correct? > > Thanks, > > Joe > > 8< > diff -c -r1.341 analyze.c > *** src/backend/parser/analyze.c 2 Aug 2006 01:59:46 - 1.341 > --- src/backend/parser/analyze.c 2 Aug 2006 05:13:20 - > *** > *** 2191,2196 > --- 2196,2202 > for (i = 0; i < sublist_length; i++) > { > coltypes[i] = select_common_type(coltype_lists[i], "VALUES"); > + list_free(coltype_lists[i]); > } > > newExprsLists = NIL; > > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Further reduction of bufmgr lock contention
Is this being kept for 8.3? --- Tom Lane wrote: > A couple months ago we were discussing partitioning the buffer mapping > table so as to reduce contention for the BufMappingLock. The discussion > stalled after I complained about the uncertainty of shared memory usage > arising from having a separate hashtable for each partition (since > different numbers of buffers might fall into each partition at different > times). I've been thinking about it more after seeing an example from > Tatsuo that seems to be exactly the same problem as Gavin saw. > > We could fix the uncertain-usage objection if we stick with a single > hashtable and ensure that concurrent access to different partitions of > it is safe. I believe this is easily doable, if we make hashtables > used in this mode allocate the maximum allowed number of buckets > immediately during hashtable initialization. (Since shared hashtables > already have a fixed maximum directory size, they already have an upper > bound on the number of buckets, so this loses no flexibility.) Then > there will be no on-the-fly bucket splits, and that means that accesses > to different hash buckets operate completely independently. Therefore, > we can regard the hashtable as logically partitioned on the basis of any > classification of entries that will uniquely assign hash buckets to > classes --- taking the low order bits of entry hash codes will do fine. > > The only changeable state that is shared across all buckets is the entry > freelist and the "nentries" counter. We could protect these with a > spinlock (one spinlock is sufficient since changes to nentries go along > with addition or removal of freelist entries). > > Usage of a partitioned hash table would then be like > > compute hashtable lookup key; > entryhashcode = calc_hash(lookup key); > partitionnumber = entryhashcode % NumPartitions; > LWLockAcquire(PartitionLock[partitionnumber], ...); > manipulate hashtable; > LWLockRelease(PartitionLock[partitionnumber]); > > We could do this without changing the API of hash_search, but then we'd > be computing the entry hashcode twice, so I'm inclined to provide an > additional entry point that takes a precalculated hashcode. > > Potential downsides of applying this idea to the buffer mapping table: > > 1. Reassigning a buffer to a new page will (usually) require two cycles > of LWLockAcquire/Release for the two different partitions involved. > Since this path also requires at least a read() kernel call (maybe a > write() too), I don't think there'll be any meaningful slowdown. > > 2. The current logic for reassigning a buffer attempts to make a > hashtable entry for its new page number (to detect collisions) before > releasing the old hashtable entry. This would only be safe if we held > both partition LWLocks concurrently; which seems bad for concurrency, > plus avoiding deadlock would complicate the code significantly. I'm > inclined to release the old entry and then try to insert the new one, > holding only one lock at a time. If the insertion fails (because > someone was concurrently loading the same page), we'd have to throw the > buffer onto the freelist rather than allowing it to retain its previous > valid data. Which is slightly annoying, but in practice the case > probably doesn't happen often enough to be worth worrying about. > > 3. Taking the freelist spinlock is new computation that wasn't there > before. But, again, it's only needed in code paths that will also be > doing a kernel call. > > If we do this we should probably also handle the lmgr lock tables the > same way (partially reverting my partition-the-LockMgrLock patch of a > couple months ago). However, downside #3 might be a stronger objection > for lmgr, since it can create or destroy lock objects without necessarily > doing any I/O. > > Thoughts, objections? > > regards, tom lane > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] New XML section for documentation
Bruce Momjian wrote: > Here is an new XML section for our SGML documentation. It explains the > various XML capabilities, if we support them, and how to use them. > > Comments? > +1. Users often ask this in the mailing lists. Where are you want to put this? I'll suggest: FAQ. What do you all think? > Missing Features > > o XQuery > o SQL/XML syntax (ISO/IEC 9075-14) > o XML data type optimized for XML storage > Another section in TODO? -- Euler Taveira de Oliveira http://www.timbira.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] integration of pgcluster into postgresql
On Fri, Aug 25, 2006 at 05:40:09AM -0700, Chahine Hamila wrote: > Hi there, > > I guess many - if not most - here have tried > pgcluster. For those who didn't, postgresql is pretty I think you need to guess again. :) > much the equivalent of pgcluster configured without > load balancer or replicator, in read-write standalone > mode. From a user point of view, that's three maximum > additional configuration files, which can be basically > set to those default values and distributed as is in > standard (making installs/upgrades transparent to > non-cluster environments). From a developer point of > view, the pgcluster code is quite easy to take a hold > on. > > pgcluster still has quite a few pending issues, but > it's good enough for many users. Integrating it as > part of a standard postgresql distribution would > likely not disrupt standard postgresql functionning, > while giving it the replication features it lacks as > is. It's also likely to accelerate its maturing by a > more widespread adoption and as a result overcome most > of its issues. > > If the idea of its integration in the main postgresql > code is of any interest to the postgresql team, I'm > willing to invest some effort on it. First, you need to review all the past discussion about the very intentional decision not to build any replication into the core database. Second, pgcluster is (AFAIK) command-based replication, which has some very, very serious drawbacks. If PostgreSQL were to include a replication solution, I'd certainly hope it wouldn't be command-based. Finally, pgcluster is very out-of-date. The last version uses 8.0.1 and was released on Mar. 7, 2005. If the author can't find the time to maintain it, I don't see why that burden should be put on the shoulders of this community. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New XML section for documentation
Euler Taveira de Oliveira wrote: > Bruce Momjian wrote: > > > Here is an new XML section for our SGML documentation. It explains the > > various XML capabilities, if we support them, and how to use them. > > > > Comments? > > > +1. Users often ask this in the mailing lists. Where are you want to > put this? I'll suggest: FAQ. What do you all think? Our main documentation. Once it is there, people will find it rather than on the FAQ. > > Missing Features > > > > o XQuery > > o SQL/XML syntax (ISO/IEC 9075-14) > > o XML data type optimized for XML storage > > > Another section in TODO? Perhaps, yea. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Tricky bugs in concurrent index build
On Fri, Aug 25, 2006 at 11:25:43AM -0400, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > The problem is that what the qualifier is doing is modifying the > > operation itself, not the properties of the index to be created, like > > UNIQUE, which modifies the index. > > Right, which was the same point Bruce made earlier. And really you > can't respect that difference while putting them into the same place in > the word order. So I'm starting to feel like maybe we should leave > well enough alone. Since we might eventually have other 'concurrent commands', perhaps CONCURRENT CREATE INDEX ... would be best. BTW, if we started to consider lazy vacuum a concurrent command we could ditch the use of FULL, which is always confusing if you're talking about database-wide vacuums. I know it'd take many versions to fully make that change, but it seems worth it to me to reduce confusion. There's also an issue of newbies thinking they should use vacuum full regularly because it's somehow better than lazyvac. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Tricky bugs in concurrent index build
On Fri, Aug 25, 2006 at 06:57:58PM +0100, Gregory Stark wrote: > I'll use this opportunity to plug that feature again. I think most people > should use autocommit off with on_error_rollack on for most of their daily > use. Being able to double check the results of my ad-hoc updates before > committing them saved me more headaches than I can count with Oracle. > Autocommit off only became practical for interactive use with postgres when > savepoints showed up. +1 -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] default child of partition master
On Sun, Aug 20, 2006 at 12:58:31PM -0400, Gene wrote: > Keep in mind if you have multiple rules for a master table, it won't return > the number of affected rows as you might expect. This screws up Hibernate > which I'm using for my application. It checks the return value to make sure > it was inserted properly. Luckily I only need one rule which puts it into > the "current" child table (im partitioning on current timestamp). I suppose > I could get around this by using a stored procedure or something but that > would not be as portable. I'm looking forward to future versions of PG which > automate more of the partitioning features :) keep up the good work! Sounds like a bug in Hibernate. It should be checking for errors instead. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [Open Item] Re: [HACKERS] Autovacuum on by default?
On Fri, Aug 25, 2006 at 12:16:33PM -0400, Alvaro Herrera wrote: > Tom Lane wrote: > > "Matthew T. O'Connor" writes: > > > Peter Eisentraut wrote: > > >> - Leave base thresholds alone (pending further analysis that might > > >> remove them > > >> altogether?) > > > > > While there is talk of removing this all together, I think it was also > > > agreed that as long as these values are there, they should be reduced. > > > I think the defaults in 8.1 are 1000/500, I think 200/100 was suggested. > > > > ISTM that if we don't want to remove the thresholds immediately, > > we should make them default to zero for a release or two and see how > > well it works. > > > > At the moment I can't find the thread that discussed removing them, > > but IIRC there were some good arguments why the thresholds should always > > be zero. > > I can't find it either, but I think the bug reported here is related: > > http://archives.postgresql.org/pgsql-general/2006-06/thrd2.php#00951 > > On the other hand, I don't think we completely resolved this, so I > proposed this be added to the "Open Items" list. Yeah, I think there's reasons we can't go to zero. 200/100 or even 20/10 would probably be a good compromise. I agree that droping to 0.08 might be a bit much, but it would be good if we started recommending that value to folks to see how well it works. I thought we had agreed it would be a good idea to turn autovac_delay on? I know there was question as to what a good value would be, but 5-10ms seems pretty reasonable. I think it'd also be good to up the cost threshold and the dirty_page cost, though I don't have much data to back that up (I did testing at one customer on a drive array and found 300 and 30 were good values). If we've got command stats turned on by default now, I'll have a hard time buying performance as any reason to turn the others off. I think we should turn them all on and let those who are trying to eek the last few percent of performance out of a system turn them off. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [DOCS] [HACKERS] New XML section for documentation
David Fetter wrote: > On Fri, Aug 25, 2006 at 07:46:57PM -0400, Bruce Momjian wrote: > > Here is an new XML section for our SGML documentation. It explains > > the various XML capabilities, if we support them, and how to use > > them. > > > > Comments? > > This looks hauntingly similar to Peter's presentation at the I used the XML/SQL and validation part from his talk, but the rest was from earlier email discussions. > conference. :) I'd add a http://wiscorp.com/SQLStandards.html to the This seems to be the best URL, but it seems too detailed: http://wiscorp.com/H2-2005-197-SC32N1293-WG3_Presentation_for_SC32_20050418.pdf > reference section. > > Speaking of other parts of the SQL:2003 standard, how about one > section each that mentions them? There's > > Part 4: SQL/PSM (Persistent Stored Modules) > Part 9: SQL/MED (Management of External Data) (my favorite) > Part 10: SQL/OLB (Object Language Binding) > Part 11: SQL/Schemata > Part 13: SQL/JRT (Java Routines and Types) I don't know anything about them. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] New XML section for documentation
On Fri, Aug 25, 2006 at 07:46:57PM -0400, Bruce Momjian wrote: > Here is an new XML section for our SGML documentation. It explains > the various XML capabilities, if we support them, and how to use > them. > > Comments? This looks hauntingly similar to Peter's presentation at the conference. :) I'd add a http://wiscorp.com/SQLStandards.html to the reference section. Speaking of other parts of the SQL:2003 standard, how about one section each that mentions them? There's Part 4: SQL/PSM (Persistent Stored Modules) Part 9: SQL/MED (Management of External Data) (my favorite) Part 10: SQL/OLB (Object Language Binding) Part 11: SQL/Schemata Part 13: SQL/JRT (Java Routines and Types) Cheers, D -- David Fetter <[EMAIL PROTECTED]> http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] New XML section for documentation
Here is an new XML section for our SGML documentation. It explains the various XML capabilities, if we support them, and how to use them. Comments? --- XML Document Support XML support is not one capability, but a variety of features supported by a database. These capabilities include storage, import/export, validation, indexing, efficiency of modification, searching, transformating, and XML to SQL mapping. PostgreSQL supports some but not all of these XML capabilities. Future releases of PostgreSQL will continue to improve XML support. Storage --- PostgreSQL stores XML documents as ordinary text documents. It does not split apart XML documents into its component parts and store each element separately. You can use middle-ware solutions to do that, but once done, the data becomes relational and has to be processed accordingly. Import/Export - Because XML documents are stored as normal text documents, they can be imported/exported with little complexity. A simple TEXT field can hold up to 1 gigabyte of text, and large objects are available for larger documents. Validation -- /contrib/xml2 has a function called xml_valid() that can be used in a CHECK constraint to enforce that a field contains valid XML. It does not support validation against a specific XML schema. A server-side language with XML capabilities could be used to do schema-specific XML checks. Indexing Because XML documents are stored as text, full-text indexing tool /contrib/tsearch2 can be used to index XML documents. Of course, the searches are text searches, with no XML awareness, but tsearch2 can be used with other XML capabilities to dramatically reduce the amount of data processed at the XML level. Modification If an UPDATE does not modify an XML field, the XML data is shared between the old and new rows. However, if the UPDATE modifies a XML field, a full modified copy of the XML field must be created internally. Searching - XPath searches are implemented using /contrib/xml2. It processes XML text documents and returns results based on the requested query. Transforming /contrib/xml2 supports XSL transformations. XML to SQL Mapping --- This involves converting XML data to and from relational structures. PostgreSQL has no internal support for such mapping, and relies on external tools to do such conversions. Missing Features o XQuery o SQL/XML syntax (ISO/IEC 9075-14) o XML data type optimized for XML storage See also http://www.rpbourret.com/xml/XMLAndDatabases.htm -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] invalid byte sequence ?
Tom Lane wrote: > Martijn van Oosterhout writes: > > And as a counter-example: pg_dump should absolutly not use the client > > locale, it should always dump as the same encoding as the server... > > Sure, but pg_dump should set that explicitly. I'm prepared to believe > that looking at the locale is sane for all normal clients. What are "normal clients"? I would think that programs in PHP or Perl have their own idea of the correct encoding (JDBC already has one). > It might be worth providing a way to set the client_encoding through a > PQconnectdb connection-string keyword, just in case the override-via- > PGCLIENTENCODING dodge doesn't suit someone. The priority order > would presumably be connection string, then PGCLIENTENCODING, then > locale. This sounds like a good idea anyway... -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] invalid byte sequence ?
Martijn van Oosterhout writes: > And as a counter-example: pg_dump should absolutly not use the client > locale, it should always dump as the same encoding as the server... Sure, but pg_dump should set that explicitly. I'm prepared to believe that looking at the locale is sane for all normal clients. It might be worth providing a way to set the client_encoding through a PQconnectdb connection-string keyword, just in case the override-via- PGCLIENTENCODING dodge doesn't suit someone. The priority order would presumably be connection string, then PGCLIENTENCODING, then locale. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] invalid byte sequence ?
On Fri, Aug 25, 2006 at 08:13:39PM +0200, Peter Eisentraut wrote: > > I agree with Martijn that putting these into libpq's API > > seems like useless clutter. > > Where else to put it? We need it in libpq anyway if we want this > behavior in all client applications (by default). Is that so? I thought we were only talkng about psql. Even then, I'm wondering if we should alter the current behaviour at all if stdout is not a tty (i.e. run as a pipe). And as a counter-example: pg_dump should absolutly not use the client locale, it should always dump as the same encoding as the server... Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] [GENERAL] invalid byte sequence ?
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> I agree with Martijn that putting these into libpq's API >> seems like useless clutter. > Where else to put it? We need it in libpq anyway if we want this > behavior in all client applications (by default). Having the code in libpq doesn't necessarily mean exposing it to the outside world. I can't see a reason for these to be in the API at all. Possibly we could avoid the duplication-of-source-code issue by putting the code in libpgport, or someplace, whence both initdb and libpq could get at it? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] invalid byte sequence ?
Tom Lane wrote: > Um, but initdb doesn't use libpq, so it's going to need its own copy > anyway. initdb certainly links against libpq. > I agree with Martijn that putting these into libpq's API > seems like useless clutter. Where else to put it? We need it in libpq anyway if we want this behavior in all client applications (by default). -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] psql 'none' as a HISTFILE special case
On 8/25/06, Tom Lane <[EMAIL PROTECTED]> wrote: "Martin Atukunda" <[EMAIL PROTECTED]> writes: > On 8/25/06, Tom Lane <[EMAIL PROTECTED]> wrote: >> There's probably no way to get Apple's libedit to not try the fchmod, >> so what do we want to do here? Maybe special-case the string >> "/dev/null"? > If this is OK, I can up with a patch that special cases /dev/null as a > HISTFILE if libedit is found. I was thinking of basically a one-liner addition to write_history to skip the whole thing if strcmp(fname, DEVNULL) == 0. Should be reasonably inoffensive on anyone's machine. I guess you meant saveHistory instead of write_history here. :) something like the attached diff - Martin - special_case_DEVNULL.diff Description: Binary data ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Tricky bugs in concurrent index build
"Zeugswetter Andreas DCP SD" <[EMAIL PROTECTED]> writes: >> That was what the patch originally used, but it was changed >> because it made difficult for psql to auto-complete that. > That is imho not enough of a reason to divert. My recollection is that the principal argument against ONLINE was that it didn't convey the function of the option to anyone who didn't already know Oracle's usage of the term. Also, psql's problem is not with auto-completion, it's with detecting whether the command is allowed inside a transaction block. That's not a function we can just blow off. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Tricky bugs in concurrent index build
Alvaro Herrera <[EMAIL PROTECTED]> writes: > That was what the patch originally used, but it was changed because it > made difficult for psql to auto-complete that. Psql has to be able to parse it not for auto-completion but because it needs to know that it's not a transactional command. The regular CREATE INDEX can be run from within a transaction but online index builds use two transactions on their own so psql has to know not to insert a BEGIN and savepoint around it. I'll use this opportunity to plug that feature again. I think most people should use autocommit off with on_error_rollack on for most of their daily use. Being able to double check the results of my ad-hoc updates before committing them saved me more headaches than I can count with Oracle. Autocommit off only became practical for interactive use with postgres when savepoints showed up. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Tricky bugs in concurrent index build
> > precedent syntax (Oracle, Informix) uses the keyword ONLINE > at the end: > > CREATE INDEX blabla_x0 ON blabla (a,b) ONLINE; > > That was what the patch originally used, but it was changed > because it made difficult for psql to auto-complete that. That is imho not enough of a reason to divert. Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Tricky bugs in concurrent index build
"Zeugswetter Andreas DCP SD" <[EMAIL PROTECTED]> writes: > precedent syntax (Oracle, Informix) uses the keyword ONLINE at the end: > CREATE INDEX blabla_x0 ON blabla (a,b) ONLINE; We rejected that one already ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] psql 'none' as a HISTFILE special case
"Martin Atukunda" <[EMAIL PROTECTED]> writes: > On 8/25/06, Tom Lane <[EMAIL PROTECTED]> wrote: >> There's probably no way to get Apple's libedit to not try the fchmod, >> so what do we want to do here? Maybe special-case the string >> "/dev/null"? > If this is OK, I can up with a patch that special cases /dev/null as a > HISTFILE if libedit is found. I was thinking of basically a one-liner addition to write_history to skip the whole thing if strcmp(fname, DEVNULL) == 0. Should be reasonably inoffensive on anyone's machine. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Tricky bugs in concurrent index build
Zeugswetter Andreas DCP SD wrote: > > > > What bothers me about what we have now is that we have optional > > > keywords before and after INDEX, rather than only between > > CREATE and INDEX. > > > > Yeah, putting them both into that space seems consistent to > > me, and it will fix the problem of making an omitted index > > name look like a valid command. > > > > I'm not sure I should be opening this can of worms, but do we > > want to use a different keyword than CONCURRENTLY to make it > > read better there? > > precedent syntax (Oracle, Informix) uses the keyword ONLINE at the end: > CREATE INDEX blabla_x0 ON blabla (a,b) ONLINE; That was what the patch originally used, but it was changed because it made difficult for psql to auto-complete that. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Performance testing of COPY (SELECT) TO
Hi, Böszörményi Zoltán írta: Böszörményi Zoltán <[EMAIL PROTECTED]> writes: With PostgreSQL 8.1.4, I used this: begin; select ... into temp myquery1; copy myquery1 to stdout csv delimiter '|'; rollback; The performance of this would doubtless vary a lot with the temp_buffers setting. Did you try different values? Yes, I did, but now checked back with 8.2CVS. The previously quoted result was achieved with temp_buffers = 1000 on both 8.1.4 and 8.2CVS. On 8.2CVS with temp_buffers = 4096, the 10 client case kills the machine with swapping, but the 3 client runtime with COPY(SELECT) went down to 2:41. The SELECT INTO TEMP case went down to 3:36. It'd also be interesting to time the same way (with a temp table) in devel. I don't remember whether we did any performance work on the COPY CSV data path in this cycle, or whether that was all present in 8.1. In any case it'd be worth proving that the COPY SELECT patch isn't degrading performance of the copy-a-relation case. I will report back with that, say on Monday. It seems my previous mail hasn't reached the hackers list, I answer here. In the export, there is a largish table, that has both many columns and rows. With COPY(SELECT) patch applied: time psql -c "copy (select * from table) to 'file'" dbx COPY 886046 real0m13.253s user0m0.000s sys 0m0.000s time psql -c "copy table to 'file'" dbx COPY 886046 real0m13.234s user0m0.000s sys 0m0.000s time psql -c "copy table to stdout" dbx >file real0m15.155s user0m0.540s sys 0m0.450s time psql -c "copy (select * from table) to stdout" dbx >file real0m15.079s user0m0.540s sys 0m0.590s Surprisingly, without the COPY(SELECT) patch it's slower, this is the lowest from five runs, e.g. with warm caches: time psql -c "copy table to 'file'" dbx real0m20.464s user0m0.000s sys 0m0.010s time psql -c "copy table to stdout" dbx >file real0m25.753s user0m0.570s sys 0m0.460s With the original settings, temp_buffers = 1000 on 8.2CVS, the export runtime with one client looks like this: first run 1:44, second run 1:12, third run 1:04. It seems it's a bit faster both on startup and on subsequent runs. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Tricky bugs in concurrent index build
> > What bothers me about what we have now is that we have optional > > keywords before and after INDEX, rather than only between > CREATE and INDEX. > > Yeah, putting them both into that space seems consistent to > me, and it will fix the problem of making an omitted index > name look like a valid command. > > I'm not sure I should be opening this can of worms, but do we > want to use a different keyword than CONCURRENTLY to make it > read better there? precedent syntax (Oracle, Informix) uses the keyword ONLINE at the end: CREATE INDEX blabla_x0 ON blabla (a,b) ONLINE; I'd stick with that. Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [Pgsqlrpms-hackers] [HACKERS] Safer auto-initdb for RPM init
> > Am Freitag, 25. August 2006 16:31 schrieb Reinhard Max: > > But shouldn't mountpoints always have 000 permissions to prevent > > writing into the directory as long as nothing is mounted to it? > > That's an interesting point, but in practice nobody does > that. And we're > trying to defend exactly against the case where someone has > set up a mount > point manually. > It had never occurred to me, but I'm definitely going to start doing it now. So it will be in practice, at least around here. Regards, Paul Bort ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] psql 'none' as a HISTFILE special case
On 8/25/06, Tom Lane <[EMAIL PROTECTED]> wrote: > When I set HISTFILE to /dev/null I get the following: > could not save history to file "/dev/null": Operation not permitted Hm. ktrace shows this happening: 23279 psql CALL open(0x302d70,0x601,0x1b6) 23279 psql NAMI "/dev/null" 23279 psql RET open 3 23279 psql CALL fchmod(0x3,0x180) 23279 psql RET fchmod -1 errno 1 Operation not permitted 23279 psql CALL close(0x3) 23279 psql RET close 0 23279 psql CALL write(0x2,0xb180,0x44) 23279 psql GIO fd 2 wrote 68 bytes "could not save history to file "/dev/null": Operation not permitted " 23279 psql RET write 68/0x44 23279 psql CALL exit(0) There's probably no way to get Apple's libedit to not try the fchmod, so what do we want to do here? Maybe special-case the string "/dev/null"? If this is OK, I can up with a patch that special cases /dev/null as a HISTFILE if libedit is found. - Martin - ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Replication
On Fri, 2006-08-25 at 11:23 +0200, Markus Schiltknecht wrote: > Jeff Davis wrote: > > Which doesn't work very well in the case of two groups of servers set up > > in two physical locations. I can see two possibilities: > > (1) You require a quorum to be effective, in which case your cluster of > > databases is only as reliable as the location which holds more servers. > > (2) You have another central authority that determines which databases > > are up, and which are down. Then your cluster is only as reliable as > > that central authority. > > Right, the ideal here would be two sync clusters a both locations, > connected via async replication :-) > > > Even if you have a large number of nodes at different locations, then > > you end up with strange decisions to make if the network connections are > > intermittent or very slow. A temporary slowdown of many nodes could > > cause them to be degraded until some kind of human intervention brought > > them back. Until that time you might not be able to determine which > > nodes make up an authoritative group. > > Side note: in such a case, I think a GCS will just choose only one node > to be the 'authoritative group'. Because most systems cannot effort to > have long waits for such decisions. For database replication I also > think its better to have at least one node running than none. > > > This kind of degradation could > > happen in the case of a DDoS attack, or perhaps a worm moving around the > > internet. > > Well, sync replication in general needs a good, low latency and secure > interconnect. The internet does not seem to be a good fit here. > > > In practice everyone can find a solution that works for them. However, > > synchronous replication is not perfect, and there are many failure > > scenarios which need to be resolved in a way that fits your business. I > > think synchronous replication is inherently less available than > > asynchronous. > > This surely depends on the environment. With a dedicated (i.e. low > latency and secure) interconnect sync replication is surely more > available because your arguments above don't apply. And because sync > replication guarantees you won't loose committed transactions. > > If however you want or have to replicate over the internet it depends. > Your arguments above also apply to async replication. Only that because > of the conflict resolution, async replication systems can continue to > operate on all the disconnected nodes and merge their work later on as > the network is up again. But then again, async still has the danger of > loosing transactions. > > So I probably agree: if you are on an unreliable network and if you have > conflict resolution correctly setup then async replication is more > available, but less secure. > > As I said above, sync replication needs a reliable interconnect, better > even have two interconnects, because it's a SPOF for a clustered > database system. > Ok, I agree with your statements. Async is convenient in many ways, but has less durable transactions (at least for transactions committed recently). Sync has some limitations, and is harder to get right (at least if you want good availability as well) but provides more durable transactions and consistency between systems. Regards, Jeff Davis ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] psql 'none' as a HISTFILE special case
"Martin Atukunda" <[EMAIL PROTECTED]> writes: > On 8/25/06, Peter Eisentraut <[EMAIL PROTECTED]> wrote: >> Please elaborate on "doesn't work". > without any .psqlrc file I get the following error when quitting a psql > session: > could not save history to file "/Users/matlads/.psql_history": Invalid > argument That is fixed in CVS HEAD. The current coding looks like: /* * return value of write_history is not standardized across GNU * readline and libedit. Therefore, check for errno becoming set * to see if the write failed. */ errno = 0; (void) write_history(fname); if (errno == 0) return true; psql_error("could not save history to file \"%s\": %s\n", fname, strerror(errno)); > When I set HISTFILE to /dev/null I get the following: > could not save history to file "/dev/null": Operation not permitted Hm. ktrace shows this happening: 23279 psql CALL open(0x302d70,0x601,0x1b6) 23279 psql NAMI "/dev/null" 23279 psql RET open 3 23279 psql CALL fchmod(0x3,0x180) 23279 psql RET fchmod -1 errno 1 Operation not permitted 23279 psql CALL close(0x3) 23279 psql RET close 0 23279 psql CALL write(0x2,0xb180,0x44) 23279 psql GIO fd 2 wrote 68 bytes "could not save history to file "/dev/null": Operation not permitted " 23279 psql RET write 68/0x44 23279 psql CALL exit(0) There's probably no way to get Apple's libedit to not try the fchmod, so what do we want to do here? Maybe special-case the string "/dev/null"? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[Open Item] Re: [HACKERS] Autovacuum on by default?
Tom Lane wrote: > "Matthew T. O'Connor" writes: > > Peter Eisentraut wrote: > >> - Leave base thresholds alone (pending further analysis that might remove > >> them > >> altogether?) > > > While there is talk of removing this all together, I think it was also > > agreed that as long as these values are there, they should be reduced. > > I think the defaults in 8.1 are 1000/500, I think 200/100 was suggested. > > ISTM that if we don't want to remove the thresholds immediately, > we should make them default to zero for a release or two and see how > well it works. > > At the moment I can't find the thread that discussed removing them, > but IIRC there were some good arguments why the thresholds should always > be zero. I can't find it either, but I think the bug reported here is related: http://archives.postgresql.org/pgsql-general/2006-06/thrd2.php#00951 On the other hand, I don't think we completely resolved this, so I proposed this be added to the "Open Items" list. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] invalid byte sequence ?
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Am Freitag, 25. August 2006 17:30 schrieb Martijn van Oosterhout: >> Umm, why export all these functions. For starters, does this even need >> to be in libpq? > Where else would you put it? > ... > initdb has different requirements. Let me know if you have a different way > to > refactor it that satisfies initdb. Um, but initdb doesn't use libpq, so it's going to need its own copy anyway. I agree with Martijn that putting these into libpq's API seems like useless clutter. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Autovacuum on by default?
Tom Lane wrote: "Matthew T. O'Connor" writes: While there is talk of removing this all together, I think it was also agreed that as long as these values are there, they should be reduced. I think the defaults in 8.1 are 1000/500, I think 200/100 was suggested. ISTM that if we don't want to remove the thresholds immediately, we should make them default to zero for a release or two and see how well it works. At the moment I can't find the thread that discussed removing them, but IIRC there were some good arguments why the thresholds should always be zero. I don't have any significant objection to removing them, it just seemed to me that we are late in release cycle and that might be more than we want to do at the moment. If others think it's OK, then it's OK with me. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] psql 'none' as a HISTFILE special case
On 8/25/06, Peter Eisentraut <[EMAIL PROTECTED]> wrote: Am Freitag, 25. August 2006 17:03 schrieb Martin Atukunda: > hmm, setting HISTFILE to /dev/null doesn't work on my MacOSX here. Please elaborate on "doesn't work". without any .psqlrc file I get the following error when quitting a psql session: could not save history to file "/Users/matlads/.psql_history": Invalid argument When I set HISTFILE to /dev/null I get the following: could not save history to file "/dev/null": Operation not permitted - Martin - ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Autovacuum on by default?
"Matthew T. O'Connor" writes: > Peter Eisentraut wrote: >> - Leave base thresholds alone (pending further analysis that might remove >> them >> altogether?) > While there is talk of removing this all together, I think it was also > agreed that as long as these values are there, they should be reduced. > I think the defaults in 8.1 are 1000/500, I think 200/100 was suggested. ISTM that if we don't want to remove the thresholds immediately, we should make them default to zero for a release or two and see how well it works. At the moment I can't find the thread that discussed removing them, but IIRC there were some good arguments why the thresholds should always be zero. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] invalid byte sequence ?
On Fri, Aug 25, 2006 at 05:38:20PM +0200, Peter Eisentraut wrote: > > In fact, the only thing you need is PQsetClientEncodingFromLocale(), > > anything else is just sugar. Why would the user care about what the OS > > calls it? We have a "pg_enc" enum, so lets use it. > > initdb has different requirements. Let me know if you have a different way > to > refactor it that satisfies initdb. Well, check_encodings_match(pg_enc,ctype) is simply a short way of saying: if(find_matching_encoding(ctype) != pg_enc ) { error }. And get_encoding_from_locale() is not used outside of those functions. So the only thing initdb actually needs is an implementation of find_matching_encoding(ctype), which returns a value of "enum pg_enc". check_encodings_match() stays in initdb, and get_encoding_from_locale() becomes internal to libpq. How does that sound? Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] [GENERAL] invalid byte sequence ?
Am Freitag, 25. August 2006 17:30 schrieb Martijn van Oosterhout: > Umm, why export all these functions. For starters, does this even need > to be in libpq? Where else would you put it? > In fact, the only thing you need is PQsetClientEncodingFromLocale(), > anything else is just sugar. Why would the user care about what the OS > calls it? We have a "pg_enc" enum, so lets use it. initdb has different requirements. Let me know if you have a different way to refactor it that satisfies initdb. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Tricky bugs in concurrent index build
Gregory Stark wrote: Bruce Momjian <[EMAIL PROTECTED]> writes: The original thinking was to use CONCURRENT, and CREATE CONCURRENT INDEX sounded like a different type of index, not a different way to build the index. I don't think CONCURRENTLY has that problem, so CREATE CONCURRENTLY INDEX sounds good. To read in English, it would be read as CREATE CONCURRENTLY, INDEX ii. That doesn't sound like English at all to me. Fwiw, I think the best option was what Tom did. The gotcha I tripped on seems pretty minor to me. It's a form of construction my father (who was a noted orator) loved to use, maybe a little too much. It is arguably slightly archaic, but nevertheless quite grammatical ;-) I agree that these days it is more idiomatic to defer the adverb until after the object of the verb in most cases. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Autovacuum on by default?
Am Freitag, 25. August 2006 17:32 schrieb Matthew T. O'Connor: > While there is talk of removing this all together, I think it was also > agreed that as long as these values are there, they should be reduced. > I think the defaults in 8.1 are 1000/500, I think 200/100 was suggested. I'm thinking leaving them higher would offset the jump in the scale factor a bit. But if the idea is to get rid of the base threshold, then that's not really a forward-looking strategy, I suppose. I don't mind if we go down with the scale factor a little less and move the threshold down more, but at least tying the scale factor to the fill factor doesn't make all those numbers too random. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Autovacuum on by default?
Peter Eisentraut wrote: Summarizing this thread, I see support for the following: - autovacuum set to on by default in 8.2. Yes. - stats_row_level also defaults to on. Yes. (Perhaps stats_block_level should also default to on so it's not inconsistent, seeing that everything else in on, too.) I haven't followed this closely, but are all the other stats commands other than block_level now on by default? In general I would vote to keep it off if not needed just for performance reasons, though I haven't measured the effect of block_level turned on. Anyone measured this? - Delayed vacuum and delayed autovacuum will stay disabled. Unfortunately. - Scale factor set to 0.08 (vacuum) and 0.04 (analyze) (?) (formerly 0.4 and 0.2) That seems a big jump. BTW, I know .08 and .04 were suggested, but I didn't see confirmation that it was a good idea. I know my initial values were grossly over-conservative, but I am concerned about bogging down the server with lots of vacuums, especially since we don't have the delay settings on by default, nor do we have a maintenance windows yet. - Leave base thresholds alone (pending further analysis that might remove them altogether?) While there is talk of removing this all together, I think it was also agreed that as long as these values are there, they should be reduced. I think the defaults in 8.1 are 1000/500, I think 200/100 was suggested. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] invalid byte sequence ?
On Fri, Aug 25, 2006 at 05:07:03PM +0200, Peter Eisentraut wrote: > I got started on this and just wanted to post an intermediate patch. I have > taken the logic from initdb and placed it into libpq and refined the API a > bit. At this point, there should be no behaviorial change. It remains to > make libpq use this stuff if PGCLIENTENCODING is not set. Unless someone > beats me, I'll figure that out later. Umm, why export all these functions. For starters, does this even need to be in libpq? I wouldn't have thought so the first time round, especially not three functions. The only thing you need is to take a locale name and return the charset you can pass to PQsetClientEncoding. In fact, the only thing you need is PQsetClientEncodingFromLocale(), anything else is just sugar. Why would the user care about what the OS calls it? We have a "pg_enc" enum, so lets use it. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Tricky bugs in concurrent index build
Alvaro Herrera <[EMAIL PROTECTED]> writes: > The problem is that what the qualifier is doing is modifying the > operation itself, not the properties of the index to be created, like > UNIQUE, which modifies the index. Right, which was the same point Bruce made earlier. And really you can't respect that difference while putting them into the same place in the word order. So I'm starting to feel like maybe we should leave well enough alone. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Tricky bugs in concurrent index build
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > What bothers me about what we have now is that we have optional keywords > > before and after INDEX, rather than only between CREATE and INDEX. > > Yeah, putting them both into that space seems consistent to me, and > it will fix the problem of making an omitted index name look like > a valid command. > > I'm not sure I should be opening this can of worms, but do we want to > use a different keyword than CONCURRENTLY to make it read better there? The problem is that what the qualifier is doing is modifying the operation itself, not the properties of the index to be created, like UNIQUE, which modifies the index. So the qualifier should be something that modifies the CREATE, that is, an adverb (AFAIK). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] psql 'none' as a HISTFILE special case
Am Freitag, 25. August 2006 17:03 schrieb Martin Atukunda: > hmm, setting HISTFILE to /dev/null doesn't work on my MacOSX here. Please elaborate on "doesn't work". -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Tricky bugs in concurrent index build
Bruce Momjian <[EMAIL PROTECTED]> writes: > What bothers me about what we have now is that we have optional keywords > before and after INDEX, rather than only between CREATE and INDEX. Yeah, putting them both into that space seems consistent to me, and it will fix the problem of making an omitted index name look like a valid command. I'm not sure I should be opening this can of worms, but do we want to use a different keyword than CONCURRENTLY to make it read better there? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Safer auto-initdb for RPM init script
Am Freitag, 25. August 2006 16:54 schrieb Tom Lane: > No, it doesn't. (The Red Hat RPMs in fact did that ... for about > a week ... until I was told in no uncertain terms that we don't > start unnecessary daemons by default.) Well, there seem to be philosophical differences between the various operating systems -- "We won't install unnecessary packages." vs. "We won't start unnecessary daemons in unnecessarily installed packages." -- in which case your solution doesn't sound all that bad. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [Pgsqlrpms-hackers] [HACKERS] Safer auto-initdb for RPM initscript
Am Freitag, 25. August 2006 17:17 schrieb Enver ALTIN: > Am I the only one who believes that PostgreSQL project is not supposed > to fix (or include workarounds for) some other systems that actually > don't work very well? Yes. > If NFS is causing trouble, let it be. NFS is not the trouble. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [Pgsqlrpms-hackers] [HACKERS] Safer auto-initdb for RPM initscript
Merhaba, On Fri, Aug 25, 2006 at 10:30:56AM -0400, Andrew Dunstan wrote: > It is extremely careful. The point is that the NFS mount will hide the > existing datadir from initdb. Am I the only one who believes that PostgreSQL project is not supposed to fix (or include workarounds for) some other systems that actually don't work very well? If NFS is causing trouble, let it be. Thanks, -- Enver signature.asc Description: Digital signature
Re: [HACKERS] Tricky bugs in concurrent index build
Gregory Stark wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > > The original thinking was to use CONCURRENT, and CREATE CONCURRENT INDEX > > sounded like a different type of index, not a different way to build the > > index. I don't think CONCURRENTLY has that problem, so CREATE > > CONCURRENTLY INDEX sounds good. To read in English, it would be read as > > CREATE CONCURRENTLY, INDEX ii. > > That doesn't sound like English at all to me. > > Fwiw, I think the best option was what Tom did. The gotcha I tripped on seems > pretty minor to me. What bothers me about what we have now is that we have optional keywords before and after INDEX, rather than only between CREATE and INDEX. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] invalid byte sequence ?
Am Donnerstag, 24. August 2006 00:52 schrieb Tom Lane: > A possible solution therefore is to have psql or libpq drive the > client_encoding off the client's locale environment instead of letting > it default to equal the server_encoding. I got started on this and just wanted to post an intermediate patch. I have taken the logic from initdb and placed it into libpq and refined the API a bit. At this point, there should be no behaviorial change. It remains to make libpq use this stuff if PGCLIENTENCODING is not set. Unless someone beats me, I'll figure that out later. -- Peter Eisentraut http://developer.postgresql.org/~petere/ codeset-refactor.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Tricky bugs in concurrent index build
Bruce Momjian <[EMAIL PROTECTED]> writes: > The original thinking was to use CONCURRENT, and CREATE CONCURRENT INDEX > sounded like a different type of index, not a different way to build the > index. I don't think CONCURRENTLY has that problem, so CREATE > CONCURRENTLY INDEX sounds good. To read in English, it would be read as > CREATE CONCURRENTLY, INDEX ii. That doesn't sound like English at all to me. Fwiw, I think the best option was what Tom did. The gotcha I tripped on seems pretty minor to me. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] psql 'none' as a HISTFILE special case
On 8/21/06, Bruce Momjian <[EMAIL PROTECTED]> wrote: Tom Lane wrote: > Neil Conway <[EMAIL PROTECTED]> writes: > > On Mon, 2006-08-21 at 19:27 +0300, Martin Atukunda wrote: > >> If readline is used by psql, a history file is automatically used. > >> This patch adds the special file name 'none', which if set as the > >> HISTFILE parameter, will cause psql not to use the history file. > > > I think it would be cleaner to use a separate \set variable to control > > whether a history file is written, rather than needlessly overloading > > the meaning of HISTFILE. > > Why is this useful at all? There's already the -n (don't use readline) > switch. Seems he wants readline without history, perhaps for security. Doesn't setting HISTFILE to /dev/null work? hmm, setting HISTFILE to /dev/null doesn't work on my MacOSX here. so I whipped up this patch. - Martin - ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Tricky bugs in concurrent index build
Bruce Momjian <[EMAIL PROTECTED]> writes: > The original thinking was to use CONCURRENT, and CREATE CONCURRENT INDEX > sounded like a different type of index, not a different way to build the > index. I don't think CONCURRENTLY has that problem, so CREATE > CONCURRENTLY INDEX sounds good. To read in English, it would be read as > CREATE CONCURRENTLY, INDEX ii. OK, we've got two votes for that, so I'll make it so. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Tricky bugs in concurrent index build
Tom Lane wrote: > Andrew Dunstan <[EMAIL PROTECTED]> writes: > > I see we have: > > CREATE index_opt_unique INDEX CONCURRENTLY index_name ... > > which explains how this error occurs. > > Maybe to you, but I'm still caffeine-deprived and don't exactly see what > it was that Greg mistyped. AFAICS he'd have to type CONCURRENTLY twice > to get into a scenario where the proposed warning would fire. > > > But might it not be better to have this instead? > > CREATE CONCURRENTLY index_opt_unique INDEX index_name ... > > When I was fooling with gram.y I was thinking that actually > > CREATE [UNIQUE] INDEX indexname [CONCURRENTLY] ... > > would be the most grammatical thing. But I can live with putting The original thinking was to use CONCURRENT, and CREATE CONCURRENT INDEX sounded like a different type of index, not a different way to build the index. I don't think CONCURRENTLY has that problem, so CREATE CONCURRENTLY INDEX sounds good. To read in English, it would be read as CREATE CONCURRENTLY, INDEX ii. > it right after CREATE, too. Or there was the proposal to put it > first: > > [CONCURRENTLY] CREATE [UNIQUE] INDEX indexname ... I think this suggested the command was CONCURRENTLY, which isn't good. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Safer auto-initdb for RPM init script
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Am Freitag, 25. August 2006 16:20 schrieb Tom Lane: >> It eats rather a lot of disk space for a package that might just be >> getting loaded as part of a system install, with no likelihood of >> actually being used. > Wouldn't the system install start the init script at the end of the > installation process anyway? No, it doesn't. (The Red Hat RPMs in fact did that ... for about a week ... until I was told in no uncertain terms that we don't start unnecessary daemons by default.) regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Safer auto-initdb for RPM init script
Tom Lane <[EMAIL PROTECTED]> writes: > Comments? Anyone see a better way? Well the truly bullet-proof mechanism would be to check every data file on every open. You could have a header with some kind of unique tag generated at initdb time and the backend could ensure it matches the same tag in the control flag whenever you open a data file. That might be too expensive though I don't see data files getting opened all that frequently. You could do the same thing for free by putting the tag in the file names though. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [Pgsqlrpms-hackers] [HACKERS] Safer auto-initdb for RPM init script
Reinhard Max writes: > Another flaw of the flag-file method is, that PGDATA might have been > changed by the sysadmin between installing the RPM and calling the > init script for the first time. What problem do you see there? With either of these methods, a manual change in PGDATA would require a manual initdb before the postmaster would start. That seems like a good conservative thing to me. (Actually, with the flag-file method you could get the initscript to run initdb for you by hand-creating the flag file, but it seems unlikely people would do that in practice.) > But shouldn't mountpoints always have 000 permissions to prevent > writing into the directory as long as nothing is mounted to it? Not sure that that helps much given that the initscript runs as root. And in any case the point here is to protect against human error, not to assume that the installation is managed according to very best practices. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [Pgsqlrpms-hackers] [HACKERS] Safer auto-initdb for RPM init
Am Freitag, 25. August 2006 16:31 schrieb Reinhard Max: > But shouldn't mountpoints always have 000 permissions to prevent > writing into the directory as long as nothing is mounted to it? That's an interesting point, but in practice nobody does that. And we're trying to defend exactly against the case where someone has set up a mount point manually. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Safer auto-initdb for RPM init script
Am Freitag, 25. August 2006 16:20 schrieb Tom Lane: > It eats rather a lot of disk space for a package that might just be > getting loaded as part of a system install, with no likelihood of > actually being used. Wouldn't the system install start the init script at the end of the installation process anyway? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [Pgsqlrpms-hackers] [HACKERS] Safer auto-initdb for RPM init
On Fri, 25 Aug 2006 at 10:20, Tom Lane wrote: > If this were a bulletproof solution then I'd consider it anyway, but > AFAICS it's got the very same vulnerabilities as the flag-file > method, ie, if you RPM install or upgrade while your mountable data > directory is offline, you can still get screwed. Another flaw of the flag-file method is, that PGDATA might have been changed by the sysadmin between installing the RPM and calling the init script for the first time. But shouldn't mountpoints always have 000 permissions to prevent writing into the directory as long as nothing is mounted to it? cu Reinhard ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [Pgsqlrpms-hackers] [HACKERS] Safer auto-initdb for RPM initscript
Sander Steffann wrote: Hi, If this were a bulletproof solution then I'd consider it anyway, but AFAICS it's got the very same vulnerabilities as the flag-file method, ie, if you RPM install or upgrade while your mountable data directory is offline, you can still get screwed. Isn't the most bulletproof solution to make initdb more careful about overwriting an existing data directory? It is extremely careful. The point is that the NFS mount will hide the existing datadir from initdb. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [Pgsqlrpms-hackers] [HACKERS] Safer auto-initdb for RPM initscript
Hi, > If this were a bulletproof solution then I'd consider it anyway, but > AFAICS it's got the very same vulnerabilities as the flag-file method, > ie, if you RPM install or upgrade while your mountable data directory > is offline, you can still get screwed. Isn't the most bulletproof solution to make initdb more careful about overwriting an existing data directory? - Sander ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [Pgsqlrpms-hackers] Safer auto-initdb for RPM init script
Tom Lane wrote: We've seen more than one report of corruption of PG databases that seemed to be due to the willingness of the RPM init script to run initdb if it thinks the data directory isn't there. This is pretty darn risky on an NFS volume, for instance, which might be offline at the instant the script looks. The failure case is - script doesn't see data directory - script runs initdb and starts postmaster - offline volume comes online - KABOOM Been there, done exactly that... I can still imagine ways for this to fail, eg if you run an RPM install or upgrade while your mountable data directory is offline. But it ought to be an order of magnitude safer than things are now. (Hm, maybe the %post script should only run during an RPM install, not an upgrade.) That's probably a good plan. Comments? Anyone see a better way? I can't think of any offhand that aren't too expensive. We ended up putting a root-owned empty data directory beneath the mount point, but that can't be automated. We also decided to turn off the init script execution entirely. The DBAs were more comfortable with a manual database startup for a production machine anyway (this is the way they typically handle Oracle databases also). They get paged if the server ever goes down unplanned, and in that event they like to check things out before bringing the db back up. For planned outages, database startup is simply part of the plan. Joe ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Safer auto-initdb for RPM init script
"Magnus Hagander" <[EMAIL PROTECTED]> writes: >> I don't really want to remove the auto-initdb feature from the >> script, because it's important not to drive away newbies by making >> Postgres hard to start for the first time. But I think we'd better >> think about ways to make it more bulletproof. > Why does initdb have to happen on startup? Wouldn't it be much more > logical to do it at install time? It eats rather a lot of disk space for a package that might just be getting loaded as part of a system install, with no likelihood of actually being used. In CVS tip a just-initdb'd data directory seems to be a shade under 30MB, which I guess isn't a huge amount these days but it compares unfavorably with the installed footprint of the code itself (postgresql-server RPM looks to be about 4MB). If this were a bulletproof solution then I'd consider it anyway, but AFAICS it's got the very same vulnerabilities as the flag-file method, ie, if you RPM install or upgrade while your mountable data directory is offline, you can still get screwed. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Performance testing of COPY (SELECT) TO
> It'd also be interesting to time the same way (with a temp table) in > devel. I don't remember whether we did any performance work on the > COPY CSV data path in this cycle, or whether that was all present in > 8.1. In any case it'd be worth proving that the COPY SELECT patch isn't > degrading performance of the copy-a-relation case. In the export, there is a largish table, that has both many columns and rows. With COPY(SELECT) patch applied: time psql -c "copy (select * from table) to 'file'" dbx COPY 886046 real0m13.253s user0m0.000s sys 0m0.000s time psql -c "copy table to 'file'" dbx COPY 886046 real0m13.234s user0m0.000s sys 0m0.000s time psql -c "copy table to stdout" dbx >file real0m15.155s user0m0.540s sys 0m0.450s time psql -c "copy (select * from table) to stdout" dbx >file real0m15.079s user0m0.540s sys 0m0.590s Surprisingly, without the COPY(SELECT) patch it's slower, it's the lowest from five runs, e.g. with warm caches: time psql -c "copy table to 'file'" dbx real0m20.464s user0m0.000s sys 0m0.010s time psql -c "copy table to stdout" dbx >file real0m25.753s user0m0.570s sys 0m0.460s With the original settings, temp_buffers = 1000 on 8.2CVS, the one client case looks like this: first run 1:44, second run 1:12, third run 1:04. It seems it's a bit faster both on startup and on subsequent runs. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Safer auto-initdb for RPM init script
> I don't really want to remove the auto-initdb feature from the > script, because it's important not to drive away newbies by making > Postgres hard to start for the first time. But I think we'd better > think about ways to make it more bulletproof. Why does initdb have to happen on startup? Wouldn't it be much more logical to do it at install time? (like we do in the win32 installer, for example, I'm sure there are other examples as well) //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Tricky bugs in concurrent index build
Tom Lane wrote: > Andrew Dunstan <[EMAIL PROTECTED]> writes: >> I see we have: >> CREATE index_opt_unique INDEX CONCURRENTLY index_name ... >> which explains how this error occurs. > > Maybe to you, but I'm still caffeine-deprived and don't exactly see what > it was that Greg mistyped. AFAICS he'd have to type CONCURRENTLY twice > to get into a scenario where the proposed warning would fire. i guess Greg is talking about something like(ie just forgetting to name the index): devel=# create index concurrently on foo ( a); CREATE INDEX devel=# \d foo Table "public.foo" Column | Type | Modifiers +-+--- a | integer | b | text| c | integer | d | text| Indexes: "concurrently" btree (a) Stefan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Tricky bugs in concurrent index build
Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: I see we have: CREATE index_opt_unique INDEX CONCURRENTLY index_name ... which explains how this error occurs. Maybe to you, but I'm still caffeine-deprived and don't exactly see what it was that Greg mistyped. AFAICS he'd have to type CONCURRENTLY twice to get into a scenario where the proposed warning would fire. AAUI, he left off the index name so the first rule was matched rather than the second, with "concurrently" being the index name. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Tricky bugs in concurrent index build
Andrew Dunstan <[EMAIL PROTECTED]> writes: > I see we have: > CREATE index_opt_unique INDEX CONCURRENTLY index_name ... > which explains how this error occurs. Maybe to you, but I'm still caffeine-deprived and don't exactly see what it was that Greg mistyped. AFAICS he'd have to type CONCURRENTLY twice to get into a scenario where the proposed warning would fire. > But might it not be better to have this instead? > CREATE CONCURRENTLY index_opt_unique INDEX index_name ... When I was fooling with gram.y I was thinking that actually CREATE [UNIQUE] INDEX indexname [CONCURRENTLY] ... would be the most grammatical thing. But I can live with putting it right after CREATE, too. Or there was the proposal to put it first: [CONCURRENTLY] CREATE [UNIQUE] INDEX indexname ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Performance testing of COPY (SELECT) TO
> Böszörményi Zoltán <[EMAIL PROTECTED]> writes: >> With PostgreSQL 8.1.4, I used this: > >> begin; >> select ... into temp myquery1; >> copy myquery1 to stdout csv delimiter '|'; >> rollback; > > The performance of this would doubtless vary a lot with the temp_buffers > setting. Did you try different values? Yes, I did, but now checked back with 8.2CVS. The previously quoted result was achieved with temp_buffers = 1000 on both 8.1.4 and 8.2CVS. On 8.2CVS with temp_buffers = 4096, the 10 client case kills the machine with swapping, but the 3 client runtime with COPY(SELECT) went down to 2:41. The SELECT INTO TEMP case went down to 3:36. > It'd also be interesting to time the same way (with a temp table) in > devel. I don't remember whether we did any performance work on the > COPY CSV data path in this cycle, or whether that was all present in > 8.1. In any case it'd be worth proving that the COPY SELECT patch isn't > degrading performance of the copy-a-relation case. I will report back with that, say on Monday. In the meantime, I documented the COPY (SELECT) case and modified parser/analyze.c and tcop/utility.c so neither of them calls anything from under another directory. I think it's cleaner now. Also, I tried to implement more closely what printtup() does. Please, review. Best regards, Zoltán Böszörményi pgsql-copyselect-7.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Safer auto-initdb for RPM init script
Am Freitag, 25. August 2006 15:19 schrieb Tom Lane: > I don't really want to remove the auto-initdb feature from the script, > because it's important not to drive away newbies by making Postgres > hard to start for the first time. But I think we'd better think about > ways to make it more bulletproof. Why not run initdb in the %post and not in the init script at all? That should be newbie-friendly as well. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Tricky bugs in concurrent index build
Gregory Stark wrote: Do we want something like this? I just made this error myself so unless I'm special (pauses for jokes) I imagine others would be prone to it as well. I would normally be pretty leery of code like this but it seems unlikely anyone would actually want an index named "concurrently" and the consequences if you get it wrong in a production environment are pretty dire. We might even consider making it an outright error. --- gram.y 25 Aug 2006 10:14:17 +0100 2.558 +++ gram.y 25 Aug 2006 14:04:54 +0100 @@ -56,6 +56,7 @@ #include "commands/defrem.h" #include "nodes/makefuncs.h" #include "parser/gramparse.h" +#include "parser/scansup.h" #include "storage/lmgr.h" #include "utils/date.h" #include "utils/datetime.h" @@ -3653,6 +3654,12 @@ opt_definition OptTableSpace where_clause { IndexStmt *n = makeNode(IndexStmt); + + if (!strcmp(downcase_truncate_identifier($4,20,false), "concurrently")) + ereport(WARNING, + (errcode(ERRCODE_SYNTAX_ERROR), +errmsg("performing non-concurrent index build of index named \"concurrently\""))); + n->unique = $2; n->concurrent = false; n->idxname = $4; I see we have: CREATE index_opt_unique INDEX CONCURRENTLY index_name ... which explains how this error occurs. But might it not be better to have this instead? CREATE CONCURRENTLY index_opt_unique INDEX index_name ... Then ISTM no ambiguity could arise (and it's also closer to grammatical English, if that matters). Just a thought cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Safer auto-initdb for RPM init script
We've seen more than one report of corruption of PG databases that seemed to be due to the willingness of the RPM init script to run initdb if it thinks the data directory isn't there. This is pretty darn risky on an NFS volume, for instance, which might be offline at the instant the script looks. The failure case is - script doesn't see data directory - script runs initdb and starts postmaster - offline volume comes online - KABOOM The initdb creates a "local" database that's physically on the root volume underneath the mountpoint directory for the intended volume. After the mountable volume comes online, these files are shadowed by the original database files. The problem is that by this point the postmaster has a copy of pg_control in memory from the freshly-initdb'd database, and that pg_control has a WAL end address and XID counter far less than is correct for the real database. Havoc ensues, very probably resulting in a hopelessly corrupt database. I don't really want to remove the auto-initdb feature from the script, because it's important not to drive away newbies by making Postgres hard to start for the first time. But I think we'd better think about ways to make it more bulletproof. The first thought that comes to mind is to have the RPM install create the data directory if not present and create a flag file in it showing that it's safe to initdb. Then the script is allowed to initdb only if it finds the directory and the flag file but not PG_VERSION. Something like (untested off-the-cuff coding) %post server if [ ! -d $PGDATA ]; then mkdir $PGDATA touch $PGDATA/NO_DATABASE_YET fi and in initscript if [ -d $PGDATA -a -f $PGDATA/NO_DATABASE_YET -a ! -f $PGDATA/PG_VERSION ] ; then rm -f $PGDATA/NO_DATABASE_YET && initdb ... fi If the data directory is not mounted then the -d test would fail, unless the directory is itself the mount point, in which case it would be there but not contain the NO_DATABASE_YET file. I can still imagine ways for this to fail, eg if you run an RPM install or upgrade while your mountable data directory is offline. But it ought to be an order of magnitude safer than things are now. (Hm, maybe the %post script should only run during an RPM install, not an upgrade.) Comments? Anyone see a better way? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Tricky bugs in concurrent index build
Do we want something like this? I just made this error myself so unless I'm special (pauses for jokes) I imagine others would be prone to it as well. I would normally be pretty leery of code like this but it seems unlikely anyone would actually want an index named "concurrently" and the consequences if you get it wrong in a production environment are pretty dire. We might even consider making it an outright error. --- gram.y 25 Aug 2006 10:14:17 +0100 2.558 +++ gram.y 25 Aug 2006 14:04:54 +0100 @@ -56,6 +56,7 @@ #include "commands/defrem.h" #include "nodes/makefuncs.h" #include "parser/gramparse.h" +#include "parser/scansup.h" #include "storage/lmgr.h" #include "utils/date.h" #include "utils/datetime.h" @@ -3653,6 +3654,12 @@ opt_definition OptTableSpace where_clause { IndexStmt *n = makeNode(IndexStmt); + + if (!strcmp(downcase_truncate_identifier($4,20,false), "concurrently")) + ereport(WARNING, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("performing non-concurrent index build of index named \"concurrently\""))); + n->unique = $2; n->concurrent = false; n->idxname = $4; -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Performance testing of COPY (SELECT) TO
=?iso-8859-2?Q?B=F6sz=F6rm=E9nyi_Zolt=E1n?= <[EMAIL PROTECTED]> writes: > With PostgreSQL 8.1.4, I used this: > begin; > select ... into temp myquery1; > copy myquery1 to stdout csv delimiter '|'; > rollback; The performance of this would doubtless vary a lot with the temp_buffers setting. Did you try different values? It'd also be interesting to time the same way (with a temp table) in devel. I don't remember whether we did any performance work on the COPY CSV data path in this cycle, or whether that was all present in 8.1. In any case it'd be worth proving that the COPY SELECT patch isn't degrading performance of the copy-a-relation case. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Tricky bugs in concurrent index build
Gregory Stark <[EMAIL PROTECTED]> writes: > Because of the way the AM API works changing how the initial heap scan works > is a bit of a pain. It would require either having some global state or > passing the concurrent flag through the AM methods or alternatively having a > whole new AM method. Yeah, I dealt with that by adding a 'concurrent build' flag to IndexInfo. A bit grotty but it beats changing a lot of function signatures. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] integration of pgcluster into postgresql
Hi there, I guess many - if not most - here have tried pgcluster. For those who didn't, postgresql is pretty much the equivalent of pgcluster configured without load balancer or replicator, in read-write standalone mode. From a user point of view, that's three maximum additional configuration files, which can be basically set to those default values and distributed as is in standard (making installs/upgrades transparent to non-cluster environments). From a developer point of view, the pgcluster code is quite easy to take a hold on. pgcluster still has quite a few pending issues, but it's good enough for many users. Integrating it as part of a standard postgresql distribution would likely not disrupt standard postgresql functionning, while giving it the replication features it lacks as is. It's also likely to accelerate its maturing by a more widespread adoption and as a result overcome most of its issues. If the idea of its integration in the main postgresql code is of any interest to the postgresql team, I'm willing to invest some effort on it. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] RV: Problem at install / uninstall postgresql
Am Mittwoch, 23. August 2006 19:04 schrieb Miguel Pedreño Saura: > Setting up postgresql-7.4 (7.4.12-3) ... > Error: cluster configuration already exists Report this issue to the Ubuntu packager. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Intervals within information_schema
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 >> Does anyone know what is supposed to go into the interval_type field? > INTERVAL_TYPE IN > ( 'YEAR', 'MONTH', 'DAY', 'HOUR', 'MINUTE', 'SECOND', > 'YEAR TO MONTH', 'DAY TO HOUR', 'DAY TO MINUTE', > 'DAY TO SECOND', 'HOUR TO MINUTE', > 'HOUR TO SECOND', 'MINUTE TO SECOND' ) > > Also, depending on which one of these is specified, the precision field > varies in meaning. I haven't done a complete analysis, but it might > not be as simple as your patch suggests. Yes, I saw some docs that hinted as much. I'm still not clear on exactly what interval_type is supposed to represent, or at least how it would ever apply to a Postgres attribute. It almost seems as though the spec is calling for interval_type of 'HOURS' and interval_precision of '2' for an example interval of '2 hours'? I'd normally say that we should leave it as NULL in such a case, but someone added a "FIXME" at some point, which led me to ask exactly what should go in there. Not a big deal, I suppose - it looks as though few, if any, other products are using those fields either. Thanks for the reply. - -- Greg Sabino Mullane [EMAIL PROTECTED] End Point Corporation PGP Key: 0x14964AC8 200608250614 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFE7s26vJuQZxSWSsgRAjFZAJ9SNXevNWtN0HXCK9D0FPsYN/9FTQCfeDX8 srbrlp4kO3co/bKPmCjP+q8= =YaXx -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] RV: Problem at install / uninstall postgresql
Hello! I’m installing postgresql in a O.S. Linux Ubuntu 6.06 and I have the next printout at installation process: Setting up postgresql-7.4 (7.4.12-3) ... Error: cluster configuration already exists I can’t find information about how I can solve the problem at Internet and I can install/uninstall anythings. [EMAIL PROTECTED]:/tmp# aptitude install postgresql Reading package lists... Done Building dependency tree... Done Reading extended state information Initializing package states... Done Building tag database... Done No packages will be installed, upgraded, or removed. 0 packages upgraded, 0 newly installed, 0 to remove and 0 not upgraded. Need to get 0B of archives. After unpacking 0B will be used. Writing extended state information... Done Setting up postgresql-7.4 (7.4.12-3) ... Error: cluster configuration already exists dpkg: error processing postgresql-7.4 (--configure): subprocess post-installation script returned error exit status 1 dpkg: dependency problems prevent configuration of postgresql: postgresql depends on postgresql-7.4; however: Package postgresql-7.4 is not configured yet. dpkg: error processing postgresql (--configure): … postgresql-pltcl-7.4 depends on postgresql-7.4; however: Package postgresql-7.4 is not configured yet. dpkg: error processing postgresql-pltcl-7.4 (--configure): dependency problems - leaving unconfigured Errors were encountered while processing: postgresql-7.4 postgresql postgresql-plperl-7.4 postgresql-plpython-7.4 postgresql-pltcl-7.4 E: Sub-process /usr/bin/dpkg returned an error code (1) A package failed to install. Trying to recover: Setting up postgresql-7.4 (7.4.12-3) ... Error: cluster configuration already exists dpkg: error processing postgresql-7.4 (--configure): subprocess post-installation script returned error exit status 1 dpkg: dependency problems prevent configuration of postgresql-pltcl-7.4: postgresql-pltcl-7.4 depends on postgresql-7.4; however: ….. Package postgresql-7.4 is not configured yet. dpkg: error processing postgresql-plpython-7.4 (--configure): dependency problems - leaving unconfigured Errors were encountered while processing: postgresql-7.4 postgresql-pltcl-7.4 postgresql postgresql-plperl-7.4 postgresql-plpython-7.4 [EMAIL PROTECTED]:/tmp# Thanks, Miguel
Re: [HACKERS] Tricky bugs in concurrent index build
Tom Lane <[EMAIL PROTECTED]> writes: > Greg Stark <[EMAIL PROTECTED]> writes: >> Hmmm. Or is that true. The problem may be somewhat easier since at least you >> can be sure every tuple in the heap is in the index. So if you see a >> DELETE_IN_PROGRESS either it *was* a constraint violation prior to the delete >> and failing is reasonable or it's an update in which case maybe it's possible >> to detect that they're part of the same chain? > > Unless we are willing to lock every single tuple while we insert it, > this seems unfixable to me. Without a lock, the tuple could become > DELETE_IN_PROGRESS immediately after we look at it. I think there's some confusion here. This above paragraph was taken from some thoughts about Hannu's suggestion of having a separate ALTER INDEX SET UNIQUE command. That command might have an advantage over CREATE INDEX CONCURRENTLY because it knows the index is already complete; it doesn't have to worry about potential conflicts with tuples that it will only find later in the scan. Effectively this is equivalent to making CREATE UNIQUE INDEX CONCURRENTLY three phases. The first two phases would be a regular CREATE INDEX CONCURRENTLY and the third phase would be what ALTER INDEX SET UNIQUE does which is scan the index and verify that it's unique. ALTER INDEX SET UNIQUE would have to perform a similar two-transaction dance though. It would have to set the index unique, wait until everyone has seen the new constraint. Then verify that the property is indeed unique, possibly rolling back the constraint creation if it's not. That would make the whole process of creating a unique index quite long. On the plus side it would be a useful command in itself. Doing an index scan might be pretty slow but if the table is mostly clean of dead and recently dead tuples it won't have to visit the heap much and should still be much quicker than building a new index. And it would itself be a concurrent command. > Actually it's worse than that. We could examine a tuple, see that > it's good, include it in the uniqueness check. Then someone updates > the tuple and puts the new version near the end of the table. By > the time we reach that version, it could be committed good. There > is absolutely no way that we could notice an issue without applying > extremely expensive tests to *every* apparently-good tuple. I think ALTER INDEX SET UNIQUE would not have this problem. It would only have to look at tuples using its own snapshot and see if there's a violation. If there isn't a violation as of its own snapshot then it can be sure later transactions will preserve this property since the index was always complete and it waited after creating the constraint. > [ thinks for a bit... ] At least, it seems hopeless if we use > SnapshotNow. Does it help if we use a real snapshot? I'm thinking > pass 1 inserts exactly those tuples that are good according to a > snap taken at its beginning, and then pass 2 considers only tuples > that are good according to a snap taken at *its* beginning. But > having consumed no caffeine yet this morning, I'm not sure I can > spot any flaws that might exist in this idea. What about tuples that are inserted and committed in the window between the two phases. Ie, they're RECENTLY_DEAD but not in phase2's snapshot. Or do you mean we use SatisfiesVacuum to determine what to insert but SatisfiesSnapshot to determine whether to check uniqueness? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Autovacuum on by default?
Summarizing this thread, I see support for the following: - autovacuum set to on by default in 8.2. - stats_row_level also defaults to on. (Perhaps stats_block_level should also default to on so it's not inconsistent, seeing that everything else in on, too.) - Delayed vacuum and delayed autovacuum will stay disabled. - Scale factor set to 0.08 (vacuum) and 0.04 (analyze) (?) (formerly 0.4 and 0.2) - Leave base thresholds alone (pending further analysis that might remove them altogether?) -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Replication
Jeff Davis wrote: Which doesn't work very well in the case of two groups of servers set up in two physical locations. I can see two possibilities: (1) You require a quorum to be effective, in which case your cluster of databases is only as reliable as the location which holds more servers. (2) You have another central authority that determines which databases are up, and which are down. Then your cluster is only as reliable as that central authority. Right, the ideal here would be two sync clusters a both locations, connected via async replication :-) Even if you have a large number of nodes at different locations, then you end up with strange decisions to make if the network connections are intermittent or very slow. A temporary slowdown of many nodes could cause them to be degraded until some kind of human intervention brought them back. Until that time you might not be able to determine which nodes make up an authoritative group. Side note: in such a case, I think a GCS will just choose only one node to be the 'authoritative group'. Because most systems cannot effort to have long waits for such decisions. For database replication I also think its better to have at least one node running than none. > This kind of degradation could happen in the case of a DDoS attack, or perhaps a worm moving around the internet. Well, sync replication in general needs a good, low latency and secure interconnect. The internet does not seem to be a good fit here. In practice everyone can find a solution that works for them. However, synchronous replication is not perfect, and there are many failure scenarios which need to be resolved in a way that fits your business. I think synchronous replication is inherently less available than asynchronous. This surely depends on the environment. With a dedicated (i.e. low latency and secure) interconnect sync replication is surely more available because your arguments above don't apply. And because sync replication guarantees you won't loose committed transactions. If however you want or have to replicate over the internet it depends. Your arguments above also apply to async replication. Only that because of the conflict resolution, async replication systems can continue to operate on all the disconnected nodes and merge their work later on as the network is up again. But then again, async still has the danger of loosing transactions. So I probably agree: if you are on an unreliable network and if you have conflict resolution correctly setup then async replication is more available, but less secure. As I said above, sync replication needs a reliable interconnect, better even have two interconnects, because it's a SPOF for a clustered database system. Regards Markus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsvector/tsearch equality and/or portability issue
Oops. Fixed. Tom Lane wrote: Teodor Sigaev <[EMAIL PROTECTED]> writes: Fixed in 8.1 and HEAD. Thank you This appears to have created a regression test failure: *** ./expected/tsearch2.out Sun Jun 18 12:55:28 2006 --- ./results/tsearch2.out Thu Aug 24 14:30:02 2006 *** *** 2496,2503 f| f| '345':1 'qwerti':2 'copyright':3 f| 'qq':7 'bar':2,8 'foo':1,3,6 'copyright':9 - f| 'a':1A,2,3C 'b':5A,6B,7C,8B f| 'a':1A,2,3B 'b':5A,6A,7C,8 f| '7w' 'ch' 'd7' 'eo' 'gw' 'i4' 'lq' 'o6' 'qt' 'y0' f| 'ar' 'ei' 'kq' 'ma' 'qa' 'qh' 'qq' 'qz' 'rx' 'st' f| 'gs' 'i6' 'i9' 'j2' 'l0' 'oq' 'qx' 'sc' 'xe' 'yu' --- 2496,2503 f| f| '345':1 'qwerti':2 'copyright':3 f| 'qq':7 'bar':2,8 'foo':1,3,6 'copyright':9 f| 'a':1A,2,3B 'b':5A,6A,7C,8 + f| 'a':1A,2,3C 'b':5A,6B,7C,8B f| '7w' 'ch' 'd7'
Re: [HACKERS] Tricky bugs in concurrent index build
Tom Lane <[EMAIL PROTECTED]> writes: > At the moment it may be moot, because I've realized that validate_index > doesn't work anyway. It is scanning the index and then assuming that > any tuple inserted into the index subsequent to that scan will still be > INSERT_IN_PROGRESS when the heapscan reaches it. EGADS Boy I feel stupid now. In fairness I think what happened is that the original plan was, like your new plan, based on snapshots. And I only switched to using HeapSatisfiesVacuum after several iterations. I guess there were some assumptions in the original thinking that I never revisited. Because of the way the AM API works changing how the initial heap scan works is a bit of a pain. It would require either having some global state or passing the concurrent flag through the AM methods or alternatively having a whole new AM method. I'll have to read (and reread) your description again in the morning -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] autovacuum cannot start when connection is full
"Guillaume Smet" <[EMAIL PROTECTED]> wrote: > There is a mention in autovacuum doc: > http://www.postgresql.org/docs/8.1/interactive/maintenance.html#AUTOVACUUM > > Last sentence of the first paragraph is: > "Also, it's important to allow a slot for the autovacuum process when > choosing the value of superuser_reserved_connections." Thank you! It may be as well to add the same mention to the description of superuser_reserved_connections. | Determines the number of connection "slots" that are reserved for connections | by PostgreSQL superusers, *including autovacuum*. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Performance testing of COPY (SELECT) TO
Hi, we have a large export here, I made an in-house benchmark between Informix, plain PostgreSQL-8.1.4 and 8.2devel+COPY(SELECT) using the same data and query. Find the results below for the two PostgreSQL versions. With PostgreSQL 8.1.4, I used this: begin; select ... into temp myquery1; copy myquery1 to stdout csv delimiter '|'; rollback; With 8.2devel, I simple used copy (select ...) to stdout csv delimiter '|'; # of clients: 1* 3** 10** PostgreSQL 1:3310:58 55:46 PostgreSQL 8.2 1:194:5518:28 * - average of 4 runs, the first was with cold caches after reboot ** - 1 run, average of cliens' runtimes Performance between 8.1.4 and 8.2devel is interesting: 1 client: 15% 3 clients: 55.2% 10 clients: 66.9% The same machine was used for testing. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] 8.2 new features
Maybe the following item would look good in a 'new features' list targeted at the less technically oriented audience: - LDAP support for authentication and centralized connection management. Yours, Laurenz Albe ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match