[GENERAL] How do you handle shared memory corruption issues? (contrib/xml2)
Postgresql 8.1 I made a simple modification to contrib/xml2 to include the ability to process exslt tags... On the production side, the postmaster crashes when I try to process my exslt stylesheet. On my development machine, everything runs without crashing. There's a number of differences there... There's little or no load. It has far less memory, and its PostgreSQL memory configuration reflects that. It has a different processor (P4 versus x86_64 on production). There are other differences at the compiler and library level I'm sure. Any of these things could be contributing to the crashes. They are both using the same version of PsotgreSQL, compiled with the same options. They are both using the same version of libxml2 and libxslt. Incidently, I installed the Perl bindings for libxml2 and libxslt. I created a plperlu function which uses those libraries to do the xml transformations. It blows up in the same way. So, I suspect that there is an issue with libxslt, but I have no idea how to nail it down. Please advise... I'm up against a deadline, so my main goal is to get this working. Timelines and paychecks being as they are, I'm not nearly as concerned about fixing the larger problem. Is there a different way to process xml and (e)xslt at the database not using contrib/xml2 or XML::LibXSLT ? Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games. http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] How do you handle shared memory corruption issues? (contrib/xml2)
I compiled the libraries from the source tarballs I configured using --with-crypt --with-debugger --with-python. It is always possible I missed something! Do you think the RPMs will even be applicable considering I'm not using RedHat or Fedora? - Original Message From: Martin Gainty <[EMAIL PROTECTED]> To: CG <[EMAIL PROTECTED]>; postgresql listserv Sent: Wednesday, June 27, 2007 12:46:59 PM Subject: Re: [GENERAL] How do you handle shared memory corruption issues? (contrib/xml2) Hi CG looks as if your 64bit box needs 64bit libraries instead of default 32 bit did you check here http://dries.studentenweb.org/apt/packages/perl-XML-LibXSLT/info.html Let us know M-- This email message and any files transmitted with it contain confidential information intended only for the person(s) to whom this email message is addressed. If you have received this email message in error, please notify the sender immediately by telephone or email and destroy the original message without making a copy. Thank you. - Original Message - From: "CG" <[EMAIL PROTECTED]> To: "postgresql listserv" Sent: Wednesday, June 27, 2007 12:02 PM Subject: [GENERAL] How do you handle shared memory corruption issues? (contrib/xml2) > Postgresql 8.1 > > I made a simple modification to contrib/xml2 to include the ability to > process exslt tags... On the production side, the postmaster crashes when > I try to process my exslt stylesheet. On my development machine, > everything runs without crashing. There's a number of differences there... > There's little or no load. It has far less memory, and its PostgreSQL > memory configuration reflects that. It has a different processor (P4 > versus x86_64 on production). There are other differences at the compiler > and library level I'm sure. Any of these things could be contributing to > the crashes. > > They are both using the same version of PsotgreSQL, compiled with the same > options. They are both using the same version of libxml2 and libxslt. > > Incidently, I installed the Perl bindings for libxml2 and libxslt. I > created a plperlu function which uses those libraries to do the xml > transformations. It blows up in the same way. > > So, I suspect that there is an issue with libxslt, but I have no idea how > to nail it down. Please advise... > > I'm up against a deadline, so my main goal is to get this working. > Timelines and paychecks being as they are, I'm not nearly as concerned > about fixing the larger problem. Is there a different way to process xml > and (e)xslt at the database not using contrib/xml2 or XML::LibXSLT ? > > > > > Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated > for today's economy) at Yahoo! Games. > http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend > TV dinner still cooling? Check out "Tonight's Picks" on Yahoo! TV. http://tv.yahoo.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] How do you handle shared memory corruption issues? (contrib/xml2)
...As we continue through this process I just want to express my sincere thanks for your suggestions heretofore I use neither RedHat nor Fedora :( I also didn't look carefully at the RPMs you were offering me. They are for XML::LibXSLT, not LibXSLT. Sorry about that! I originally built XML::LibXSLT using CPAN. For argument's sake, I dropped in just the binary from the Fedora Core 6 x86_64 RPM (the rest of the RPM's contents according to diff were identical). It blew up with a Floating Point Error when I ran #!/usr/bin/perl use XML::LibXSLT; use XML::LibXML; my $parser = XML::LibXML->new(); my $xslt = XML::LibXSLT->new(); my $source = $parser->parse_file('foo.xml'); my $style_doc = $parser->parse_file('bar.xsl'); my $stylesheet = $xslt->parse_stylesheet($style_doc); my $results = $stylesheet->transform($source); print $stylesheet->output_string($results); The auto/XML/LibXSLT/LibXSLT.so library compiled on the machine which it is running should be sufficient, right? - Original Message From: Martin Gainty <[EMAIL PROTECTED]> To: CG <[EMAIL PROTECTED]>; postgresql listserv Sent: Wednesday, June 27, 2007 2:23:37 PM Subject: Re: [GENERAL] How do you handle shared memory corruption issues? (contrib/xml2) yes..go with the specific distro for your OS (either RH or Fedora) keep us apprised! Thanks/ M-- This email message and any files transmitted with it contain confidential information intended only for the person(s) to whom this email message is addressed. If you have received this email message in error, please notify the sender immediately by telephone or email and destroy the original message without making a copy. Thank you. - Original Message - From: "CG" <[EMAIL PROTECTED]> To: "Martin Gainty" <[EMAIL PROTECTED]>; "postgresql listserv" Sent: Wednesday, June 27, 2007 2:14 PM Subject: Re: [GENERAL] How do you handle shared memory corruption issues? (contrib/xml2) >I compiled the libraries from the source tarballs I configured >using --with-crypt --with-debugger --with-python. It is always possible I >missed something! > > Do you think the RPMs will even be applicable considering I'm not using > RedHat or Fedora? > > > - Original Message > From: Martin Gainty <[EMAIL PROTECTED]> > To: CG <[EMAIL PROTECTED]>; postgresql listserv > > Sent: Wednesday, June 27, 2007 12:46:59 PM > Subject: Re: [GENERAL] How do you handle shared memory corruption issues? > (contrib/xml2) > > > Hi CG > > looks as if your 64bit box needs 64bit libraries instead of default 32 bit > > did you check here > http://dries.studentenweb.org/apt/packages/perl-XML-LibXSLT/info.html > > Let us know > M-- > > This email message and any files transmitted with it contain confidential > information intended only for the person(s) to whom this email message is > addressed. If you have received this email message in error, please > notify > the sender immediately by telephone or email and destroy the original > message without making a copy. Thank you. > > - Original Message - > From: "CG" <[EMAIL PROTECTED]> > To: "postgresql listserv" > Sent: Wednesday, June 27, 2007 12:02 PM > Subject: [GENERAL] How do you handle shared memory corruption issues? > (contrib/xml2) > > >> Postgresql 8.1 >> >> I made a simple modification to contrib/xml2 to include the ability to >> process exslt tags... On the production side, the postmaster crashes when >> I try to process my exslt stylesheet. On my development machine, >> everything runs without crashing. There's a number of differences >> there... >> There's little or no load. It has far less memory, and its PostgreSQL >> memory configuration reflects that. It has a different processor (P4 >> versus x86_64 on production). There are other differences at the compiler >> and library level I'm sure. Any of these things could be contributing to >> the crashes. >> >> They are both using the same version of PsotgreSQL, compiled with the >> same >> options. They are both using the same version of libxml2 and libxslt. >> >> Incidently, I installed the Perl bindings for libxml2 and libxslt. I >> created a plperlu function which uses those libraries to do the xml >> transformations. It blows up in the same way. >> >> So, I suspect that there is an issue with libxslt, but I have no idea how >> to nail it down. Please advise... >> >> I'm up against a deadline, so my main goal is to get this working. >> Timeline
[GENERAL] xpath_string namespace issue...
I'm not sure what I'm missing here... :) select xpath_string($xml$ baz $xml$ ,'//f:bar/text()') This does not give me back "baz" as I was expecting it to... How does one clue-in the xpath functions to the namespaces in the XML document? Got a little couch potato? Check out fun summer activities for kids. http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] xpath_* namespace bug
select xpath_string($xml$ baz $xml$ ,'//f:bar/text()'); This does not give me back "baz" as I was expecting it to... It seems like xpath is ignoring the namespace directives in the source XML document. TV dinner still cooling? Check out "Tonight's Picks" on Yahoo! TV. http://tv.yahoo.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] xpath_* namespace bug
Can't wait... Got to have it now. :( I patched xpath.c and created a function which takes a third parameter which are the known namespaces in a space delimited hash list ... I based the patch on some example code from the libxml website which works perfectly from the command line. However, in PostgreSQL, it only works 65% of the time. The other 35% of the time it returns an empty string. I have no idea how to debug that kind of problem. You make it sound like 8.3 will have full-featured xpath_* frunctions. Will it have full-featured xslt support as well? - Original Message From: Nikolay Samokhvalov <[EMAIL PROTECTED]> To: CG <[EMAIL PROTECTED]> Cc: postgresql listserv Sent: Monday, July 9, 2007 4:33:04 PM Subject: Re: [GENERAL] xpath_* namespace bug AFAIK, contrib/xml2 doesn't support namespaces for XPath expressions at all. Wait for PostgreSQL 8.3 :-) On 7/9/07, CG <[EMAIL PROTECTED]> wrote: > select xpath_string($xml$ > > baz > > $xml$ > ,'//f:bar/text()'); > > This does not give me back "baz" as I was expecting it to... It seems like > xpath is ignoring the namespace directives in the source XML document. > > > > > TV dinner still cooling? > Check out "Tonight's Picks" on Yahoo! TV. > http://tv.yahoo.com/ > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match > -- Best regards, Nikolay Get the Yahoo! toolbar and be alerted to new email wherever you're surfing. http://new.toolbar.yahoo.com/toolbar/features/mail/index.php ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Need help to clean up after failed CLUSTER
I'm using PostgreSQL 8.1 ... I had to terminate some clustering before it had completed. I think I have quite a bit of wasted disk space in half-baked table files. I need some suggestions for an easy way to find and clean out the files left over from the failed cluster operations. TIA! Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Automatic type conversion
I'm in the process of attempting to upgrade from 8.1 to 8.4, and I've been using the uniqueidentifier contrib module for my UUID's ... In 8.1 Postgresql was able to figure out what to do with statements like # SELECT 'Your new UUID is ' || newid(); ?column? --- Your new UUID is 5f1a4987-9005-42d6-8e62-d1ff14bb1433 (1 row) but in 8.4 (and probably 8.3 also...), # SELECT 'Your new UUID is ' || newid(); ERROR: operator is not unique: unknown || uniqueidentifier LINE 1: SELECT 'Your new UUID is ' || newid(); ^ HINT: Could not choose a best candidate operator. You might need to add explicit type casts. But, with the new internal UUID datatype, what is the magic recipe that allows this to run? # SELECT 'Your new UUID is ' || uuid_generate_v4(); ?column? --- Your new UUID is cd8d8f5f-3eea-4e96-be4f-f93daa174d8f (1 row) I could add the explicit type casts, but I'd rather find out what the nature of the subtle (or not-so-subtle) difference I've stumbled upon is...
[GENERAL] best practice transitioning from one datatype to another
Another UUID question... I was hoping to put this off for a while, but it looks like it's going to need to happen sooner than later. Rats! I keep telling myself transitioning to core datatypes is good for the soul. While transitioning from 8.1 to 8.4, I need to transition to the internal UUID type in place of the contrib/uniqueidentifier module. I've built the database around uniqueidentifier, so nearly every table has one column of that data type. It's going to be tedious to ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE uuid; ...repeat 600 times... I'll also have to drop and reload the views and the rules on tables. It'll be tedious even if the tables have no data in them. Can anyone recommend a better/faster way to make the transition?
Re: [GENERAL] best practice transitioning from one datatype to another
Trying to fix a dump file ... cat backup.sql | sed -e "s/uniqueidentifier/uuid/g" > backup_fixed.sql ... gives me a dump that won't import. It is hard digging through 30+ gigs of text data to find where sed ate a field delimiter, so I'm going to give Tom's idea a try. I didn't even know the ALTER TYPE x RENAME TO y; was even available, and I probably wouldn't have tried it if Tom hadn't suggested it. It takes a certan amount of chutzpah to make that kind of change before diving into the lengthy process of database upgrading. From: Arndt Lehmann To: pgsql-general@postgresql.org Sent: Thursday, July 16, 2009 5:22:26 AM Subject: Re: [GENERAL] best practice transitioning from one datatype to another On Jul 16, 5:46 pm, a.w...@netzmeister-st-pauli.de (Andreas Wenk) wrote: > Arndt Lehmann schrieb: > > > On Jul 16, 8:05 am, t...@sss.pgh.pa.us (Tom Lane) wrote: > >> CG writes: > >>> While transitioning from 8.1 to 8.4, I need to transition to the internal > >>> UUID type in place of the contrib/uniqueidentifier module. I've built the > >>> database around uniqueidentifier, so nearly every table has one column of > >>> that data type. It's going to be tedious to > >>> ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE uuid; > >>> ...repeat 600 times... > >>> I'll also have to drop and reload the views and the rules on tables. > >>> It'll be tedious even if the tables have no data in them. > >>> Can anyone recommend a better/faster way to make the transition? > >> Couldn't you rename the type to uuid in the 8.1 database before you > >> dump? > > >> regards, tom lane > > >> -- > >> Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) > >> To make changes to your > >> subscription:http://www.postgresql.org/mailpref/pgsql-general > > > Just an idea - don't know if it works, first try on a test server: > > 1. Dump the complete database into text format (pg_dump --format=t) > > 2. Do a search and replace from "uniqueidentifier" to "uuid" > > 3. Reimport > > > Best Regards, > > Arndt Lehmann > > uh - --format=t means tar format. --format=p means plain text ... > > Or am I missing something? > > $pg_dump --help > Usage: > pg_dump [OPTION]... [DBNAME] > > General options: > ... > -F, --format=c|t|p output file format (custom, tar, plain text) > ... > > Cheers > > Andy > > -- > Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) > To make changes to your > subscription:http://www.postgresql.org/mailpref/pgsql-general Hi Andy, You are perfectly right. My oversight. Best Regards, Arndt Lehmannn -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] best practice transitioning from one datatype to another
I'm stumped-- at least for an easy way to do this. When referencing the uniqueidentifier data type in PostgreSQL 8.1 (now uuid, because of an UPDATE to the pg_ tables) in function definitions in schemas not in the search path, one must reference the data type as "public.uniqueidentifier" (ahem. "public.uuid"). This was done because the query planner couldn't with 100% certainty determine type equality (i.e. for foreign key constraints, comparisons in WHERE clauses...), so it decided to use sequential scans where index scans would have been most appropriate. When I reload to PostgreSQL 8.4, it doesn't understand public.uuid. So, the only way I can think of to get those functions back into the database is to use a restore list, and keep them from being inserted in the first place. Then, one-at-a-time, re-create them manually. There must be a better way, though! I'm sure this only my second of several more hurdles to overcome before I'm finished with the transition. Your wisdom will be appreciated! CG ____ From: Tom Lane To: CG Cc: pgsql-general@postgresql.org Sent: Wednesday, July 15, 2009 7:05:47 PM Subject: Re: [GENERAL] best practice transitioning from one datatype to another CG writes: > While transitioning from 8.1 to 8.4, I need to transition to the internal > UUID type in place of the contrib/uniqueidentifier module. I've built the > database around uniqueidentifier, so nearly every table has one column of > that data type. It's going to be tedious to > ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE uuid; > ...repeat 600 times... > I'll also have to drop and reload the views and the rules on tables. It'll be > tedious even if the tables have no data in them. > Can anyone recommend a better/faster way to make the transition? Couldn't you rename the type to uuid in the 8.1 database before you dump? regards, tom lane
[GENERAL] anyelement and anynonarray inclusion
I use a custom datatype that's a good candidate for being included in the family of data types that fit in the "anynonarray" and "anyelement" categories. How can I get PostgreSQL to include that data type when it is considering selecting functions and operators that take the polymorphic types?
[GENERAL] ALTER TABLE ... DISABLE TRIGGERS Isolation leve
I'm using 8.4.1 I want to add a column to a table, but there are update triggers that will fire that don't need to fire for this operation. So, I'd like to add the column with triggers off. Normally this operation would take 10 or so seconds, so locking the table for that amount of time is not a big deal. I just want to make sure that no new data gets written to the table while the triggers are disabled. BEGIN; ALTER TABLE foo ADD COLUMN bar DISABLE TRIGGER USER; COMMIT; seems to leave the triggers disabled. My tests seem to show that BEGIN; ALTER TABLE foo DISABLE TRIGGER USER; locks the table fully, then ALTER TABLE foo ADD COLUMN bar; ALTER TABLE foo ENABLE TRIGGER USER; COMMIT; gets the job done. I only pause because I figured that the single DISABLE triggerin transaction would have flopped back when the transaction committed. I was wrong about that I only need a little bit of affirmation or a kick in the right direction. Thanks folks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] information_schema.referential_constraints contains NULLs
PostgreSQL 9.1.0 For some of the referential constraints listed in my information_schema.referential_constraints table the values for the fields unique_constraint_catalog, unique_constraint_schema, and unique_constraint_name are NULL. There doesn't seem to be any rhyme or reason to which ones have values and which ones don't. They should all have values, right? Any ideas what could be going on? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Best way to debug user defined type
I'm trying to debug a C function that is used for the binary send/receive part of a user defined type. I can debug the send part fine, but the receive part takes an input parameter of type "internal", and I can't seem to conjure up an "internal" to feed to my function to test if the output is working. Can you suggest a debugging strategy? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] pg_dump and ON DELETE CASCADE problem
Hi all, We're using PostgreSQL 8.4 ... We do our nightly database backups with pg_dump. I was doing a test restore and I encountered some data during the reload that was in a table against the conditions of a foreign key constraint. I run my restores with the "-e" option to halt on errors, so this data halted the restore... I went to check the running database and the row in question had been deleted. I had defined the foreign key to cascade on delete, and I imagine that during the dump the delete occurred on the master table. Perhaps the keyed table had already been dumped so when it came time to dump the master table, the referencing row was not there to be dumped. One would imagine that PostgreSQL would have protections for that sort of thing... Can you think of how I can protect against this in the future? CG
Re: [GENERAL] pg_dump and ON DELETE CASCADE problem
The command's nothing out-of-the-ordinary: #!/bin/bash export LD_LIBRARY_PATH=/usr/local/pgsql/lib # # Set Variables # DAY_NUM=`/bin/date +"%d"` MON_NUM=`/bin/date +"%m"` YEAR_NUM=`/bin/date +"%Y"` /usr/local/pgsql/bin/pg_dump -h 192.168.1.5 -Upostgres -f backup.$YEAR_NUM$MON_NUM$DAY_NUM.pga -Fc -b data #END Curiouser and curiouser... Last night's dump failed to restore in the same way: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 7545; 2606 311883439 FK CONSTRAINT packet_search_trigram_puuid_fkey postgres pg_restore: [archiver (db)] could not execute query: ERROR: insert or update on table "packet_search_trigram" violates foreign key constraint "packet_search_trigram_puuid_fkey" DETAIL: Key (packet_uuid)=(0ab44da9-544d-413a-9ab1-a1b442310b24) is not present in table "packet". Command was: ALTER TABLE ONLY packet_search_trigram ADD CONSTRAINT packet_search_trigram_puuid_fkey FOREIGN KEY (packet_uuid) REFERE... pg_restore: *** aborted because of error pg_restore: finished item 7545 FK CONSTRAINT packet_search_trigram_puuid_fkey pg_restore: [archiver] worker process failed: exit code 1 pg_restore: *** aborted because of error That was the same failure I got the previous night. I go to the live database and rows with that key are /not/ in either one of those tables. They /were/ in the tables at one point. I have an ON DELETE trigger that copies deleted rows into another table, so I can see that a row with that key once existed in those tables. This may not be a pg_dump problem, but some sort of MVCC irregularity where pg_dump is able to dump rows that it shouldn't. I bet a VACUUM FULL would clean this up, but I have a live problem here. If I eradicate it, who knows when we'll see it again... --- On Wed, 12/9/09, Craig Ringer wrote: From: Craig Ringer Subject: Re: [GENERAL] pg_dump and ON DELETE CASCADE problem To: cgg...@yahoo.com Cc: "postgresql listserv" Date: Wednesday, December 9, 2009, 9:02 PM On 10/12/2009 3:31 AM, CG wrote: > Hi all, > We're using PostgreSQL 8.4 ... We do our nightly database backups with > pg_dump. I was doing a test restore and I encountered some data during > the reload that was in a table against the conditions of a foreign key > constraint. I run my restores with the "-e" option to halt on errors, so > this data halted the restore... I went to check the running database and > the row in question had been deleted. > I had defined the foreign key to cascade on delete, and I imagine that > during the dump the delete occurred on the master table. Perhaps the > keyed table had already been dumped so when it came time to dump the > master table, the referencing row was not there to be dumped. pg_dump does all its work in a single serializable transaction to avoid this sort of problem. It doesn't see any changes made to the database after it starts. So, assuming you used pg_dump to dump the database as a whole rather than invoking it separately for a bunch of separate tables, that should not be your problem. How do you run pg_dump? Can you supply the script or command line? > One would > imagine that PostgreSQL would have protections for that sort of thing... It does, which is what makes the issue you've encountered somewhat strange. -- Craig Ringer
Re: [GENERAL] pg_dump and ON DELETE CASCADE problem
Thanks for the suggestion. I'm not sure what you mean when you say I should restore to a file. Do you mean I should dump the database to an SQL file instead of the "compressed" format? What do you think I will find? In the database dump, it is including a row that should be marked as deleted. I can select on that key in the production database and get zero rows, and I can select on that key in the restored database and find the row. When I ignore errors the data is restored, but the foreign key can't be created (and that is the only error I encounter). The presence of the data in the dump can not be contested... :) --- On Thu, 12/10/09, Adrian Klaver wrote: > > One thing that comes to mind is to restore the dump file to > a file instead of a > database and see what is being dumped from the live > database. > > > > -- > Adrian Klaver > akla...@comcast.net > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump and ON DELETE CASCADE problem
That's really nifty! I didn't know you could do that! So I expanded it, and I grepped for that UUID through the 46 gig file, and I found the row in the dump that shouldn't be there... It defies explanation. --- On Thu, 12/10/09, John R Pierce wrote: > From: John R Pierce > Subject: Re: [GENERAL] pg_dump and ON DELETE CASCADE problem > To: cgg...@yahoo.com, pgsql-general@postgresql.org > Date: Thursday, December 10, 2009, 3:29 PM > CG wrote: > > Thanks for the suggestion. I'm not sure what you mean > when you say I should restore to a file. Do you mean I > should dump the database to an SQL file instead of the > "compressed" format? > > > > he meant... > > pg_restore -f outputfile.sql > yourdumpfile > > this will convert the dumpfile to SQL... > > > > > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump and ON DELETE CASCADE problem
--- On Fri, 12/11/09, Scott Marlowe wrote: > From: Scott Marlowe > Subject: Re: [GENERAL] pg_dump and ON DELETE CASCADE problem > To: cgg...@yahoo.com > Cc: pgsql-general@postgresql.org, "Adrian Klaver" , > "Craig Ringer" > Date: Friday, December 11, 2009, 1:17 PM > On Thu, Dec 10, 2009 at 1:21 PM, CG > > wrote: > > > > Thanks for the suggestion. I'm not sure what you mean > when you say I should restore to a file. Do you mean I > should dump the database to an SQL file instead of the > "compressed" format? > > > > What do you think I will find? > > > > In the database dump, it is including a row that > should be marked as deleted. I can select on that key in the > production database and get zero rows, and I can select on > that key in the restored database and find the row. When I > ignore errors the data is restored, but the foreign key > can't be created (and that is the only error I encounter). > The presence of the data in the dump can not be contested... > :) > > This could be a corrupted index problem maybe? If you > do this: > > set enable_indexscan=off; > select * from table where key=value; > > does it still not show up? > Bingo. Showed right up. I did a reindex, and now it shows up searching via sequential scan or index scan. So that's pretty scary to have a corrupted index. Once I reindexed, I'm able to see /a lot/ of data I couldn't before. This is the first time in 9 years that I've been bitten by PostgreSQL, and this one HURT. PostgreSQL didn't crash, so there was no indication of failure until the demp-reload. To quote from the masters: Although in theory this should never happen, in practice indexes may become corrupted due to software bugs or hardware failures. I'm reasonably certain that the hardware for the server is sound. No crashes, no alarms... That leaves sofware bugs. We're running PostgreSQL 8.4.1. I don't see any smoking gun bugfixes in 8.4.2, but we'll upgrade ASAP anyway... What are your suggestions for how to proceed? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump and ON DELETE CASCADE problem
--- On Tue, 12/15/09, Adrian Klaver wrote: > From: Adrian Klaver > Subject: Re: [GENERAL] pg_dump and ON DELETE CASCADE problem > To: cgg...@yahoo.com > Cc: "postgresql listserv" , "Craig Ringer" > , "Scott Marlowe" > Date: Tuesday, December 15, 2009, 6:53 PM > On Tuesday 15 December 2009 2:33:39 > pm CG wrote: > > > > > Bingo. Showed right up. I did a reindex, and now it > shows up searching via > > sequential scan or index scan. > > > > So that's pretty scary to have a corrupted index. Once > I reindexed, I'm > > able to see /a lot/ of data I couldn't before. This is > the first time in 9 > > years that I've been bitten by PostgreSQL, and this > one HURT. > > > > PostgreSQL didn't crash, so there was no indication of > failure until the > > demp-reload. To quote from the masters: Although in > theory this should > > never happen, in practice indexes may become corrupted > due to software bugs > > or hardware failures. I'm reasonably certain that the > hardware for the > > server is sound. No crashes, no alarms... That leaves > sofware bugs. > > > > We're running PostgreSQL 8.4.1. I don't see any > smoking gun bugfixes in > > 8.4.2, but we'll upgrade ASAP anyway... > > > > What are your suggestions for how to proceed? > > Interesting, though something is still bothering me. To > quote from one of your > posts upstream; > > "That was the same failure I got the previous night. I go > to the live database > and rows with that key are /not/ in either one of those > tables. They /were/ in > the tables at one point. I have an ON DELETE trigger that > copies deleted rows > into another table, so I can see that a row with that key > once existed in those > tables." > > Would seem that the rows where deleted and should not be > there when the table > was reindexed. Are the 'new' rows you are seeing also in > the delete table? > select foo from bar where baz = 'key'; I was mistaken when I said that the row was not in the table. If I had an index on baz, and the index was corrupted, that SQL would return 0 rows leading me to believe that there were no rows in the table with that key. And, the reason for that row remaining in the database after its foreign keyed parent row was deleted was because the delete operation was depending on the index to find the rows to delete, and that index was corrupt. Of course, I had no idea that the index was corrupt when I made my first post. On the table that has the "martian" row, there is no delete storage. Since the data in the table is trigger-generated for FTI searches, there's no reason to keep that data around. I'm still faced with the unpleasant and unresolved issue of why the index was corrupted in the first place. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump and ON DELETE CASCADE problem
--- On Thu, 12/17/09, Adrian Klaver wrote: > > Would it be possible to see the table schemas and indices > ? > > > Sure (you asked for it!!) : CREATE TABLE packet ( id integer NOT NULL DEFAULT nextval('packet_id_seq'::regclass), packet_uuid uniqueidentifier NOT NULL DEFAULT newid(), username character varying(50) NOT NULL DEFAULT ''::character varying, pgroup_uuid uniqueidentifier DEFAULT newid(), orig_trans_uuid uniqueidentifier, user_reference_id character varying(50) DEFAULT ''::character varying, trans_data character varying(100) NOT NULL DEFAULT ''::character varying, trans_type character varying(50) NOT NULL DEFAULT 'unknown'::character varying, trans_date timestamp with time zone DEFAULT ('now'::text)::timestamp(6) with time zone, processor character varying(10), service character varying(10), CONSTRAINT packet_pkey PRIMARY KEY (id) ) WITH ( OIDS=TRUE ); ALTER TABLE packet OWNER TO postgres; GRANT ALL ON TABLE packet TO postgres; GRANT ALL ON TABLE packet TO adduser; CREATE INDEX packet_otuuid_idx ON packet USING btree (orig_trans_uuid); CREATE INDEX packet_pgroup_uuid_idx ON packet USING btree (pgroup_uuid); CREATE INDEX packet_puuid_hash_uniq ON packet USING hash (packet_uuid); CREATE UNIQUE INDEX packet_puuid_idx ON packet USING btree (packet_uuid); CREATE INDEX packet_trans_date_idx ON packet USING btree (trans_date); CREATE INDEX packet_user_idx ON packet USING btree (username); CREATE INDEX packet_user_puuid_idx ON packet USING btree (username, packet_uuid); CREATE OR REPLACE RULE packet_delete_rule AS ON DELETE TO packet DO INSERT INTO removed_packet (id, packet_uuid, username, pgroup_uuid, orig_trans_uuid, user_reference_id, trans_data, trans_type, trans_date, processor, service) SELECT packet.id, packet.packet_uuid, packet.username, packet.pgroup_uuid, packet.orig_trans_uuid, packet.user_reference_id, packet.trans_data, packet.trans_type, packet.trans_date, packet.processor, packet.service FROM packet WHERE packet.id = old.id; CREATE TRIGGER packet_count_delete_trig BEFORE DELETE ON packet FOR EACH ROW EXECUTE PROCEDURE letter_count_trig(); CREATE TRIGGER packet_count_insert_trig AFTER INSERT ON packet FOR EACH ROW EXECUTE PROCEDURE letter_count_trig(); CREATE TRIGGER packet_delete_trig BEFORE DELETE ON packet FOR EACH ROW EXECUTE PROCEDURE packet_datalink_status_trig(); CREATE TRIGGER packet_insert_trig AFTER INSERT ON packet FOR EACH ROW EXECUTE PROCEDURE packet_ins_trig(); CREATE TABLE dpo.packet_search_trigram ( id integer NOT NULL DEFAULT nextval('packet_search_trigram_id_seq'::regclass), packet_uuid uniqueidentifier NOT NULL, trigram_vector tsvector NOT NULL, CONSTRAINT packet_search_trigram_id_pkey PRIMARY KEY (id), CONSTRAINT packet_search_trigram_puuid_fkey FOREIGN KEY (packet_uuid) REFERENCES dpo.packet (packet_uuid) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE ) WITH ( OIDS=FALSE ); ALTER TABLE dpo.packet_search_trigram OWNER TO postgres; GRANT ALL ON TABLE dpo.packet_search_trigram TO postgres WITH GRANT OPTION; GRANT ALL ON TABLE dpo.packet_search_trigram TO addgroup; CREATE INDEX packet_search_trigram_packet_uuid_idx ON dpo.packet_search_trigram USING hash (packet_uuid); CREATE INDEX packet_search_trigram_trigram_vector_idx ON dpo.packet_search_trigram USING gin (trigram_vector); -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Function written in C, hangs on one machine and not another...
PostgreSQL 7.4 ... Essentially, I've written a function in C for use with PostgreSQL. The debugger shows that the program is hanging on the part of the program that is writing data into it's own STDIN. [snip] // Open up and hijack STDIN int pipe_pair[2]; int pipe_rv = pipe(pipe_pair); if (pipe_rv != 0) // Abort! Abort! { close(pipe_pair[1]); pfree(param_1); pfree(param_2); PG_RETURN_NULL(); } int newfd = dup2(pipe_pair[0],STDIN_FILENO); if (newfd != 0) // Abort! Abort! { close(pipe_pair[1]); pfree(param_1); pfree(param_2); PG_RETURN_NULL(); } // Write param_1 to hijacked pipe write(pipe_pair[1], param_1, param_1_len); // Hangs here... [/snip] It works on the machine I use for testing from within PostgreSQL, but it doesn't work on the machine which is the production server. I'd hate for this to matter, but I ought to disclose that testing machine is a 1-way AMD Box with a more recent version of the Linux 2.6 kernel, and a more recent version of libc. The production machine is a 2-way Dell Xeon processor. Same version of PostgreSQL, compiled with the same flags (except with debugging symbols for the testing machine). You'd, or at least I would, think simple code like this would compile and run on multiple platforms... I can perform the same STDIN hijacking on both machines in a standalone program, but it fails under PostgreSQL. I'm completely stumped, and I need YOUR insight! Thank you!! CGV __ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Function written in C, hangs on one machine and not another...
--- Martijn van Oosterhout wrote: > On Fri, Oct 28, 2005 at 06:38:29AM -0700, CG wrote: > > Umm, what *are* you trying to do? Is this running in the backend? Yes, running on the back-end. I'm trying to utilize Adobe's FDF toolkit to parse the FDF files stored in my database. They distirubte a C-Library that can be used to parse FDF files. > Firstly, depending on the saize of param_1, the write will block > because it can't write all of it (usually PIPE_BUF). Perhaps recent > kernel versions have changed to make it so no data is accepted until a > reader appears even if the data is smaller than that. > > Since apparently you want the read to happen in the same process as the > write, you've just deadlocked yourself. The write won't happen till > someone reads, and the read won't happen because you're stuck > writing... So it might be a kernel thing. What is different when the function is called from within PostgreSQL that is different that the function being called in a standalone program? > Finally, this is insane, why would you want to change STDIN? Insanity? I agree completely. The major issue is that the FDF Toolkit has only one function for reading in FDF Data: /* FDFOpen: Reads an FDF file into memory. Client should call FDFClose() when the FDF is no longer needed. Parameters: - fileName: Complete pathname (in Host encoding), or "-" to read from stdin. - howMany: If fileName specifies stdin, then howMany should indicate the number of characters to read. Otherwise, it is unused. In a web server environment, this is available as the value of the CONTENT_LENGTH environment variable. In some servers executing cgi-bin scripts, if the script tries to read stdin until an EOF is reached, the script hangs. Thus this parameter. - pTheFDF: If FDFOpen() returns FDFErcOK, then pTheFDF will point to an FDFDoc, which is needed for most other calls in the API. - Error codes: FDFErcBadParameter, FDFErcFileSysErr, FDFErcBadFDF, FDFErcInternalError */ FDFLIBAPI FDFErc FDFOpen(const char* fileName, ASInt32 howMany, FDFDoc* pTheFDF); There's no other way to load data into the toolkit! (Can you /feel/ the insanity?) Does this give you any more insight into an alternate method of getting this thing done? __ Start your day with Yahoo! - Make it your home page! http://www.yahoo.com/r/hs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Function written in C, hangs on one machine and not another...
--- Martijn van Oosterhout wrote: > On Fri, Oct 28, 2005 at 07:24:12AM -0700, CG wrote: > Not entirely sure, but I'm sure the size of the write matters. For > example, if your test rpogram, did you check that the write actually > wrote everything? There's beginning and ending tokens in the FDF filespec. The toolkit complains if the data isn't all there... > Firstly, instead of using stdin, you can pass /dev/fd/ > as the filename (on Linux). This avoids stuffing with stdin. That's a FANTASTIC idea. I'll give it a go. We'll cross our fingers, hold our breath, and hope that the blocking issue evaporates. :) > ISTM the best idea: write the data to disk then read it back. Why be > difficult when you can do it easily... I was never supposed to have to do this sort of thing. The idea was never to pull individual peices of data out of the FDFs. Now, the bosses say I have to do some usage analysis, and the data is locked up tight in an FDF. I suppose I could write 100+ files to disk and read them back off and then delete them. At the time, that seemed more insane to me than trying to pump data into stdin. I'm not so sure anymore.. :) __ Start your day with Yahoo! - Make it your home page! http://www.yahoo.com/r/hs ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Function written in C, hangs on one machine and not another...
Thanks to the great suggestions I've at least gotten it to not hang... Martijn's hint about blocking led me to open up those filehandles in a non-blocking mode. It appears that write() will write, at a maximum, only 4096 bytes when it is called from within PostgreSQL. I've tried to push data into it in <=4096-byte slugs, but after 4096 bytes it just won't take anymore. Since (I think) using a non-blocking mode could cause problems with thread safety, it's probably a lost cause. I'm new to C, so this may seem extremely naive: I'm not sure how to use exec() to solve this problem. Could you give me a few pointers to get me started? --- Douglas McNaught <[EMAIL PROTECTED]> wrote: > Martijn van Oosterhout writes: > > > The things that have screwed me up in the past with pulling tricks like > > this are: > > > > 1. Program has registered atexit() handlers. _exit() avoids this. > > 2. Pending stdio output that gets flushed. The backend doesn't use > > stdio much so you might be fine here. > > 3. Signals. Make sure you don't get sent signals that screw state. > > Might be wise to block them all, or reset them all to default. > > > > Truly, exec() is the cleanest way to solve all this, it simply replaces > > the current process, lock, stock and barrel. > > Definitely. It would probably also be good to close all file > descriptors (except for stdin/etdout/stderr) before exec(), just in > case the other binary does something screwy with random file > descriptors (which it obviously shouldn't). > > -Doug > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster > __ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] CLUSTERing on Insert
As I'm waiting for a CLUSTER operation to finish, it occurs to me that in a lot of cases, the performance benefits to having one's data stored on disk in index order can outweigh the overhead involved in inserting data on-disk in index order Just an idea I thought I'd throw out. :) Also, the CLUSTER operation is about as straight forward as one can get. It basically reads each row, one-by-one, in the index order over to the new table, reindexes, then renames the new table to preserve references. I've been thinking about how to speed up the copy process. Perhaps taking contiguous blocks of data and moving them into place would save some I/O time. Locking the table is another problem. Would it be impossible to perform the CLUSTER within the context of a READ COMMITTED transaction, and then pick up the leftover CRUD rows and put them at the end of the file. The existing code makes some assumptions that the table was not altered. There would be no more assumptions. I'm sure I'm not the first person to scratch his head thinking about CLUSTER. Maybe I just don't really understand the limitations that are out there preventing these things from being created. But, what else is there to do at 1AM on a Sunday night waiting for a 500MB table to CLUSTER? :) CG ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Creating an index-type for LIKE '%value%'
Once upon a time there was an FTI contrib module that split up a varchar field into little bits and placed them into an FTI table to facilitate a full text index search. It was like being able to do a "SELECT * FROM table WHERE field LIKE '%value%';" and have it search an index! It was a great idea! What a pain it was to implement! You see, a trigger had to be run on INSERT and UPDATE to split up the varchar field into little pieces. On DELETE you'd have to clear out the rows from the FTI table. And when you wanted to use the FTI table in a SELECT you had to write your SQL to join up that FTI table and dig through it. As I was exploring ways to optimize my application's use of the database, which has to run the horrid "SELECT * FROM table WHERE field LIKE '%value%';" in places, I thought this solution could be built upon to allow for an easier deployment. AFAICT, the "right" way to do this would be to create an index type which would take care of splitting the varchar field, and to have the query planner use the index in a SELECT when a "LIKE '%value%'" is used in the WHERE clause. Tsearch2 is fantastic, but it works best for fields that contain words. I have to sift through alphanumeric identification numbers. Is the split-field FTI the best way to tackle my problem? What can I do to get better performance on "SELECT * FROM table WHERE field LIKE '%value%';" ?? CG __ Do you Yahoo!? Yahoo! Mail - Easier than ever with enhanced search. Learn more. http://info.mail.yahoo.com/mail_250 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] ltree + gist index performance degrades significantly over a night
PostgreSQL 8.1.3 I'm trying to collect some hard numbers to show just how much it degrades and over how long a time interval. All I have now is anecdotal evidence, and I was hoping to save myself some downtime by seeking advice early. I have a search table which I use for partial-match text searches: CREATE TABLE search ( id int4 NOT NULL DEFAULT nextval('search_id_seq'::regclass), item_id int8 NOT NULL, search_vector ltree NOT NULL, CONSTRAINT search_id_pkey PRIMARY KEY (id), CONSTRAINT search_item_id_fkey FOREIGN KEY (item_id) REFERENCES items (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE ) WITH OIDS; CREATE INDEX lsearch_vector_idx ON search USING gist (search_vector); I have some triggers that insert rows into the search table as rows are inserted into "items". I implimented this yesterday, and the immediate effect was a fantastic return time for partial text searches in the sub-second range. By today, these queries take 10 minutes sometimes... There are about 134000 rows in the table. The table gets analyzed nightly. Should the frequency be more? There are about 1000 rows added a day, only about 30 or so rows removed, and nothing is ever updated. There's not that much turnover. The search vectors are built like this: For a string "Hello World" the ltree is created like 'h.e.l.l.o.w.o.r.l.d' ... If I wanted to find all rows with "orl" in them i would construct an lquery like '*.o.r.l.*' and use the "~" operator in the where clause. I would link to the table "items" by the item_id ... What could be making this go so wrong? Is there a better way to accomplish my task? CG __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] ltree + gist index performance degrades significantly over a night
--- Scott Marlowe <[EMAIL PROTECTED]> wrote: > Are you vacuuming regularly, are your fsm settings high enough, and what > does vacuum verbose say? Autovacuum is running, but I do a nightly vacuum analyze. When I just do a vacuum analyze on the table I get: data=# vacuum analyze verbose search; INFO: vacuuming "search" INFO: index "search_id_pkey" now contains 1344661 row versions in 5134 pages DETAIL: 9 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.09s/0.07u sec elapsed 4.91 sec. INFO: index "search_vector_idx" now contains 1344672 row versions in 47725 pages DETAIL: 9 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 1.77s/0.37u sec elapsed 407.55 sec. INFO: index "search_item_id_idx" now contains 1344690 row versions in 6652 pages DETAIL: 9 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.24s/0.08u sec elapsed 45.62 sec. INFO: "search": removed 9 row versions in 2 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.05 sec. INFO: "letter_search": found 9 removable, 1344661 nonremovable row versions in 33984 pages DETAIL: 0 dead row versions cannot be removed yet. There were 141 unused item pointers. 0 pages are entirely empty. CPU 2.41s/0.62u sec elapsed 483.06 sec. INFO: vacuuming "pg_toast.pg_toast_174918394" INFO: index "pg_toast_174918394_index" now contains 0 row versions in 1 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: "pg_toast_174918394": found 0 removable, 0 nonremovable row versions in 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: analyzing "search" INFO: "search": scanned 3000 of 33984 pages, containing 119035 live rows and 0 dead rows; 3000 rows in sample, 1348428 estimated total rows VACUUM max_fsm_pages = 6 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 2000# min 100, ~70 bytes each __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] ltree + gist index performance degrades significantly over a night
--- Martijn van Oosterhout wrote: > That's very odd. Like the other person said, do you vacuum and analyse? > But my question is: is it using the index? What does EXPLAIN / EXPLAIN > ANALYZE tell you? data=# explain select * from search where search_vector ~ '*.o.r.l.*'::lquery; QUERY PLAN --- Bitmap Heap Scan on search (cost=53.71..4566.65 rows=1345 width=161) Recheck Cond: (search_vector ~ '*.o.r.l.*'::lquery) -> Bitmap Index Scan on search_vector_idx (cost=0.00..53.71 rows=1345 width=0) Index Cond: (search_vector ~ '*.o.r.l.*'::lquery) (4 rows) data=# explain analyze select * from search where search_vector ~ '*.o.r.l.*'::lquery; QUERY PLAN -- Bitmap Heap Scan on search (cost=53.71..4566.65 rows=1345 width=161) (actual time=183684.156..196997.278 rows=1655 loops=1) Recheck Cond: (search_vector ~ '*.o.r.l.*'::lquery) -> Bitmap Index Scan on search_vector_idx (cost=0.00..53.71 rows=1345 width=0) (actual time=183683.857..183683.857 rows=1655 loops=1) Index Cond: (search_vector ~ '*.o.r.l.*'::lquery) Total runtime: 197000.061 ms (5 rows) I appreciate you taking the time to help me out. Thank you all. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] ltree + gist index performance degrades significantly over a night
Tsearch2 searches for whole words, and is designed with language in mind, yes? I'm looking for consecutive characters in words or serial numbers, etc. As for support, the same guys who wrote Tsearch2 wrote ltree. Can't go wrong there! Here's the solution to this problem: As usual, operator error. :( For some reason I thought it would be a good idea to cluster the table on the item_id index... What in the world was I thinking? When I clustered the search table on the search_vector index (which makes the most sense, yes?) it seemed to bring actual performance in-line with the type of performance I imagined that I would receive. I could probably get even better performance out of the table, at the cost of a significant increase in table and index size, by chopping up the columns into smaller chunks. "Hello World" would yield 'h.e.l.l.o.w.o.r.l.d' 'e.l.l.o.w.o.r.l.d' 'l.l.o.w.o.r.l.d' 'l.o.w.o.r.l.d' 'o.w.o.r.l.d' 'w.o.r.l.d' 'o.r.l.d' 'r.l.d' and using a wildcard search "search_vector ~ 'o.r.l.*'" would jump right to the vectors which start with "o.r.l" ... Thanks for all the responses! They did get my head pointed in the right direction. CG --- "Jim C. Nasby" <[EMAIL PROTECTED]> wrote: > On Fri, Feb 24, 2006 at 09:02:04AM -0800, CG wrote: > > I have a search table which I use for partial-match text searches: > > > For a string "Hello World" the ltree is created like 'h.e.l.l.o.w.o.r.l.d' > ... > > If I wanted to find all rows with "orl" in them i would construct an lquery > > like '*.o.r.l.*' and use the "~" operator in the where clause. I would link > to > > the table "items" by the item_id ... > > Is there some reason you can't use tsearch2? I suspect it would probably > work better; if nothing else you'd probably get better support since a > lot more people use it. > -- > Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] > Pervasive Software http://pervasive.comwork: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] ltree + gist index performance degrades significantly over a night
That would do the job, wouldn't it? :) I don't think it's a naive question at all. Its quite a good question, and the solution you suggest is a good option to have, and would probably work better than the single-vector ltree index for simple substring matching. In my case, the ltree+gist index table actually contains more pages than the table of data itself. I'd need to see if the space required for the varchar+btree tables are comparible, better, or worse than the ltree+gist tables with regards to size. Now that I think about it, building substrings out of ltree nodes would be incredible overkill comapred to the effetiveness of the varchar+btree. The extra advantages of ltree are the ability to match and extract nodes in a path based not only on contents but also proximity, and aggregate on those characteristics. In my case this might be good for serial numbers where each digit or grouping of digits have special values which would be used to aggregate on. The ltree method was suggested to me a while back when I was frustrated with the performance of "like '%something%'" ... --- Bernhard Weisshuhn <[EMAIL PROTECTED]> wrote: > On Mon, Feb 27, 2006 at 09:14:40AM -0800, CG <[EMAIL PROTECTED]> wrote: > > > I could probably get even better performance out of the table, at the cost > of a > > significant increase in table and index size, by chopping up the columns > into > > smaller chunks. > > > > "Hello World" would yield > > > > 'h.e.l.l.o.w.o.r.l.d' > > 'e.l.l.o.w.o.r.l.d' > > 'l.l.o.w.o.r.l.d' > > 'l.o.w.o.r.l.d' > > 'o.w.o.r.l.d' > > 'w.o.r.l.d' > > 'o.r.l.d' > > 'r.l.d' > > > > and using a wildcard search "search_vector ~ 'o.r.l.*'" would jump right to > the > > vectors which start with "o.r.l" ... > > But with this approch you'd be fine with a normal varchar_ops btree index > for textfields and searching using "like 'world%'", wouldn't you? > Or is the ltree approch more efficient? > > I'm not trying to be smart-assed, it's a naive question, since I'm > looking for an efficient substring search solution in postgresql myself. > > regards, > bkw > > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] ltree + gist index performance degrades significantly over a night
--- Bernhard Weisshuhn <[EMAIL PROTECTED]> wrote: > On Mon, Feb 27, 2006 at 10:27:20AM -0800, CG <[EMAIL PROTECTED]> wrote: > > > [...] I'd need to see if the space required for the varchar+btree tables > are > > comparible, better, or worse than the ltree+gist tables with regards to > size. > > Please test this, I'm guessing (hoping actually) that having bazillions of > combinations of 26 (or so) characters (ltree labels) might be consuming > less space than having bazillions of substings in the database. > > Or maybe some clever combination of both approaches? > > If you find out something interesting, please let me know. Performance using varchar+btree, breaking up the string into distinct letter groups >= 3 chars is slightly better. Size of the varchar search vector table table is much bigger.. Most of my fields are about 15-25 characters in length. Expect even bigger tables for longer fields. The size of the btree index is less. The time to bootstrap the data into the tables was significantly longer. I used two triggers, one that normalized the search field before insert, and another that inserted a breakdown row after the insert row. There's a recursive effect built-in to get down to the smallest unique element. I'm sticking with ltree and setting up a vacuum analyze on a cron to keep the searches snappy. Hope that helps you with your project! CG __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] GiST index slower than seqscan
I'm still trying to wrap my brain around this one. Please forgive me if this is the proverbial "dead horse" that I'm beating. In case you're unfamiliar with this particular horse, I'm using ltree to create a full text index on some <= 50 char long fields for a lookup table. The idea was to be able to tear through tons of data quickly finding case insensitive substring matches. Here's my index... CREATE INDEX letter_search_vector_idx ON letter_search USING gist (search_vector); I know that the index is bigger than the table, but shouldn't it be able to quickly scan the few branches that matter? I've tried to do a varchar-based substring lookup table, and the size for that table+index is enormous compared to the ltree table + index (which is huge anyway) I'm thinking that I've created something or am using something incorrectly. I need to be set straight. Please advise! [snip] Welcome to psql 8.1.3, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit data=# analyze letter_search; ANALYZE data=# explain select * from letter_search where search_vector ~ charslquery('669344'); QUERY PLAN --- Bitmap Heap Scan on letter_search (cost=59.14..4978.98 rows=1467 width=162) Recheck Cond: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery) -> Bitmap Index Scan on letter_search_vector_idx (cost=0.00..59.14 rows=146 7 width=0) Index Cond: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery) (4 rows) data=# explain analyze select * from letter_search where search_vector ~ charslquery('669344'); QUERY PLAN - Bitmap Heap Scan on letter_search (cost=59.14..4978.98 rows=1467 width=162) (a ctual time=63061.402..63072.362 rows=2 loops=1) Recheck Cond: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery) -> Bitmap Index Scan on letter_search_vector_idx (cost=0.00..59.14 rows=146 7 width=0) (actual time=63058.094..63058.094 rows=2 loops=1) Index Cond: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery) Total runtime: 63072.411 ms (5 rows) data=# set enable_bitmapscan=off; SET data=# explain select * from letter_search where search_vector ~ charslqu ery('669344'); QUERY PLAN -- Index Scan using letter_search_vector_idx on letter_search (cost=0.00..5837.70 rows=1467 width=162) Index Cond: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery) (2 rows) data=# explain analyze select * from letter_search where search_vector ~ charslquery('669344'); QUERY PLA N - Index Scan using letter_search_vector_idx on letter_search (cost=0.00..5837.70 rows=1467 width=162) (actual time=14582.619..162637.887 rows=2 loops=1) Index Cond: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery) Total runtime: 162637.977 ms (3 rows) data=# set enable_indexscan=off; SET data=# explain select * from letter_search where search_vector ~ charslqu ery('669344'); QUERY PLAN -- Seq Scan on letter_search (cost=0.00..55232.18 rows=1467 width=162) Filter: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery) (2 rows) data=# explain analyze select * from letter_search where search_vector ~ charslquery('669344'); QUERY PLAN -- Seq Scan on letter_search (cost=0.00..55232.18 rows=1467 width=162) (actual ti me=4725.525..9428.087 rows=2 loops=1) Filter: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery) Total runtime: 9428.118 ms (3 rows) [/snip] __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] GiST index slower than seqscan
--- Teodor Sigaev <[EMAIL PROTECTED]> wrote: > > In case you're unfamiliar with this particular horse, I'm using ltree to > create > > a full text index on some <= 50 char long fields for a lookup table. The > idea > > was to be able to tear through tons of data quickly finding case > insensitive > > substring matches. > > > > Why it is a ltree, not a tsearch? When I said full text, I meant substring. Please correct me if I am wrong, but tsearch would be useful for finding words in a paragraph, not characters in a word (or small group of words) ... If I had fields 'Hello World!', 'Low Tide', and 'Following Day' they would all be hits for a search on 'low' ... > > > > Index Cond: (search_vector ~ '*.6.6.9.3.4.4.*'::lquery) > > That's the problem. Queries which begin with '*' will be slow enough... > Indeed. Substring searches are quite costly... I was hoping that the hiearchical nature of ltree would allow me to be able to sift quickly through the list since every alpha or numeric character would be a branch on the tree. > Try to reduce SIGLENINT in tsearch2/gistidx.h up to 8 (do not forget reindex > !!) > and try it I bet you meant ltree/ltree.h ... I'll give that a try and see what happens! Thank you! __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Postmaster crashes after upgrade to 8.1.4!
nect to the database and repeat your command. 2006-05-25 08:30:50.089 EDT myuser mydata 10.0.1.1(4133)WARNING: terminating connection because of crash of another server process 2006-05-25 08:30:50.089 EDT myuser mydata 10.0.1.1(4133)DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2006-05-25 08:30:50.089 EDT myuser mydata 10.0.1.1(4133)HINT: In a moment you should be able to reconnect to the database and repeat your command. 2006-05-25 08:30:50.091 EDT LOG: all server processes terminated; reinitializing 2006-05-25 08:30:50.103 EDT LOG: database system was interrupted at 2006-05-25 08:29:27 EDT 2006-05-25 08:30:50.103 EDT LOG: checkpoint record is at 28/3C101AE8 2006-05-25 08:30:50.103 EDT LOG: redo record is at 28/3C101AE8; undo record is at 0/0; shutdown TRUE 2006-05-25 08:30:50.103 EDT LOG: next transaction ID: 204190698; next OID: 186879866 2006-05-25 08:30:50.103 EDT LOG: next MultiXactId: 1; next MultiXactOffset: 0 2006-05-25 08:30:50.103 EDT LOG: database system was not properly shut down; automatic recovery in progress 2006-05-25 08:30:50.114 EDT LOG: redo starts at 28/3C101B38 What could be wrong? I am panicking. Please advise! CG __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Postmaster crashes after upgrade to 8.1.4!
I didn't find a core dump. Perhaps I'm looking in the wrong spot or for the wrong file. The file should be called "core.32140", correct? ... I did a "find / -name core*" ... that found nothing useful. --- Tom Lane <[EMAIL PROTECTED]> wrote: > CG <[EMAIL PROTECTED]> writes: > > 2006-05-25 08:30:50.076 EDT LOG: server process (PID 32140) was > terminated > > by signal 11 > > That should be leaving a core dump file (if not, restart the postmaster > under "ulimit -c unlimited"). Get a stack trace with gdb to get some > more info about what's going on. > > regards, tom lane > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Postmaster crashes after upgrade to 8.1.4!
Okay, there was no core dump to be found. I had to revert back to 8.1.3 which seems to be running fine. I am /extremely/ thankful that there was no data corruption. I took a 24 hour old dumpfile of the database it was crashing on and I restored it to a similar AMD64 box (SunFire x2100 instead of SunFire x4100) running 8.1.4 and tried to crash it as the other was crashing. No joy. It seems to run. I'll leave it running and try to put a decent load on the box to get it to crash. Since I would have to down the production database to get a working copy, I won't be able to copy the offending data directory over to the test installation until my next maint window rolls around in a few weeks. That, or we have another outage of some type which would give me the ability to down the database and copy the tree over. I wish I could've done more analysis while the server was crippled. I'll keep trying. CG --- Tom Lane <[EMAIL PROTECTED]> wrote: > CG <[EMAIL PROTECTED]> writes: > > 2006-05-25 08:30:50.076 EDT LOG: server process (PID 32140) was > terminated > > by signal 11 > > That should be leaving a core dump file (if not, restart the postmaster > under "ulimit -c unlimited"). Get a stack trace with gdb to get some > more info about what's going on. > > regards, tom lane > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster