Re: [HACKERS] Correlation in cost_index()
On Wed, 02 Oct 2002 18:48:49 -0400, Tom Lane [EMAIL PROTECTED] wrote: I don't think it's really a good idea to expect users to pick among multiple cost functions The idea is that PG is shipped with a default representing the best of our knowledge and users are not encouraged to change it. When a user sends a PG does not use my index or Why doesn't it scan sequentially? message to one of the support lists, we advise her/him to set index_cost_algorithm to 3 (or whatever we feel appropriate) and watch the feedback we get. We don't risk anything, if the default is the current behaviour. Servus Manfred ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Correlation in cost_index()
On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I'd certainly be willing to do some testing on my own data with them. Great! Gotta patch? Not yet. I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, and I have been running random page cost at 1 lately, as well as running cpu_index_cost at 1/10th the default setting to get good results. May I ask for more information? What are your settings for effective_cache_size and shared_buffers? And which version are you running? Servus Manfred ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] (Fwd) Re: Any Oracle 9 users? A test please...
Tom Lane wrote: Has anyone done the corresponding experiments on the other DBMSes to identify exactly when they allow CURRENT_TIMESTAMP to advance ? This applies up to Oracle 8.1.6, maybe it helps: According to a co-worker, Oracle advances the time in transactions: select to_char(sysdate, 'dd.mm. hh24:mi:ss') from dual; TO_CHAR(SYSDATE,'DD --- 03.10.2002 10:16:28 (wait ...) SQL r 1* select to_char(sysdate, 'dd.mm. hh24:mi:ss') from dual TO_CHAR(SYSDATE,'DD --- 03.10.2002 10:17:41 It even advances within procedures/functions, example: create or replace procedure foobar is s1 varchar(2000); s2 varchar(2000); begin select to_char(sysdate, 'dd.mm. hh24:mi:ss') into s1 from dual; (... put long running query here ...) select to_char(sysdate, 'dd.mm. hh24:mi:ss') into s2 from dual; dbms_output.put_line(s1); dbms_output.put_line(s2); end; / set serverout on execute foobar; Hope it helps. Regards, Mario Weilguni ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] anoncvs and diff
I've been waiting to see how a patched file differs from my version. The patch was added to the to apply list last week I think (it wasn't mine btw) and I've been doing cvs diff to view the differences so I can tell when the patch has been applied. Additional information given by this is the revision number the comparison is against of course. This has stayed at 1.61 all the time I've been doing this cvs diff operation. Looking at the web interface to cvs I see the file has a revision number of 1.64. I use the anoncvs server for my operations. Am I being daft or is there a problem with the anoncvs archive? -- Nigel J. Andrews ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Correlation in cost_index()
On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I'd certainly be willing to do some testing on my own data with them. Gotta patch? Yes, see below. Disclaimer: Apart from make; make check this is completely untested. Use at your own risk. Have fun! Servus Manfred diff -ruN ../base/src/backend/optimizer/path/costsize.c src/backend/optimizer/path/costsize.c --- ../base/src/backend/optimizer/path/costsize.c 2002-07-04 18:04:57.0 +0200 +++ src/backend/optimizer/path/costsize.c 2002-10-03 09:53:06.0 +0200 @@ -72,6 +72,7 @@ double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST; double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST; double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST; +intindex_cost_algorithm = DEFAULT_INDEX_COST_ALGORITHM; Cost disable_cost = 1.0; @@ -213,8 +214,8 @@ CostindexStartupCost; CostindexTotalCost; Selectivity indexSelectivity; - double indexCorrelation, - csquared; + double indexCorrelation; + CostIO_cost; Costmin_IO_cost, max_IO_cost; Costcpu_per_tuple; @@ -329,13 +330,62 @@ min_IO_cost = ceil(indexSelectivity * T); max_IO_cost = pages_fetched * random_page_cost; - /* -* Now interpolate based on estimated index order correlation to get -* total disk I/O cost for main table accesses. -*/ - csquared = indexCorrelation * indexCorrelation; + switch (index_cost_algorithm) { + case 1: { + /* + ** Use abs(correlation) for linear interpolation + */ + double absC = fabs(indexCorrelation); + + IO_cost = absC * min_IO_cost + (1 - absC) * max_IO_cost; + } + + case 2: { + /* + ** First estimate number of pages and cost per page, + ** then multiply the results. min_IO_cost is used for + ** min_pages, because seq_page_cost = 1. + */ + double absC = fabs(indexCorrelation); + + double estPages = absC * min_IO_cost + (1 - absC) * pages_fetched; + double estPCost = absC * 1 + (1 - absC) * random_page_cost; + IO_cost = estPages * estPCost; + } + + case 3: { + /* + ** Interpolate based on independence squared, thus forcing the + ** result to be closer to min_IO_cost + */ + double independence = 1 - fabs(indexCorrelation); + double isquared = independence * independence; + + IO_cost = (1 - isquared) * min_IO_cost + isquared * max_IO_cost; + } + + case 4: { + /* + ** Interpolate geometrically + */ + double absC = fabs(indexCorrelation); + + IO_cost = exp(absC * log(min_IO_cost) + + (1 - absC) * log(max_IO_cost)); + } + + default: { + /* +* Interpolate based on estimated index order correlation +* to get total disk I/O cost for main table accesses. +*/ + double csquared = indexCorrelation * indexCorrelation; + + IO_cost = max_IO_cost + csquared * (min_IO_cost - max_IO_cost); + } + } - run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost); + run_cost += IO_cost; /* * Estimate CPU costs per tuple. diff -ruN ../base/src/backend/utils/misc/guc.c src/backend/utils/misc/guc.c --- ../base/src/backend/utils/misc/guc.c2002-07-20 17:27:19.0 +0200 +++ src/backend/utils/misc/guc.c2002-10-03 10:03:37.0 +0200 @@ -644,6 +644,11 @@ }, { + { index_cost_algorithm, PGC_USERSET }, index_cost_algorithm, + DEFAULT_INDEX_COST_ALGORITHM, 0, INT_MAX, NULL, NULL + }, + + { { NULL, 0 }, NULL, 0, 0, 0, NULL, NULL } }; diff -ruN ../base/src/include/optimizer/cost.h src/include/optimizer/cost.h --- ../base/src/include/optimizer/cost.h2002-06-21 02:12:29.0 +0200 +++ src/include/optimizer/cost.h2002-10-03 09:56:28.0 +0200 @@ -24,6 +24,7 @@ #define DEFAULT_CPU_TUPLE_COST 0.01 #define DEFAULT_CPU_INDEX_TUPLE_COST 0.001 #define DEFAULT_CPU_OPERATOR_COST 0.0025 +#define DEFAULT_INDEX_COST_ALGORITHM 3 /* defaults for function attributes used for expensive function calculations */ #define BYTE_PCT 100 @@ -43,6 +44,7 @@ extern double cpu_tuple_cost; extern double cpu_index_tuple_cost; extern double cpu_operator_cost; +extern int index_cost_algorithm; extern Cost disable_cost; extern bool enable_seqscan; extern
[HACKERS] OT: Looking to Open Source the Flash training material
Hi everyone, Have been thinking for a while now about viable ways to Open Source the Flash based training material that has been in development from last year. After discussing this with a number of people for suggestions, feedback, advise, etc, these are looking to be the general concepts that, as a whole, would likely work to greatest effect: *** - Create a new Open Source license specifically for this. The er... DDPL (?). Digital Distribution Public License. - Release the source code to all the Flashes developed thus far, through this license. The DDPL would go something like this: - People can use the source Flash files to create training content for any kind of software they so choose to. We of course heartily recommend Open Source Software. Everything released must also be under the DDPL. - All content must be released unrestricted, etc, and the Flash source files must be available for all. It's allowed to be included with paid for products. No restrictions, etc. - All content and Flash source files under the DDPL must also be submitted to us, so we can decide whether or not to include them the digitaldistribution.com site or not, etc. We can distribute through resellers, etc, and no royalties are applicable. Additionally: - We make a few points really clear and simple on the website. The primary reason for existence for the digitaldistribution.com site is to educate end users, and additionally to create a revenue stream that can be used to hire further developers for Open Source projects and be used to the benefit of the Open Source Community as need be (i.e. hire lawyers to fight against inappropriate patents, pay for advertisements and research studies, etc). - Open up the translation interface and mechanisms on the digitaldistribution.com site so that people can come along and do translations for their language as they feel like it. - Have a support mechanism (in a way that's fair) so that resellers of the tutorials are well funded to provide support for the communities of their native languages, etc. - Will probably work in something about Membership fees for the digitaldistribution.com site will be based upon the GDP for a nation, so that for example, a person coming from Thailand isn't charged anywhere near as much as a person in the US. Not sure how to make it workable, but it's the start for addressing an important issue. *** Please remember this is just a start and might totally change or be dropped entirely, depending on whether it looks to be workable and beneficial, etc. Now looking for thoughts and feedback on this from a wider audience, so hoping people have good ideas, beneficial directions, etc. If everything is looking good, then we'll look to ensuring this is a workable Open Source license, etc. (www.opensource.org) :-) Regards and best wishes, Justin Clift -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Large databases, performance
Hi, Today we concluded test for database performance. Attached are results and the schema, for those who have missed earlier discussion on this. We have (almost) decided that we will partition the data across machines. The theme is, after every some short interval a burst of data will be entered in new table in database, indexed and vacuume. The table(s) will be inherited so that query on base table will fetch results from all the children. The application has to consolidate all the data per node basis. If the database is not postgresql, app. has to consolidate data across partitions as well. Now we need to investigate whether selecting on base table to include children would use indexes created on children table. It's estimated that when entire data is gathered, total number of children tables would be around 1K-1.1K across all machines. This is in point of average rate of data insertion i.e. 5K records/sec and total data size, estimated to be 9 billion rows max i.e. estimated database size is 900GB. Obviously it's impossible to keep insertion rate on an indexed table high as data grows. So partitioning/inheritance looks better approach. Postgresql is not the final winner as yet. Mysql is in close range. I will keep you guys posted about the result. Let me know about any comments.. Bye Shridhar -- Price's Advice: It's all a game -- play it to have fun. Machine Compaq Proliant Server ML 530 Intel Xeon 2.4 Ghz Processor x 4, 4 GB RAM, 5 x 72.8 GB SCSI HDD RAID 0 (Striping) Hardware Setup, Mandrake Linux 9.0 Cost - $13,500 ($1,350 for each additional 72GB HDD) Performance Parameter MySQL 3.23.52 MySQL 3.23.52 PostgreSQL 7.2.2 WITHOUT InnoDB WITH InnoDB for with built-in support for transactional transactional support for transactions support Complete Data Inserts + building a composite index 40 GB data, 432,000,000 tuples3738 secs 18720 secs 20628 secs about 100 bytes each, schema on 'schema' sheet composite index on 3 fields (esn, min, datetime) Load Speed 115570 tuples/second23076 tuples/second 20942 tuples/second Database Size on Disk 48 GB 87 GB 111 GB Average per partition Inserts + building a composite index 300MB data, 3,000,000 tuples, 28 secs 130 secs 150 secs about 100 bytes each, schema on 'schema' sheet composite index on 3 fields (esn, min, datetime) Select Query7 secs 7 secs 6 secs based on equality match of 2 fields (esn and min) - 4 concurrent queries running Database Size on Disk 341 MB 619 MB 788 MB Field Name Field Type NullableIndexed typeint no no esn char (10) no yes min char (10) no yes datetimetimestamp no yes opc0char (3)no no opc1char (3)no no opc2char (3)no no dpc0char (3)no no dpc1char (3)no no dpc2char (3)no no npa char (3)no
Re: [HACKERS] pg_dump and large files - is this a problem?
At 11:06 AM 2/10/2002 -0400, Tom Lane wrote: It needs to get done; AFAIK no one has stepped up to do it. Do you want to? My limited reading of off_t stuff now suggests that it would be brave to assume it is even a simple 64 bit number (or even 3 32 bit numbers). One alternative, which I am not terribly fond of, is to have pg_dump write multiple files - when we get to 1 or 2GB, we just open another file, and record our file positions as a (file number, file position) pair. Low tech, but at least we know it would work. Unless anyone knows of a documented way to get 64 bit uint/int file offsets, I don't see we have mush choice. Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /() __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 0500 83 82 82 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] pg_dump and large files - is this a problem?
My limited reading of off_t stuff now suggests that it would be brave to assume it is even a simple 64 bit number (or even 3 32 bit numbers). One alternative, which I am not terribly fond of, is to have pg_dump write multiple files - when we get to 1 or 2GB, we just open another file, and record our file positions as a (file number, file position) pair. Low tech, but at least we know it would work. Unless anyone knows of a documented way to get 64 bit uint/int file offsets, I don't see we have mush choice. How common is fgetpos64? Linux supports it, but I don't know about other systems. http://hpc.uky.edu/cgi-bin/man.cgi?section=alltopic=fgetpos64 Regards, Mario Weilguni ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] v7.2.3 - tag'd, packaged ... need it checked ...
Lamar Owen [EMAIL PROTECTED] writes: Builds fine here for RPM usage. Got an odd diff in the triggers regression test: did we drop a NOTICE? If so, the regression output should probably have been changed too. No, we didn't change anything, and the 7.2 regression tests passed for me on Tuesday. Please investigate more closely. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Anyone want to assist with the translation of the Advocacy site?
Is anyone interested in translating the English version to other languages? I don't have time for the translation, unfortunately, but i would suggest changing worlds to world's on the main page. -tfo ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Anyone want to assist with the translation of the Advocacy
Thomas O'Connell wrote: Is anyone interested in translating the English version to other languages? I don't have time for the translation, unfortunately, but i would suggest changing worlds to world's on the main page. Um, doesn't world's mean world is ? That wouldn't make sense then though. ? Regards and best wishes, Justin Clift -tfo ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Anyone want to assist with the translation of the Advocacy site?
Um, doesn't world's mean world is ? In this situation, the 's denotes possession, as in the most advanced open source database of the world. worlds here is basically saying every world most advanced open source database and does not, in any case, connote possession. -tfo ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Correlation in cost_index()
On Thu, 03 Oct 2002 12:40:20 +0200, I wrote: Gotta patch? Yes, see below. Oh, did I mention that inserting some break statements after the switch cases helps a lot? :-( Cavus venter non laborat libenter ... Servus Manfred ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Anyone want to assist with the translation of the
Thomas F.O'Connell wrote: Um, doesn't world's mean world is ? In this situation, the 's denotes possession, as in the most advanced open source database of the world. worlds here is basically saying every world most advanced open source database and does not, in any case, connote possession. Ok, updating it now. Thanks heaps Thomas. :-) Regards and best wishes, Justin Clift -tfo -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Anyone want to assist with the translation of the Advocacy site?
Um, doesn't world's mean world is ? i forgot to provide a real-world example: http://www.amazon.com/ Earth's Biggest Selection -tfo ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Correlation in cost_index()
On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, Scott, yet another question: are multicolunm indices involved in your estimator problems? Servus Manfred ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Correlation in cost_index()
On Thu, 3 Oct 2002, Manfred Koizar wrote: On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I'd certainly be willing to do some testing on my own data with them. Great! Gotta patch? Not yet. I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, and I have been running random page cost at 1 lately, as well as running cpu_index_cost at 1/10th the default setting to get good results. May I ask for more information? What are your settings for effective_cache_size and shared_buffers? And which version are you running? I'm running 7.2.2 in production and 7.3b2 in testing. effective cache size is the default (i.e. commented out) shared buffers are at 4000. I've found that increasing shared buffers past 4000 (32 megs) to 16384 (128 Megs) has no great effect on my machine's performance, but I've never really played with effective cache size. I've got a couple of queries that join a 1M+row table to itself and to a 50k row table, and the result sets are usually 100 rows at a time. Plus some other smaller datasets that return larger amounts (i.e. sometimes all rows) of data generally. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Correlation in cost_index()
On Thu, 3 Oct 2002, Manfred Koizar wrote: On Wed, 2 Oct 2002 14:07:19 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: I've found that when the planner misses, sometimes it misses by HUGE amounts on large tables, Scott, yet another question: are multicolunm indices involved in your estimator problems? No. Although I use them a fair bit, none of the problems I've encountered so far have involved them. But I'd be willing to setup some test indexes and get some numbers on them. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Correlation in cost_index()
On Thu, 3 Oct 2002 10:59:54 -0600 (MDT), scott.marlowe [EMAIL PROTECTED] wrote: are multicolunm indices involved in your estimator problems? No. Although I use them a fair bit, none of the problems I've encountered so far have involved them. But I'd be willing to setup some test indexes and get some numbers on them. Never mind! I just stumbled over those lines in selfuncs.c where indexCorrelation is calculated by dividing the correlation of the first index column by the number of index columns. I have a use case here where this clearly is not the right choice and was hoping to find some examples that help me investigate whether my case is somewhat uncommon ... Servus Manfred ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Trigger regression test output
On Thursday 03 October 2002 12:46 pm, Tom Lane wrote: Lamar Owen [EMAIL PROTECTED] writes: Builds fine here for RPM usage. Got an odd diff in the triggers regression test: did we drop a NOTICE? If so, the regression output should probably have been changed too. The diff: *** ./expected/triggers.out Sat Jan 15 14:18:23 2000 --- ./results/triggers.out Thu Oct 3 00:16:09 2002 - NOTICE: check_pkeys_fkey_cascade: 1 tuple(s) of fkeys2 are deleted After looking into this I have a theory about the cause: you must have built the contrib/spi/refint.c module without -DREFINT_VERBOSE. That flag is required to pass the regression tests, because it controls output of this debug notice. The normal build procedure for the regression tests does cause this to happen, but if you'd previously built the contrib subdirectory with default switches, I think the regress tests would use the existing refint.o and get a failure. So the regression tests weren't really testing the actually built module, so to speak. Is there a good reason to leave the NOTICE's in the expected regression output? As to the way it's built, the regression tests are built in the RPMset to allow post-install (that is, post _RPM_ install) regression testing on machines without make or compilers. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] v7.2.3 - tag'd, packaged ... need it checked ...
On Thursday 03 October 2002 12:29 am, Lamar Owen wrote: RPMs will be uploaded either tonight or tomorrow morning after I get to work; it will depend on how much upload bandwidth I can get out of this dialup. It appears to be running OK, so I may let it run. RPMS uploaded into the usual place, so the announcement can take that into account, Marc. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Oracle beats up on Open Source Database(s) ... and
On Wed, 2002-10-02 at 16:14, Marc G. Fournier wrote: Just in case anyone enjoys these sorts of things :) It deals with the whole .org TLD assignment ... http://forum.icann.org/org-eval/gartner-report I like this one: | Unlike many of the conventional commercial databases, PostgreSQL has | offered advanced Object Relational capabilities for years, including | inheritance. Ms. Gelhausen is quite correct that these are important | capabilities, finally available with the release of Oracle9i. We | applaud Oracle's continued efforts to close the gap and stay | ompetitive with this, and other open source database features. cheers -- vbi -- secure email with gpg http://fortytwo.ch/gpg NOTICE: subkey signature! request key 92082481 from keyserver.kjsl.com signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Correlation in cost_index()
Manfred Koizar [EMAIL PROTECTED] writes: Never mind! I just stumbled over those lines in selfuncs.c where indexCorrelation is calculated by dividing the correlation of the first index column by the number of index columns. Yeah, I concluded later that that was bogus. I've been thinking of just using the correlation of the first index column and ignoring the rest; that would not be great, but it's probably better than what's there. Have you got a better formula? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] DROP COLUMN misbehaviour with multiple inheritance
On Thu, Oct 03, 2002 at 04:00:32PM -0400, Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Where are we with this patch? It's done as far as I'm concerned ;-). Not sure if Hannu still wants to argue that the behavior is wrong ... it seems fine to me though ... I still haven't submitted the ALTER TABLE/ADD COLUMN part. There's a little thing I want to change first. It's a different issue though (but related). -- Alvaro Herrera (alvherre[a]atentus.com) El que vive para el futuro es un iluso, y el que vive para el pasado, un imbécil (Luis Adler, Los tripulantes de la noche) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] [GENERAL] Small patch for PL/Perl Misbehavior with Runtime ErrorReporting
Good day, I just stumbled across this peculiarity in PL/Perl today writing a method to invoke Perl Regexes from a function: if a run-time error is raised in an otherwise good function, the function will never run correctly again until the connection to the database is reset. I poked around in the code and it appears that it's because when elog() raises the ERROR, it doesn't first take action to erase the system error message ($@) and consequently every subsequent run has an error raised, even if it runs successfully. For example: -- This comparison works fine. template1=# SELECT perl_re_match('test', 'test'); perl_re_match --- t (1 row) -- This one dies, for obvious reasons. template1=# SELECT perl_re_match('test', 't{1}+?'); ERROR: plperl: error from function:(in cleanup) Nested quantifiers before HERE mark in regex m/t{1}+ HERE ?/ at (eval 2) line 4. -- This should work fine again, but we still have this error raised...! template1=# SELECT perl_re_match('test', 'test'); ERROR: plperl: error from function:(in cleanup) Nested quantifiers before HERE mark in regex m/t{1}+ HERE ?/ at (eval 2) line 4. I don't know if the following is the best way to solve it, but I got around it by modifying the error report in this part of PL/Perl to be a NOTICE, cleared the $@ variable, and then raised the fatal ERROR. A simple three line patch to plperl.c follows, and is attached. src/pl/plperl/plperl.c: 443c443,445 elog(ERROR, plperl: error from function: %s, SvPV(ERRSV, PL_na)); --- elog(NOTICE, plperl: error from function: %s, SvPV(ERRSV, PL_na)); sv_setpv(perl_get_sv(@,FALSE),); elog(ERROR, plperl: error was fatal.); Best Regards, Jw. -- John Worsley - [EMAIL PROTECTED] http://www.openvein.com/ 443c443,445 elog(ERROR, plperl: error from function: %s, SvPV(ERRSV, PL_na)); --- elog(NOTICE, plperl: error from function: %s, SvPV(ERRSV, PL_na)); sv_setpv(perl_get_sv(@,FALSE),); elog(ERROR, plperl: error was fatal.); ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Advice: Where could I be of help?
tom lane wrote: But more globally, I think that our worst problems these days have to do with planner misestimations leading to bad plans. The planner is usually *capable* of generating a good plan, but all too often it picks the wrong one. We need work on improving the cost modeling equations to be closer to reality. If that's at all close to your sphere of interest then I think it should be #1 priority --- it's both localized, which I think is important for a first project, and potentially a considerable win. This seems like a very interesting problem. One of the ways that I thought would be interesting and would solve the problem of trying to figure out the right numbers is to have certain guesses for the actual values based on statistics gathered during vacuum and general running and then have the planner run the best plan. Then during execution if the planner turned out to be VERY wrong about certain assumptions the execution system could update the stats that led to those wrong assumptions. That way the system would seek the correct values automatically. We could also gather the stats that the system produces for certain actual databases and then use those to make smarter initial guesses. I've found that I can never predict costs. I always end up testing empirically and find myself surprised at the results. We should be able to make the executor smart enough to keep count of actual costs (or a statistical approximation) without introducing any significant overhead. tom lane also wrote: There is no cache flushing. We have a shared buffer cache management algorithm that's straight LRU across all buffers. There's been some interest in smarter cache-replacement code; I believe Neil Conway is messing around with an LRU-2 implementation right now. If you've got better ideas we're all ears. Hmmm, this is the area that I think could lead to huge performance gains. Consider a simple system with a table tbl_master that gets read by each process many times but with very infrequent inserts and that contains about 3,000 rows. The single but heavily used index for this table is contained in a btree with a depth of three with 20 - 8K pages in the first two levels of the btree. Another table tbl_detail with 10 indices that gets very frequent inserts. There are over 300,000 rows. Some queries result in index scans over the approximatley 5,000 8K pages in the index. There is a 40M shared cache for this system. Everytime a query which requires the index scan runs it will blow out the entire cache since the scan will load more blocks than the cache holds. Only blocks that are accessed while the scan is going will survive. LRU is bad, bad, bad! LRU-2 might be better but it seems like it still won't give enough priority to the most frequently used blocks. I don't see how it would do better for the above case. I once implemented a modified cache algorithm that was based on the clock algorithm for VM page caches. VM paging is similar to databases in that there is definite locality of reference and certain pages are MUCH more likely to be requested. The basic idea was to have a flag in each block that represented the access time in clock intervals. Imagine a clock hand sweeping across a clock, every access is like a tiny movement in the clock hand. Blocks that are not accessed during a sweep are candidates for removal. My modification was to use access counts to increase the durability of the more accessed blocks. Each time a block is accessed it's flag is shifted left (up to a maximum number of shifts - ShiftN ) and 1 is added to it. Every so many cache accesses (and synchronously when the cache is full) a pass is made over each block, right shifting the flags (a clock sweep). This can also be done one block at a time each access so the clock is directly linked to the cache access rate. Any blocks with 0 are placed into a doubly linked list of candidates for removal. New cache blocks are allocated from the list of candidates. Accesses of blocks in the candidate list just removes them from the list. An index root node page would likely be accessed frequently enough so that all it's bits would be set so it would take ShiftN clock sweeps. This algorithm increased the cache hit ratio from 40% to about 90% for the cases I tested when compared to a simple LRU mechanism. The paging ratio is greatly dependent on the ratio of the actual database size to the cache size. The bottom line that it is very important to keep blocks that are frequently accessed in the cache. The top levels of large btrees are accessed many hundreds (actually a power of the number of keys in each page) of times more frequently than the leaf pages. LRU can be the worst possible algorithm for something like an index or table scan of large tables since it flushes a large number of potentially frequently accessed blocks in favor of ones that are very unlikely to be retrieved again. tom lane also wrote: This is an
[HACKERS] Potential Large Performance Gain in WAL synching
I've been looking at the TODO lists and caching issues and think there may be a way to greatly improve the performance of the WAL. I've made the following assumptions based on my reading in the manual and the WAL archives since about November 2000: 1) WAL is currently fsync'd before commit succeeds. This is done to ensure that the D in ACID is satisfied. 2) The wait on fsync is the biggest time cost for inserts or updates. 3) fsync itself probably increases contention for file i/o on the same file since some OS file system cache structures must be locked as part of fsync. Depending on the file system this could be a significant choke on total i/o throughput. The issue is that there must be a definite record in durable storage for the log before one can be certain that a transaction has succeeded. I'm not familiar with the exact WAL implementation in PostgreSQL but am familiar with others including ARIES II, however, it seems that it comes down to making sure that the write to the WAL log has been positively written to disk. So, why don't we use files opened with O_DSYNC | O_APPEND for the WAL log and then use aio_write for all log writes? A transaction would simple do all the log writing using aio_write and block until all the last log aio request has completed using aio_waitcomplete. The call to aio_waitcomplete won't return until the log record has been written to the disk. Opening with O_DSYNC ensures that when i/o completes the write has been written to the disk, and aio_write with O_APPEND opened files ensures that writes append in the order they are received, hence when the aio_write for the last log entry for a transaction completes, the transaction can be sure that its log records are in durable storage (IDE problems aside). It seems to me that this would: 1) Preserve the required D semantics. 2) Allow transactions to complete and do work while other threads are waiting on the completion of the log write. 3) Obviate the need for commit_delay, since there is no blocking and the file system and the disk controller can put multiple writes to the log together as the drive is waiting for the end of the log file to come under one of the heads. Here are the relevant TODO's: Delay fsync() when other backends are about to commit too [fsync] Determine optimal commit_delay value Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options Allow multiple blocks to be written to WAL with one write() Am I missing something? Curtis Faith Principal Galt Capital, LLP -- Galt Capitalhttp://www.galtcapital.com 12 Wimmelskafts Gade Post Office Box 7549 voice: 340.776.0144 Charlotte Amalie, St. Thomasfax: 340.776.0244 United States Virgin Islands 00801 cell: 340.643.5368 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Advice: Where could I be of help?
Curtis Faith [EMAIL PROTECTED] writes: Then during execution if the planner turned out to be VERY wrong about certain assumptions the execution system could update the stats that led to those wrong assumptions. That way the system would seek the correct values automatically. That has been suggested before, but I'm unsure how to make it work. There are a lot of parameters involved in any planning decision and it's not obvious which ones to tweak, or in which direction, if the plan turns out to be bad. But if you can come up with some ideas, go to it! Everytime a query which requires the index scan runs it will blow out the entire cache since the scan will load more blocks than the cache holds. Right, that's the scenario that kills simple LRU ... LRU-2 might be better but it seems like it still won't give enough priority to the most frequently used blocks. Blocks touched more than once per query (like the upper-level index blocks) will survive under LRU-2. Blocks touched once per query won't. Seems to me that it should be a win. My modification was to use access counts to increase the durability of the more accessed blocks. You could do it that way too, but I'm unsure whether the extra complexity will buy anything. Ultimately, I think an LRU-anything algorithm is equivalent to a clock sweep for those pages that only get touched once per some-long-interval: the single-touch guys get recycled in order of last use, which seems just like a clock sweep around the cache. The guys with some amount of preference get excluded from the once-around sweep. To determine whether LRU-2 is better or worse than some other preference algorithm requires a finer grain of analysis than this. I'm not a fan of more complex must be better, so I'd want to see why it's better before buying into it ... The kinds of things I was thinking about should be very portable. I found that simply writing the cache in order of the file system offset results in very greatly improved performance since it lets the head seek in smaller increments and much more smoothly, especially with modern disks. Shouldn't the OS be responsible for scheduling those writes appropriately? Ye good olde elevator algorithm ought to handle this; and it's at least one layer closer to the actual disk layout than we are, thus more likely to issue the writes in a good order. It's worth experimenting with, perhaps, but I'm pretty dubious about it. BTW, one other thing that Vadim kept saying we should do is alter the cache management strategy to retain dirty blocks in memory (ie, give some amount of preference to as-yet-unwritten dirty pages compared to clean pages). There is no reliability cost here since the WAL will let us reconstruct any dirty pages if we crash before they get written; and the periodic checkpoints will ensure that we eventually write a dirty block and thus it will become available for recycling. This seems like a promising line of thought that's orthogonal to the basic LRU-vs-whatever issue. Nobody's got round to looking at it yet though. I've got no idea how much preference should be given to a dirty block --- not infinite, probably, but some. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] [GENERAL] Performance while loading data and indexing
Some of you may be interested in this seemingly exhaustive benchmark between ext2/3, ReiserFS, JFS, and XFS. http://www.osdl.org/presentations/lwe-jgfs.pdf ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Potential Large Performance Gain in WAL synching
Curtis Faith [EMAIL PROTECTED] writes: So, why don't we use files opened with O_DSYNC | O_APPEND for the WAL log and then use aio_write for all log writes? We already offer an O_DSYNC option. It's not obvious to me what aio_write brings to the table (aside from loss of portability). You still have to wait for the final write to complete, no? 2) Allow transactions to complete and do work while other threads are waiting on the completion of the log write. I'm missing something. There is no useful work that a transaction can do between writing its commit record and reporting completion, is there? It has to wait for that record to hit disk. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Performance while loading data and indexing
Hey, excellent. Thanks! Based on that, it appears that XFS is a pretty good FS to use. For me, the real surprise was how well reiserfs performed. Greg On Thu, 2002-10-03 at 18:09, Mike Benoit wrote: Some of you may be interested in this seemingly exhaustive benchmark between ext2/3, ReiserFS, JFS, and XFS. http://www.osdl.org/presentations/lwe-jgfs.pdf ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org signature.asc Description: This is a digitally signed message part
Re: [HACKERS] [GENERAL] Small patch for PL/Perl Misbehavior with Runtime Error Reporting
John Worsley [EMAIL PROTECTED] writes: I just stumbled across this peculiarity in PL/Perl today writing a method to invoke Perl Regexes from a function: if a run-time error is raised in an otherwise good function, the function will never run correctly again until the connection to the database is reset. I poked around in the code and it appears that it's because when elog() raises the ERROR, it doesn't first take action to erase the system error message ($@) and consequently every subsequent run has an error raised, even if it runs successfully. That seems a little weird. Does Perl really expect people to do that (ie, is it a documented part of some API)? I wonder whether there is some other action that we're supposed to take instead, but are missing... src/pl/plperl/plperl.c: 443c443,445 elog(ERROR, plperl: error from function: %s, SvPV(ERRSV, PL_na)); --- elog(NOTICE, plperl: error from function: %s, SvPV(ERRSV, PL_na)); sv_setpv(perl_get_sv(@,FALSE),); elog(ERROR, plperl: error was fatal.); If this is what we'd have to do, I think a better way would be perlerrmsg = pstrdup(SvPV(ERRSV, PL_na)); sv_setpv(perl_get_sv(@,FALSE),); elog(ERROR, plperl: error from function: %s, perlerrmsg); Splitting the ERROR into a NOTICE with the useful info and an ERROR without any isn't real good, because the NOTICE could get dropped on the floor (either because of min_message_level or a client that just plain loses notices). regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [SQL] [GENERAL] CURRENT_TIMESTAMP
On Thu, Oct 03, 2002 at 07:09:33PM -0400, Tom Lane wrote: statement-arrival time. (I like the idea of a parameterized version of now() to provide a consistent interface to all three functionalities.) I like this, too. I think it'd be probably useful. But. . . pride of place to statement-arrival time. In the end, I think that transaction-start time is the most commonly useful and safest variant, . . .I also think this is true. If I'm doing a bunch of database operations in one transaction, there is a remarkably good argument that they happened at the same time. After all, the marked passage of time is probably just an unfortunate side effect of the inability of my database can't process things instantaneously. A -- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada [EMAIL PROTECTED] M2P 2A8 +1 416 646 3304 x110 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Potential Large Performance Gain in WAL synching
tom lane replies: Curtis Faith [EMAIL PROTECTED] writes: So, why don't we use files opened with O_DSYNC | O_APPEND for the WAL log and then use aio_write for all log writes? We already offer an O_DSYNC option. It's not obvious to me what aio_write brings to the table (aside from loss of portability). You still have to wait for the final write to complete, no? Well, for starters by the time the write which includes the commit log entry is written, much of the writing of the log for the transaction will already be on disk, or in a controller on its way. I don't see any O_NONBLOCK or O_NDELAY references in the sources so it looks like the log writes are blocking. If I read correctly, XLogInsert calls XLogWrite which calls write which blocks. If these assumptions are correct, there should be some significant gain here but I won't know how much until I try to change it. This issue only affects the speed of a given back-ends transaction processing capability. The REAL issue and the one that will greatly affect total system throughput is that of contention on the file locks. Since fsynch needs to obtain a write lock on the file descriptor, as does the write calls which originate from XLogWrite as the writes are written to the disk, other back-ends will block while another transaction is committing if the log cache fills to the point where their XLogInsert results in a XLogWrite call to flush the log cache. I'd guess this means that one won't gain much by adding other back-end processes past three or four if there are a lot of inserts or updates. The method I propose does not result in any blocking because of writes other than the final commit's write and it has the very significant advantage of allowing other transactions (from other back-ends) to continue until they enter commit (and blocking waiting for their final commit write to complete). 2) Allow transactions to complete and do work while other threads are waiting on the completion of the log write. I'm missing something. There is no useful work that a transaction can do between writing its commit record and reporting completion, is there? It has to wait for that record to hit disk. The key here is that a thread that has not committed and therefore is not blocking can do work while other threads (should have said back-ends or processes) are waiting on their commit writes. - Curtis P.S. If I am right in my assumptions about the way the current system works, I'll bet the change would speed up inserts in Shridhar's huge database test by at least a factor of two or three, perhaps even an order of magnitude. :-) -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED]] Sent: Thursday, October 03, 2002 7:17 PM To: Curtis Faith Cc: Pgsql-Hackers Subject: Re: [HACKERS] Potential Large Performance Gain in WAL synching Curtis Faith [EMAIL PROTECTED] writes: So, why don't we use files opened with O_DSYNC | O_APPEND for the WAL log and then use aio_write for all log writes? We already offer an O_DSYNC option. It's not obvious to me what aio_write brings to the table (aside from loss of portability). You still have to wait for the final write to complete, no? 2) Allow transactions to complete and do work while other threads are waiting on the completion of the log write. I'm missing something. There is no useful work that a transaction can do between writing its commit record and reporting completion, is there? It has to wait for that record to hit disk. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Potential Large Performance Gain in WAL synching
Curtis Faith wrote: The method I propose does not result in any blocking because of writes other than the final commit's write and it has the very significant advantage of allowing other transactions (from other back-ends) to continue until they enter commit (and blocking waiting for their final commit write to complete). 2) Allow transactions to complete and do work while other threads are waiting on the completion of the log write. I'm missing something. There is no useful work that a transaction can do between writing its commit record and reporting completion, is there? It has to wait for that record to hit disk. The key here is that a thread that has not committed and therefore is not blocking can do work while other threads (should have said back-ends or processes) are waiting on their commit writes. I may be missing something here, but other backends don't block while one writes to WAL. Remember, we are proccess based, not thread based, so the write() call only blocks the one session. If you had threads, and you did a write() call that blocked other threads, I can see where your idea would be good, and where async i/o becomes an advantage. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Potential Large Performance Gain in WAL synching
Curtis Faith [EMAIL PROTECTED] writes: The REAL issue and the one that will greatly affect total system throughput is that of contention on the file locks. Since fsynch needs to obtain a write lock on the file descriptor, as does the write calls which originate from XLogWrite as the writes are written to the disk, other back-ends will block while another transaction is committing if the log cache fills to the point where their XLogInsert results in a XLogWrite call to flush the log cache. But that's exactly *why* we have a log cache: to ensure we can buffer a reasonable amount of log data between XLogFlush calls. If the above scenario is really causing a problem, doesn't that just mean you need to increase wal_buffers? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Advice: Where could I be of help?
I wrote: My modification was to use access counts to increase the durability of the more accessed blocks. tom lane replies: You could do it that way too, but I'm unsure whether the extra complexity will buy anything. Ultimately, I think an LRU-anything algorithm is equivalent to a clock sweep for those pages that only get touched once per some-long-interval: the single-touch guys get recycled in order of last use, which seems just like a clock sweep around the cache. The guys with some amount of preference get excluded from the once-around sweep. To determine whether LRU-2 is better or worse than some other preference algorithm requires a finer grain of analysis than this. I'm not a fan of more complex must be better, so I'd want to see why it's better before buying into it ... I'm definitely not a fan of more complex must be better either. In fact, its surprising how often the real performance problems are easy to fix and simple while many person years are spent solving the issue everyone knows must be causing the performance problems only to find little gain. The key here is empirical testing. If the cache hit ratio for LRU-2 is much better then there may be no need here. OTOH, it took less than less than 30 lines or so of code to do what I described, so I don't consider it too, too more complex :=} We should run a test which includes running indexes (or is indices the PostgreSQL convention?) that are three or more times the size of the cache to see how well LRU-2 works. Is there any cache performance reporting built into pgsql? tom lane wrote: Shouldn't the OS be responsible for scheduling those writes appropriately? Ye good olde elevator algorithm ought to handle this; and it's at least one layer closer to the actual disk layout than we are, thus more likely to issue the writes in a good order. It's worth experimenting with, perhaps, but I'm pretty dubious about it. I wasn't proposing anything other than changing the order of the writes, not actually ensuring that they get written that way at the level you describe above. This will help a lot on brain-dead file systems that can't do this ordering and probably also in cases where the number of blocks in the cache is very large. On a related note, while looking at the code, it seems to me that we are writing out the buffer cache synchronously, so there won't be any possibility of the file system reordering anyway. This appears to be a huge performance problem. I've read claims in the archives that that the buffers are written asynchronously but my read of the code says otherwise. Can someone point out my error? I only see calls that ultimately call FileWrite or write(2) which will block without a O_NOBLOCK open. I thought one of the main reasons for having a WAL is so that you can write out the buffer's asynchronously. What am I missing? I wrote: Then during execution if the planner turned out to be VERY wrong about certain assumptions the execution system could update the stats that led to those wrong assumptions. That way the system would seek the correct values automatically. tom lane replied: That has been suggested before, but I'm unsure how to make it work. There are a lot of parameters involved in any planning decision and it's not obvious which ones to tweak, or in which direction, if the plan turns out to be bad. But if you can come up with some ideas, go to it! I'll have to look at the current planner before I can suggest anything concrete. - Curtis ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Potential Large Performance Gain in WAL synching
I wrote: The REAL issue and the one that will greatly affect total system throughput is that of contention on the file locks. Since fsynch needs to obtain a write lock on the file descriptor, as does the write calls which originate from XLogWrite as the writes are written to the disk, other back-ends will block while another transaction is committing if the log cache fills to the point where their XLogInsert results in a XLogWrite call to flush the log cache. tom lane wrote: But that's exactly *why* we have a log cache: to ensure we can buffer a reasonable amount of log data between XLogFlush calls. If the above scenario is really causing a problem, doesn't that just mean you need to increase wal_buffers? Well, in cases where there are a lot of small transactions the contention will not be on the XLogWrite calls from caches getting full but from XLogWrite calls from transaction commits which will happen very frequently. I think this will have a detrimental effect on very high update frequency performance. So while larger WAL caches will help in the case of cache flushing because of its being full I don't think it will make any difference for the potentially more common case of transaction commits. - Curtis ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Return of INSTEAD rules
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: We have talked about possible return values for RULES, particularly INSTEAD rule. Manfred has a nice example here, so I propose we handle INSTEAD rules this way: that we return the oid and tuple count of the last INSTEAD rule query with a tag matching the main query. Hmm ... that's subtly different from what I'd seen discussed before. I thought the idea was 1. If no INSTEAD rule: return tag, count, and OID of original query, regardless of what is added by non-INSTEAD rules. (I think this part is not controversial.) 2. If any INSTEAD rule: return tag, count, and OID of the last executed query that has the same tag as the original query. If no substituted query matches the original query's tag, return original query's tag with zero count and OID. (This is where the going gets tough.) I think you just modified the second part of that to restrict it to queries that were added by INSTEAD rules. This is doable but it's not a trivial change --- in particular, I think it implies adding another field to Query data structure so we can mark INSTEAD-added vs non-INSTEAD-added queries. Which means an initdb because it breaks stored rules. I am confused how yours differs from mine. I don't see how the last matching tagged query would not be from an INSTEAD rule. Are you thinking multiple queries in the query string? Offhand I think this might be worth doing, because I like that subtle change in behavior. But we should understand exactly what we're doing here... Seems we are adding up reasons for initdb. :-) -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Return of INSTEAD rules
Bruce Momjian [EMAIL PROTECTED] writes: I am confused how yours differs from mine. I don't see how the last matching tagged query would not be from an INSTEAD rule. You could have both INSTEAD and non-INSTEAD rules firing for the same original query. If the alphabetically-last rule is a non-INSTEAD rule, then there's a difference. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [SQL] [GENERAL] CURRENT_TIMESTAMP
On Fri, 2002-10-04 at 01:41, Bruce Momjian wrote: Well, let's see what others say. If no one is excited about the change, we can just document its current behavior. Oh, I see it is already documented in func.sgml: It is quite important to realize that functionCURRENT_TIMESTAMP/function and related functions all return the time as of the start of the current transaction; their values do not increment while a transaction is running. But functiontimeofday()/function returns the actual current time. Seems that isn't helping enough to reduce the number of people who are surprised by our behavior. I don't think anyone would be surprised by statement time. What do others think? I would prefer that CURRENT_TIME[STAMP] always produce the same time within a transaction. If it is changed, it will certainly break one of my applications, which explicitly depends on the current behaviour. If you change it, please provide an alternative way of doing the same thing. I can see that the current behaviour might give surprising results in a long running transaction. Surprise could be reduced by giving the time of first use within the transaction rather than the start of the transaction. -- Oliver Elphick[EMAIL PROTECTED] Isle of Wight, UK http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C For the word of God is quick, and powerful, and sharper than any twoedged sword, piercing even to the dividing asunder of soul and spirit, and of the joints and marrow, and is a discerner of the thoughts and intents of the heart.Hebrews 4:12 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [SQL] [GENERAL] CURRENT_TIMESTAMP
Oliver Elphick [EMAIL PROTECTED] writes: I can see that the current behaviour might give surprising results in a long running transaction. Surprise could be reduced by giving the time of first use within the transaction rather than the start of the transaction. [ cogitates ... ] Hmm, we could do that, and it probably would break few if any existing apps. But would it really reduce the surprise factor? The complaints we've heard so far all seemed to come from people who expected multiple current_timestamp calls to show advancing times within a transaction. Oliver's idea might be worth doing just on performance grounds: instead of a gettimeofday() call at the start of every transaction, we'd only have to reset a flag variable. When and if current_timestamp is done inside the transaction, then call the kernel to ask what time it is. We win on every transaction that does not contain a current_timestamp call, which is probably a good bet for most apps. But I don't think this does much to resolve the behavioral complaints. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Potential Large Performance Gain in WAL synching
Bruce Momjian wrote: I may be missing something here, but other backends don't block while one writes to WAL. I don't think they'll block until they get to the fsync or XLogWrite call while another transaction is fsync'ing. I'm no Unix filesystem expert but I don't see how the OS can handle multiple writes and fsyncs to the same file descriptors without blocking other processes from writing at the same time. It may be that there are some clever data structures they use but I've not seen huge praise for most of the file systems. A well written file system could minimize this contention but I'll bet it's there with most of the ones that PostgreSQL most commonly runs on. I'll have to write a test and see if there really is a problem. - Curtis -Original Message- From: Bruce Momjian [mailto:[EMAIL PROTECTED]] Sent: Friday, October 04, 2002 12:44 AM To: Curtis Faith Cc: Tom Lane; Pgsql-Hackers Subject: Re: [HACKERS] Potential Large Performance Gain in WAL synching Curtis Faith wrote: The method I propose does not result in any blocking because of writes other than the final commit's write and it has the very significant advantage of allowing other transactions (from other back-ends) to continue until they enter commit (and blocking waiting for their final commit write to complete). 2) Allow transactions to complete and do work while other threads are waiting on the completion of the log write. I'm missing something. There is no useful work that a transaction can do between writing its commit record and reporting completion, is there? It has to wait for that record to hit disk. The key here is that a thread that has not committed and therefore is not blocking can do work while other threads (should have said back-ends or processes) are waiting on their commit writes. I may be missing something here, but other backends don't block while one writes to WAL. Remember, we are proccess based, not thread based, so the write() call only blocks the one session. If you had threads, and you did a write() call that blocked other threads, I can see where your idea would be good, and where async i/o becomes an advantage. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [GENERAL] OT: Looking to Open Source the Flash training material
On Thu, Oct 03, 2002 at 10:26:16PM +1000, Justin Clift wrote: Have been thinking for a while now about viable ways to Open Source the Flash based training material that has been in development from last year. After discussing this with a number of people for suggestions, feedback, advise, etc, these are looking to be the general concepts that, as a whole, would likely work to greatest effect: Is there some reason not to use FSF's FDL? -- Alvaro Herrera (alvherre[a]atentus.com) When the proper man does nothing (wu-wei), his thought is felt ten thousand miles. (Lao Tse) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] [SQL] [GENERAL] CURRENT_TIMESTAMP
On Thu, Oct 03, 2002 at 04:18:08PM -0400, Bruce Momjian wrote: So, we have a couple of decisions to make: Should CURRENT_TIMESTAMP be changed to statement arrival time? Should now() be changed the same way? If not, should now() and CURRENT_TIMESTAMP return the same type of value? One idea is to change CURRENT_TIMESTAMP to statement arrival time, and leave now() as transaction start time. A disadvantage to this, as I see it, is that users may have depended on the traditional Postgres behaviour of time freezing in transaction. You always had to select timeofday() for moving time. I can see an advantage in making what Postgres does somewhat more like what other people do (as flat-out silly as some of that seems to be). Still, it looks to me like the present CURRENT_TIMESTAMP implementation is at least as much like the spec as anyone else's implementation, and more like the spec than many of them. So I'm still not clear on what problem the change is going to fix, especially since it breaks with traditional behaviour. A -- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada [EMAIL PROTECTED] M2P 2A8 +1 416 646 3304 x110 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] [SQL] [GENERAL] CURRENT_TIMESTAMP
Bruce Momjian [EMAIL PROTECTED] writes: So, in summary, reasons for the change: more intuitive more standard-compliant more closely matches other db's I'd give you the first and third of those. As Andrew noted, the argument that it's more standard-compliant is not very solid. Reasons not to change: PostgreSQL traditional behavior You've phrased that in a way that makes it sound like the decision is a no-brainer. How about Breaks existing Postgres applications in non-obvious ways which I think is a more realistic description of the downside. Also, it seems a lot of people who have thought about this carefully think that the start-of-transaction behavior is just plain more useful. The fact that it surprises novices is not a reason why people who know the behavior shouldn't want it to work like it does. (The behavior of nextval/currval for sequences surprises novices, too, but I haven't heard anyone claim we should change it because of that.) So I think a fairer summary is Pro: more intuitive (but still not what an unversed person would expect, namely true current time) arguably more standard-compliant more closely matches other db's (but still not very closely) Con: breaks existing Postgres applications in non-obvious ways arguably less useful than our traditional behavior I've got no problem with the idea of adding a way to get at statement-arrival time. (I like the idea of a parameterized version of now() to provide a consistent interface to all three functionalities.) But I'm less than enthused about changing the existing functions to give pride of place to statement-arrival time. In the end, I think that transaction-start time is the most commonly useful and safest variant, and so I feel it ought to have pride of place as the easiest one to get at. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] pg_dump and large files - is this a problem?
At 07:15 AM 4/10/2002 +1000, Giles Lean wrote: My limited reading of off_t stuff now suggests that it would be brave to assume it is even a simple 64 bit number (or even 3 32 bit numbers). What are you reading?? If you find a platform with 64 bit file offsets that doesn't support 64 bit integral types I will not just be surprised but amazed. Yes, but there is no guarantee that off_t is implemented as such, nor would we be wise to assume so (most docs say explicitly not to do so). Unless anyone knows of a documented way to get 64 bit uint/int file offsets, I don't see we have mush choice. If you're on a platform that supports large files it will either have a straightforward 64 bit off_t or else will support the large files API that is common on Unix-like operating systems. What are you trying to do, exactly? Again yes, but the problem is the same: we need a way of making the *value* of an off_t portable (not just assuming it's a int64). In general that involves knowing how to turn it into a more universal data type (eg. int64, or even a string). Does the large file API have functions for representing the off_t values that is portable across architectures? And is the API also portable? Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 0500 83 82 82 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [GENERAL] Performance while loading data and indexing
Greg Copeland wrote: -- Start of PGP signed section. Hey, excellent. Thanks! Based on that, it appears that XFS is a pretty good FS to use. For me, the real surprise was how well reiserfs performed. OK, hardware performance paper updated: --- File system choice is particularly difficult on Linux because there are so many file system choices, and none of them are optimal: ext2 is not entirely crash-safe, ext3, xfs, and jfs are journal-based, and Reiser is optimized for small files and does journalling. The journalling file systems can be significantly slower than ext2 but when crash recovery is required, ext2 isn't an option. If ext2 must be used, mount it with sync enabled. Some people recommend xfs or an ext3 filesystem mounted with data=writeback. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] [SQL] [GENERAL] CURRENT_TIMESTAMP
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: So, in summary, reasons for the change: more intuitive more standard-compliant more closely matches other db's I'd give you the first and third of those. As Andrew noted, the argument that it's more standard-compliant is not very solid. The standard doesn't say anything about transaction in this regard. I actually think Oracle is closer to the standard than we are right now. Reasons not to change: PostgreSQL traditional behavior You've phrased that in a way that makes it sound like the decision is a no-brainer. How about Breaks existing Postgres applications in non-obvious ways which I think is a more realistic description of the downside. I had used Andrew's words: the traditional Postgres behaviour of time freezing in transaction. Yes, breaking is a clearer description. Also, it seems a lot of people who have thought about this carefully think that the start-of-transaction behavior is just plain more useful. The fact that it surprises novices is not a reason why people who know the behavior shouldn't want it to work like it does. (The behavior of nextval/currval for sequences surprises novices, too, but I haven't heard anyone claim we should change it because of that.) No one has suggested a more intuitive solution for sequences, or we would have discussed it. So I think a fairer summary is Pro: more intuitive (but still not what an unversed person would expect, namely true current time) arguably more standard-compliant What does arguably mean? That seems more like a throw-away objection. more closely matches other db's (but still not very closely) Closer! No need to qualify what I said. It is more of all these things, not exactly, of course. Con: breaks existing Postgres applications in non-obvious ways arguably less useful than our traditional behavior I've got no problem with the idea of adding a way to get at statement-arrival time. (I like the idea of a parameterized version of now() to provide a consistent interface to all three functionalities.) But I'm less than enthused about changing the existing functions to give pride of place to statement-arrival time. In the end, I think that transaction-start time is the most commonly useful and safest variant, and so I feel it ought to have pride of place as the easiest one to get at. Well, let's see what others say. If no one is excited about the change, we can just document its current behavior. Oh, I see it is already documented in func.sgml: It is quite important to realize that functionCURRENT_TIMESTAMP/function and related functions all return the time as of the start of the current transaction; their values do not increment while a transaction is running. But functiontimeofday()/function returns the actual current time. Seems that isn't helping enough to reduce the number of people who are surprised by our behavior. I don't think anyone would be surprised by statement time. What do others think? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] pg_dump and large files - is this a problem?
Philip Warner writes: Yes, but there is no guarantee that off_t is implemented as such, nor would we be wise to assume so (most docs say explicitly not to do so). I suspect you're reading old documents, which is why I asked what you were referring to. In the '80s what you are saying would have been best practice, no question: 64 bit type support was not common. When talking of near-current systems with 64 bit off_t you are not going to find one without support for 64 bit integral types. Again yes, but the problem is the same: we need a way of making the *value* of an off_t portable (not just assuming it's a int64). In general that involves knowing how to turn it into a more universal data type (eg. int64, or even a string). So you need to know the size of off_t, which will be 32 bit or 64 bit, and then you need routines to convert that to a portable representation. The canonical solution is XDR, but I'm not sure that you want to bother with it or if it has been extended universally to support 64 bit types. If you limit the file sizes to 1GB (your less preferred option, I know;-) then like the rest of the PostgreSQL code you can safely assume that off_t fits into 32 bits and have a choice of functions (XDR or ntohl() etc) to deal with them and ignore 64 bit off_t issues altogether. If you intend pg_dump files to be portable avoiding the use of large files will be best. It also avoids issues on platforms such as HP-UX where large file support is available, but it has to be enabled on a per-filesystem basis. :-( Does the large file API have functions for representing the off_t values that is portable across architectures? And is the API also portable? The large files API is a way to access large files from 32 bit processes. It is reasonably portable, but is a red herring for what you are wanting to do. (I'm not convinced I am understanding what you're trying to do, but I have 'flu which is not helping. :-) Regards, Giles ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Improving backend startup interlock
Have people considered flock (advisory locking) on the postmaster.pid file for backend detection? It has a nonblocking option. Don't most OS's support it? I can't understand why we can't get an easier solution to postmaster detection than shared memory. --- Tom Lane wrote: Giles Lean [EMAIL PROTECTED] writes: I'm certainly no fan of NFS locking, but if someone trusts their NFS client and server implementations enough to put their data on, they might as well trust it to get a single lock file for startup right too. IMHO. Your mileage may vary. Well, my local man page for lockf() sez The advisory record-locking capabilities of lockf() are implemented throughout the network by the ``network lock daemon'' (see lockd(1M)). If the file server crashes and is rebooted, the lock daemon attempts to recover all locks associated with the crashed server. If a lock cannot be reclaimed, the process that held the lock is issued a SIGLOST signal. and the lockd man page mentions that not only lockd but statd have to be running locally *and* at the NFS server. This sure sounds like file locking on NFS introduces additional failure modes above and beyond what we have already. Since the entire point of this locking exercise is to improve PG's robustness, solutions that depend on other daemons not crashing don't sound like a step forward to me. I'm willing to trust the local kernel, but I get antsy if I have to trust more than that. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] v7.3 Branched ...
Just a reminder, we are not using this tag. All 7.3 patches are going to HEAD. Once we decide to split the tree for 7.4, we will update this branch and announce it is ready to be used. --- Marc G. Fournier wrote: As was previously discussed (and now that I'm mostly back from the dead ... damn colds) I've just branched off REL7_3_STABLE ... all future beta's will be made based off of that branch, so that development may resume on the main branch ... So, for those doing commits or anoncvs, remember that the 'stable' branch requires you to use: -rREL7_3_STABLE while the development branch is 'as per normal' ... ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] anoncvs and diff
Nigel J. Andrews wrote: I've been waiting to see how a patched file differs from my version. The patch was added to the to apply list last week I think (it wasn't mine btw) and I've been doing cvs diff to view the differences so I can tell when the patch has been applied. Additional information given by this is the revision number the comparison is against of course. This has stayed at 1.61 all the time I've been doing this cvs diff operation. Looking at the web interface to cvs I see the file has a revision number of 1.64. I use the anoncvs server for my operations. Am I being daft or is there a problem with the anoncvs archive? That is strange. anoncvs and the web interface should have the same version number. What file are you looking at? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [GENERAL] Large databases, performance
On 3 Oct 2002 at 19:33, Shridhar Daithankar wrote: On 3 Oct 2002 at 13:56, Nigel J. Andrews wrote: It's one hell of a DB you're building. I'm sure I'm not the only one interested so to satisfy those of us who are nosey: can you say what the application is? I'm sure we'll all understand if it's not possible for you mention such information. Well, I can't tell everything but somethings I can.. 1) This is a system that does not have online capability yet. This is an attempt to provide one. 2) The goal is to avoid costs like licensing oracle. I am sure this would make a great example for OSDB advocacy, which ever database wins.. 3) The database size estimates, I put earlier i.e. 9 billion tuples/900GB data size, are in a fixed window. The data is generated from some real time systems. You can imagine the rate. Read that fixed time window.. 4) Further more there are timing restrictions attached to it. 5K inserts/sec. 4800 queries per hour with response time of 10 sec. each. It's this aspect that has forced us for partitioning.. And contrary to my earlier information, this is going to be a live system rather than a back up one.. A better win to postgresql.. I hope it makes it. And BTW, all these results were on reiserfs. We didn't found much of difference in write performance between them. So we stick to reiserfs. And of course we got the latest hot shot Mandrake9 with 2.4.19-16 which really made difference over RHL7.2.. Well, we were comparing ext3 v/s reiserfs. I don't remember the journalling mode of ext3 but we did a 10 GB write test. Besides converting the RAID to RAID- 0 from RAID-5 might have something to do about it. There was a discussion on hackers some time back as in which file system is better. I hope this might have an addition over it.. Bye Shridhar -- What terrible way to die. There are no good ways. -- Sulu and Kirk, That Which Survives, stardate unknown ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Large databases, performance
NOTE: Setting follow up to the performance list Funny that the status quo seems to be if you need fast selects on data that has few inserts to pick mysql, otherwise if you have a lot of inserts and don't need super fast selects go with PostgreSQL; yet your data seems to cut directly against this. I'm curious, did you happen to run the select tests while also running the insert tests? IIRC the older mysql versions have to lock the table when doing the insert, so select performance goes in the dumper in that scenario, perhaps that's not an issue with 3.23.52? It also seems like the vacuum after each insert is unnecessary, unless your also deleting/updating data behind it. Perhaps just running an ANALYZE on the table would suffice while reducing overhead. Robert Treat On Thu, 2002-10-03 at 08:36, Shridhar Daithankar wrote: Machine Compaq Proliant Server ML 530 Intel Xeon 2.4 Ghz Processor x 4, 4 GB RAM, 5 x 72.8 GB SCSI HDD RAID 0 (Striping) Hardware Setup, Mandrake Linux 9.0 Cost - $13,500 ($1,350 for each additional 72GB HDD) Performance Parameter MySQL 3.23.52 MySQL 3.23.52 PostgreSQL 7.2.2 WITHOUT InnoDB WITH InnoDB for with built-in support for transactional transactional support for transactions support Complete Data Inserts + building a composite index 40 GB data, 432,000,000 tuples 3738 secs 18720 secs 20628 secs about 100 bytes each, schema on 'schema' sheet composite index on 3 fields (esn, min, datetime) Load Speed115570 tuples/second23076 tuples/second 20942 tuples/second Database Size on Disk 48 GB 87 GB 111 GB Average per partition Inserts + building a composite index 300MB data, 3,000,000 tuples, 28 secs 130 secs150 secs about 100 bytes each, schema on 'schema' sheet composite index on 3 fields (esn, min, datetime) Select Query 7 secs 7 secs 6 secs based on equality match of 2 fields (esn and min) - 4 concurrent queries running Database Size on Disk 341 MB 619 MB 788 MB ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] anoncvs and diff
Nigel J. Andrews wrote: cvs diff -r HEAD pltcl.c gave me differences against revision 1.64 and cvs update pltcl.c said it was merging changes between 1.64 and 1.61 and a plain cvs diff now shows me differences against 1.64 I think this is probably just a short fall in my fairly basic knowledge of how cvs works. What does 'cvs log' say about the file, especially the top stuff? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: AIX compilation problems (was Re: [HACKERS] Proposal ...)
has this patched been applied to the CVS yet? On Tue, 1 Oct 2002, Zeugswetter Andreas SB SD wrote: Date: Tue, 1 Oct 2002 10:23:13 +0200 From: Zeugswetter Andreas SB SD [EMAIL PROTECTED] To: Peter Eisentraut [EMAIL PROTECTED] Cc: PostgreSQL Development [EMAIL PROTECTED] Subject: Re: AIX compilation problems (was Re: [HACKERS] Proposal ...) Attached is a patch to fix the mb linking problems on AIX. As a nice side effect it reduces the duplicate symbol warnings to linking libpq.so and libecpg.so (all shlibs that are not postmaster loadable modules). Can you explain the method behind your patch? Have you tried -bnogc? -bnogc would (probably) have been the correct switch reading the man page, but the method was previously not good since it involved the following: 1. create a static postgres executable from the SUBSYS.o's 2. create an exports file from above 3. recreate a shared postgres executable This naturally had a cyclic dependency, that could not properly be reflected in the Makefile (thus a second make sometimes left you with a static postgres unless you manually removed postgres.imp). Now it does: postgres.imp: $(OBJS) create a temporary SUBSYS.o from all $(OBJS) create a postgres.imp from SUBSYS.o rm temporary SUBSYS.o postgres: postgres.imp link a shared postgres A second change was to move the import and export files to the end of the link line, then the linker knows not to throw a duplicate symbol warning, and keeps all symbols that are mentioned in the exports file (== -bnogc restricted to $(OBJS) symbols). Thus now only libpq.so and libecpg.so still show the duplicate symbol warnings since their link line should actually not include postgres.imp . I did not see how to make a difference between loadable modules (need postgres.imp) and interface libraries (do not need postgres.imp), but since the resulting libs are ok, I left it at that. I tested both gcc and xlc including regression tests. Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly [EMAIL PROTECTED] (Samuel A Horwitz) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] [GENERAL] Large databases, performance
On 3 Oct 2002 at 8:54, Charles H. Woloszynski wrote: Can you comment on the tools you are using to do the insertions (Perl, Java?) and the distribution of data (all random, all static), and the transaction scope (all inserts in one transaction, each insert as a single transaction, some group of inserts as a transaction). Most proably it's all inserts in one transaction spread almost uniformly over around 15-20 tables. Of course there will be bunch of transactions.. I'd be curious what happens when you submit more queries than you have processors (you had four concurrent queries and four CPUs), if you care to run any additional tests. Also, I'd report the query time in absolute (like you did) and also in 'Time/number of concurrent queries. This will give you a sense of how the system is scaling as the workload increases. Personally I am more concerned about this aspect than the load time, since I am going to guess that this is where all the time is spent. I don't think so. Because we plan to put enough shared buffers that would almost contain the indexes in RAM if not data. Besides number of tuples expected per query are not many. So more concurrent queries are not going to hog anything other than CPU power at most. Our major concern remains load time as data is generated in real time and is expecetd in database with in specified time period. We need indexes for query and inserting into indexed table is on hell of a job. We did attempt inserting 8GB of data in indexed table. It took almost 20 hours at 1K tuples per second on average.. Though impressive it's not acceptable for that load.. Was the original posting on GENERAL or HACKERS. Is this moving the PERFORMANCE for follow-up? I'd like to follow this discussion and want to know if I should join another group? Shall I subscribe to performance? What's the exat list name? Benchmarks? I don't see anything as performance mailing list on this page.. http://developer.postgresql.org/mailsub.php?devlp P.S. Anyone want to comment on their expectation for 'commercial' databases handling this load? I know that we cannot speak about specific performance metrics on some products (licensing restrictions) but I'd be curious if folks have seen some of the databases out there handle these dataset sizes and respond resonably. Well, if something handles such kind of data with single machine and costs under USD20K for entire setup, I would be willing to recommend that to client.. BTW we are trying same test on HP-UX. I hope we get some better figures on 64 bit machines.. Bye Shridhar -- Clarke's Conclusion:Never let your sense of morals interfere with doing the right thing. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [GENERAL] Large databases, performance
Shridhar Daithankar wrote: snip Was the original posting on GENERAL or HACKERS. Is this moving the PERFORMANCE for follow-up? I'd like to follow this discussion and want to know if I should join another group? Shall I subscribe to performance? What's the exat list name? Benchmarks? I don't see anything as performance mailing list on this page.. http://developer.postgresql.org/mailsub.php?devlp It's a fairly new mailing list. :) [EMAIL PROTECTED] Easiest way to subscribe is by emailing [EMAIL PROTECTED] with: subscribe pgsql-performance as the message body. :-) Regards and best wishes, Justin Clift snip Bye Shridhar -- Clarke's Conclusion:Never let your sense of morals interfere with doing the right thing. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] anoncvs and diff
On Thu, 3 Oct 2002, Bruce Momjian wrote: Nigel J. Andrews wrote: cvs diff -r HEAD pltcl.c gave me differences against revision 1.64 and cvs update pltcl.c said it was merging changes between 1.64 and 1.61 and a plain cvs diff now shows me differences against 1.64 I think this is probably just a short fall in my fairly basic knowledge of how cvs works. What does 'cvs log' say about the file, especially the top stuff? It gave me the log all the way up to the 1.64 revision with the REL7_3_STABLE label assigned to revision 1.64.0.2 Revision 1.64 apparently backing out my patch which made 1.63. I had a brain wave and did the cvs log command which was what lead me to try specifying revisions. As I say it looks like a lack of knowledge about how cvs works for these things. I always thought it worked like RCS and gave a diff against the latest checked in but obviously not. BTW, I've found Neil Conway's patch for this file, email dated 25th Sept., I can forward it or apply it and include the changes along with whatever I do for my next submission, which ever you'd prefer. I'd suggest it's easy to let me apply and submit it due to overlaps. -- Nigel J. Andrews ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [GENERAL] Large databases, performance
On Thu, 2002-10-03 at 10:56, Shridhar Daithankar wrote: Well, we were comparing ext3 v/s reiserfs. I don't remember the journalling mode of ext3 but we did a 10 GB write test. Besides converting the RAID to RAID- 0 from RAID-5 might have something to do about it. There was a discussion on hackers some time back as in which file system is better. I hope this might have an addition over it.. Hmm. Reiserfs' claim to fame is it's low latency with many, many small files and that it's journaled. I've never seem anyone comment about it being considered an extremely fast file system in an general computing context nor have I seen any even hint at it as a file system for use in heavy I/O databases. This is why Reiserfs is popular with news and squid cache servers as it's almost an ideal fit. That is, tons of small files or directories contained within a single directory. As such, I'm very surprised that reiserfs is even in the running for your comparison. Might I point you toward XFS, JFS, or ext3, ? As I understand it, XFS and JFS are going to be your preferred file systems for for this type of application with XFS in the lead as it's tool suite is very rich and robust. I'm actually lacking JFS experience but from what I've read, it's a notch or two back from XFS in robustness (assuming we are talking Linux here). Feel free to read and play to find out for your self. I'd recommend that you start playing with XFS to see how the others compare. After all, XFS' specific claim to fame is high throughput w/ low latency on large and very large files. Furthermore, they even have a real time mechanism that you can further play with to see how it effects your throughput and/or latencies. Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] [GENERAL] Large databases, performance
On 3 Oct 2002 at 11:23, Greg Copeland wrote: On Thu, 2002-10-03 at 10:56, Shridhar Daithankar wrote: Well, we were comparing ext3 v/s reiserfs. I don't remember the journalling mode of ext3 but we did a 10 GB write test. Besides converting the RAID to RAID- 0 from RAID-5 might have something to do about it. There was a discussion on hackers some time back as in which file system is better. I hope this might have an addition over it.. Hmm. Reiserfs' claim to fame is it's low latency with many, many small files and that it's journaled. I've never seem anyone comment about it being considered an extremely fast file system in an general computing context nor have I seen any even hint at it as a file system for use in heavy I/O databases. This is why Reiserfs is popular with news and squid cache servers as it's almost an ideal fit. That is, tons of small files or directories contained within a single directory. As such, I'm very surprised that reiserfs is even in the running for your comparison. Might I point you toward XFS, JFS, or ext3, ? As I understand it, XFS and JFS are going to be your preferred file systems for for this type of application with XFS in the lead as it's tool suite is very rich and robust. I'm actually lacking JFS experience but from what I've read, it's a notch or two back from XFS in robustness (assuming we are talking Linux here). Feel free to read and play to find out for your self. I'd recommend that you start playing with XFS to see how the others compare. After all, XFS' specific claim to fame is high throughput w/ low latency on large and very large files. Furthermore, they even have a real time mechanism that you can further play with to see how it effects your throughput and/or latencies. I would try that. Once we are thr. with tests at our hands.. Bye Shridhar -- The combination of a number of things to make existence worthwhile. Yes, the philosophy of 'none,' meaning 'all.' -- Spock and Lincoln, The Savage Curtain, stardate 5906.4 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] anoncvs and diff
Nigel J. Andrews wrote: It gave me the log all the way up to the 1.64 revision with the REL7_3_STABLE label assigned to revision 1.64.0.2 Revision 1.64 apparently backing out my patch which made 1.63. I had a brain wave and did the cvs log command which was what lead me to try specifying revisions. As I say it looks like a lack of knowledge about how cvs works for these things. I always thought it worked like RCS and gave a diff against the latest checked in but obviously not. BTW, I've found Neil Conway's patch for this file, email dated 25th Sept., I can forward it or apply it and include the changes along with whatever I do for my next submission, which ever you'd prefer. I'd suggest it's easy to let me apply and submit it due to overlaps. Sure, sounds good. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Trigger regression test output
Lamar Owen [EMAIL PROTECTED] writes: Builds fine here for RPM usage. Got an odd diff in the triggers regression test: did we drop a NOTICE? If so, the regression output should probably have been changed too. The diff: *** ./expected/triggers.out Sat Jan 15 14:18:23 2000 --- ./results/triggers.out Thu Oct 3 00:16:09 2002 *** *** 75,91 insert into fkeys values (60, '6', 4); ERROR: check_fkeys_pkey2_exist: tuple references non-existing key in fkeys2 delete from pkeys where pkey1 = 30 and pkey2 = '3'; - NOTICE: check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted ERROR: check_fkeys2_fkey_restrict: tuple referenced in fkeys delete from pkeys where pkey1 = 40 and pkey2 = '4'; - NOTICE: check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted - NOTICE: check_pkeys_fkey_cascade: 1 tuple(s) of fkeys2 are deleted update pkeys set pkey1 = 7, pkey2 = '70' where pkey1 = 50 and pkey2 = '5'; - NOTICE: check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted ERROR: check_fkeys2_fkey_restrict: tuple referenced in fkeys update pkeys set pkey1 = 7, pkey2 = '70' where pkey1 = 10 and pkey2 = '1'; - NOTICE: check_pkeys_fkey_cascade: 1 tuple(s) of fkeys are deleted - NOTICE: check_pkeys_fkey_cascade: 1 tuple(s) of fkeys2 are deleted DROP TABLE pkeys; DROP TABLE fkeys; DROP TABLE fkeys2; --- 75,85 After looking into this I have a theory about the cause: you must have built the contrib/spi/refint.c module without -DREFINT_VERBOSE. That flag is required to pass the regression tests, because it controls output of this debug notice. The normal build procedure for the regression tests does cause this to happen, but if you'd previously built the contrib subdirectory with default switches, I think the regress tests would use the existing refint.o and get a failure. This seems a tad undesirable now that I look at it. I don't want to mess with 7.2.3, but for 7.3 I think we should try to make the regression test work correctly with a default build of the contrib module. As of CVS tip, the notice isn't appearing in the regression test output at all, because the elog was changed to DEBUG3 which is below the default message threshold. This is certainly not desirable since it reduces the specificity of the test. I am inclined to have the refint.c code emit the notice unconditionally at DEBUG1 level, and then add a SET client_min_messages = DEBUG1 in the triggers regression test to ensure the notice will appear. Any objections? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Large databases, performance
On Thu, 03 Oct 2002 21:47:03 +0530, Shridhar Daithankar [EMAIL PROTECTED] wrote: I believe that was vacuum analyze only. Well there is VACUUM [tablename]; and there is ANALYZE [tablename]; And VACUUM ANALYZE [tablename]; is VACUUM followed by ANALYZE. Servus Manfred ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Trigger regression test output
I said: I am inclined to have the refint.c code emit the notice unconditionally at DEBUG1 level, and then add a SET client_min_messages = DEBUG1 in the triggers regression test to ensure the notice will appear. Hmm, that doesn't look that good after all: the SET causes the regression output to be cluttered with a whole *lot* of chatter, which will doubtless change constantly and break the test regularly. Plan B is to make the refint.c code emit the message at NOTICE level, but to change the contrib makefile so that REFINT_VERBOSE is defined by default (ie, you gotta edit the makefile if you don't want it). This will work nicely for the regression tests' purposes. If there is anyone out there actually using refint.c in production, they might be annoyed by the NOTICE chatter, but quite honestly I doubt anyone is --- this contrib module has long since been superseded by standard foreign-key support. Comments? regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [GENERAL] Anyone want to assist with the translation of theAdvocacy
Hi Tino, Tino Wildenhain wrote: Hi Justin, you want probably use the language-negotiation rather then a query variable :-) Um, language-negotiation in good in theory, but there are real world scenarios it doesn't take into account. :( However, the query variable is an override, and if one isn't present then the backend is supposed to use other means to determine the appropriate language, including the browsers preferred language. It's just that the code to do this bit hasn't been written yet. :-) If all else fails it falls back to a default language, English for this site. :-) Regards and best wishes, Justin Clift Regards Tino snip -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] anoncvs and diff
Nigel J. Andrews [EMAIL PROTECTED] writes: I had a brain wave and did the cvs log command which was what lead me to try specifying revisions. As I say it looks like a lack of knowledge about how cvs works for these things. I always thought it worked like RCS and gave a diff against the latest checked in but obviously not. I think cvs diff foo.c without any switches gives you the diff between your local copy of foo.c and the last version of foo.c *that you checked out* --- ie, it shows you the uncommitted editing that you've done. If you hadn't done cvs update since rev 1.61 then this would explain the behavior you saw. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Trigger regression test output
Tom Lane wrote: I said: I am inclined to have the refint.c code emit the notice unconditionally at DEBUG1 level, and then add a SET client_min_messages = DEBUG1 in the triggers regression test to ensure the notice will appear. Hmm, that doesn't look that good after all: the SET causes the regression output to be cluttered with a whole *lot* of chatter, which will doubtless change constantly and break the test regularly. Plan B is to make the refint.c code emit the message at NOTICE level, but to change the contrib makefile so that REFINT_VERBOSE is defined by default (ie, you gotta edit the makefile if you don't want it). This will work nicely for the regression tests' purposes. If there is anyone out there actually using refint.c in production, they might be annoyed by the NOTICE chatter, but quite honestly I doubt anyone is --- this contrib module has long since been superseded by standard foreign-key support. Yes, but if few people are using it, should we question whether it belongs in the standard regression tests at all? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] About connectby() again
Patch applied. Thanks. --- Joe Conway wrote: Masaru Sugawara wrote: The previous patch fixed an infinite recursion bug in contrib/tablefunc/tablefunc.c:connectby. But, other unmanageable error seems to occur even if a table has commonplace tree data(see below). I would think the patch, ancestor check, should be if (strstr(branch_delim || branchstr-data || branch_delim, branch_delim || current_key || branch_delim)) This is my image, not a real code. However, if branchstr-data includes branch_delim, my image will not be perfect. Good point. Thank you Masaru for the suggested fix. Attached is a patch to fix the bug found by Masaru. His example now produces: regression=# SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', '11', 0, '-') AS t(keyid int, parent_keyid int, level int, branch text); keyid | parent_keyid | level | branch ---+--+---+-- 11 | | 0 | 11 10 | 11 | 1 | 11-10 111 | 11 | 1 | 11-111 1 | 111 | 2 | 11-111-1 (4 rows) While making the patch I also realized that the no show branch form of the function was not going to work very well for recursion detection. Therefore there is now a default branch delimiter ('~') that is used internally, for that case, to enable recursion detection to work. If you need a different delimiter for your specific data, you will have to use the show branch form of the function. If there are no objections, please apply. Thanks, Joe Index: contrib/tablefunc/README.tablefunc === RCS file: /opt/src/cvs/pgsql-server/contrib/tablefunc/README.tablefunc,v retrieving revision 1.3 diff -c -r1.3 README.tablefunc *** contrib/tablefunc/README.tablefunc2 Sep 2002 05:44:04 - 1.3 --- contrib/tablefunc/README.tablefunc26 Sep 2002 22:57:27 - *** *** 365,371 branch_delim ! if optional branch value is desired, this string is used as the delimiter Outputs --- 365,373 branch_delim ! If optional branch value is desired, this string is used as the delimiter. ! When not provided, a default value of '~' is used for internal ! recursion detection only, and no branch field is returned. Outputs *** *** 388,394 the level value output 3. If the branch field is not desired, omit both the branch_delim input ! parameter *and* the branch field in the query column definition 4. If the branch field is desired, it must be the forth column in the query column definition, and it must be type TEXT --- 390,399 the level value output 3. If the branch field is not desired, omit both the branch_delim input ! parameter *and* the branch field in the query column definition. Note ! that when branch_delim is not provided, a default value of '~' is used ! for branch_delim for internal recursion detection, even though the branch ! field is not returned. 4. If the branch field is desired, it must be the forth column in the query column definition, and it must be type TEXT Index: contrib/tablefunc/tablefunc.c === RCS file: /opt/src/cvs/pgsql-server/contrib/tablefunc/tablefunc.c,v retrieving revision 1.9 diff -c -r1.9 tablefunc.c *** contrib/tablefunc/tablefunc.c 14 Sep 2002 19:32:54 - 1.9 --- contrib/tablefunc/tablefunc.c 26 Sep 2002 23:09:27 - *** *** 652,657 --- 652,660 branch_delim = GET_STR(PG_GETARG_TEXT_P(5)); show_branch = true; } + else + /* default is no show, tilde for the delimiter */ + branch_delim = pstrdup(~); per_query_ctx = rsinfo-econtext-ecxt_per_query_memory; oldcontext = MemoryContextSwitchTo(per_query_ctx); *** *** 798,807 --- 801,816 char *current_branch; char **values; StringInfo branchstr = NULL; + StringInfo chk_branchstr = NULL; + StringInfo chk_current_key = NULL; /* start a new branch */ branchstr = makeStringInfo(); + /* need these to check for recursion */ + chk_branchstr = makeStringInfo(); + chk_current_key = makeStringInfo(); + if (show_branch) values = (char **) palloc(CONNECTBY_NCOLS * sizeof(char *)); else *** *** 854,875 { /* initialize branch for this pass */
Re: [HACKERS] fix for client utils compilation under win32
Patch applied. Thanks. --- Joe Conway wrote: Tom Lane wrote: It might work to measure time since the start of the whole process, or until the timeout target, rather than accumulating adjustments to the remains count each time through. In other words something like at start: targettime = time() + specified-timeout each time we are about to wait: set select timeout to targettime - time(). This bounds the error at 1 second which is probably good enough (you might want to add 1 to targettime to ensure the error is in the conservative direction of not timing out too soon). The attached patch fixes a number of issues related to compiling the client utilities (libpq.dll and psql.exe) for win32 (missing defines, adjustments to includes, pedantic casting, non-existent functions) per: http://developer.postgresql.org/docs/postgres/install-win32.html. It compiles cleanly under Windows 2000 using Visual Studio .net. Also compiles clean and passes all regression tests (regular and contrib) under Linux. In addition to a review by the usual suspects, it would be very desirable for someone well versed in the peculiarities of win32 to take a look. If there are no objections, please commit. Thanks, Joe Index: src/backend/libpq/md5.c === RCS file: /opt/src/cvs/pgsql-server/src/backend/libpq/md5.c,v retrieving revision 1.18 diff -c -r1.18 md5.c *** src/backend/libpq/md5.c 4 Sep 2002 20:31:19 - 1.18 --- src/backend/libpq/md5.c 26 Sep 2002 17:56:11 - *** *** 26,35 * can be compiled stand-alone. */ ! #ifndef MD5_ODBC #include postgres.h #include libpq/crypt.h ! #else #include md5.h #endif --- 26,44 * can be compiled stand-alone. */ ! #if ! defined(MD5_ODBC) ! defined(FRONTEND) #include postgres.h #include libpq/crypt.h ! #endif ! ! #ifdef FRONTEND ! #include postgres_fe.h ! #ifndef WIN32 ! #include libpq/crypt.h ! #endif /* WIN32 */ ! #endif /* FRONTEND */ ! ! #ifdef MD5_ODBC #include md5.h #endif Index: src/bin/psql/command.c === RCS file: /opt/src/cvs/pgsql-server/src/bin/psql/command.c,v retrieving revision 1.81 diff -c -r1.81 command.c *** src/bin/psql/command.c22 Sep 2002 20:57:21 - 1.81 --- src/bin/psql/command.c26 Sep 2002 18:18:17 - *** *** 23,28 --- 23,29 #include win32.h #include io.h #include fcntl.h + #include direct.h #endif #include libpq-fe.h *** *** 1163,1169 return NULL; } ! if (i token_len - 1) return_val[i + 1] = '\0'; } --- 1164,1170 return NULL; } ! if (i (int) token_len - 1) return_val[i + 1] = '\0'; } *** *** 1240,1246 exit(EXIT_FAILURE); } ! for (p = source; p - source len *p; p += PQmblen(p, pset.encoding)) { if (esc) { --- 1241,1247 exit(EXIT_FAILURE); } ! for (p = source; p - source (int) len *p; p += PQmblen(p, pset.encoding)) { if (esc) { *** *** 1278,1284 char *end; l = strtol(p, end, 0); ! c = l; p = end - 1; break; } --- 1279,1285 char *end; l = strtol(p, end, 0); ! c = (char) l; p = end - 1; break; } Index: src/bin/psql/common.c === RCS file: /opt/src/cvs/pgsql-server/src/bin/psql/common.c,v retrieving revision 1.45 diff -c -r1.45 common.c *** src/bin/psql/common.c 14 Sep 2002 19:46:01 - 1.45 --- src/bin/psql/common.c 26 Sep 2002 18:43:31 - *** *** 11,27 #include errno.h #include stdarg.h - #include sys/time.h #ifndef HAVE_STRDUP
Re: [HACKERS] Please, applay patch to current CVS
Patch applied. Thanks. --- Teodor Sigaev wrote: This is small README fix for contrib/intarray. Thank you. -- Teodor Sigaev [EMAIL PROTECTED] [ application/gzip is not supported, skipping... ] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Trigger regression test output
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: This will work nicely for the regression tests' purposes. If there is anyone out there actually using refint.c in production, they might be annoyed by the NOTICE chatter, but quite honestly I doubt anyone is --- this contrib module has long since been superseded by standard foreign-key support. Yes, but if few people are using it, should we question whether it belongs in the standard regression tests at all? Well, it's not there to test itself, it's there to test trigger functionality. And, not so incidentally, to test that dynamically-loaded C functions work. I don't want to take it out. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Trigger regression test output
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: This will work nicely for the regression tests' purposes. If there is anyone out there actually using refint.c in production, they might be annoyed by the NOTICE chatter, but quite honestly I doubt anyone is --- this contrib module has long since been superseded by standard foreign-key support. Yes, but if few people are using it, should we question whether it belongs in the standard regression tests at all? Well, it's not there to test itself, it's there to test trigger functionality. And, not so incidentally, to test that dynamically-loaded C functions work. I don't want to take it out. Oh, interestings. Makes sense. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] small patch for vacuumlo
Patch applied. Thanks. --- Mario Weilguni wrote: It's just a cosmetic change, fixes the help screen. Should be applied in /contrib/vacuumlo Regards, Mario Weilguni [ Attachment, skipping... ] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Diff for reindexdb
Patch applied. Thanks. --- Mario Weilguni wrote: This small patch adds a Makefile for /contrib/reindexdb/ and renames the README to README.reindexdb. Regards, Mario Weilguni [ Attachment, skipping... ] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: AIX compilation problems (was Re: [HACKERS] Proposal ...)
Samuel A Horwitz wrote: has this patched been applied to the CVS yet? No, I was waiting to see if there were any negative comments, but seeing none, I will add it to the patch queue today. --- On Tue, 1 Oct 2002, Zeugswetter Andreas SB SD wrote: Date: Tue, 1 Oct 2002 10:23:13 +0200 From: Zeugswetter Andreas SB SD [EMAIL PROTECTED] To: Peter Eisentraut [EMAIL PROTECTED] Cc: PostgreSQL Development [EMAIL PROTECTED] Subject: Re: AIX compilation problems (was Re: [HACKERS] Proposal ...) Attached is a patch to fix the mb linking problems on AIX. As a nice side effect it reduces the duplicate symbol warnings to linking libpq.so and libecpg.so (all shlibs that are not postmaster loadable modules). Can you explain the method behind your patch? Have you tried -bnogc? -bnogc would (probably) have been the correct switch reading the man page, but the method was previously not good since it involved the following: 1. create a static postgres executable from the SUBSYS.o's 2. create an exports file from above 3. recreate a shared postgres executable This naturally had a cyclic dependency, that could not properly be reflected in the Makefile (thus a second make sometimes left you with a static postgres unless you manually removed postgres.imp). Now it does: postgres.imp: $(OBJS) create a temporary SUBSYS.o from all $(OBJS) create a postgres.imp from SUBSYS.o rm temporary SUBSYS.o postgres: postgres.imp link a shared postgres A second change was to move the import and export files to the end of the link line, then the linker knows not to throw a duplicate symbol warning, and keeps all symbols that are mentioned in the exports file (== -bnogc restricted to $(OBJS) symbols). Thus now only libpq.so and libecpg.so still show the duplicate symbol warnings since their link line should actually not include postgres.imp . I did not see how to make a difference between loadable modules (need postgres.imp) and interface libraries (do not need postgres.imp), but since the resulting libs are ok, I left it at that. I tested both gcc and xlc including regression tests. Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly [EMAIL PROTECTED] (Samuel A Horwitz) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 2nd cut at SSL documentation
I have added this to backend/libpq/README.SSL to be integrated into our main docs later. --- Bear Giles wrote: A second cut at SSL documentation SSL Support in PostgreSQL = Who needs it? = The sites that require SSL fall into one (or more) of several broad categories. *) They have insecure networks. Examples of insecure networks are anyone in a corporate hotel, any network with 802.11b wireless access points (WAP) (in 2002, this protocol has many well-known security weaknesses and even 'gold' connections can be broken within 8 hours), or anyone accessing their database over the internet. These sites need a Virtual Private Network (VPN), and either SSH tunnels or direct SSL connections can be used. *) They are storing extremely sensitive information. An example of extremely sensitive information is logs from network intrusion detection systems. This information *must* be fully encrypted between front- and back-end since an attacker is presumably sniffing all traffic within the VPN, and if they learn that you know what they are doing they may attempt to cover their tracks with a quick 'rm -rf /' and 'dropdb' In the extreme case, the contents of the database itself may be encrypted with either the crypt package (which provides symmetrical encryption of the records) or the PKIX package (which provides public-key encryption of the records). *) They are storing information which is considered confidential by custom, law or regulation. This includes all records held by your doctor, lawyer, accountant, etc. In these cases, the motivation for using encryption is not a conscious evaulation of risk, but the fear of liability for 'failure to perform due diligence' if encryption is available but unused and an attacker gains unauthorized access to the harm of others. *) They have 'road warriors.' This includes all sites where people need to have direct access to the database (not through a proxy such as a secure web page) from changing remote addresses. Client certificates provide a clean way to grant this access without opening up the database to the world. Who does not need it? - It's at least as important to know who does not need SSL as it is to know who does. Sites that do not need SSL fall into several broad categories. *) Access is limited to the Unix socket. *) Access is limited to a physically secure network. Physically secure networks are common in the clusters and colocation sites - all database traffic is restricted to dedicated NIC cards and hubs, and all servers and cabling are maintained in locked cabinets. Using SSH/OpenSSH as a Virtual Private Network (VPN) SSH and OpenSSH can be used to construct a Virtual Private Network (VPN) to provide confidentiality of PostgreSQL communications. These tunnels are widely available and fairly well understood, but do not provide any application-level authentication information. To set up a SSH/OpenSSH tunnel, a shell account for each user should be set up on the database server. It is acceptable for the shell program to be bogus (e.g., /bin/false), if the tunnel is set up in to avoid launching a remote shell. On each client system the $HOME/.ssh/config file should contain an additional line similiar to LocalForward psql.example.com:5432 (replacing psql.example.com with the name of your database server). By putting this line in the configuration file, instead of specifying it on the command line, the tunnel will be created whenever a connection is made to the remote system. The psql(1) client (or any client) should be wrapped with a script that establishes an SSH tunnel when the program is launched: #!/bin/sh HOST=psql.example.com IDENTITY=$HOME/.ssh/identity.psql /usr/bin/ssh -1 -i $IDENTITY -n $HOST 'sleep 60' \ /usr/bin/psql -h $HOST -p $1 Alternately, the system could run a daemon that establishes and maintains the tunnel. This is preferrable when multiple users need to establish similar tunnels to the same remote site. Unfortunately, there are many potential drawbacks to SSL tunnels: *) the SSH implementation or protocol may be flawed. Serious problems are discovered about once every 18- to 24- months. *) the systems may be misconfigured by accident. *) the database server must provide shell accounts for all users needing access. This can be a chore to maintain, esp. in if all other user access should be denied. *) neither the front- or back-end can determine the level of encryption provided by the SSH tunnel - or even whether an SSH tunnel is in use. This prevents
Re: [GENERAL] [HACKERS] Anyone want to assist with the translationof the
Tino Wildenhain wrote: snip Haha cutpaste ;-) Ever heard of csv? :-)) However, I can also have a look at it, if desired. Heh Heh Heh Good point. For the moment we've whipped up that MS Excel document (created in OpenOffice of course) of all the English text strings in the site and emailed it to the volunteers. :) So far community members have volunteered for German, Turkish, French, Spanish, Brazilian Portuguese, and Polish. Cool. :) Want to co-ordinate with the other two German language volunteers? Regards and best wishes, Justin Clift Regards Tino -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] [SQL] [GENERAL] CURRENT_TIMESTAMP
Thomas Lockhart wrote: ... Seems that isn't helping enough to reduce the number of people who are surprised by our behavior. I don't think anyone would be surprised by statement time. I think that there is no compelling reason for changing the current behavior. There is no *single* convention used by all other databases, and *if* the standard specifies this as statement time then afaict no database implements that exactly. I was attempting to get closer to the standards and to other databases, and to make it perhaps more intuitive. Transaction time is the only relatively deterministic time, and other times are (or could be) available using other function calls. So what problem are we trying to solve? There is no evidence that a different convention would change the number of folks who do not understand what convention was chosen. Arguing to change the current implementation without offering to include the functionality to handle all of the scenarios seems to be premature. And arguing that a change would be clearer to some folks is not compelling; transaction start is at least as easily understood as any other definition we could make. Yes, clearly, we will need to have all three time values available to users. With three people now suggesting we don't change, I will just add to TODO: Add now(transaction|statement|clock) functionality Is that good? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Return of INSTEAD rules
We have talked about possible return values for RULES, particularly INSTEAD rule. Manfred has a nice example here, so I propose we handle INSTEAD rules this way: that we return the oid and tuple count of the last INSTEAD rule query with a tag matching the main query. The returned tag, of course, would be the tag of the main query. This works for Manfred's case, and it works for my case when there is only one action in the INSTEAD rule. If there is more than one matching tag in the INSTEAD rule, the user has the option to place the query he wants for the return at the end of the rule. This does give the user some control over what is returned. Comments? I think non-INSTEAD rules already return the tag, oid, and tuple count of the main query, right? --- Manfred Koizar wrote: On Sat, 28 Sep 2002 19:20:43 -0400 (EDT), Bruce Momjian [EMAIL PROTECTED] wrote: OK, that is a good example. It would return the sum of the matching tags. You are suggesting here that it would be better to take the result of the last matching tag command, right? The examples were meant to support my previous suggestion of explicitly marking the statement you want to be counted, something like: CREATE VIEW twotables AS SELECT ... FROM table1 INNER JOIN table2 ON ... ; CREATE RULE twotables_insert AS -- INSERT rule ON INSERT TO twotables DO INSTEAD ( COUNT INSERT INTO table1 VALUES (new.pk, new.col1); INSERT INTO table2 VALUES (new.pk, new.col2) ); CREATE RULE twotables_update AS -- UPDATE rule ON UPDATE TO twotables DO INSTEAD ( COUNT UPDATE table1 SET col1 = new.col1 WHERE pk = old.pk; UPDATE table2 SET col2 = new.col2 WHERE pk = old.pk ); CREATE RULE twotables_delete AS -- DELETE rule ON DELETE TO twotables DO INSTEAD ( COUNT DELETE FROM table1 WHERE pk = old.pk; DELETE FROM table2 WHERE pk = old.pk ); CREATE VIEW visible AS SELECT ... FROM table3 WHERE deleted = 0; CREATE RULE visible_delete AS -- DELETE rule ON DELETE TO visible DO INSTEAD COUNT UPDATE table3 SET deleted = 1 WHERE pk = old.pk; One argument against automatically don't count non-INSTEAD rules and count the last statement in INSTEAD rules: sql-createrule.html says: | for view updates: there must be an unconditional INSTEAD rule [...] | If you want to handle all the useful cases in conditional rules, you | can; just add an unconditional DO INSTEAD NOTHING rule [...] | Then make the conditional rules non-INSTEAD CREATE RULE v_update AS -- UPDATE rule ON UPDATE TO v DO INSTEAD NOTHING; CREATE RULE v_update2 AS -- UPDATE rule ON UPDATE TO v WHERE condition1 DO ( COUNT ... ); CREATE RULE v_update3 AS -- UPDATE rule ON UPDATE TO v WHERE condition2 DO ( COUNT ... ); Servus Manfred -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] How to REINDEX in high volume environments?
Jim, glad you are still around. Yes, we would love to get tablespaces in 7.4. I think we need to think bigger and get something where we can name tablespaces and place tables/indexes into these named spaces. I can reread the TODO.detail stuff and give you an outline. How does that sound? Thomas Lockhart is also interested in this feature. --- Jim Buttafuoco wrote: Just wanted to pipe in here. I am still very interested in tablespaces ( I have many database systems that are over 500GB and growing) and am willing to port my tablespace patch to 7.4. I have everything (but only tested here) working in 7.2 but the patch was not accepted. I didn't see a great speed improvement but the patch helps with storage management. Recap. the patch would enable the following a database to have a default data tablespace and index tablespace a user to have a default data and index tablespace a table to have a specific tablespace an index to have a specfic tablespace I would like to also add namespace (schema) to have a default data and index tablespaces Jim Justin Clift [EMAIL PROTECTED] writes: Shridhar Daithankar wrote: Looks like we should have a subdirectory in database directory which stores index. That was my first thought also, but an alternative/additional approach would be this (not sure if it's workable): See the tablespaces TODO item. I'm not excited about building half-baked versions of tablespaces before we get around to doing the real thing ... regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Improving backend startup interlock
Bruce Momjian [EMAIL PROTECTED] writes: Have people considered flock (advisory locking) on the postmaster.pid file for backend detection? $ man flock No manual entry for flock. $ HPUX has generally taken the position of adopting both BSD and SysV features, so if it doesn't exist here, it's not portable to older Unixen ... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_dump and large files - is this a problem?
Giles Lean [EMAIL PROTECTED] writes: When talking of near-current systems with 64 bit off_t you are not going to find one without support for 64 bit integral types. I tend to agree with Giles on this point. A non-integral representation of off_t is theoretically possible but I don't believe it exists in practice. Before going far out of our way to allow it, we should first require some evidence that it's needed on a supported or likely-to-be-supported platform. time_t isn't guaranteed to be an integral type either if you read the oldest docs about it ... but no one believes that in practice ... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_dump and large files - is this a problem?
Tom Lane wrote: Giles Lean [EMAIL PROTECTED] writes: When talking of near-current systems with 64 bit off_t you are not going to find one without support for 64 bit integral types. I tend to agree with Giles on this point. A non-integral representation of off_t is theoretically possible but I don't believe it exists in practice. Before going far out of our way to allow it, we should first require some evidence that it's needed on a supported or likely-to-be-supported platform. time_t isn't guaranteed to be an integral type either if you read the oldest docs about it ... but no one believes that in practice ... I think fpos_t is the non-integral one. I thought off_t almost always was integral. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_dump and large files - is this a problem?
At 11:07 PM 3/10/2002 -0400, Tom Lane wrote: A non-integral representation of off_t is theoretically possible but I don't believe it exists in practice. Excellent. So I can just read/write the bytes in an appropriate order and expect whatever size it is to be a single intXX. Fine with me, unless anybody voices another opinion in the next day, I will proceed. I just have this vague recollection of seeing a header file with a more complex structure for off_t. I'm probably dreaming. Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 0500 83 82 82 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Return of INSTEAD rules
Bruce Momjian [EMAIL PROTECTED] writes: We have talked about possible return values for RULES, particularly INSTEAD rule. Manfred has a nice example here, so I propose we handle INSTEAD rules this way: that we return the oid and tuple count of the last INSTEAD rule query with a tag matching the main query. Hmm ... that's subtly different from what I'd seen discussed before. I thought the idea was 1. If no INSTEAD rule: return tag, count, and OID of original query, regardless of what is added by non-INSTEAD rules. (I think this part is not controversial.) 2. If any INSTEAD rule: return tag, count, and OID of the last executed query that has the same tag as the original query. If no substituted query matches the original query's tag, return original query's tag with zero count and OID. (This is where the going gets tough.) I think you just modified the second part of that to restrict it to queries that were added by INSTEAD rules. This is doable but it's not a trivial change --- in particular, I think it implies adding another field to Query data structure so we can mark INSTEAD-added vs non-INSTEAD-added queries. Which means an initdb because it breaks stored rules. Offhand I think this might be worth doing, because I like that subtle change in behavior. But we should understand exactly what we're doing here... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Trigger regression test output
Lamar Owen [EMAIL PROTECTED] writes: So the regression tests weren't really testing the actually built module, so to speak. Is there a good reason to leave the NOTICE's in the expected regression output? Yes: without them the test is less useful, because you're less certain that what happened was what was supposed to happen. As to the way it's built, the regression tests are built in the RPMset to allow post-install (that is, post _RPM_ install) regression testing on machines without make or compilers. Well, I'm about to commit a change that makes the default build of contrib/spi have the correct NOTICE output, as of 7.3. You could make the 7.2 RPMset do likewise if you wish. One thing that confuses me though is that the build options have been like this for a long time (at least since 7.1). Why haven't you seen this problem before? Did you recently change the way the RPMs build contrib? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Oracle beats up on Open Source Database(s) ... and
Adrian 'Dagurashibanipal' von Bidder wrote: -- Start of PGP signed section. On Wed, 2002-10-02 at 16:14, Marc G. Fournier wrote: Just in case anyone enjoys these sorts of things :) It deals with the whole .org TLD assignment ... http://forum.icann.org/org-eval/gartner-report I like this one: | Unlike many of the conventional commercial databases, PostgreSQL has | offered advanced Object Relational capabilities for years, including | inheritance. Ms. Gelhausen is quite correct that these are important | capabilities, finally available with the release of Oracle9i. We | applaud Oracle's continued efforts to close the gap and stay | competitive with this, and other open source database features. Yes, I found the thread assuming. Here are the choice parts from the Oracle posting: PostgreSQL, like many other open source database products, has been in the market for many years with very little adoption. Unlike the Oh, someone should tell our huge user base. open-source operating system market, the open-source database market has We support Linux, so I can't bad mouth open-source OS's. been unsuccessful due to the complexity of customer requirements and sophistication of the technology needed. PostgreSQL is used primarily Fear-uncertainty-doubt. Express it as fact and people will belive it. in the embedded system market because it lacks the transactional features, high availability, security and manageability of any commercial enterprise database. He is confusing us with MySQL. Oh, they are all the same; doesn't matter. [ Quotes of lots of stuff PostgreSQL has had for years that Oracle just added recently. Is he trying to make Oracle look good? ] While there is a place in the industry for open source software. It will be many years, if ever, that an open source database matches Oracle's database technology for the availability, standards support, performance, manageability, security, application support, and stability that most real-world business applications require. Fear-uncertainty-doubt (FUD). thank you. Jenny Gelhausen Oracle Marketing Oh, that explains it all. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: AIX compilation problems (was Re: [HACKERS] Proposal ...)
Your patch has been added to the PostgreSQL unapplied patches list at: http://candle.pha.pa.us/cgi-bin/pgpatches I will try to apply it within the next 48 hours. --- Zeugswetter Andreas SB SD wrote: and mb conversions (pg_ascii2mic and pg_mic2ascii not found in the postmaster and not included from elsewhere) shared libs on AIX need to be able to resolve all symbols at linkage time. Those two symbols are in backend/utils/SUBSYS.o but not in the postgres executable. They are defined in backend/utils/mb/conv.c and declared in include/mb/pg_wchar.h. They're also linked into the postmaster. I don't see anything unusual. Attached is a patch to fix the mb linking problems on AIX. As a nice side effect it reduces the duplicate symbol warnings to linking libpq.so and libecpg.so (all shlibs that are not postmaster loadable modules). Please apply to current (only affects AIX). The _LARGE_FILES problem is unfortunately still open, unless Peter has fixed it per his recent idea. Thanx Andreas Content-Description: mb_link_patch4.gz [ Attachment, skipping... ] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Optimizer generates bad plans.
Tom Lane wrote: Neil Conway [EMAIL PROTECTED] writes: Interesting. The inconsistency you're seeing is a result of GEQO. I would have hoped that it would have produced a better quality plan more often, but apparently not. On my system, the regular query optimizer handily beats GEQO for this query: it produces more efficienty query plans 100% of the time and takes less time to do so. For *this* query at least, raising geqo_threshold would be a good idea, but that may not be true universally. The current GEQO threshold was set some time ago; since then, the regular optimizer has been improved while the GEQO code hasn't been touched. It might well be time to ratchet up the threshold. Anyone care to do some additional experiments? Added to TODO: * Check GUC geqo_threshold to see if it is still accurate -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Trigger regression test output
On Thursday 03 October 2002 02:31 pm, Tom Lane wrote: Lamar Owen [EMAIL PROTECTED] writes: One thing that confuses me though is that the build options have been like this for a long time (at least since 7.1). Why haven't you seen this problem before? Did you recently change the way the RPMs build contrib? Yes, I recently changed that to use the default make instead of the horribly cobbled thing I was using. But it broke regression, which I didn't check when I built the 7.2.2-1PGDG set (I had done a regression test with 7.2.2-0.1PGDG, which had the old kludge for contrib). -- Lamar Owen WGCR Internet Radio 1 Peter 4:11 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] DROP COLUMN misbehaviour with multiple inheritance
Where are we with this patch? --- Alvaro Herrera wrote: On 29 Sep 2002, Hannu Krosing wrote: On Sun, 2002-09-29 at 19:57, Tom Lane wrote: Hannu Krosing [EMAIL PROTECTED] writes: I'd propose that ADD ONLY would pull topmost attislocal up (reset it from the (grand)child) whereas plain ADD would leave attislocal alone. ADD ONLY? There is no such animal as ADD ONLY, and cannot be because it implies making a parent inconsistent with its children. I meant ADD ONLY to be the exact opposite of DROP ONLY - it adds parent column and removes attislocal from children. Simple ADD would _not_ remove attislocal from children with matching column. Consistency requires that it be exactly the opposite. When you ADD ONLY, you want only in the local table, so children still have a local definition; OTOH, when you ADD (recursively) you want all children to get non-local status. Suppose CREATE TABLE p (f1 int); CREATE TABLE c (f2 int) INHERITS (p); c.f2.attislocal = true Now, ALTER TABLE ONLY p ADD COLUMN f2 int should leavy c.f2.attislocal alone, while ALTER TABLE p ADD COLUMN f2 int should reset it. This is the opposite of your proposal, and I don't think it exists in Tom's proposal. I think this is also consistent with the fact that ONLY requires the column to exist in all children, while non-ONLY creates it where it doesn't exist, and merges (resetting attislocal if set -- it could be inherited from some other parent) where it exists. -- Alvaro Herrera (alvherre[@]dcc.uchile.cl) Nunca se desea ardientemente lo que solo se desea por razon (F. Alexandre) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] [HACKERS] Anyone want to assist with the translationof the Advocacy
Tino Wildenhain [EMAIL PROTECTED] wrote: Hi Justin, Good point. For the moment we've whipped up that MS Excel document (created in OpenOffice of course) of all the English text strings in the site and emailed it to the volunteers. :) Btw. did you ever unzip the native OpenOffice (aka StarOffice) file? So far community members have volunteered for German, Turkish, French, Spanish, Brazilian Portuguese, and Polish. Cool. :) Want to co-ordinate with the other two German language volunteers? Sure. So I'm here :-) Regards Tino You should have already got at least two mails, haven't you? Michael ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] psqlODBC *nix Makefile (new 7.3 open item?)
-Original Message- From: Peter Eisentraut [mailto:[EMAIL PROTECTED]] Sent: 01 October 2002 21:05 To: Dave Page Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: RE: [HACKERS] psqlODBC *nix Makefile (new 7.3 open item?) Dave Page writes: majority of you!) knock up a makefile so the driver will build standalone on *nix systems please? There should be no dependencies on any of the rest of the code - certainly there isn't for the Win32 build. I'm working something out. I'll send it to you tomorrow. Hah. I tried to put something together based on Automake and Libtool, but I must conclude that Libtool is just completely utterly broken. I also considered copying over Makefile.shlib, but that would draw in too many auxiliary files and create a different kind of mess. So what I would suggest right now as the course of action is to copy your local psqlodbc subtree to its old location under interfaces/ and try to hook things together that way. Perhaps one of these days we should convert Makefile.shlib into a shell script that we can deploy more easily to different projects. I have added README.unix to the psqlODBC CVS containing the following text: I assume the odbc options haven't been removed from the autoconf stuff? Regards, Dave. psqlODBC README for *nix Systems Since psqlODBC has be moved from the main PostgreSQL source tree, we have yet to create a new build system for the driver. Currently, in order to build under *nix systems, it is recommended that you copy the files in this directory to src/interfaces/odbc in your PostgreSQL source tree, then re-run configure with the required options from the top of the tree. The driver can then be built using make as per normal. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] (Fwd) Re: Any Oracle 9 users? A test please...
Tom Lane wrote: Has anyone done the corresponding experiments on the other DBMSes to identify exactly when they allow CURRENT_TIMESTAMP to advance ? I have Db2 on hand and examined CURRENT TIMESTAMP in an sql procedure. (IBM have implemented it without the _ ) The short of it is that CURRENT TIMESTAMP is the not frozen to the transaction start, but reflects time movement within the transaction. Note that db2 +c is equivalent to issueing BEGIN in Pg, and the command line tool (db2) keeps (the same) connection open until the TERMINATE is issued : $ cat stamp.sql create procedure stamp() language sql begin insert into test values(1,current timestamp); insert into test values(2,current timestamp); insert into test values(3,current timestamp); insert into test values(4,current timestamp); insert into test values(5,current timestamp); insert into test values(6,current timestamp); insert into test values(7,current timestamp); insert into test values(8,current timestamp); insert into test values(9,current timestamp); end @ $ db2 connect to dss Database Connection Information Database server= DB2/LINUX 7.2.3 SQL authorization ID = DB2 Local database alias = DSS $ db2 -td@ -f stamp.sql DB2I The SQL command completed successfully. $ db2 +c db2 = call stamp(); STAMP RETURN_STATUS: 0 db2 = commit; DB2I The SQL command completed successfully. db2 = select * from test; ID VAL --- -- 1 2002-10-03-19.35.16.286019 2 2002-10-03-19.35.16.286903 3 2002-10-03-19.35.16.287549 4 2002-10-03-19.35.16.288235 5 2002-10-03-19.35.16.288925 6 2002-10-03-19.35.16.289571 7 2002-10-03-19.35.16.290209 8 2002-10-03-19.35.16.290884 9 2002-10-03-19.35.16.291522 9 record(s) selected. db2 = terminate; regards Mark ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] DROP COLUMN misbehaviour with multiple inheritance
Bruce Momjian [EMAIL PROTECTED] writes: Where are we with this patch? It's done as far as I'm concerned ;-). Not sure if Hannu still wants to argue that the behavior is wrong ... it seems fine to me though ... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [SQL] [GENERAL] CURRENT_TIMESTAMP
[ Thread moved to hackers.] OK, I have enough information from the various other databases to make a proposal. It seems the other databases, particularly Oracle, record CURRENT_TIMESTAMP as the time of statement start. However, it isn't the time of statement start from the user's perspective, but rather from the database's perspective, i.e. if you call a function that has two statements in it, each statement could have a different CURRENT_TIMESTAMP. I don't think that is standards-compliant, and I don't think any of our users want that. What they probably want is to have a fixed CURRENT_TIMESTAMP from the time the query is submitted until it is completed. We can call that the statement arrival time version of CURRENT_TIMESTAMP. I don't know if any of the other databases support this concept, but it seems the most useful, and is closer to the standards and to other databases than we are now. So, we have a couple of decisions to make: Should CURRENT_TIMESTAMP be changed to statement arrival time? Should now() be changed the same way? If not, should now() and CURRENT_TIMESTAMP return the same type of value? One idea is to change CURRENT_TIMESTAMP to statement arrival time, and leave now() as transaction start time. Also, should we added now(val) where val can be transaction, statement, or clock? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] DROP COLUMN misbehaviour with multiple inheritance
On Fri, 2002-10-04 at 01:00, Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Where are we with this patch? It's done as far as I'm concerned ;-). Not sure if Hannu still wants to argue that the behavior is wrong ... it seems fine to me though ... I stop arguing for now, ONLY can mean too many things ;) I can't promise that I don't bring some of it up again when we will start discussing a more general overhaul of our inheritance and OO . --- Hannu ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]