Re: [HACKERS] Regression failure with PostgreSQL 8beta1 and Intel
Dear Hans, Robert, Are you planning to use Intel's C compiler in production? We tried that some time ago and corrupted our database cluster almost instantly (for some reason we have not investigated any further). I highly recommend to do some stress testing to see if everything works nicely. I'd be pleased to get some feedback. We're using the Intel C compiler for all our code on the Altix. We have encountered a few bugs with optimization, but on the whole, it works pretty well, and it generates code that runs about 30% faster than gcc (depending on the application of course). With regard to PostgreSQL, we are not having massive problems with data corruption. We have had problems with 7.4.3, and I'm currently testing 8.0.0beta1. I don't believe the problem with 7.4.3 is a compiler issue -- it only happens when we have multiple backends running at the same time. Other people like Hubert Froehlich (http://archives.postgresql.org/pgsql-general/2004-07/msg00670.php) have run into the same problem using a completely different environment. I'll report on 8.0.0beta1 when our testing is complete. --Bob +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | President, Congenair LLC| URL: http://www.congen.com/~bruc | | P.O. Box 314| Phone: 609 818 7251| | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Regression failure with PostgreSQL 8beta1 and Intel Itanium 2 C compiler
Dear All, I built PG 8.0 beta1 on an Itanium 2 platform using the Intel compilers version 8, and got one real difference in the regression tests that affected int2, int4, union, and numerology. Here's the key difference: horta postgres 177 diff -c int4.out ../expected/ *** int4.outTue Aug 10 18:41:48 2004 --- ../expected/int4.outWed Mar 10 21:11:13 2004 *** *** 22,27 --- 22,28 INSERT INTO INT4_TBL(f1) VALUES (' asdf '); ERROR: invalid input syntax for integer:asdf INSERT INTO INT4_TBL(f1) VALUES ('- 1234'); + ERROR: invalid input syntax for integer: - 1234 INSERT INTO INT4_TBL(f1) VALUES ('123 5'); ERROR: invalid input syntax for integer: 123 5 INSERT INTO INT4_TBL(f1) VALUES (''); PG 8.0 beta1 is accepting - 1234 as a valid integer. Further investigation reveals that this is a peculiarity of the Intel compilers. The following program, #include stdio.h #include stdlib.h main() { char st[] = - 1234; int l; char *endp; l = strtol(st, endp, 10); printf(l = %d st = %lx endp = %lx\n, l, st, endp); } using the Intel compiler provided libraries prints l = -1234 st = 6fffb720 endp = 6fffb726 whereas gcc and glibc yields l = 0 st = 6fffb710 endp = 6fffb710 Boo hiss... +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | President, Congenair LLC| URL: http://www.congen.com/~bruc | | P.O. Box 314| Phone: 609 818 7251| | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] Successful proting report for 7.4beta4 on Irix and Altix
Dear All, I have compiled and tested Postgresql-7.4beta4 on SGI Irix 6.5.20 using the MIPS Pro 7.4 compilers and on SGI Altix using the gcc 2.96 compilers and both ports passed all tests within the limits of numerical accuracy. What is the URL for reporting successful ports? --Bob +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | President, Congenomics Inc. | URL: http://www.congen.com/~bruc | | P.O. Box 314| Phone: 609 818 7251| | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Compilation of PostgreSQL on Irix
Dear Devrim, You can build the latest release of bison from www.gnu.org without any trouble under Irix. PostgreSQL 7.4 builds cleanly on Irix, and so far, it's much faster than 7.3 for the one database I've tested. --Bob +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | President, Congenomics Inc. | URL: http://www.congen.com/~bruc | | P.O. Box 314| Phone: 609 818 7251| | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Compilation of PostgreSQL on Irix
Dear Devrim, I have been using Postgres on Irix for over 8 years, and I have only used the SGI provided compilers. GCC doesn't work well on Irix. In addition, you can build a 64 bit version of PostgreSQL. Here's the script we used for PostgreSQL 7.3.2: #!/bin/ksh -x ./copy_local_files_for_build export SGI_ABI=-64 cat config.cache EOF ac_cv_lib_nsl_main=${ac_cv_lib_nsl_main='no'} ac_cv_prog_CPP=${ac_cv_prog_CPP='cc -E'} ac_cv_prog_gcc=${ac_cv_prog_gcc=no} ac_cv_prog_perl=${ac_cv_prog_perl=/stf/sys64/bin/perl} EOF gmake clean CC=cc -64 \ AWK=awk \ INSTALL=/pg/postgresql-7.3.2/config/install-sh \ LDFLAGS=-rpath $POSTGRES_HOME/local/lib \ ./configure --prefix=/pg/postgresql-7.3.2 \ --enable-hba \ --with-pgport=6543 \ --disable-locale \ --enable-cassert \ --with-template=irix5 \ --with-includes=$POSTGRES_HOME/local/include $POSTGRES_HOME/local/include/readline \ --with-libs=$POSTGRES_HOME/local/lib \ --without-CXX \ --with-maxbackends=128 \ --enable-debug \ --without-java \ --enable-odbc gmake LD_LIBRARY64_PATH=/pg/postgresql-7.3.2/src/interfaces/libpq:$LD_LIBRARY64_PATH gmake check gmake install export PATH=/pg/postgresql-7.3.2/bin:$PATH initdb -D /pg/postgresql-7.3.2/data src/Makefile.custom is set to: CUSTOM_CC = cc -64 LD += -64 MK_NO_LORDER = 1 The script copy_local_files_for_build is as follows: #!/bin/sh source=/stf/sys64 if [ $POSTGRES_HOMEx = x ] then echo No POSTGRES_HOME variable set. exit 1 fi if [ ! -d $POSTGRES_HOME/local/lib ] then mkdir -p $POSTGRES_HOME/local/lib fi /usr/local/bin/tarcp $source/include/readline $POSTGRES_HOME/local/include/readline cp ${source}/lib/libz* $POSTGRES_HOME/local/lib cp ${source}/lib/libreadline* $POSTGRES_HOME/local/lib cp ${source}/include/z* $POSTGRES_HOME/local/include --Bob +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | President, Congenomics Inc. | URL: http://www.congen.com/~bruc | | P.O. Box 314| Phone: 609 818 7251| | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Problem with 7.3 on Irix with dates before 1970
Dear Tom, I have removed the NO_MKTIME_BEFORE_1970 symbol from irix5.h, rebuilt 7.3b2, and reran the regression. The three time tests (tinterval, horology, abstime) now match the Solaris expected files. I checked the timezone files, and the system does not appear to have savings time defined for 1947, but it does report it as such in the PostgreSQL regression tests. WRT your old message about the mktime workaround, I must have missed your message. However, in the future, if you want something tested on Irix, please let me know. I use PostgreSQL a fair amount at Bristol-Myers Squibb, and I will spend some time to keep it working. Sincerely, Bob Robert E. Bruccoleri [EMAIL PROTECTED] writes: There are differences in the regression tests for PostgreSQL 7.3b2 with handling of times before 1970. I recently sent out a set of diffs for the changes. I've looked through the datetime.c code, but it's not obvious to me what the cause of the change is. PostgreSQL 7.2 works fine on Irix for these cases. Waitasec ... are you using src/include/port/irix5.h as the port-specific config file? (Check the symlink src/include/pg_config_os.h to find out.) If so, try removing #define NO_MKTIME_BEFORE_1970 from irix5.h and see if things get better. I asked awhile ago if that symbol was still needed given the mktime workaround, but no one got back to me on it. My current theory is that once you remove that symbol, you will get results matching the Solaris expected files --- ie, I suspect that your system believes there was DST in 1947. Do you have a way of verifying that theory by looking at the system timezone database? regards, tom lane +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] Problem with 7.3 on Irix with dates before 1970
Dear All, There are differences in the regression tests for PostgreSQL 7.3b2 with handling of times before 1970. I recently sent out a set of diffs for the changes. I've looked through the datetime.c code, but it's not obvious to me what the cause of the change is. PostgreSQL 7.2 works fine on Irix for these cases. Can anyone familiar with the code please take a look at the diffs and identify where the trouble may be? I'd be happy to rerun the regression tests to help diagnose the problem. --Bob +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Problem with 7.3 on Irix with dates before 1970
Dear Tom, Here's the result of your queries: template1=# set TimeZone TO 'PST8PDT'; SET template1=# select abstime 'May 10, 1947 23:59:12'; abstime 1947-05-10 22:59:12-08 (1 row) template1=# select (abstime 'May 10, 1947 23:59:12')::int4; int4 -714589248 (1 row) --Bob Robert E. Bruccoleri [EMAIL PROTECTED] writes: There are differences in the regression tests for PostgreSQL 7.3b2 with handling of times before 1970. I recently sent out a set of diffs for the changes. I've looked through the datetime.c code, but it's not obvious to me what the cause of the change is. PostgreSQL 7.2 works fine on Irix for these cases. I have a feeling that it's got something to do with the workaround for broken mktime() that I recently put in. Could you try this experiment and see what you get? (This should at least tell us whether the variance from common behavior is on the input side or the output side.) regression=# set TimeZone TO 'PST8PDT'; SET regression=# select abstime 'May 10, 1947 23:59:12'; abstime 1947-05-10 23:59:12-08 (1 row) regression=# select (abstime 'May 10, 1947 23:59:12')::int4; int4 -714585648 (1 row) regards, tom lane +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Test of PG7.3.2b2 on SGI Irix
Dear Tom, All the regression tests pass except for tests involving Savings Time which are off by one hour. --Bob Details? If you ran it today then the DST-boundary problems shouldn't be there anymore. Here are the diffs: It looks like your files match the solaris-1947 variants; would you confirm that? If so, please send a patch with a resultmap addition that matches your platform. There is already an entry horology/.*-irix6=horology-no-DST-before-1970 but it looks like that's not triggering on your system. It's not the same as the solaris-1947 variants. I compared all three horology files in the expected directory -- none match. There is no such irix6 file in the distribution. Please let me know what else you'd like me to do WRT to this diff. --Bob +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Test of PG7.3.2b2 on SGI Irix
I've built and run the regression tests on PostgreSQL7.3 beta 2 on SGI Irix and found the following suspicious compiler error message: gmake[4]: Entering directory `/pg/postgresql-7.3b2/src/backend/utils/hash' cc -64 -g -woff 1164,1171,1185,1195,1552 -I../../../../src/include -I/stf/sys64/include -I/stf/sys64/include/readline -U_NO_XOPEN4 -c dynahash.c -o dynahash.o cc-1184 cc: WARNING File = dynahash.c, Line = 543 = is used where where == may have been intended. Assert(currBucket !(saveState.currBucket = NULL)); ^ Looks like a bug to me. All the regression tests pass except for tests involving Savings Time which are off by one hour. --Bob +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Test of PG7.3.2b2 on SGI Irix
Dear Tom, Robert E. Bruccoleri [EMAIL PROTECTED] writes: gmake[4]: Entering directory `/pg/postgresql-7.3b2/src/backend/utils/hash' cc -64 -g -woff 1164,1171,1185,1195,1552 -I../../../../src/include -I/stf/sys64/include -I/stf/sys64/include/readline -U_NO_XOPEN4 -c dynahash.c -o dynahash.o cc-1184 cc: WARNING File = dynahash.c, Line = 543 = is used where where == may have been intended. Assert(currBucket !(saveState.currBucket = NULL)); ^ Looks like a bug to me. No, the code is correct, although no doubt too clever by half :-( How can it be correct? If the assertion checking is turned off, then saveState.currBucket will not be changed, but if assertion checking is on, it will be set to NULL. The only way that it would make no difference would be if the saveState.currBucket were NULL to begin with, but then, why make the assignment? All the regression tests pass except for tests involving Savings Time which are off by one hour. --Bob Details? If you ran it today then the DST-boundary problems shouldn't be there anymore. Here are the diffs: *** abstime.out Thu Oct 31 13:18:04 2002 --- ../expected/abstime.out Wed Nov 21 13:27:25 2001 *** *** 44,50 | Wed Dec 31 16:00:00 1969 PST | infinity | -infinity !| Sat May 10 22:59:12 1947 PST | invalid (7 rows) --- 44,50 | Wed Dec 31 16:00:00 1969 PST | infinity | -infinity !| Sat May 10 23:59:12 1947 PST | invalid (7 rows) *** *** 56,62 | Mon May 01 00:30:30 1995 PDT | Wed Dec 31 16:00:00 1969 PST | -infinity ! | Sat May 10 22:59:12 1947 PST (5 rows) SELECT '' AS six, ABSTIME_TBL.* --- 56,62 | Mon May 01 00:30:30 1995 PDT | Wed Dec 31 16:00:00 1969 PST | -infinity ! | Sat May 10 23:59:12 1947 PST (5 rows) SELECT '' AS six, ABSTIME_TBL.* *** *** 67,73 | Mon May 01 00:30:30 1995 PDT | Wed Dec 31 16:00:00 1969 PST | infinity ! | Sat May 10 22:59:12 1947 PST | invalid (6 rows) --- 67,73 | Mon May 01 00:30:30 1995 PDT | Wed Dec 31 16:00:00 1969 PST | infinity ! | Sat May 10 23:59:12 1947 PST | invalid (6 rows) *** *** 89,95 ---+-- | Wed Dec 31 16:00:00 1969 PST | -infinity !| Sat May 10 22:59:12 1947 PST (3 rows) SELECT '' AS four, ABSTIME_TBL.* --- 89,95 ---+-- | Wed Dec 31 16:00:00 1969 PST | -infinity !| Sat May 10 23:59:12 1947 PST (3 rows) SELECT '' AS four, ABSTIME_TBL.* *** *** 99,105 | Sun Jan 14 03:14:21 1973 PST | Wed Dec 31 16:00:00 1969 PST | -infinity ! | Sat May 10 22:59:12 1947 PST (4 rows) SELECT '' AS four, ABSTIME_TBL.* --- 99,105 | Sun Jan 14 03:14:21 1973 PST | Wed Dec 31 16:00:00 1969 PST | -infinity ! | Sat May 10 23:59:12 1947 PST (4 rows) SELECT '' AS four, ABSTIME_TBL.* *** *** 121,127 ORDER BY abstime; four | abstime| year | month | day | hour | minute | second --+--+--+---+-+--++ ! | Sat May 10 22:59:12 1947 PST | 1947 | 5 | 10 | 22 | 59 | 12 | Wed Dec 31 16:00:00 1969 PST | 1969 |12 | 31 | 16 | 0 | 0 | Sun Jan 14 03:14:21 1973 PST | 1973 | 1 | 14 |3 | 14 | 21 | Mon May 01 00:30:30 1995 PDT | 1995 | 5 | 1 |0 | 30 | 30 --- 121,127 ORDER BY abstime; four | abstime| year | month | day | hour | minute | second --+--+--+---+-+--++ ! | Sat May 10 23:59:12 1947 PST | 1947 | 5 | 10 | 23 | 59 | 12 | Wed Dec 31 16:00:00 1969 PST | 1969 |12 | 31 | 16 | 0 | 0 | Sun Jan 14 03:14:21 1973 PST | 1973 | 1 | 14 |3 | 14 | 21 | Mon May 01 00:30:30 1995 PDT | 1995 | 5 | 1 |0 | 30 | 30 *** horology.outThu Oct 31 13:18:19 2002 --- ../expected/horology.outWed Sep 18 17:35:25 2002 *** *** 1741,1750 | Wed Mar 15 13:14:02 2000 PST | @ 34 years| Tue Mar 15 13:14:02 1966 PST | Sun Dec 31 17:32:01 2000 PST | @ 34 years| Sat Dec 31 17:32:01 1966 PST | Mon Jan 01 17:32:01 2001 PST | @ 34 years| Sun Jan 01 17:32:01 1967 PST ! | Sat Sep 22 18:19:20 2001 PDT | @ 34 years| Fri Sep 22 17:19:20 1967 PST ! | Thu Jan 01 00:00:00 1970 PST | @ 5 mons 12 hours | Thu Jul 31 11:00:00 1969 PST ! | Thu Jan 01 00
Re: [HACKERS] Question about LWLockAcquire's use of semaphores instead of spinlocks
Dear Tom, Thank you for the explanation. I did not understand what was going on in lwlock.c. My systems are all SGI Origins having between 8 and 32 processors, and I've been running PostgreSQL on them for about 5 years. These machines do provide a number of good mechanisms for high performance shared memory parallelism that I don't think are found elsewhere. I wish that I had the time to understand and tune PostgreSQL to run really well on them. I have a question for you and other developers with regard to my SGI needs. If I made a functional Origin 2000 system available to you with hardware support, would the group be willing to tailor the SGI port for better performance? Sincerely, Bob Robert E. Bruccoleri [EMAIL PROTECTED] writes: Tom Lane writes: If you're saying that we don't have an implementation of TAS for SGI hardware, then feel free to contribute one. If you are wanting to replace LWLocks with spinlocks, then you are sadly mistaken, IMHO. This touches on my question. Why am I mistaken? I don't understand. Because we just got done replacing spinlocks with LWLocks ;-). I don't believe that reverting that change will improve matters. It will certainly hurt on SMP machines, and I believe that it would at best be a breakeven proposition on uniprocessors. See the discussions last fall that led up to development of the LWLock mechanism. The problem with TAS spinlocks is that they are suitable only for implementing locks that will be held for *very short* periods, ie, actual contention is rare. Over time we had allowed that mechanism to be abused for locking fairly large and complex shared-memory data structures (eg, the lock manager, the buffer manager). The next step up, a lock-manager lock, is very expensive and certainly can't be used by the lock manager itself anyway. LWLocks are an intermediate mechanism that is marginally more expensive than a spinlock but behaves much more gracefully in the presence of contention. LWLocks also allow us to distinguish shared and exclusive lock modes, thus further reducing contention in some cases. BTW, now that I reread the title of your message, I wonder if you haven't just misunderstood what's happening in lwlock.c. There is no IPC semaphore call in the fast (no-contention) path of control. A semaphore call occurs only when we are forced to wait, ie, yield the processor. Substituting a spinlock for that cannot improve matters; it would essentially result in wasting the remainder of our timeslice in a busy-loop, rather than yielding the CPU at once to some other process that can get some useful work done. regards, tom lane +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Question about LWLockAcquire's use of semaphores instead of spinlocks
Dear Luis, I would be very interested. Replacing the IPC shared memory with an arena make a lot of sense. --Bob Hi Bob: We're have been working with an sproc version of postgres and it has improve performance over a NUMA3 origin 3000 due to IRIX implements round_robin by default on memory placement instead of first touch as it did on fork. We're been wondering about replacing IPC shmem with a shared arena to help performance improve on IRIX. I dont´know if people here in postgres are interested on specifical ports but it could help you improve your performance. Regards +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Question about LWLockAcquire's use of semaphores instead of spinlocks
On SGI multiprocessor machines, I suspect that a spinlock implementation of LWLockAcquire would give better performance than using IPC semaphores. Is there any specific reason that a spinlock could not be used in this context? +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: Fw: Fw: [HACKERS] bad performance on irix
Dear Igor, Igor Kovalenko writes: Makes me wonder... perhaps now someone will be convinced to take a look at the POSIX IPC patch. On some platforms (not on Linux I am afraid) POSIX mutexes might be quite a bit faster than SYSV semaphores. Yes, but on the SGI platform, the MIPS test_and_set instructions are really fast and should be used. +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: Fw: Fw: [HACKERS] bad performance on irix
Dear Igor, I am confused to hell. I always thought MIPS does NOT have TAS instruction ;) On the SGI platform, there are very high speed implementations of test and set which allow large number of processes to safely and quickly access shared memory. SGI has a hardware team that specifies MIPS processor variants that are used in their servers so the machines can scale. I've tried to get SGI interested in putting some internal engineering effort to improve PostgreSQL performance on operations which could benefit from its shared memory parallel architecture (like index creation and sorting), but without success. +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: Fw: Fw: [HACKERS] bad performance on irix
Dear Tom, The evidence is from the Process Activity Recorder, an Irix utility similar to strace the reports syscall usage. A number of semop's are performed in the operation of backend. Luis can send you specifics. --Bob Luis Alberto Amigo Navarro writes: - Original Message - From: Tom Lane [EMAIL PROTECTED] To: Luis Alberto Amigo Navarro [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; Robert E. Bruccoleri [EMAIL PROTECTED] Sent: Monday, March 18, 2002 5:36 PM Subject: Re: Fw: [HACKERS] bad performance on irix Luis Alberto Amigo Navarro [EMAIL PROTECTED] forwards: It's using the spinlocks for some locks, but semaphores for others. That doesn't make any sense to me. For one thing, if HAS_TEST_AND_SET is defined in the config header, the executable will just plain fail to build if there's no tas implementation, because lmgr/spin.c won't be compiled. And I sure don't see how some of the locks might be implemented one way and some the other. Which ones do you think are being implemented as semaphores, and what's your evidence? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html +-++ | Robert E. Bruccoleri, Ph.D. | email: [EMAIL PROTECTED]| | P.O. Box 314| URL: http://www.congen.com/~bruc | | Pennington, NJ 08534|| +-++ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Re: Various silliness in heap_getnext and related routines
Dear Tom, [EMAIL PROTECTED] (Robert E. Bruccoleri) writes: For my immediate problem, would removing the spinlock acquisition be OK? It'd be interesting to remove the marked lines: bufHdr = BufferDescriptors[buffer - 1]; - SpinAcquire(BufMgrLock); if (bufHdr-tag.blockNum == blockNumber RelFileNodeEquals(bufHdr-tag.rnode, relation-rd_node)) { - SpinRelease(BufMgrLock); return buffer; } - return ReadBufferWithBufferLock(relation, blockNumber, true); and see how that affects your performance issue, if at all. I have made those changes, ran the regression tests, and then issued eight simultaneous retrieval jobs against the database. Performance is now greatly improved with all the jobs making quick progress through the sequential query and completing in very reasonable times. Thanks very much for your help. BTW, given the high level of support that you provide to the PostgreSQL community, it's very accurate to state that support for PostgreSQL is far superior to that of Oracle, especially for SGI systems. Sincerely, Bob +--++ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Re: Various silliness in heap_getnext and related routines
Dear Tom, [EMAIL PROTECTED] (Robert E. Bruccoleri) writes: BTW, given the high level of support that you provide to the PostgreSQL community, it's very accurate to state that support for PostgreSQL is far superior to that of Oracle, especially for SGI systems. It's all about having the source code available, I think. After all, it was you who identified the location of the problem... Yes, but it's not just having the source code. Although I could see a potential problem, it was your knowledge of the source code to recommend a patch that worked immediately, and your willingness to help that together makes PostgreSQL support so good. The same knowledge and helpful attitude applies to all the PostgreSQL developers. Thanks. --Bob +--++ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Re: Various silliness in heap_getnext and related routines
Dear Tom, Robert Bruccoleri ([EMAIL PROTECTED]) wrote: It's not clear to me why the spinlock needs be grabbed at the beginning of RelationGetBufferWithBuffer, I believe you are right: the spinlock doesn't need to be grabbed, because if a valid buffer is passed in, it must already be pinned (since the returned buffer is expected to be pinned). Hence the check for same-buffer could be done without first grabbing the spinlock. For my immediate problem, would removing the spinlock acquisition be OK? Thanks for looking into this problem. Sincerely, Bob +--++ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Re: XFS File systems and PostgreSQL
I have run a simple PostgreSQL benchmark on my SGI system which uses XFS for its file system on all disks to compare the effect of fsync. The benchmark was the loading of a database from 157 MB of pg_dump data including the construction of 11 Btree indexes covering nearly all of the data. The second column was just for the data load, and the third column is for the index creation. The system is an SGI Indigo2 R1 running Irix 6.5.7 with 384 MB RAM writing to Seagate 18GB 7200RPM narrow SCSI disks. Fsync enabled Elapsed load time Elapsed indexing time Yes 15:53 9:16 No10:33 8:40 The CPU is not fully utilized for loading, and thus the system is I/O bound and the use of fsync has an impact. By contrast, the indexing process is CPU bound, and fsync is less important. The performance penalty for using fsync is modest, and therefore, I do not believe that we should discourage people from using XFS because it is a journaling file system. The note advising against installing Postgres on XFS should be removed from the installation guide. Instead, we need to explore how to use XFS's features to improve PostgreSQL's performance. For example, the XFS filesystem journal can be placed on a drive different from the data drive. This would substantially improve write performance. +--++ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] XFS File systems and PostgreSQL
I have been using PostgreSQL and XFS file systems on SGI's for many years, and PostgreSQL is fast. Dumping and loading 100GB of table files takes less than one day elapsed (provided there is no other activity on that database -- large amounts of transactional activity will slow things down). I always turn off fsync. Most of my experience has been with 6.5.3, although I've been using 7.1 and I don't see much of a difference in performance. I don't know if the Linux version of XFS is substantially different than the SGI version, but XFS is a wonderful filesystem to use and administer (at least on SGI's). +--++ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] XFS File systems and PostgreSQL
Dear Bruce, Yes, the irony is that a journaling file system is being used to have fast, reliable restore after crash bootup, but with no fsync, the db is probably hosed. There is no irony in these cases. In my systems, which are used for bioinformatics, the updating process is generally restartable. I normally have lots of data to load or many records to change, and the quantities are much more than any reasonable sized transaction. Some jobs run for days. If I lose some data because of a crash, I just restart the jobs, and they'll delete some of the last data to be loaded, and then resume. Furthermore, the SGI's that I run on are highly reliable, and they rarely crash. So, I might have to clean up a big mess rarely (I've had one really big one in two years), but performance otherwise is really good. I should also point out that most of my work has been with PostgreSQL 6.5.3 which doesn't have the WAL. If I have some time, I will try the experiment of loading a database of mine into PG 7.1 using -F or not and I'll report the timing. +--++ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Irix binaries of 7.1 RC1 are available
Binaries for PostgreSQL 7.1 RC1 are now available in ftp://ftp.postgresql.org/pub/binary/v7.1/IRIX_6.5.7/ All regression tests pass except that geometry is different by very small amounts ( 10^14) and join gives the same rows in a different order. +--++ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] Spinlocks on SGI's
I contributed the first working s_lock.c code for the SGI's over three years ago (using the test_and_set library calls). It's been working for me ever since in a heavy multi-user environment. Please don't change it unless there's an overwhelming reason. Also, the GNU compiler systems works poorly on SGI's. In this environment, you need to stick to the SGI provided compilers. Once you get used to them, they work fairly well. Sincerely, Bob +--++ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Spinlocks on SGI's
Dear Mathijs, Wouldn't it be best to build IRIX 6.2 binaries? Those should run properly on IRIX 6.2 and anything newer. Of course, having both IRIX 6.2 and 6.5 binaries wouldn't really hurt... It's not feasible to do this, nor is it worth it. Irix 6.5 is much more reliable than Irix 6.2. --Bob +--++ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Re: Performance degradation in PostgreSQL 7.1beta3 vs
Dear Hannu, "Robert E. Bruccoleri" wrote: explain select count(*) from comparisons_4 where code = 80003; NOTICE: QUERY PLAN: Aggregate (cost=15659.29..15659.29 rows=1 width=0) - Seq Scan on comparisons_4 (cost=0.00..15640.81 rows=7391 width=0) EXPLAIN What is the type of field "code" ? int4 Do you think that should make a difference? +--+--------+ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++
Re: [HACKERS] Re: Performance degradation in PostgreSQL 7.1beta3 vs
Dear Hannu, "Robert E. Bruccoleri" wrote: Dear Hannu, "Robert E. Bruccoleri" wrote: explain select count(*) from comparisons_4 where code = 80003; NOTICE: QUERY PLAN: Aggregate (cost=15659.29..15659.29 rows=1 width=0) - Seq Scan on comparisons_4 (cost=0.00..15640.81 rows=7391 width=0) EXPLAIN What is the type of field "code" ? int4 Do you think that should make a difference? Probably not here. Sometimes it has made difference if the system does not recognize the other side of comparison (80003) as being of the same type as the index. what are the cost estimates when you run explain with seqscan disabled ? do = SET ENABLE_SEQSCAN TO OFF; see: (http://www.postgresql.org/devel-corner/docs/admin/runtime-config.htm#RUNTIME-CONFIG-OPTIMIZER) Here's the result from EXPLAIN: Aggregate (cost=19966.21..19966.21 rows=1 width=0) - Index Scan using comparisons_4_code on comparisons_4 (cost=0.00..19947.73 rows=7391 width=0) The estimates are too high. --Bob +--+------------+ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++
[HACKERS] Performance degradation in PostgreSQL 7.1beta3 vs 6.5.3
Dear Tom, I am writing to you because you are the maintainer of the query optimizer and planner. I have found a very significant performance degradation between PostgreSQL 6.5.3 and 7.1beta3, which will severely impact two large applications that I have developed and maintain for several clients. The performance difference is seen with the use of indices in SELECT statements, whereby the current release does not make effective use of the indices and 6.5.3 does. All of these tests were run on a SGI R1 Indigo2 system running Irix 6.5. All the regression tests passed as expected for both versions. I have followed the discussion in pgsql-hackers over the previous months and others have noted some performance problems, and the response has typically been to VACUUM the tables. Unfortunately, this is not a practical option for my applications. They are very large -- I have one table that is 17GB in length, and the applications are used frequently. More importantly, PostgreSQL 6.5.3 works very, very well without VACUUM'ing. In order to assist you to diagnosing and correcting this problem, I have prepared a test database that shows the problems. I will attach three files; the test script, the log from running it on version 6.5.3, and the log from running it on version 7.1beta3. In addition, I have setup an anonymous FTP directory on ftp.congen.com:/pub/pg_perf which contains all of these files as well as the compressed table dumps used to build the test database. (When you have finished copying the files, please let me know.) The test script creates the database including the necessary indexing, and then runs EXPLAIN on each of the queries followed by actually executing the queries with "timex" commands to report elapsed times. The striking difference in the query plans is that 7.1 uses only sequential searches for the SELECT's whereas 6.5.3 uses index scans. As a result, 7.1 is almost two orders of magnitude slower than 6.5.3 with exactly the same data, schema, and queries. I plead with you to revisit this question of performance and fix PostgreSQL 7.1 to work as well as PostgreSQL 6.5.3 does. I depend upon PostgreSQL for much of my work, and I do not want to abandon it because of this performance problem which arose only recently. Thank you. +--+----+ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++ #!/bin/csh createdb perf_test gunzip proteins.dmp.gz | timex psql -e perf_test gunzip comparisons_4.dmp.gz | timex psql -e perf_test gunzip concordance_91.dmp.gz | timex psql -e perf_test psql -e perf_test EOF explain select * from comparisons_4 where name1 = 'HI0001'; explain select count(*) from comparisons_4 where code = 80003; explain select p.name, p.purpose from comparisons_4 c, proteins p where c.name1 = 'HI0003' and c.name2 = p.name; explain select c.target_name, c.matched_name, c.score, p.purpose from concordance_91 c, proteins p where c.matched_name = p.name; EOF timex psql -e -c "select * from comparisons_4 where name1 = 'HI0001'" perf_test timex psql -e -c "select count(*) from comparisons_4 where code = 80003" perf_test timex psql -e -c "select p.name, p.purpose from comparisons_4 c, proteins p where c.name1 = 'HI0003' and c.name2 = p.name" perf_test timex psql -e -c "select c.target_name, c.matched_name, c.score, p.purpose from concordance_91 c, proteins p where c.matched_name = p.name" perf_test CREATE TABLE "proteins" ( "name" character varying(16), "organism" text, "start_position" int4, "last_position" int4, "seq" text, "purpose" text, "alternate_key" character varying(16), "comment" text, "compared" bool, "complement" bool, "chromosome" character varying(4), "essentiality" float8); QUERY: CREATE TABLE "proteins" ( "name" character varying(16), "organism" text, "start_position" int4, "last_position" int4, "seq" text, "purpose" text, "alternate_key" character varying(16), "comment" text, "compared" bool, "complement" bool, "chromosome" character varying(4), "essentiality" floa
[HACKERS] Re: Performance degradation in PostgreSQL 7.1beta3 vs 6.5.3
Dear Tom, You can't afford to run a VACUUM ANALYZE even once in the lifetime of the table? Not very often at best, and certainly not routinely. Some of my tables exceed 10GB and have multiple indices. However, to test your suggestion, I modified my performance test script to "VACUUM ANALYZE" all the tables prior to invoking EXPLAIN, and it improves all of the searches except this one (EXPLAIN output also included): explain select count(*) from comparisons_4 where code = 80003; NOTICE: QUERY PLAN: Aggregate (cost=15659.29..15659.29 rows=1 width=0) - Seq Scan on comparisons_4 (cost=0.00..15640.81 rows=7391 width=0) EXPLAIN The choice of sequential scan here takes 30x longer. More importantly, PostgreSQL 6.5.3 works very, very well without VACUUM'ing. 6.5 effectively assumes that "foo = constant" will select exactly one row, if it has no statistics to prove otherwise. I don't regard that as a well-chosen default, even if it does happen to work OK for your application. Selecting an indexscan when a seqscan is needed is just as evil as doing the reverse; what's much worse is that 6.5 will pick incredibly bad join plans (ie, nested loops) because it thinks that very little data is coming out of the scans. I've tuned my applications to work well with these defaults (and demonstrate to my peers that PostgreSQL is comparable to Oracle in performance for these types of queries). I am willing to make changes to my applications to make them work as well with 7.1, but the performance of the query above worries me. I think the current planner will make the wrong decision more often than the right one. To test this further on this table, I went through the comparisons_4 table and found that code 13 appears the most (73912) out of 591825 rows. In this case, 6.5.3 takes 8.56 seconds to return its answer, whereas 7.1 takes 12.11 seconds. Even in the worst case for this table, the indexed scan is faster, but the optimizer decides on the sequential scan. It appears that the decision point for the switch to sequential scans isn't set properly. To me, this is a bug. If you want to revert to the 6.5 behavior without doing a VACUUM, you could probably get pretty close with update pg_attribute set attdispersion = -1.0; Does VACUUM ANALYZE set this column to its calculated value? What kinds of queries would not give 6.5 behavior if I set this column as you suggest? Alternatively, how hard would it be to add another SET variable like USE_6_5_PLANNING_RULES? Personally, that would be most helpful from an application development viewpoint because I could switch to PostgreSQL 7.1 without destroying the performance of my applications, and then test new versions with the 7.1 planner with less potential for service disruption. Stats-gathering and planning certainly does need a great deal of additional work, but I'm afraid that none of that will happen before 7.1. As I said above, I've put a lot of effort into making my applications work quickly with Postgres, and I'm looking forward to using the new features that are available with version 7.1. However, I'm very concerned that I will not be able to achieve the same performance without detailed knowledge of the internals. Shouldn't I be assured that I will improve the performance of a query by creating a index on the fields used for selecting the row? That is not the case for the query above. Finally, I apologize for being a little strident here. I've been advocating for and using Postgres for four years, and it's frustrating when a new version results in a serious and noticeable performance degradation. Sincerely, Bob +--+--------+ | Robert E. Bruccoleri, Ph.D. | Phone: 609 737 6383| | President, Congenomics, Inc. | Fax: 609 737 7528| | 114 W Franklin Ave, Suite K1,4,5 | email: [EMAIL PROTECTED]| | P.O. Box 314 | URL: http://www.congen.com/~bruc | | Pennington, NJ 08534 || +--++