Re: [HACKERS] New CRC algorithm: Slicing by 8
Tom Lane wrote: Mark Kirkwood [EMAIL PROTECTED] writes: Interesting that the slice-8 algorithm seems to work noticeably better on FreeBSD than Linux - Are you running similar gcc versions on both? I realize I forgot to document what I was using: HPPA: gcc version 2.95.3 20010315 (release) PPC: gcc version 4.0.1 (Apple Computer, Inc. build 5363) both Intels: gcc version 4.1.1 20060525 (Red Hat 4.1.1-1) Interesting that it's only the oldest-line tech that's showing a win for me ... Ah - good point, FreeBSD is using an older compiler: FreeBSD: gcc (GCC) 3.4.6 [FreeBSD] 20060305 Linux: gcc (GCC) 4.1.1 (Gentoo 4.1.1) Hmm - there is a FreeBSD port for 4.1.2, I might set that off to build itself and see if compiling with it changes the results any ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New CRC algorithm: Slicing by 8
On Sun, 2006-10-22 at 17:18 +0200, Martijn van Oosterhout wrote: On Sun, Oct 22, 2006 at 08:10:56PM +0530, Gurjeet Singh wrote: Hi all, Michael Kounavis has given a green signal (please refer the forwarded message). I don't think that helps. The publishing date of this article was March 2006. If this is really the first time this algorithm was published, that means that anyone else (or even the author) has the option of patenting this algorithm sometime before March 2007 and still claiming they invented it first. Slice-By-8 was first mentioned here: M. E. Kounavis and F. L. Berry, A Systematic Approach to Building High Performance Software-based CRC Generators, Proceedings, Tenth IEEE International Symposium on Computers and Communications (ISCC 2005), La Manga Del Mar Menor, Spain, June, 2005. so presumably the opportunity to patent has already passed, given what you say. And realistically we would wait at least a year or three after that, because you don't get to see patents as they're applied for. Maybe March 2010 we can look into it... I think that's a tad overcooked. We have to balance caution with boldness. Simply copying stuff from research can be dangerous, but I think this is about as safe as it gets. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New CRC algorithm: Slicing by 8
On Mon, 23 Oct 2006 07:35:08 +0200, Tom Lane [EMAIL PROTECTED] wrote: I realize I forgot to document what I was using: HPPA: gcc version 2.95.3 20010315 (release) PPC: gcc version 4.0.1 (Apple Computer, Inc. build 5363) both Intels: gcc version 4.1.1 20060525 (Red Hat 4.1.1-1) Interesting that it's only the oldest-line tech that's showing a win for me ... Does anyone have an Intel compiler availible? It would be interesting to see results on that compared to gcc (given that it is an Intel algorithm ;). \Anders ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] New CRC algorithm: Slicing by 8
Tom Lane wrote: Are you running similar gcc versions on both? I realize I forgot to document what I was using: HPPA: gcc version 2.95.3 20010315 (release) PPC: gcc version 4.0.1 (Apple Computer, Inc. build 5363) both Intels: gcc version 4.1.1 20060525 (Red Hat 4.1.1-1) Interesting that it's only the oldest-line tech that's showing a win for me ... It would be interesting to see if the Intel compiler works better for this algorithm. Anyone have it installed? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New CRC algorithm: Slicing by 8
On Mon, 23 Oct 2006, Mark Kirkwood wrote: Tom Lane wrote: Yah, I checked. Several times... but if anyone else wants to repeat the experiment, please do. Or look for bugs in either my test case or Gurjeet's. Just for fun, I tried it out with both GCC and with Intel's C compiler with some agressive platform-specific flags on my 2.8Ghz Xeon running Gentoo. Std crc Slice-8 crc Intel P4 Xeon 2.8Ghz (Gentoo, gcc-3.4.5, -O2) 8192 bytes 4.6975729.806341 1024 bytes 0.5974291.181828 64 bytes0.0466360.086984 Intel P4 Xeon 2.8Ghz (Gentoo, icc-9.0.032, -O2 -xN -ipo -parallel) 8192 bytes 0.040.001085 1024 bytes 0.040.001292 64 bytes0.030.001078 So at this point I realize that intel's compiler is optimizing the loop away, at least for the std crc and probably for both. So I make mycrc an array of 2, and substript mycrc[j1] in the loop. Std crc Slice-8 crc Intel P4 Xeon 2.8Ghz (Gentoo, gcc-3.4.5, -O2) 8192 bytes 51.397146 9.523182 1024 bytes 6.4309861.229043 64 bytes0.4000620.128579 Intel P4 Xeon 2.8Ghz (Gentoo, icc-9.0.032, -O2 -xN -ipo -parallel) 8192 bytes 29.881708 0.001432 1024 bytes 3.7503130.001432 64 bytes0.2385830.001431 So it looks like something fishy is still going on with the slice-8 with the intel compiler. I have attached my changed testcrc.c file. FWIW - FreeBSD and Linux results using Tom's test program on almost identical hardware[1]: Std crc Slice-8 crc Intel P-III 1.26Ghz (FreeBSD 6.2) 8192 bytes 12.975314 14.503810 1024 bytes 1.6335571.852322 64 bytes0.1115800.206975 Intel P-III 1.26Ghz (Gentoo 2006.1) 8192 bytes 12.967997 28.363876 1024 bytes 1.6323173.626230 64 bytes0.1115130.326557 Interesting that the slice-8 algorithm seems to work noticeably better on FreeBSD than Linux - but still not as well as the standard one (for these tests anyway)... Cheers Mark [1] Both boxes have identical mobos, memory and CPUs (same sspec nos). ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- You can tune a piano, but you can't tuna fish.#include postgres.h #include time.h #include sys/time.h #include pg_crc.h int main() { charbuffer[TESTSIZE]; pg_crc32mycrc[2]; int j; struct timeval tstart; struct timeval tstop; srand(time(NULL)); for (j = 0; j TESTSIZE; ++j) buffer[j] = (char) (255 * (rand() / (RAND_MAX + 1.0))); gettimeofday(tstart, NULL); for (j = 0; j NTESTS; j++) { INIT_CRC32(mycrc[j1]); COMP_CRC32(mycrc[j1], buffer, TESTSIZE); FIN_CRC32(mycrc[j1]); } gettimeofday(tstop, NULL); if (tstop.tv_usec tstart.tv_usec) { tstop.tv_sec--; tstop.tv_usec += 100; } printf(bufsize = %d, loops = %d, elapsed = %ld.%06ld\n, TESTSIZE, NTESTS, (long) (tstop.tv_sec - tstart.tv_sec), (long) (tstop.tv_usec - tstart.tv_usec)); return 0; } ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SPAM?] Re: [HACKERS] Asynchronous I/O Support
So far I've seen no evidence that async I/O would help us, only a lot of wishful thinking. is this thread moot? while researching this thread I came across this article: http://kerneltrap.org/node/6642 describing claims of 30% performance boost when using posix_fadvise to ask the o/s to prefetch data. istm that this kind of improvement is in line with what aio can provide, and posix_fadvise is cleaner, not requiring threads and such. Hmm, my man page says: POSIX_FADV_WILLNEED and POSIX_FADV_NOREUSE both initiate a non-blocking read of the specified region into the page cache. The amount of data read may be decreased by the kernel depending on VM load. (A few megabytes will usually be fully satisfied, and more is rarely useful.) This appears to be exactly what we want, no? It would be nice to get some idea of what systems support this. POSIX_FADV_WILLNEED definitely sounds very interesting, but: I think this interface was intended to hint larger areas (megabytes). But the wishful thinking was not to hint seq scans, but to advise single 8k pages. The OS is responsible for sequential readahead, but it cannot anticipate random access that results from btree access (unless of course we are talking about very small tables). But I doubt, that with this interface many OS's will actually forward multiple IO's to the disk subsystem in parallel, which would be what is needed. Also the comment Bruce quoted does not sound incouraging :-( Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SPAM?] Re: [HACKERS] Asynchronous I/O Support
So far I've seen no evidence that async I/O would help us, only a lot of wishful thinking. is this thread moot? while researching this thread I came across this article: http://kerneltrap.org/node/6642 describing claims of 30% performance boost when using posix_fadvise to ask the o/s to prefetch data. istm that this kind of improvement is in line with what aio can provide, and posix_fadvise is cleaner, not requiring threads and such. This again is for better OS readahead for sequential access, where standard Linux obviously behaves differently. It is not about random access. Btw. I do understand the opinion from Linux developers, that pg should actually read larger blocks for seq scans. In cases of high disk load OS's tend to not do all needed readahead, which has pros and cons, but mainly cons for pg. Andreas ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SPAM?] Re: [HACKERS] Asynchronous I/O Support
Yup, that would be the scenario where it helps (provided that you have a smart disk or a disk array and an intelligent OS aio implementation). It would be used to fetch the data pages pointed at from an index leaf, or the next level index pages. We measured the IO bandwidth difference on Windows with EMC as beeing nearly proportional to parallel outstanding requests up to at least Measured it using what? I was under the impression only one proof-of-implementation existed, and that the scenarios and configuration of the person who wrote it, did not show significant improvement. IIRC the configuration of that test was not suitable to show any benefit. Minimum requirements to show improvement are: - very few active sessions (typically less than number of disks) - a table that spans multiple disks (typically on a stripe set) (or one intelligent scsi disk) - only random disk access plans You have PostgreSQL on Windows with EMC with async I/O support to test with? No, sorry. Was a MaxDB issue. Andreas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] bug or feature, || -operator and NULLs
Mario Weilguni wrote: This has been been discussed before, but Oracle behaves differently, and IMHO in a more correct way. The following query returns NULL in PG: SELECT NULL || 'fisk'; But in Oracle, it returns 'fisk': SELECT NULL || 'fisk' FROM DUAL; The latter seems more logical... I've worked alot with oracle a few years ago and I agree, the feature is handy and makes sometimes life easier, but it's simply wrong. I heard a while ago that newer oracle versions changed this to sql - standard, is this true? Unfortunately not, in Oracle's current version (10.2.0.2.0) it is still that way. I think that this Oracle 'feature' is almost as terrible as the fact that they treat '' as NULL, which is (as has been pointed out) most likely the reason for treating NULL as '' in ||. Yours, Laurenz Albe ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] New CRC algorithm: Slicing by 8
Simon Riggs [EMAIL PROTECTED] writes: Slice-By-8 was first mentioned here: Are you sure? US patent 7,047,479 filed in 2002 sounds like it may be relevant: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1Sect2=HITOFFd=PALLp=1u=%2Fnetahtml%2FPTO%2Fsrchnum.htmr=1f=Gl=50s1=7047479.PN. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] New CRC algorithm: Slicing by 8
Mark Kirkwood wrote: Tom Lane wrote: Are you running similar gcc versions on both? I realize I forgot to document what I was using: Ah - good point, FreeBSD is using an older compiler: FreeBSD: gcc (GCC) 3.4.6 [FreeBSD] 20060305 Linux: gcc (GCC) 4.1.1 (Gentoo 4.1.1) Hmm - there is a FreeBSD port for 4.1.2, I might set that off to build itself and see if compiling with it changes the results any Here are the results after building gcc 4.1.2 (repeating results for gcc 3.4.6 for comparison). I suspect that performance is probably impacted because gcc 4.1.2 (and also the rest of the tool-chain) is built with gcc 3.4.6 - but it certainly suggests that the newer gcc versions don't like the slice-8 algorithm for some reason. Std crc Slice-8 crc Intel P-III 1.26Ghz (FreeBSD 6.2 gcc 3.4.6) 8192 bytes 12.975314 14.503810 1024 bytes 1.6335571.852322 64 bytes0.1115800.206975 Intel P-III 1.26Ghz (FreeBSD 6.2 gcc 4.1.2) 8192 bytes 19.516974 29.457739 1024 bytes 2.4485703.742106 64 bytes0.1126440.335292 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New CRC algorithm: Slicing by 8
On Mon, 2006-10-23 at 05:22 -0400, Gregory Stark wrote: Simon Riggs [EMAIL PROTECTED] writes: Slice-By-8 was first mentioned here: Are you sure? US patent 7,047,479 filed in 2002 sounds like it may be relevant: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1Sect2=HITOFFd=PALLp=1u=%2Fnetahtml%2FPTO%2Fsrchnum.htmr=1f=Gl=50s1=7047479.PN. Yes, I'm sure. That patent is titled Parallel CRC formulation and the abstract describes the use of two threads to calculate different parts of the CRC. It's designed for parallel circuit designs in hardware. That doesn't resemble SB8 at all - there's still only one thread. The cleverness of SB8 is to calculate more bytes simultaneously *without* requiring a corresponding increase in the size of the lookup table. We have the original author's word that no patent has been filed. Even if we disbelieve that, which I have no reason to do, that alone would be sufficient to make a counter-claim for any damages claimed by any hypothetical patent holder in the future. What is the difference between this case and other patches? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] MVCC question
Folks, How long are we supporting MVCC? It is from the beginning or is it added later to PG -- Regards, Gevik Babakhani www.postgresql.nl www.truesoftware.nl ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New CRC algorithm: Slicing by 8
MK == Mark Kirkwood [EMAIL PROTECTED] writes: MK Here are the results after building gcc 4.1.2 (repeating results MK for gcc 3.4.6 for comparison). I suspect that performance is MK probably impacted because gcc 4.1.2 (and also the rest of the MK tool-chain) is built with gcc 3.4.6 - but it certainly suggests MK that the newer gcc versions don't like the slice-8 algorithm for MK some reason. They don't seem to like the old CRC algorithms either. It is quite strange, such dramatic performance regressions from 3.x to 4.x are rare. /Benny ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] MVCC question
Release 6.5 brings MVCC to PostgreSQL. Check out the following doc for details http://www.postgresql.org/docs/7.1/static/release-6-5.html --Imad EnterpriseDB (http://www.enterprisedb.com) On 10/23/06, Gevik Babakhani [EMAIL PROTECTED] wrote: Folks, How long are we supporting MVCC? It is from the beginning or is it added later to PG -- Regards, Gevik Babakhani www.postgresql.nl www.truesoftware.nl ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] MVCC question
Am Montag, 23. Oktober 2006 13:50 schrieb Gevik Babakhani: How long are we supporting MVCC? 6.5 -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New CRC algorithm: Slicing by 8
Jeremy Drake [EMAIL PROTECTED] writes: So at this point I realize that intel's compiler is optimizing the loop away, at least for the std crc and probably for both. So I make mycrc an array of 2, and substript mycrc[j1] in the loop. That's not a good workaround, because making mycrc expensive to access means your inner loop timing isn't credible at all. Instead try making the buffer array nonlocal --- malloc it, perhaps. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PgSQL users quota
On Fri, Oct 20, 2006 at 03:30:40AM +0300, Tux P wrote: Hi .* Is there any chance to see the quota implementation described in this post in any next releases? http://archives.postgresql.org/pgsql-hackers/2004-07/msg00392.php Since Jonah hasn't done anything with it he's presumably lost interest, so you'd need to find someone else looking for an itch to scratch. And it appears the original patch was against 7.4, so it'd probably need a decent amount of work to make it work with our current code. On the bright side, there is the following TODO item: o Allow per-tablespace quotas -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PgSQL users quota
On 10/23/06, Jim C. Nasby [EMAIL PROTECTED] wrote: Since Jonah hasn't done anything with it he's presumably lost interest, so you'd need to find someone else looking for an itch to scratch. And it appears the original patch was against 7.4, so it'd probably need a decent amount of work to make it work with our current code. Well, it was more of the case of community wants vs. what I needed at the time. I'm not quite sure if I have the patch lying around anywhere, but it was fairly trivial to implement. o Allow per-tablespace quotas Yes, this is what came out of the discussion. A per-tablespace patch should be fairly easy to implement once you get a consensus on what constitutes actual space usage... as VACUUMable space may or may not apply depending on your use-case. -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New CRC algorithm: Slicing by 8
On Sun, 2006-10-22 at 18:06 -0400, Tom Lane wrote: These numbers are um, not impressive. Considering that a large fraction of our WAL records are pretty short, the fact that slice8 consistently loses at short buffer lengths is especially discouraging. Much of that ground could be made up perhaps with tenser coding of the initialization and finalization code, but it'd still not be worth taking any legal risk for AFAICS. It doesn't look good for SB8, does it? Nor for gcc4.1 either. Presumably Intel themselves will have some come-back, but I'm not sure what they'll so to so many conclusive tests. Instead, I'd like to include a parameter to turn off CRC altogether, for heavily CPU bound operations and the WAL drive on trustworthy hardware. wal_checksum = off -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] New CRC algorithm: Slicing by 8
Simon Riggs [EMAIL PROTECTED] writes: Instead, I'd like to include a parameter to turn off CRC altogether, for heavily CPU bound operations and the WAL drive on trustworthy hardware. No can do --- we rely on the checksums to be able to tell when we've hit the end of WAL during replay. You may as well propose not writing WAL at all (and no, I don't think it'd pass). regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] New CRC algorithm: Slicing by 8
On 10/23/06, Tom Lane [EMAIL PROTECTED] wrote: It's not so much that I don't trust Intel as that a CRC algorithm is exactly the sort of nice little self-contained thing that people love to try to patent these days. What I am really afraid of is that someone else has already invented this same method (or something close enough to it) and filed for a patent that Intel doesn't know about either. I'd be wondering about that no matter where the code had come from. Given the numbers I posted earlier today, the proposal is dead in the water anyway, quite aside from any legal considerations. The horror, the horror. I wonder if changing to Slicing by 8 would be so self contained, so that people from software-patent free world would be able to just patch their distribution if they will. Regards, Dawid ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Problems starting Template1...
Hello, I'm from Venezuela, and I've been making some modifications to Postgre's Catalog, but it seems to be a problem creating the Template1 Database. When the creation of the database is starting this is what happens: [EMAIL PROTECTED]:~ /home/luis/pgsql/bin/initdb -D /home/luis/pgsql/data/The files belonging to this database system will be owned by user luis.This user must also own the server process. The database cluster will be initialized with locale es_ES.UTF-8.The default database encoding has accordingly been set to UTF8.fixing permissions on existing directory /home/luis/pgsql/data ... ok creating directory /home/luis/pgsql/data/global ... okcreating directory /home/luis/pgsql/data/pg_xlog ... okcreating directory /home/luis/pgsql/data/pg_xlog/archive_status ... okcreating directory /home/luis/pgsql/data/pg_clog ... ok creating directory /home/luis/pgsql/data/pg_subtrans ... okcreating directory /home/luis/pgsql/data/pg_twophase ... okcreating directory /home/luis/pgsql/data/pg_multixact/members ... okcreating directory /home/luis/pgsql/data/pg_multixact/offsets ... ok creating directory /home/luis/pgsql/data/base ... okcreating directory /home/luis/pgsql/data/base/1 ... okcreating directory /home/luis/pgsql/data/pg_tblspc ... okselecting default max_connections ... 100 selecting default shared_buffers ... 1000creating configuration files ... okcreating template1 database in /home/luis /pgsql/data/base/1 ... FATAL: incorrect number of columns in row (expected 32, got 29)child process exited with exit code 1 initdb: removing contents of data directory /home/luis/pgsql/dataI just added three more fields to pg_class structure (pg_class.h) just like this: ..bool relhasfrequency; /* has time frequency */NameData relfrequencytype; /* table time frequency type (REAL_TIME, HISTORIC)*/NameData relfrequency ; /* class time frequency value*/#define Natts_pg_class_fixed 31 #define Natts_pg_class 32#define Anum_pg_class_relname 1 #define Anum_pg_class_relnamespace 2... DATA(insert OID = 1259 ( pg_class PGNSP 83 PGUID 0 1259 0 0 0 0 0 f f r 32 0 0 0 0 0 t f f f f null 0 f _null_ ));...-pg_class- has originally 25 fields, and one of my friends has added four more with no problems, but it seems to be a problem with the changes I've made, andwe don't know witch is it...I also made the next changes to pg_attribute.h:...{ 1259, {relhasfrequency},16, -1, 1, 29, 0, -1, -1, true, 'p', 'c', true, false, false, true, 0 }, \ { 1259, {relfrequencytype}, 19, -1, NAMEDATALEN, 30, 0, -1, -1, false, 'p', 'i', true, false, false, true, 0 }, \{ 1259, {relfrequency}, 19, -1, NAMEDATALEN, 31, 0, -1, -1, false, 'p', 'i', true, false, false, true, 0 }, \ { 1259, {relacl}, 1034, -1, -1, 32, 1, -1, -1, false, 'x', 'i', false, false, false, true, 0 }...DATA(insert ( 1259,relhasfrequency 16 -1 1 29 0 -1 -1 t p c t f f t 0));DATA(insert ( 1259,relfrequencytype 19 -1 NAMEDATALEN 30 0 -1 -1 f p i t f f t 0)); DATA(insert ( 1259,relfrequency 19 -1 NAMEDATALEN 31 0 -1 -1 f p i t f f t 0));DATA(insert ( 1259 relacl 1034 -1 -1 32 1 -1 -1 f x i f f f t 0));...If anyone could help me on this I would be very grateful, because this is the final project of my Career... THANKS-- Luis D. García M.Telf: 0414-3482018- FACYT - UC -- Computación -
Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
If the decision to vacuum based on autovacuum criteria is good enough for you then I think you should just focus on getting autovac to do what you want it to do. Perhaps you just need to decrease the sleep time to a few seconds, so that autovac will quickly detect when something needs to be vacuumed. The only case I can think of where autovac might not work as well as smartvacuum would be if you had a lot of databases in the cluster, since autovacuum will only vacuum one database at a time. On Mon, Oct 23, 2006 at 11:18:39AM +0900, Hitoshi Harada wrote: Ok, But my point is, autovacuum may corrupt with vacuum analyze command on another session. My intention of smartvacuum() is based on this. Any solution for this?? Regards, Hitoshi Harada -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Monday, October 23, 2006 11:10 AM To: Hitoshi Harada Cc: 'Peter Eisentraut'; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum Hitoshi Harada [EMAIL PROTECTED] writes: How is this different from what autovacuum does? My application needs to do vacuum by itself, while autovacuum does it as daemon. The database is updated so frequently that normal vacuum costs too much and tables to be updated are not so many as the whole database is vacuumed. I want to use autovacuum except the feature of daemon, but want to control when to vacuum and which table to vacuum. So, nothing is different between autovacuum and smartvacuum(), but former is daemon and later is user function. This seems completely unconvincing. What are you going to do that couldn't be done by autovacuum? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] New CRC algorithm: Slicing by 8
Bruce, Tom, All: Given the numbers I posted earlier today, the proposal is dead in the water anyway, quite aside from any legal considerations. Agreed. I just wanted to point out we have other sharks in the water. *IF* Slice-by-8 turned out to be a winner, I could get the legal issues looked into and probably resolved. But we should only do that if it turns out we really want to use it. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New CRC algorithm: Slicing by 8
On Mon, 2006-10-23 at 13:52 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Instead, I'd like to include a parameter to turn off CRC altogether, for heavily CPU bound operations and the WAL drive on trustworthy hardware. No can do --- we rely on the checksums to be able to tell when we've hit the end of WAL during replay. No we don't: Zero length records are the trigger for EOF. Sure, a CRC failure will cause replay to end, but only if you calculate it and compare it to whats on disk - which is the bit I would turn off. We don't rely on that, however, so it is avoidable. In most cases, it would be foolish to avoid: but there are cases where the data is CRC checked by the hardware/system already, so I'd like to make an option to turn this off, defaulting to on, for safety. You may as well propose not writing WAL at all (and no, I don't think it'd pass). That would undo all of my efforts, so no I wouldn't consider that. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] New CRC algorithm: Slicing by 8
In most cases, it would be foolish to avoid: but there are cases where the data is CRC checked by the hardware/system already, so I'd like to make an option to turn this off, defaulting to on, for safety. How would we know? What are those cases? Sounds like a foot gun to me. Sincerely, Joshua D. Drake You may as well propose not writing WAL at all (and no, I don't think it'd pass). That would undo all of my efforts, so no I wouldn't consider that. -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
Jim C. Nasby [EMAIL PROTECTED] writes: The only case I can think of where autovac might not work as well as smartvacuum would be if you had a lot of databases in the cluster, since autovacuum will only vacuum one database at a time. It's conceivable that it'd make sense to allow multiple autovac processes running in parallel. (The infrastructure part of this is easy enough, the hard part is keeping them from all deciding to vacuum the same table.) One reason we have not done that already is the thought that multiple vacuum processes would suck too much I/O to be reasonable. Now you could dial back their resource demands with the cost-delay settings, but it's not clear that ten autovacs running at one-tenth speed are better than one autovac using all the cycles you can spare. Usually I think it's best if a vacuum transaction finishes as fast as it can. In any case, these exact same concerns would apply to manual vacuums or a combination of manual and auto vacuum. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Tsearch2 index size
I am running versions 8.1.2 and I installed 8.2B last week. I dumped the data from the old version into the new version. The db consists of several million records. Total disk usage is approximately 114GB. My two observations are as follows... Also, keep in mind these are truly just observations, I didn't use any benchmarking tools. If someone can point me to a link of performance tools, I'd be happy to try them and report back! 1. The release notes indicate more efficient vacuuming. However, both vacuums seems to take about the same amount of time, ie. approx: 9 hours. Does more efficient simply mean, less IO/CPU busyness? This one doesn't really bother me, the next one does... Here are my vacuum parms, I used the same ones for both versions, of course. -- maintenance_work_mem = 40 # Unnecessarily high, I know I left it # for comparison's sake. vacuum_cost_delay = 50 vacuum_cost_page_hit = 1 vacuum_cost_page_miss = 10 vacuum_cost_page_dirty = 20 vacuum_cost_limit = 2000 -- 2. I have a tsearch2 index which is 756MB in size in 8.1.2 but balloons to 910MB in 8.2! These numbers were taken right after a REINDEX. Normally, I wouldn't care about physical index size, but this particular index is sitting on a ramdisk, so size really does matter. I see that the tsearch2 type was diddled with in 8.2. Is this an intentional change to improve tsearch2 performance? Thank you for any advice or abuse you give. No. Wait. No abuse please. Richard Whidden ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] New CRC algorithm: Slicing by 8
Simon Riggs [EMAIL PROTECTED] writes: On Mon, 2006-10-23 at 13:52 -0400, Tom Lane wrote: No can do --- we rely on the checksums to be able to tell when we've hit the end of WAL during replay. No we don't: Zero length records are the trigger for EOF. Only if the file happens to be all-zero already, which is not the normal operating state (see WAL-file recycling). Otherwise you have to be able to detect an invalid record. There are actually three checks used to detect end of WAL: zero record length, invalid checksum, and incorrect back-pointer. Zero length is the first and cleanest-looking test, but AFAICS we have to have both of the others to avoid obvious failure modes. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] New CRC algorithm: Slicing by 8
On Mon, 23 Oct 2006, Tom Lane wrote: Jeremy Drake [EMAIL PROTECTED] writes: So at this point I realize that intel's compiler is optimizing the loop away, at least for the std crc and probably for both. So I make mycrc an array of 2, and substript mycrc[j1] in the loop. That's not a good workaround, because making mycrc expensive to access means your inner loop timing isn't credible at all. Instead try making the buffer array nonlocal --- malloc it, perhaps. That did not make any difference. The way I see it, the only way to convince the compiler it really needs to do this loop more than once is to make it think it is not overwriting the same variable every time. The subscript was the cheapest way I could think of to do that. Any other suggestions on how to do this are welcome. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend -- I like being single. I'm always there when I need me. -- Art Leo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New CRC algorithm: Slicing by 8
Jeremy Drake [EMAIL PROTECTED] writes: On Mon, 23 Oct 2006, Tom Lane wrote: That's not a good workaround, because making mycrc expensive to access means your inner loop timing isn't credible at all. Instead try making the buffer array nonlocal --- malloc it, perhaps. That did not make any difference. The way I see it, the only way to convince the compiler it really needs to do this loop more than once is to make it think it is not overwriting the same variable every time. The subscript was the cheapest way I could think of to do that. Any other suggestions on how to do this are welcome. Hmm. Maybe store the CRCs into a global array somewhere? uint32 results[NTESTS]; for ... { INIT/COMP/FIN_CRC32... results[j] = mycrc; } This still adds a bit of overhead to the outer loop, but not much. Another possibility is to put the INIT/COMP/FIN_CRC32 into an external subroutine, thereby adding a call/return to the outer loop overhead. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Tsearch2 index size
[EMAIL PROTECTED] writes: 1. The release notes indicate more efficient vacuuming. However, both vacuums seems to take about the same amount of time, ie. approx: 9 hours. I think the improvements were only in btree index vacuuming, which it sounds like isn't your big problem. 2. I have a tsearch2 index which is 756MB in size in 8.1.2 but balloons to 910MB in 8.2! FILLFACTOR? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] New CRC algorithm: Slicing by 8
On Mon, 23 Oct 2006, Tom Lane wrote: Hmm. Maybe store the CRCs into a global array somewhere? uint32 results[NTESTS]; for ... { INIT/COMP/FIN_CRC32... results[j] = mycrc; } This still adds a bit of overhead to the outer loop, but not much. That seems to have worked. Std crc Slice-8 crc Intel P4 Xeon 2.8Ghz (Gentoo, gcc-3.4.5, -O2) 8192 bytes 26.765317 10.511143 1024 bytes 3.3578431.280890 64 bytes0.2232130.103767 Intel P4 Xeon 2.8Ghz (Gentoo, icc-9.0.032, -O2 -xN -ipo -parallel) 8192 bytes 29.495836 0.007107 1024 bytes 3.7086650.012183 64 bytes0.2425790.008700 So the gcc times are reasonable, but the icc times for the slice-by-8 are still too fast to be believed. I will have to take a look at the generated assembly later and see what gives. My changed testcrc.c is attached, again. -- I'd love to go out with you, but I did my own thing and now I've got to undo it.#include postgres.h #include time.h #include sys/time.h #include pg_crc.h int main() { volatile char buffer[TESTSIZE]; pg_crc32results[NTESTS]; pg_crc32mycrc; int j; struct timeval tstart; struct timeval tstop; srand(time(NULL)); for (j = 0; j TESTSIZE; ++j) buffer[j] = (char) (255 * (rand() / (RAND_MAX + 1.0))); gettimeofday(tstart, NULL); for (j = 0; j NTESTS; j++) { INIT_CRC32(mycrc); COMP_CRC32(mycrc, buffer, TESTSIZE); FIN_CRC32(mycrc); results[j] = mycrc; } gettimeofday(tstop, NULL); if (tstop.tv_usec tstart.tv_usec) { tstop.tv_sec--; tstop.tv_usec += 100; } printf(bufsize = %d, loops = %d, elapsed = %ld.%06ld\n, TESTSIZE, NTESTS, (long) (tstop.tv_sec - tstart.tv_sec), (long) (tstop.tv_usec - tstart.tv_usec)); return 0; } ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] COPY does not work with regproc and aclitem
I tried to use COPY command to export and import tables from catalog, but COPY command has problem with data type regproc. See example create table test (like pg_aggregate); copy pg_aggregate to '/tmp/pg_agg.out'; copy test from '/tmp/pg_agg.out'; ERROR: more than one function named pg_catalog.avg CONTEXT: COPY test, line 1, column aggfnoid: pg_catalog.avg The problem is that pg_proc table has following unique indexes: pg_proc_oid_index UNIQUE, btree (oid) pg_proc_proname_args_nsp_index UNIQUE, btree (proname, proargtypes, pronamespace) And regprocin in the backend/utils/adt/regproc.c cannot found unique OID for proname. Workaround is use binary mode, but on other side aclitem is not supported in the binary mode. postgres=# copy pg_class to '/tmp/pg_class.out' binary; ERROR: no binary output function available for type aclitem The solution is that COPY command will be use OID instead procname for export regproc. Zdenek ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] COPY does not work with regproc and aclitem
Zdenek Kotala wrote: I tried to use COPY command to export and import tables from catalog Is it just me or does this seem like a strange thing to want to do? I am trying to think of a good use case, so far without much success. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] COPY does not work with regproc and aclitem
Zdenek Kotala wrote: I tried to use COPY command to export and import tables from catalog, but COPY command has problem with data type regproc. See example create table test (like pg_aggregate); copy pg_aggregate to '/tmp/pg_agg.out'; copy test from '/tmp/pg_agg.out'; ERROR: more than one function named pg_catalog.avg CONTEXT: COPY test, line 1, column aggfnoid: pg_catalog.avg Hmm, maybe it should be using regprocedure instead? That one emits type-qualified function names, IIRC. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] COPY does not work with regproc and aclitem
Andrew Dunstan wrote: Zdenek Kotala wrote: I tried to use COPY command to export and import tables from catalog Is it just me or does this seem like a strange thing to want to do? I am trying to think of a good use case, so far without much success. I'm playing with catalog upgrade. The very basic idea of my experiment is export data from catalog and import it back to the new initialized/fresh catalog. Zdenek ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] COPY does not work with regproc and aclitem
Alvaro Herrera [EMAIL PROTECTED] writes: Hmm, maybe it should be using regprocedure instead? Not unless you want to break initdb. The only reason regproc still exists, really, is to accommodate loading of pg_type during initdb. Guess what: we can't do type lookup at that point. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] COPY does not work with regproc and aclitem
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Hmm, maybe it should be using regprocedure instead? Not unless you want to break initdb. The only reason regproc still exists, really, is to accommodate loading of pg_type during initdb. Guess what: we can't do type lookup at that point. I was thinking in the copied-out table, which not necessarily has to be pg_aggregate. I just tried, and it works to do this: alvherre=# create table pg_aggregate2 (aggfnoid regprocedure, aggtransfn alvherre(# regprocedure, aggfinalfn regprocedure, aggsortop oid, aggtranstype oid, alvherre(# agginitval text); CREATE TABLE alvherre=# insert into pg_aggregate2 select * from pg_aggregate; INSERT 0 114 alvherre=# create table test (like pg_aggregate2); CREATE TABLE alvherre=# copy pg_aggregate2 to '/tmp/pg_agg.out'; COPY 114 alvherre=# copy test from '/tmp/pg_agg.out'; COPY 114 alvherre=# -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] COPY does not work with regproc and aclitem
Zdenek Kotala wrote: Andrew Dunstan wrote: Zdenek Kotala wrote: I tried to use COPY command to export and import tables from catalog Is it just me or does this seem like a strange thing to want to do? I am trying to think of a good use case, so far without much success. I'm playing with catalog upgrade. The very basic idea of my experiment is export data from catalog and import it back to the new initialized/fresh catalog. Fair enough, but I am somewhat doubtful that COPY is the best way to do this. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] COPY does not work with regproc and aclitem
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Hmm, maybe it should be using regprocedure instead? Not unless you want to break initdb. The only reason regproc still exists, really, is to accommodate loading of pg_type during initdb. Guess what: we can't do type lookup at that point. Do you mean something like this: Datum regprocout(PG_FUNCTION_ARGS) { ... if( donot_resolve_procname == TRUE) { result = (char *) palloc(NAMEDATALEN); snprintf(result, NAMEDATALEN, %u, proid); } ... PG_RETURN_CSTRING(result); } donot_resolve_procname will be set when COPY will be performed. Zdenek ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] COPY does not work with regproc and aclitem
Zdenek Kotala [EMAIL PROTECTED] writes: I'm playing with catalog upgrade. The very basic idea of my experiment is export data from catalog and import it back to the new initialized/fresh catalog. That is never going to work, at least not for any interesting catalogs. A system with a fresh (I assume you mean empty) pg_proc, for instance, is non functional. A much bigger problem, if you're thinking of this as a component step of pg_upgrade, is that you can't use anything at the COPY level of detail because it will fail if the new version wants a different catalog layout --- for instance, if someone's added a column to the catalog. The right way to implement pg_upgrade is to transfer the catalog data at the SQL-command level of abstraction, ie, pg_dump -s and reload. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] COPY does not work with regproc and aclitem
Zdenek Kotala [EMAIL PROTECTED] writes: if( donot_resolve_procname == TRUE) { result = (char *) palloc(NAMEDATALEN); snprintf(result, NAMEDATALEN, %u, proid); } What for? If you want numeric OIDs you can have that today by casting the column to OID. More to the point, the issue is hardly restricted to COPY --- you'd get the same thing if you tried to insert data with INSERT. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] COPY does not work with regproc and aclitem
Tom Lane wrote: Zdenek Kotala [EMAIL PROTECTED] writes: I'm playing with catalog upgrade. The very basic idea of my experiment is export data from catalog and import it back to the new initialized/fresh catalog. That is never going to work, at least not for any interesting catalogs. A system with a fresh (I assume you mean empty) pg_proc, for instance, is non functional. No empty, fresh initialized by initdb. I want to copy only user data which is not created during boostrap. A much bigger problem, if you're thinking of this as a component step of pg_upgrade, is that you can't use anything at the COPY level of detail because it will fail if the new version wants a different catalog layout --- for instance, if someone's added a column to the catalog. Yes, I know about it. It is not problem, I want to prepare shadow catalog with new structure on old database in separate schema and adjust data in these tables. After it I want to make final COPY - data will be copied with correct structure. The right way to implement pg_upgrade is to transfer the catalog data at the SQL-command level of abstraction, ie, pg_dump -s and reload. I'm not sure if it is important, but I think that preserve OID is important and SQL level does not allow set OID. Zdenek ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] COPY does not work with regproc and aclitem
Zdenek Kotala wrote: Tom Lane wrote: The right way to implement pg_upgrade is to transfer the catalog data at the SQL-command level of abstraction, ie, pg_dump -s and reload. I'm not sure if it is important, but I think that preserve OID is important and SQL level does not allow set OID. Does it matter in any case other than where it refers to an on-disk object? And does that need anything other than a fixup to pg_class::relfilenode? cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Tsearch2 index size
2. I have a tsearch2 index which is 756MB in size in 8.1.2 but balloons to 910MB in 8.2! FILLFACTOR? Tom, Of course! I had it in my head that fillfactor had to be explicitely set. But then, after RTFM, it looks like there are defaults! Thank you! One more inane question, though. The default fillfactors are currently #define'ed. Is there or will there be a way to set the default fillfactor, using a system table perhaps? I realize this is just a convenience, not integral to fillfactor functionality... I sniffed around the docs and mailing lists but I could not find an answer. My apologies if this question has been answered already. Richard Whidden ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] COPY does not work with regproc and aclitem
Andrew Dunstan [EMAIL PROTECTED] writes: Zdenek Kotala wrote: I'm not sure if it is important, but I think that preserve OID is important and SQL level does not allow set OID. Does it matter in any case other than where it refers to an on-disk object? And does that need anything other than a fixup to pg_class::relfilenode? The only things pg_upgrade should be trying to preserve OIDs for are large objects. I don't even see a need to worry about relfilenode: you've got to link the physical files into the new directory tree anyway, you can perfectly well link them in under whatever new relfilenode identity happens to be assigned during the dump-reload step. This was all worked out years ago. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New CRC algorithm: Slicing by 8
Benny Amorsen wrote: MK == Mark Kirkwood [EMAIL PROTECTED] writes: MK Here are the results after building gcc 4.1.2 (repeating results MK for gcc 3.4.6 for comparison). I suspect that performance is MK probably impacted because gcc 4.1.2 (and also the rest of the MK tool-chain) is built with gcc 3.4.6 - but it certainly suggests MK that the newer gcc versions don't like the slice-8 algorithm for MK some reason. They don't seem to like the old CRC algorithms either. It is quite strange, such dramatic performance regressions from 3.x to 4.x are rare. Right - I think the regression is caused by libc and kernel being built with gcc 3.4.6 and the test program being built with gcc 4.1.2. Rebuilding *everything* with 4.1.2 (which I'm not sure is possible for FreeBSD at the moment) would probably get us back to numbers that looked more like my Gentoo ones [1]. Cheers Mark [1] Note that the upgrade process for switching Gentoo from gcc 3.4 to 4.1 involves precisely this - build 4.1, then rebuild everything using 4.1 (including 4.1 itself!) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New CRC algorithm: Slicing by 8
Mark Kirkwood [EMAIL PROTECTED] writes: Right - I think the regression is caused by libc and kernel being built with gcc 3.4.6 and the test program being built with gcc 4.1.2. Why do you think that? The performance of the CRC loop shouldn't depend at all on either libc or the kernel, because they're not invoked inside the loop. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: The only case I can think of where autovac might not work as well as smartvacuum would be if you had a lot of databases in the cluster, since autovacuum will only vacuum one database at a time. It's conceivable that it'd make sense to allow multiple autovac processes running in parallel. (The infrastructure part of this is easy enough, the hard part is keeping them from all deciding to vacuum the same table.) One reason we have not done that already is the thought that multiple vacuum processes would suck too much I/O to be reasonable. Now you could dial back their resource demands with the cost-delay settings, but it's not clear that ten autovacs running at one-tenth speed are better than one autovac using all the cycles you can spare. Usually I think it's best if a vacuum transaction finishes as fast as it can. I think this is one of the reasons table specific delay settings were designed in from the beginning. I think the main use cases for multiple vacuums at once are: 1) Vacuum per table space assuming each table space is on a different drive with it's own I/O. 2) the frequently updated table that can't wait to be vacuumed while a large table is being vacuumed. In this case if you set a system default delay setting and set a more aggressive table specific delay setting for your hot spot tables then multiple vacuums become a clear win. This is an important case that I hope we handle soon. At this point it's one of the main failings of the current autovacuum system. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New CRC algorithm: Slicing by 8
Tom Lane wrote: Mark Kirkwood [EMAIL PROTECTED] writes: Right - I think the regression is caused by libc and kernel being built with gcc 3.4.6 and the test program being built with gcc 4.1.2. Why do you think that? The performance of the CRC loop shouldn't depend at all on either libc or the kernel, because they're not invoked inside the loop. Just come to the same conclusion myself... I've got gcc 3.4.6 on Gentoo, so tried using that, and observed no regression at all there for the std case - which pretty much rules out any issues connected with what did I build kernel + libc with vs what compiler am I using now: Intel P-III 1.26Ghz (Gentoo 2006.1) Std crc Slice-8 crc 8192 bytes 12.967997 28.363876 (gcc 4.1.1) 8192 bytes 12.956765 13.978495 (gcc 3.4.6) So Gentoo using gcc 3.4.6 looks like FreeBSD using 3.4.6, so the std vs slice-8 performance seems to be all about compiler version. Now the regression on FreeBSD for the std case might be (as Kenneth pointed out) due to 4.1.1 being built by 3.4.6 but I guess it is just a nit at this point. Cheers Mark ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
On Mon, Oct 23, 2006 at 03:08:03PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: The only case I can think of where autovac might not work as well as smartvacuum would be if you had a lot of databases in the cluster, since autovacuum will only vacuum one database at a time. It's conceivable that it'd make sense to allow multiple autovac processes running in parallel. (The infrastructure part of this is easy enough, the hard part is keeping them from all deciding to vacuum the same table.) It might be worth creating a generic framework that prevents multiple vacuums from hitting a table at once, autovac or not. One reason we have not done that already is the thought that multiple vacuum processes would suck too much I/O to be reasonable. Now you could dial back their resource demands with the cost-delay settings, but it's not clear that ten autovacs running at one-tenth speed are better than one autovac using all the cycles you can spare. Usually I think it's best if a vacuum transaction finishes as fast as it can. There's other things that would benefit from having some idea on what IO resources are available. For example, having a separate bgwriter (or reader) for each set of physical volumes. So a means of grouping tablespaces wouldn't hurt. In any case, these exact same concerns would apply to manual vacuums or a combination of manual and auto vacuum. Well, the advantage to manual vacuums is that you can tune things to utilize multiple arrays... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
Jim C. Nasby wrote: On Mon, Oct 23, 2006 at 03:08:03PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: The only case I can think of where autovac might not work as well as smartvacuum would be if you had a lot of databases in the cluster, since autovacuum will only vacuum one database at a time. It's conceivable that it'd make sense to allow multiple autovac processes running in parallel. (The infrastructure part of this is easy enough, the hard part is keeping them from all deciding to vacuum the same table.) It might be worth creating a generic framework that prevents multiple vacuums from hitting a table at once, autovac or not. That one is easy, because vacuum gets a lock on the affected table that conflicts with itself. The problem is that the second vacuum would actually wait for the first to finish. A naive idea is to use ConditionalLockAcquire, and if it fails just skip the table. One reason we have not done that already is the thought that multiple vacuum processes would suck too much I/O to be reasonable. Now you could dial back their resource demands with the cost-delay settings, but it's not clear that ten autovacs running at one-tenth speed are better than one autovac using all the cycles you can spare. Usually I think it's best if a vacuum transaction finishes as fast as it can. In the scenario where one table is huge and another is very small, it can certainly be useful to vacuum the small table several times while the huge one has only been vacuumed once. For that you definitively need the ability to run parallel vacuums. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Bug related to out of memory condition
This behavior exists in 8.1.4 and CVS HEAD. I list below my preexisting schema, a set of commands that behave as I expect (and result in an ERROR), and a similar set of commands that do not behave as I expect (and result in a PANIC). Note the position of BEGIN in each. This is quite finicky behavior. Apparently the columns in the crashme table have a significant effect on the results. If you have trouble reproducing this, I can give more system details. However, it behaves this way on a fresh install of CVS HEAD, and it's 100% reproducible (for me, anyway). Regards, Jeff Davis --- EXISTING SCHEMA - CREATE TABLE r1( i INT PRIMARY KEY ); INSERT INTO r1 VALUES(1); CREATE TABLE r2( i INT PRIMARY KEY ); INSERT INTO r2 VALUES(1); CREATE TABLE r3( i INT PRIMARY KEY ); INSERT INTO r3 VALUES(1); CREATE TABLE r4( i INT PRIMARY KEY ); INSERT INTO r4 VALUES(1); First, here is the correct behavior: CORRECT BEHAVIOR --- crashme= CREATE TABLE crashme ( crashme( attr1SERIAL8 PRIMARY KEY, crashme( attr2TIMESTAMPTZ DEFAULT NOW(), crashme( attr3TIMESTAMPTZ, crashme( attr4INT REFERENCES r1(i), crashme( attr5INT REFERENCES r2(i), crashme( attr6INT REFERENCES r3(i), crashme( attr7INT REFERENCES r4(i), crashme( attr8TEXT crashme( ); NOTICE: CREATE TABLE will create implicit sequence crashme_attr1_seq for serial column crashme.attr1 NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index crashme_pkey for table crashme CREATE TABLE crashme= BEGIN; BEGIN crashme= INSERT INTO crashme (attr1,attr2,attr3,attr4,attr5,attr6,attr7,attr8) SELECT NEXTVAL ('crashme_attr1_seq'),NOW(),NOW(),1,1,1,1,'t1' FROM generate_series (1,500); ERROR: out of memory DETAIL: Failed on request of size 32. crashme= --- UNEXPECTED BEHAVIOR -- crashme= BEGIN; BEGIN crashme= CREATE TABLE crashme ( crashme( attr1SERIAL8 PRIMARY KEY, crashme( attr2TIMESTAMPTZ DEFAULT NOW(), crashme( attr3TIMESTAMPTZ, crashme( attr4INT REFERENCES r1(i), crashme( attr5INT REFERENCES r2(i), crashme( attr6INT REFERENCES r3(i), crashme( attr7INT REFERENCES r4(i), crashme( attr8TEXT crashme( ); NOTICE: CREATE TABLE will create implicit sequence crashme_attr1_seq for serial column crashme.attr1 NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index crashme_pkey for table crashme CREATE TABLE crashme= INSERT INTO crashme (attr1,attr2,attr3,attr4,attr5,attr6,attr7,attr8) SELECT NEXTVAL ('crashme_attr1_seq'),NOW(),NOW(),1,1,1,1,'t1' FROM generate_series (1,500); WARNING: AbortTransaction while in ABORT state WARNING: AbortTransaction while in ABORT state WARNING: AbortTransaction while in ABORT state ERROR: out of memory DETAIL: Failed on request of size 32. ERROR: out of memory DETAIL: Failed on request of size 27. ERROR: out of memory DETAIL: Failed on request of size 27. ERROR: out of memory DETAIL: Failed on request of size 27. ERROR: out of memory DETAIL: Failed on request of size 24. PANIC: ERRORDATA_STACK_SIZE exceeded server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. ! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Replication documentation addition
Here is my first draft of a new replication section for our documentation. I am looking for any comments. --- Replication === Database replication allows multiple computers to work together, making them appear as a single computer to user applications. This might involve allowing a backup server to take over if the primary server fails, or it might involve allowing several computers to work together at the same time. It would be ideal if database servers could be combined seamlessly. Web servers serving static web pages can be combined quite easily by merely load-balancing web requests to multiple machines. In fact, most read-only servers can be combined relatively easily. Unfortunately, most database servers have a read/write mix of requests, and read/write servers are much harder to combine. This is because though read-only data has to be placed on each each server only once, a write to any server has to be seen by all other servers so that future read requests to those servers return consistent results. This sync problem is the fundamental difficulty of doing database replication. Because there is no single solution that limits the impact of the sync problem for all workloads, there are multiple replication solutions. Each solution addresses the sync problem in a different way, and minimizes its impact for a specific workload. This section first outlines two important replication capabilities, and then outlines various replication solutions. Sychronous vs. Asynchronous Replication --- The term sychronous replication means that a query is not considered committed unless all servers have access to the committed records. In that case, a failover to a backup server will lose no data records. Asynchronous replication has a small delay between the time of commit and its propogation to backup servers, opening the possibility that some transactions might be lost in a switch to a backup server. Asynchronous is used when sychronous replication would be too slow. Full vs. Partial Replication The term full replication means only a full database cluster can be replicated, while partial replication means more fine-grained control over replicated objects is possible. Shared Disk Failover This replication solution avoids the sync problem by having only one copy of the database. This is possible because a single disk array is shared by multiple servers. If the main database server fails, the backup server is able to mount and start the database as though it was restarting after a database crash. This shared hardware functionality is common in network storage devices. This allows sychronous, full replication. Warm Standby Using Point-In-Time Recovery - A warm standby server (add doc xref) can be kept current by reading a stream of WAL records. If the main server fails, the warm standby contains almost all of the data as the main server, and can be used as the new database server. This allows asychronous, full replication. Point-In-Time Recovery [Asychronous, Full] -- A Point-In-Time Recovery is the same as a Warm Standby server except that the standby server must go though a full restore and archive recovery operation, delaying how quickly it can be used as the main database server. This allows asychronous, full replication. Continuously Running Failover Server A continuously running failover server allows the backup server to answer read-only queries while the master server is running. It receives a continuous stream of write activity from the master server. Because the failover server can be used for read-only database requests, it is ideal for data warehouse queries. Slony offers this as asychronous, partial replication. Data Partitioning - Data partitioning partitions the database into data sets. To achieve replication, each data set can only be modified by one server. For example, data can be partitioned by main office, e.g. London and Paris. While London and Paris servers have all data records, only London can modify London records, and Paris can only modify Paris records. Such partitioning is usually accomplished in application code, though rules and triggers can help enforce such partitioning and keep the read-only data sets current. Slony can also be used in such a setup. While Slony replicates only entire tables, London and Paris can be placed in separate tables, and inheritance can be used to pull from both tables at the same time. Query Broadcast Replication --- This involves sending write queries to multiple servers. Read-only queries can be sent to a single server because there is no need for all servers to process it. This can be complex to setup
Re: [HACKERS] [DOCS] Replication documentation addition
Please disregard. I am redoing it and will post a URL with the most recent version. --- Bruce Momjian wrote: Here is my first draft of a new replication section for our documentation. I am looking for any comments. --- Replication === Database replication allows multiple computers to work together, making them appear as a single computer to user applications. This might involve allowing a backup server to take over if the primary server fails, or it might involve allowing several computers to work together at the same time. It would be ideal if database servers could be combined seamlessly. Web servers serving static web pages can be combined quite easily by merely load-balancing web requests to multiple machines. In fact, most read-only servers can be combined relatively easily. Unfortunately, most database servers have a read/write mix of requests, and read/write servers are much harder to combine. This is because though read-only data has to be placed on each each server only once, a write to any server has to be seen by all other servers so that future read requests to those servers return consistent results. This sync problem is the fundamental difficulty of doing database replication. Because there is no single solution that limits the impact of the sync problem for all workloads, there are multiple replication solutions. Each solution addresses the sync problem in a different way, and minimizes its impact for a specific workload. This section first outlines two important replication capabilities, and then outlines various replication solutions. Sychronous vs. Asynchronous Replication --- The term sychronous replication means that a query is not considered committed unless all servers have access to the committed records. In that case, a failover to a backup server will lose no data records. Asynchronous replication has a small delay between the time of commit and its propogation to backup servers, opening the possibility that some transactions might be lost in a switch to a backup server. Asynchronous is used when sychronous replication would be too slow. Full vs. Partial Replication The term full replication means only a full database cluster can be replicated, while partial replication means more fine-grained control over replicated objects is possible. Shared Disk Failover This replication solution avoids the sync problem by having only one copy of the database. This is possible because a single disk array is shared by multiple servers. If the main database server fails, the backup server is able to mount and start the database as though it was restarting after a database crash. This shared hardware functionality is common in network storage devices. This allows sychronous, full replication. Warm Standby Using Point-In-Time Recovery - A warm standby server (add doc xref) can be kept current by reading a stream of WAL records. If the main server fails, the warm standby contains almost all of the data as the main server, and can be used as the new database server. This allows asychronous, full replication. Point-In-Time Recovery [Asychronous, Full] -- A Point-In-Time Recovery is the same as a Warm Standby server except that the standby server must go though a full restore and archive recovery operation, delaying how quickly it can be used as the main database server. This allows asychronous, full replication. Continuously Running Failover Server A continuously running failover server allows the backup server to answer read-only queries while the master server is running. It receives a continuous stream of write activity from the master server. Because the failover server can be used for read-only database requests, it is ideal for data warehouse queries. Slony offers this as asychronous, partial replication. Data Partitioning - Data partitioning partitions the database into data sets. To achieve replication, each data set can only be modified by one server. For example, data can be partitioned by main office, e.g. London and Paris. While London and Paris servers have all data records, only London can modify London records, and Paris can only modify Paris records. Such partitioning is usually accomplished in application code, though rules and triggers can help enforce such partitioning and keep the read-only data sets current. Slony can also be used in such a setup. While Slony replicates only entire tables, London and Paris can be placed in separate tables, and inheritance can be used to pull
[HACKERS] Replication documentation addition
Here is a new replication documentation section I want to add for 8.2: ftp://momjian.us/pub/postgresql/mypatches/replication Comments welcomed. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly