Re: [PERFORM] New to PostgreSQL, performance considerations
On Dec 11, 2006, at 04:35 , Tatsuo Ishii wrote: That's not the whole story. UTF-8 and other variable-width encodings don't provide a 1:1 mapping of logical characters to single bytes; in particular, combination characters opens the possibility of multiple different byte sequences mapping to the same code point; therefore, string comparison in such encodings generally cannot be done at the byte level (unless, of course, you first acertain that the strings involved are all normalized to an unambiguous subset of your encoding). Can you tell me such encodings supported by PostgreSQL other than UTF-8? http://www.postgresql.org/docs/8.1/interactive/ multibyte.html#MULTIBYTE-CHARSET-SUPPORTED Alexander. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Postgresql - Threshold value.
On 12/11/06, Ravindran G - TLS, Chennai. [EMAIL PROTECTED] wrote: Hello, How to get Postgresql Threshold value ?. Any commands available ?. What is meant my threshold value ? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Postgresql - Threshold value.
Hi, try using: tmp=# show all; and tmp=# show geqo_threshold; Regards, Kaloyan Iliev Ravindran G - TLS, Chennai. wrote: Hello, How to get Postgresql Threshold value ?. Any commands available ?. Regards, Ravi DISCLAIMER The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] SQL_CALC_FOUND_ROWS in POSTGRESQL / Some one can
On mán, 2006-12-11 at 17:01 +1100, Chris wrote: Mark Kirkwood wrote: Chris wrote: It's the same as doing a select count(*) type query using the same clauses, but all in one query instead of two. It doesn't return any extra rows on top of the limit query so it's better than using pg_numrows which runs the whole query and returns it to php (in this example). Their docs explain it: http://dev.mysql.com/doc/refman/4.1/en/information-functions.html See FOUND_ROWS() Note that from the same page: If you are using SELECT SQL_CALC_FOUND_ROWS, MySQL must calculate how many rows are in the full result set. However, this is faster than running the query again without LIMIT, because the result set need not be sent to the client. yes but not any faster than a select count(*) from (full query without LIMIT) so the only advantage to the SQL_CALC_FOUND_ROWS thingie is that instead of doing select count(*) from full-query select * from query-with-LIMIT which will do the query twice, but possibly with different optimisations, you would do a non-standard select SQL_CALC_FOUND_ROWS query-with-LIMIT select FOUND_ROWS() which will do one full query, without any LIMIT optimisation, but with the same number of round-trips, and same amount of data over the line. the only case where the second way may be more effective, is if no LIMIT optimisation can be made, and where the dataset is larger than file buffer space, so that there is no effect from caching. gnari ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] New to PostgreSQL, performance considerations
On 12/11/06, Alexander Staubo [EMAIL PROTECTED] wrote: On Dec 11, 2006, at 02:47 , Daniel van Ham Colchete wrote: I never understood what's the matter between the ASCII/ISO-8859-1/UTF8 charsets to a database. They're all simple C strings that doesn't have the zero-byte in the midlle (like UTF16 would) and that doesn't require any different processing unless you are doing case insensitive search (them you would have a problem). That's not the whole story. UTF-8 and other variable-width encodings don't provide a 1:1 mapping of logical characters to single bytes; in particular, combination characters opens the possibility of multiple different byte sequences mapping to the same code point; therefore, string comparison in such encodings generally cannot be done at the byte level (unless, of course, you first acertain that the strings involved are all normalized to an unambiguous subset of your encoding). PostgreSQL's use of strings is not limited to string comparison. Substring extraction, concatenation, regular expression matching, up/ downcasing, tokenization and so on are all part of PostgreSQL's small library of text manipulation functions, and all deal with logical characters, meaning they must be Unicode-aware. Alexander. You're right. I was thinking only about my cases that takes the Unicode normatization for granted and doesn't use regexp/tokenization/... Thanks Best Daniel ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] New to PostgreSQL, performance considerations
Hi Dave, On 12/11/06, Dave Cramer [EMAIL PROTECTED] wrote: Hi Daniel On 10-Dec-06, at 8:02 PM, Daniel van Ham Colchete wrote: Hi Gene, at my postgresql.conf, the only non-comented lines are: fsync = off This can, and will result in lost data. I know... If there is a power failure things can happen. I'm know, but the performance dif is really really big I just have to decide if I'm willing to take that chance or not. lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' How much memory does this machine have and what version of postgresql are you using? It's only a test server with 512MB RAM, I only used it to see how well would the PostgreSQL do in a ugly case. Daniel ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] New to PostgreSQL, performance considerations
On 11-Dec-06, at 5:36 AM, Daniel van Ham Colchete wrote: Hi Dave, On 12/11/06, Dave Cramer [EMAIL PROTECTED] wrote: Hi Daniel On 10-Dec-06, at 8:02 PM, Daniel van Ham Colchete wrote: Hi Gene, at my postgresql.conf, the only non-comented lines are: fsync = off This can, and will result in lost data. I know... If there is a power failure things can happen. I'm know, but the performance dif is really really big I just have to decide if I'm willing to take that chance or not. lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' How much memory does this machine have and what version of postgresql are you using? It's only a test server with 512MB RAM, I only used it to see how well would the PostgreSQL do in a ugly case. Given that optimal performance for postgresql can require up to 50% of available memory, you are going to leave the OS with 256MB of memory ? Dave Daniel ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] New to PostgreSQL, performance considerations
Mike, unfortunally I don't have any benchmarks right now. Doing something like this with the same machine would be a 2-day work (at least). Installing a Gentoo and putting it to run well is not a quick task (although is easy). But, trust me on this one. It's worth it. Think of this: PostgreSQL and GNU LibC use a lot of complex algorithms: btree, hashes, checksums, strings functions, etc... And you have a lot of ways to compile it into binary code. Now you have Pentium4's vectorization that allow you to run plenty of instructions in paralell, but AMD doesn't have this. Intel also have SSE2 that makes double-precision floatpoint operations a lot faster, AMD also doesn't have this (at least on 32bits). Now imagine that you're RedHat and that you have to deliver one CD to AMD and Intel servers. That means you can't use any AMD-specific or Intel-specific tecnology at the binary level. Things can get worse. If you take a look at GCC's code (at gcc/config/i386/i386.c), you'll see that GCC knows what runs faster on each processor. Let-me give an example with the FDIV and FSQRT instructions: ARCH - FDIV - FSQRT (costs relative to an ADD) i386: 88 - 122 i486: 73 - 83 pentium: 39 - 70 k6: 56 - 56 Athlon: 24 - 35 K8: 19 - 35 Pentium4: 43 - 43 Nocona: 40 - 44 Imagine that you are GCC and that you have two options in front of you: you can use FSQRT or FDIV plus 20 ADD/SUB. If you are on an Pentium situation you should use the second option. But on a Pentium4 or on a Athlon you should choose for the first one. This was only an example, nowadays you would have to choose between: 387, 3dNow, 3dNow+, SSE, SSE2, ... With this info, GCC knows how to choose the best ways to doing things to each processor. When you are compiling to an generic i586 (like Fedora and RedHat), them you are using pentium's timings. An example that I know of: it's impossible to run my software at a high demanding customer without compiling it to the it's processor (I make 5 compilations on every release). Using Intel's Compiler for Intel's processors makes things even faster, but it is not free and the how fast part really depends on your application is coded. With 64bits processors, AMD and Intel restarted the process and everyone has SSE2 (but not SSE3). Even so, the timings are also very diferent. Best regards Daniel On 12/11/06, Michael Stone [EMAIL PROTECTED] wrote: On Sun, Dec 10, 2006 at 11:02:44PM -0200, Daniel van Ham Colchete wrote: I'm using Gentoo Linux, so all my libraries (including glibc that is very important to PostgreSQL), and all my softwares are compiled with good CFLAG options to my processor (-O2 march=athlon-xp (...)). My Linux is not an Intel-AMD binary compatible turtle like Fedora/RedHat/SUSE/... It's really important to have your GLIBC compiled for your processor. It is essencial for performance. Please, point to the benchmarks that demonstrate this for a postgres application. Mike Stone ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] New to PostgreSQL, performance considerations
How much memory does this machine have and what version of postgresql are you using? It's only a test server with 512MB RAM, I only used it to see how well would the PostgreSQL do in a ugly case. Given that optimal performance for postgresql can require up to 50% of available memory, you are going to leave the OS with 256MB of memory ? If it were the case I wouldn't have any problems letting the OS use only 256MB, but this is not my production server. My production is built yet. It'll have at least 2GB of memory. But it's good to know anyway. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 09:05:56AM -0200, Daniel van Ham Colchete wrote: But, trust me on this one. It's worth it. You know what? I don't. Think of this: PostgreSQL and GNU LibC use a lot of complex algorithms: btree, hashes, checksums, strings functions, etc... And you have a lot of ways to compile it into binary code. Now you have Pentium4's vectorization that allow you to run plenty of instructions in paralell, but AMD doesn't have this. Intel also have SSE2 that makes double-precision floatpoint operations a lot faster, AMD also doesn't have this (at least on 32bits). Athlon 64 has SSE2, also in 32-bit-mode. Of course, it doesn't really matter, since at the instant you hit the disk even once, it's going to take a million cycles and any advantage you got from saving single cycles is irrelevant. Imagine that you are GCC and that you have two options in front of you: you can use FSQRT or FDIV plus 20 ADD/SUB. Could you please describe a reasonable case where GCC would have such an option? I cannot imagine any. An example that I know of: it's impossible to run my software at a high demanding customer without compiling it to the it's processor (I make 5 compilations on every release). What's your software? How can you make such assertions without backing them up? How can you know that the same holds for PostgreSQL? As Mike said, point to the benchmarks showing this essential difference between -O2 and -O2 -mcpu=pentium4 (or whatever). The only single worthwhile difference I can think of, is that glibc can use the SYSENTER function if it knows you have a 686 or higher (which includes AMD), and with recent kernels, I'm not even sure if that is needed anymore. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 09:05:56AM -0200, Daniel van Ham Colchete wrote: unfortunally I don't have any benchmarks right now. That's fairly normal for gentoo users pushing their compile options. Mike Stone ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] New to PostgreSQL, performance considerations
On 12/11/06, Steinar H. Gunderson [EMAIL PROTECTED] wrote: On Mon, Dec 11, 2006 at 09:05:56AM -0200, Daniel van Ham Colchete wrote: But, trust me on this one. It's worth it. You know what? I don't. So test it yourself. Think of this: PostgreSQL and GNU LibC use a lot of complex algorithms: btree, hashes, checksums, strings functions, etc... And you have a lot of ways to compile it into binary code. Now you have Pentium4's vectorization that allow you to run plenty of instructions in paralell, but AMD doesn't have this. Intel also have SSE2 that makes double-precision floatpoint operations a lot faster, AMD also doesn't have this (at least on 32bits). Athlon 64 has SSE2, also in 32-bit-mode. It's true. But, I'm not saying that Postfix is faster on AMD or Intel systems. I'm saying that it's a lot faster on you compile Postfix and your glibc to your processor. AMD also has features that Intel systems doesn't: 3dNow for example. The fact is that if your distro is compatible with a plain Athlon, you can't use neighter SSE nor SSE2. Of course, it doesn't really matter, since at the instant you hit the disk even once, it's going to take a million cycles and any advantage you got from saving single cycles is irrelevant. Really??? We're talking about high performance systems and every case is diferent. I once saw a ddr2 ram based storage once (like 1TB). Before you say it, I don't understand how it works, but you won't lose your data on a reboot or powerfailure. It was very expensive but really solve this thing with the IO bottleneck. Even when your bottleneck is the IO, still makes no sense to waste CPU resources unnecessarily. Imagine that you are GCC and that you have two options in front of you: you can use FSQRT or FDIV plus 20 ADD/SUB. Could you please describe a reasonable case where GCC would have such an option? I cannot imagine any. As I said, it is an example. Take floatpoint divisions. You have plenty of ways of doing it: 387, MMX, SSE, 3dNow, etc... Here GCC have to make a choice. And this is only one case. Usually, compiler optimizations are really complex and the processor's timings counts a lot. At every optimization the compile needs to mesure the quickest path, so it uses information on how the processor will run the code. If you take a look the AMD's docs you will see that theirs SSE2 implementation is diferent from Intel's internally. So, sometimes the quickest path uses SSE2 and sometimes it doesn't. You also have to count the costs of converting SSE registers to commom ones. If you still can't imagine any case, you can read Intel's assembler reference. You'll see that there are a lot of ways of doing a lot of things. An example that I know of: it's impossible to run my software at a high demanding customer without compiling it to the it's processor (I make 5 compilations on every release). What's your software? How can you make such assertions without backing them up? How can you know that the same holds for PostgreSQL? As Mike said, point to the benchmarks showing this essential difference between -O2 and -O2 -mcpu=pentium4 (or whatever). The only single worthwhile difference I can think of, is that glibc can use the SYSENTER function if it knows you have a 686 or higher (which includes AMD), and with recent kernels, I'm not even sure if that is needed anymore. Steinar, you should really test it. I won't read the PostgreSQL source to point you were it could use SSE or SSE2 or whatever. And I won't read glibc's code. You don't need to belive in what I'm saying. You can read GCC docs, Intel's assembler reference, AMD's docs about their processor and about how diferent that arch is. Best regards, Daniel Colchete ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] New to PostgreSQL, performance considerations
This is very very very true :-)! I just remebered one case with MySQL. When I changed the distro from Conectiva 10 (rpm-based ended brazilian distro) to Gentoo, a MySQL operation that usually took 2 minutes to run, ended in 47 seconds. This is absolutely vage. I don't have how to prove it to you. The old situation doesn't even exists anymore because I used the same hardware on the upgrade. And I can't mesure how each factor helped: compiling glibc and Mysql with good cflags, rebuilding my database in a ordered way, never kernel, etc.. All I know is that this process still runs with less than 1 minute (my database is larger now). I used the very same hardware: P4 3.0Ghz SATA disk without RAID. And I only upgraded because Conectiva's support to their version 10 ended and I need to keep my system up with the security patches. Best, Daniel On 12/11/06, Michael Stone [EMAIL PROTECTED] wrote: On Mon, Dec 11, 2006 at 09:05:56AM -0200, Daniel van Ham Colchete wrote: unfortunally I don't have any benchmarks right now. That's fairly normal for gentoo users pushing their compile options. Mike Stone ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] New to PostgreSQL, performance considerations
Oops! [EMAIL PROTECTED] (Daniel van Ham Colchete) was seen spray-painting on a wall: But, trust me on this one. It's worth it. No, the point of performance analysis is that you *can't* trust the people that say trust me on this one. If you haven't got a benchmark where you can demonstrate a material and repeatable difference, then you're just some Gentoo speed racer making things up. -- select 'cbbrowne' || '@' || 'acm.org'; http://linuxdatabases.info/info/wp.html One last point about metaphor, poetry, etc. As an example to illustrate these capabilities in Sastric Sanskrit,consider the bahuvrihi construct (literally man with a lot of rice) which is used currently in linguistics to describe references outside of compounds. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] New to PostgreSQL, performance considerations
You are right Christopher. Okay. Let's solve this matter. What PostgreSQL benchmark software should I use??? I'll test PostgreSQL 8.1 on a Fedora Core 6 and on a Gentoo. I'll get the same version FC6 uses and install it at my Gentoo. I'll use the same hardware (diferent partitions to each). I'm not saying that Gentoo is faster than FC6. I just want to prove that if you compile your software to make better use of your processor, it will run faster. It might take a few days because I'm pretty busy right now at my job. Best regards, Daniel On 12/11/06, Christopher Browne [EMAIL PROTECTED] wrote: Oops! [EMAIL PROTECTED] (Daniel van Ham Colchete) was seen spray-painting on a wall: But, trust me on this one. It's worth it. No, the point of performance analysis is that you *can't* trust the people that say trust me on this one. If you haven't got a benchmark where you can demonstrate a material and repeatable difference, then you're just some Gentoo speed racer making things up. -- select 'cbbrowne' || '@' || 'acm.org'; http://linuxdatabases.info/info/wp.html One last point about metaphor, poetry, etc. As an example to illustrate these capabilities in Sastric Sanskrit,consider the bahuvrihi construct (literally man with a lot of rice) which is used currently in linguistics to describe references outside of compounds. ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 11:09:13AM -0200, Daniel van Ham Colchete wrote: You know what? I don't. So test it yourself. You're making the claims, you're supposed to be proving them... As I said, it is an example. Take floatpoint divisions. You have plenty of ways of doing it: 387, MMX, SSE, 3dNow, etc... Here GCC have to make a choice. No, you don't. MMX, SSE and 3Dnow! will all give you the wrong result (reduced precision). SSE2, on the other hand, has double precision floats, so you might have a choice there -- except that PostgreSQL doesn't really do a lot of floating-point anyhow. And this is only one case. Usually, compiler optimizations are really complex and the processor's timings counts a lot. You keep asserting this, with no good backing. If you still can't imagine any case, you can read Intel's assembler reference. You'll see that there are a lot of ways of doing a lot of things. I've been programming x86 assembler for ten years or so... Steinar, you should really test it. I won't read the PostgreSQL source to point you were it could use SSE or SSE2 or whatever. And I won't read glibc's code. Then you should stop making these sort of wild claims. You don't need to belive in what I'm saying. You can read GCC docs, Intel's assembler reference, AMD's docs about their processor and about how diferent that arch is. I have. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] New to PostgreSQL, performance considerations
On 12/11/06, Steinar H. Gunderson [EMAIL PROTECTED] wrote: On Mon, Dec 11, 2006 at 11:09:13AM -0200, Daniel van Ham Colchete wrote: You know what? I don't. So test it yourself. You're making the claims, you're supposed to be proving them... As I said, it is an example. Take floatpoint divisions. You have plenty of ways of doing it: 387, MMX, SSE, 3dNow, etc... Here GCC have to make a choice. No, you don't. MMX, SSE and 3Dnow! will all give you the wrong result (reduced precision). SSE2, on the other hand, has double precision floats, so you might have a choice there -- except that PostgreSQL doesn't really do a lot of floating-point anyhow. And this is only one case. Usually, compiler optimizations are really complex and the processor's timings counts a lot. You keep asserting this, with no good backing. If you still can't imagine any case, you can read Intel's assembler reference. You'll see that there are a lot of ways of doing a lot of things. I've been programming x86 assembler for ten years or so... So, I'm a newbie to you. I learned x86 assembler last year. Steinar, you should really test it. I won't read the PostgreSQL source to point you were it could use SSE or SSE2 or whatever. And I won't read glibc's code. Then you should stop making these sort of wild claims. You don't need to belive in what I'm saying. You can read GCC docs, Intel's assembler reference, AMD's docs about their processor and about how diferent that arch is. I have. /* Steinar */ Okay, I'll do the benchmarks. Just sent an e-mail about this to the list. If you have any sugestions of how to make the benchmark please let-me know. I like when I prove myself wrong. Although it's much better when I'm right :-)... Best regards, Daniel Colchete ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 11:17:06AM -0200, Daniel van Ham Colchete wrote: I just remebered one case with MySQL. When I changed the distro from Conectiva 10 (rpm-based ended brazilian distro) to Gentoo, a MySQL operation that usually took 2 minutes to run, ended in 47 seconds. How do you know that this improvement had _anything_ to do with the use of different optimization flags? Were even the MySQL versions or configuration the same? This is absolutely vage. Indeed it is. I don't have how to prove it to you. No, but you should stop making this sort of absolutely essential claims if you can't. And I can't mesure how each factor helped: compiling glibc and Mysql with good cflags, rebuilding my database in a ordered way, never kernel, etc.. Exactly. So why are you attributing it to the first factor only? And why do you think this would carry over to PostgreSQL? Remember, anecdotal evidence isn't. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 11:31:48AM -0200, Daniel van Ham Colchete wrote: What PostgreSQL benchmark software should I use??? Look up the list archives; search for TPC. I'll test PostgreSQL 8.1 on a Fedora Core 6 and on a Gentoo. I'll get the same version FC6 uses and install it at my Gentoo. I'll use the same hardware (diferent partitions to each). Why do you want to compare FC6 and Gentoo? Wasn't your point that the -march= was supposed to be the relevant factor here? In that case, you want to keep all other things equal; so use the same distribution, only with -O2 -march=i686 vs. -march=athlon-xp (or whatever). /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Postgresql - Threshold value.
Thank you very much for your reply. This is not working in Postgresql 8.1.4. Its throwing some error. BTW, how to fetch the total disk space that is alloted to DB in postgresql ?. Any commands available ?. Regards, Ravi -Original Message- From: Kaloyan Iliev [mailto:[EMAIL PROTECTED] Sent: Monday, December 11, 2006 3:04 PM To: Ravindran G - TLS, Chennai.; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Postgresql - Threshold value. Hi, try using: tmp=# show all; and tmp=# show geqo_threshold; Regards, Kaloyan Iliev Ravindran G - TLS, Chennai. wrote: Hello, How to get Postgresql Threshold value ?. Any commands available ?. Regards, Ravi DISCLAIMER The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Postgresql - Threshold value.
am Mon, dem 11.12.2006, um 19:41:29 +0530 mailte Ravindran G - TLS, Chennai. folgendes: Thank you very much for your reply. This is not working in Postgresql 8.1.4. Its throwing some error. Which errors? test=# show geqo_threshold; geqo_threshold 12 (1 row) test=*# select version(); version --- PostgreSQL 8.1.4 on i386-pc-linux-gnu, compiled by GCC cc (GCC) 3.3.5 (Debian 1:3.3.5-13) (1 row) BTW, how to fetch the total disk space that is alloted to DB in postgresql ?. Any commands available ?. Yes, RTFM. http://www.postgresql.org/docs/8.1/interactive/functions-admin.html - pg_database_size(name) Regards, Ravi -Original Message- Please, no silly fullquote below your text. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47215, D1: 0160/7141639 (mehr: - Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgresql - Threshold value.
Thanks. I am using Postgres 8.1.4 in windows 2000 and i don't get the proper response for threshold. - pg_database_size(name) - is giving the current space used by DB. But I want like Total space utilized by DB and Free space. Regards, Ravi -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of A. Kretschmer Sent: Monday, December 11, 2006 7:55 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Postgresql - Threshold value. am Mon, dem 11.12.2006, um 19:41:29 +0530 mailte Ravindran G - TLS, Chennai. folgendes: Thank you very much for your reply. This is not working in Postgresql 8.1.4. Its throwing some error. Which errors? test=# show geqo_threshold; geqo_threshold 12 (1 row) test=*# select version(); version --- PostgreSQL 8.1.4 on i386-pc-linux-gnu, compiled by GCC cc (GCC) 3.3.5 (Debian 1:3.3.5-13) (1 row) BTW, how to fetch the total disk space that is alloted to DB in postgresql ?. Any commands available ?. Yes, RTFM. http://www.postgresql.org/docs/8.1/interactive/functions-admin.html - pg_database_size(name) Regards, Ravi -Original Message- Please, no silly fullquote below your text. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47215, D1: 0160/7141639 (mehr: - Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster DISCLAIMER The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] New to PostgreSQL, performance considerations
On 12/11/06, Daniel van Ham Colchete [EMAIL PROTECTED] wrote: But, trust me on this one. It's worth it. Think of this: PostgreSQL and GNU LibC use a lot of complex algorithms: btree, hashes, checksums, strings functions, etc... And you have a lot of ways to compile it into binary code. Now you have Pentium4's vectorization that allow you to run plenty of instructions in paralell, but AMD doesn't have this. Intel also have SSE2 that makes double-precision floatpoint operations a lot faster, AMD also doesn't have this (at least on 32bits). Now imagine that you're RedHat and that you have to deliver one CD to AMD and Intel servers. That means you can't use any AMD-specific or Intel-specific tecnology at the binary level. AMD processors since the K6-2 and I think Intel ones since P-Pro are essentially RISC processors with a hardware microcode compiler that translates and reorganizes instructions on the fly. Instruction choice and ordering was extremely important in older 32 bit architectures (like the 486) but is much less important these days. I think you will find that an optimized glibc might be faster in specific contrived cases, the whole is unfortunately less than the sum of its parts. While SSE2 might be able to optimize things like video decoding and the like, for most programs it's of little benifit and IMO a waste of time. Also as others pointed out things like cache hits/misses and i/o considerations are actually much more important than instruction execution speed. We ran Gentoo here for months and did not to be faster enough to merit the bleeding edge quirks it has for production environments. If you dig assembly, there was an interesting tackle of the spinlocks code on the hackers list last year IIRC. merlin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Low throughput of binary inserts from windows to linux
On 12/9/06, Tom Lane [EMAIL PROTECTED] wrote: Axel Waggershauser [EMAIL PROTECTED] writes: ... This works quite well for the following setups: client - server - linux - linux linux - windows windows - windows but pretty bad (meaning about 10 times slower) for this setup windows - linux This has to be a network-level problem. IIRC, there are some threads in our archives discussing possibly-related performance issues seen with Windows' TCP stack. Don't recall details, but I think in some cases the problem was traced to third-party add-ons to the Windows stack. You might want to check just what you're running there. I searched the archives but found nothing really enlightening regarding my problem. One large thread regarding win32 was related to a psql problem related to multiple open handles, other mails referred to a QoS patch but I could not find more specific information. I thought about firewall or virus scanning software myself, but I can't really see why such software should distinguish between a windows remote host and a linux remote host. Furthermore, downloading is fast on all setups, it's just uploading from windows to linux, which is slow. I repeated my test with a vanilla windows 2000 machine (inc. tons of microsoft hot-fixes) and it exposes the same problem. I'm out of ideas here, maybe someone could try to reproduce this behavior or could point me to the thread containing relevant information (sorry, maybe I'm just too dumb :-/) Thank, Axel ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Low throughput of binary inserts from windows to linux
I'm out of ideas here, maybe someone could try to reproduce this behavior or could point me to the thread containing relevant information (sorry, maybe I'm just too dumb :-/) please specify how you're transfering the data from windows - linux. are you using odbc? if yes, what driver? are you using FDQN server names or a plain ip adress? etc etc. - thomas ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Low throughput of binary inserts from windows to linux
On 12/11/06, Thomas H. [EMAIL PROTECTED] wrote: I'm out of ideas here, maybe someone could try to reproduce this behavior or could point me to the thread containing relevant information (sorry, maybe I'm just too dumb :-/) please specify how you're transfering the data from windows - linux. are you using odbc? if yes, what driver? are you using FDQN server names or a plain ip adress? etc etc. You may take a look at my first mail (starting this thread), there you find a my_lo.c attached, containing a the complete code. I use libpq. The connection is established like this: conn = PQsetdbLogin(argv[1], 5432, NULL, NULL, argv[2], argv[3], argv[4]); I called the test program with the plain ip-address of the server machine. Axel ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Postgresql - Threshold value.
On 12/11/06, Ravindran G - TLS, Chennai. [EMAIL PROTECTED] wrote: Thanks. I am using Postgres 8.1.4 in windows 2000 and i don't get the proper response for threshold. what is the response you get ? please be specific about the issues. also the footer that comes with your emails are not appreciated by many people. if possible pls avoid it. Regds mallah. - ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Low throughput of binary inserts from windows to linux
Axel Waggershauser [EMAIL PROTECTED] writes: On 12/9/06, Tom Lane [EMAIL PROTECTED] wrote: This has to be a network-level problem. IIRC, there are some threads in our archives discussing possibly-related performance issues seen with Windows' TCP stack. I searched the archives but found nothing really enlightening regarding my problem. One large thread regarding win32 was related to a psql problem related to multiple open handles, other mails referred to a QoS patch but I could not find more specific information. Yeah, that's what I couldn't think of the other day. The principal report was here: http://archives.postgresql.org/pgsql-general/2005-01/msg01231.php By default, Windows XP installs the QoS Packet Scheduler service. It is not installed by default on Windows 2000. After I installed QoS Packet Scheduler on the Windows 2000 machine, the latency problem vanished. Now he was talking about a local connection not remote, but it's still something worth trying. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] New to PostgreSQL, performance considerations
[EMAIL PROTECTED] (Daniel van Ham Colchete) writes: You are right Christopher. Okay. Let's solve this matter. What PostgreSQL benchmark software should I use??? pgbench is one option. There's a TPC-W at pgFoundry (http://pgfoundry.org/projects/tpc-w-php/). There's the Open Source Database Benchmark. (http://osdb.sourceforge.net/) Those are three reasonable options. I'll test PostgreSQL 8.1 on a Fedora Core 6 and on a Gentoo. I'll get the same version FC6 uses and install it at my Gentoo. I'll use the same hardware (diferent partitions to each). Wrong approach. You'll be comparing apples to oranges, because Gentoo and Fedora pluck sources from different points in the source code stream. In order to prove what you want to prove, you need to run the benchmarks all on Gentoo, where you run with 4 categorizations: 1. Where you run PostgreSQL and GLIBC without any processor-specific optimizations 2. Where you run PostgreSQL and GLIBC with all relevant processor-specific optimizations 3. Where you run PostgreSQL with, and GLIBC without processor-specific optimizations 4. Where you run PostgreSQL without, and GLIBC with processor-specific optimizations That would allow one to clearly distinguish which optimizations are particularly relevant. I'm not saying that Gentoo is faster than FC6. I just want to prove that if you compile your software to make better use of your processor, it will run faster. It might take a few days because I'm pretty busy right now at my job. I expect that you'll discover, if you actually do these tests, that this belief is fairly much nonsense. - Modern CPUs do a huge amount of on-CPU self-tuning. - CPU features that could have a material effect tend to be unusable when compiling general purpose libraries and applications. GCC doesn't generate MMX-like instructions. - Database application performance tends to be I/O driven. - When database application performance *isn't* I/O driven, it is likely to be driven by cache management, which compiler options won't affect. -- output = reverse(ofni.secnanifxunil @ enworbbc) http://linuxdatabases.info/info/sgml.html very few people approach me in real life and insist on proving they are drooling idiots. -- Erik Naggum, comp.lang.lisp ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Postgresql - Threshold value.
Rajesh Kumar Mallah [EMAIL PROTECTED] writes: On 12/11/06, Ravindran G - TLS, Chennai. [EMAIL PROTECTED] wrote: I am using Postgres 8.1.4 in windows 2000 and i don't get the proper response for threshold. what is the response you get ? please be specific about the issues. Even more to the point, what sort of response do you think you should get? What kind of threshold are you talking about, and why do you think it even exists in Postgres? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Low throughput of binary inserts from windows to
On Mon, Dec 11, 2006 at 8:58 AM, in message [EMAIL PROTECTED], Axel Waggershauser [EMAIL PROTECTED] wrote: I'm out of ideas here, maybe someone could try to reproduce this behavior or could point me to the thread containing relevant information No guarantees that this is the problem, but I have seen similar issues in other software because of delays introduced in the TCP stack by the Nagle algorithm. Turning on TCP_NODELAY has solved such problems. I don't know if PostgreSQL is vulnerable to this, or how it would be fixed in a PostgreSQL environment, but it might give you another avenue to search. -Kevin ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Looking for hw suggestions for high concurrency OLTP app
Hi all, I'd like to get suggestions from all you out there for a new Postgresql server that will replace an existing one. My performance analysis shows very *low* iowaits, and very high loads at times of peak system activity. The average concurrent processes number is 3/4, with peaks of 10/15. *Sustained* system load varies from 1.5 to 4, while peak load reaches 20 and above, always with low iowait%. I see this as a clear sign of more processors power need. I'm aware of the context-switching storm problem, but here the cs stays well under 50,000, so I think it's not the problem here. Current machine is an Acer Altos R700 (2 Xeon 2.8 Ghz, 4 Gb RAM), similar to this one (R710): http://www.acer.co.uk/acereuro/page9.do?sp=page4dau34.oid=7036UserCtxParam=0GroupCtxParam=0dctx1=17CountryISOCtxParam=UKLanguageISOCtxParam=enctx3=-1ctx4=United+Kingdomcrc=334044639 So, I'm looking for advice on a mid-range OLTP server with the following (flexible) requirements: - 4 physical processors expandable to 8 (dual core preferred), either Intel or AMD - 8 Gb RAM exp. to at least 32 Gb - Compatibility with Acer S300 External storage enclosure. The old server uses that, and it is boxed with 15krpm hdds in RAID-10, which do perfectly well their job. The enclosure is connected via 2 x LSI Logic PCI U320 controllers - On-site, same-day hardware support contract - Rack mount Machines we're evaluating currently include: - Acer Altos R910 - Sun Fire V40Z, - HP Integrity RX4640 - IBM eServer 460 Experiences on these machines? Other suggestions? Thanks. -- Cosimo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] really quick multiple inserts can use COPY?
Hello! In our JAVA application we do multiple inserts to a table by data from a Hash Map. Due to poor database access implemention - done by another company (we got the job to enhance the software) - we cannot use prepared statements. (We are not allowed to change code at database access!) First, we tried to fire one INSERT statement per record to insert. This costs 3 ms per row which is to slow because normally we insert 10.000 records which results in 30.000 ms just for inserts. for(){ sql = INSERT INTO tblfoo(foo,bar) VALUES(+it.next()+,+CONST.BAR+);; } I was searching for an quicker way - MSSQL offers Array Inserts - at PostgreSQL. The only solution seem to be INSERT INTO foo SELECT and this is really dirty. I improved the inserts using the subselect with union. sql = INSERT INTO tblfoo(foo,bar) ; for(){ sql += SELECT +it.next()+,+CONST.BAR+ UNION ... } This results in a really long INSERT INTO SELECT UNION statement and works correct and quick but looks dirty. When I heard about COPY I thought this will be the right way. But it does not work using JDBC. Testing via psql does it perfect but sending the same SQL statements via JDBC throws an error. - BEGIN sql = COPY tblfoo(foo,bar) FROM STDIN;\n1 'foobar'\n2 'foobar'\n\\.; - COMMIT ERROR: syntax error at or near 1 at character 34 So, my questions: Is it possible to use COPY FROM STDIN with JDBC? Will it bring performance improvement compared to SELECT UNION solution? many thanks in advance, Jens Schipkowski -- ** APUS Software GmbH ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] really quick multiple inserts can use COPY?
Jens Schipkowski [EMAIL PROTECTED] schrieb: Hello! In our JAVA application we do multiple inserts to a table by data from a Hash Map. Due to poor database access implemention - done by another company (we got the job to enhance the software) - we cannot use prepared statements. (We are not allowed to change code at database access!) First, we tried to fire one INSERT statement per record to insert. This costs 3 ms per row which is to slow because normally we insert 10.000 records which results in 30.000 ms just for inserts. Can you change this from INSERT-Statements to COPY? Copy is *much* faster than INSERT. If no, do you using all INSERTs in one transaction? I believe, a 'BEGIN' and a 'COMMIT' around all INSERTs may increase the speed. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) If I was god, I would recompile penguin with --enable-fly.(unknow) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889° ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] really quick multiple inserts can use COPY?
So, my questions: Is it possible to use COPY FROM STDIN with JDBC? Should be. Its at least possible using DBI and DBD::Pg (perl) my $copy_sth = $dbh - prepare( COPY general.datamining_mailing_lists (query_id,email_key) FROM STDIN;) ; $copy_sth - execute(); while (my ($email_key ) = $fetch_sth - fetchrow_array ()) { $dbh - func($query_id\t$email_key\n, 'putline'); } $fetch_sth - finish(); $dbh - func(\\.\n, 'putline'); $dbh - func('endcopy'); $copy_sth-finish(); Some JDBC expert would tell better how its done with JDBC. Will it bring performance improvement compared to SELECT UNION solution? COPY is quite faast. Regds mallah. many thanks in advance, Jens Schipkowski -- ** APUS Software GmbH ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] really quick multiple inserts can use COPY?
Jens Schipkowski [EMAIL PROTECTED] writes: Is it possible to use COPY FROM STDIN with JDBC? You should be asking the pgsql-jdbc list, not here. (I know I've seen mention of a JDBC patch to support COPY, but I dunno if it's made it into any official version.) Will it bring performance improvement compared to SELECT UNION solution? Please, at least be smart enough to use UNION ALL not UNION. If you're using 8.2 you could also consider using INSERT with multiple VALUES-lists. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] really quick multiple inserts can use COPY?
Jens Schipkowski jens.schipkowski 'at' apus.co.at writes: Hello! In our JAVA application we do multiple inserts to a table by data from a Hash Map. Due to poor database access implemention - done by another company (we got the job to enhance the software) - we cannot use prepared statements. (We are not allowed to change code at database access!) First, we tried to fire one INSERT statement per record to insert. This costs 3 ms per row which is to slow because normally we insert 10.000 records which results in 30.000 ms just for inserts. for(){ sql = INSERT INTO tblfoo(foo,bar) VALUES(+it.next()+,+CONST.BAR+);; } You should try to wrap that into a single transaction. PostgreSQL waits for I/O write completion for each INSERT as it's implicitely in its own transaction. Maybe the added performance would be satisfactory for you. -- Guillaume Cottenceau Create your personal SMS or WAP Service - visit http://mobilefriends.ch/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] New to PostgreSQL, performance considerations
This definitely is the correct approach. Actually, Daniel van Ham Colchete may not be as all wet as some around here think. We've had previous data that shows that pg can become CPU bound (see previous posts by Josh Berkus and others regarding CPU overhead in what should be IO bound tasks). In addition, we know that x86 compatible 64b implementations differ enough between AMD and Intel products that it sometimes shows on benches. Evidence outside the DBMS arena supports the hypothesis that recent CPUs are needing more hand-holding and product specific compiling, not less, compared to their previous versions. Side Note: I wonder what if anything pg could gain from using SWAR instructions (SSE*, MMX, etc)? I'd say the fairest attitude is to do everything we can to support having the proper experiments done w/o presuming the results. Ron Peacetree At 10:47 AM 12/11/2006, Chris Browne wrote: In order to prove what you want to prove, you need to run the benchmarks all on Gentoo, where you run with 4 categorizations: 1. Where you run PostgreSQL and GLIBC without any processor-specific optimizations 2. Where you run PostgreSQL and GLIBC with all relevant processor-specific optimizations 3. Where you run PostgreSQL with, and GLIBC without processor-specific optimizations 4. Where you run PostgreSQL without, and GLIBC with processor-specific optimizations That would allow one to clearly distinguish which optimizations are particularly relevant. I'm not saying that Gentoo is faster than FC6. I just want to prove that if you compile your software to make better use of your processor, it will run faster. It might take a few days because I'm pretty busy right now at my job. I expect that you'll discover, if you actually do these tests, that this belief is fairly much nonsense. - Modern CPUs do a huge amount of on-CPU self-tuning. - CPU features that could have a material effect tend to be unusable when compiling general purpose libraries and applications. GCC doesn't generate MMX-like instructions. - Database application performance tends to be I/O driven. - When database application performance *isn't* I/O driven, it is likely to be driven by cache management, which compiler options won't affect. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 12:15:51PM -0500, Ron wrote: I'd say the fairest attitude is to do everything we can to support having the proper experiments done w/o presuming the results. Who's presuming results?[1] It is fair to say that making extraordinary claims without any evidence should be discouraged. It's also fair to say that if there are specific things that need cpu-specific tuning they'll be fairly limited critical areas (e.g., locks) which would probably be better implemented with a hand-tuned code and runtime cpu detection than by magical mystical compiler invocations. Mike Stone [1] I will say that I have never seen a realistic benchmark of general code where the compiler flags made a statistically significant difference in the runtime. There are some particularly cpu-intensive codes, like some science simulations or encoding routines where they matter, but that's not the norm--and many of those algorithms already have hand-tuned versions which will outperform autogenerated code. You'd think that with all the talk that the users of certain OS's generate about CFLAG settings, there'd be some well-published numbers backing up the hype. At any rate if there were numbers to back the claim then I think they could certainly be considered without prejudice. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] File Systems Compared
On Wed, Dec 06, 2006 at 08:55:14 -0800, Mark Lewis [EMAIL PROTECTED] wrote: Anyone run their RAIDs with disk caches enabled, or is this akin to having fsync off? Disk write caches are basically always akin to having fsync off. The only time a write-cache is (more or less) safe to enable is when it is backed by a battery or in some other way made non-volatile. So a RAID controller with a battery-backed write cache can enable its own write cache, but can't safely enable the write-caches on the disk drives it manages. This appears to be changing under Linux. Recent kernels have write barriers implemented using cache flush commands (which some drives ignore, so you need to be careful). In very recent kernels, software raid using raid 1 will also handle write barriers. To get this feature, you are supposed to mount ext3 file systems with the barrier=1 option. For other file systems, the parameter may need to be different. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] New to PostgreSQL, performance considerations
Statements like these can not be reasonably interpreted in any manner _except_ that of presuming the results: I expect that you'll discover, if you actually do these tests, that this belief (that using arch specific compiler options lead to better performing SW) is fairly much nonsense. ...IMO a waste of time... etc The correct objective response to claims w/o evidence is to request evidence, and to do everything we can to support it being properly gathered. Not to try to discourage the claimant from even trying by ganging up on them with multiple instances of Argument From Authority or variations of Ad Hominem attacks. (The validity of the claim has nothing to do with the skills or experience of the claimant or anyone else in the discussion. Only on the evidence.) It is a tad unfair and prejudicial to call claims that CPU optimizations matter to the performance of DB product extraordinary. Evidence outside the DBMS field exists; and previous posts here show that pg can indeed become CPU-bound during what should be IO bound tasks. At the moment, Daniel's claims are not well supported. That is far different from being extraordinary given the current circumstantial evidence. Let's also bear in mind that as a community project, we can use all the help we can get. Driving potential resources away is in opposition to that goal. [1] The evidence that arch specific flags matter to performance can be found as easily as recompiling your kernel or your compiler. While it certainly could be argued how general purpose such SW is, the same could be said for just about any SW at some level of abstraction. Ron Peacetree At 12:31 PM 12/11/2006, Michael Stone wrote: On Mon, Dec 11, 2006 at 12:15:51PM -0500, Ron wrote: I'd say the fairest attitude is to do everything we can to support having the proper experiments done w/o presuming the results. Who's presuming results?[1] It is fair to say that making extraordinary claims without any evidence should be discouraged. It's also fair to say that if there are specific things that need cpu-specific tuning they'll be fairly limited critical areas (e.g., locks) which would probably be better implemented with a hand-tuned code and runtime cpu detection than by magical mystical compiler invocations. Mike Stone [1] I will say that I have never seen a realistic benchmark of general code where the compiler flags made a statistically significant difference in the runtime. There are some particularly cpu-intensive codes, like some science simulations or encoding routines where they matter, but that's not the norm--and many of those algorithms already have hand-tuned versions which will outperform autogenerated code. You'd think that with all the talk that the users of certain OS's generate about CFLAG settings, there'd be some well-published numbers backing up the hype. At any rate if there were numbers to back the claim then I think they could certainly be considered without prejudice. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] New to PostgreSQL, performance considerations
Michael, On 12/11/06 9:31 AM, Michael Stone [EMAIL PROTECTED] wrote: [1] I will say that I have never seen a realistic benchmark of general code where the compiler flags made a statistically significant difference in the runtime. Here's one - I wrote a general purpose Computational Fluid Dynamics analysis method used by hundreds of people to perform aircraft and propulsion systems analysis. Compiler flag tuning would speed it up by factors of 2-3 or even more on some architectures. The reason it was so effective is that the structure of the code was designed to be general, but also to expose the critical performance sections in a way that the compilers could use - deep pipelining/vectorization, unrolling, etc, were carefully made easy for the compilers to exploit in critical sections. Yes, this made the code in those sections harder to read, but it was a common practice because it might take weeks of runtime to get an answer and performance mattered. The problem I see with general purpose DBMS code the way it's structured in pgsql (and others) is that many of the critical performance sections are embedded in abstract interfaces that obscure them from optimization. An example is doing a simple is equal to operation has many layers surrounding it to ensure that UDFs can be declared and that special comparison semantics can be accomodated. But if you're simply performing a large number of INT vs. INT comparisons, it will be thousands of times slower than a CPU native operation because of the function call overhead, etc. I've seen presentations that show IPC of Postgres at about 0.5, versus the 2-4 possible from the CPU. Column databases like C-Store remove these abstractions at planner time to expose native operations in large chunks to the compiler and the IPC reflects that - typically 1+ and as high as 2.5. If we were to redesign the executor and planner to emulate that same structure we could achieve similar speedups and the compiler would matter more. - Luke ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Looking for hw suggestions for high concurrency
The Sun X4600 is very good for this, the V40z is actually EOL so I'd stay away from it. You can currently do 8 dual core CPUs with the X4600 and 128GB of RAM and soon you should be able to do 8 quad core CPUs and 256GB of RAM. - Luke On 12/11/06 8:26 AM, Cosimo Streppone [EMAIL PROTECTED] wrote: Hi all, I'd like to get suggestions from all you out there for a new Postgresql server that will replace an existing one. My performance analysis shows very *low* iowaits, and very high loads at times of peak system activity. The average concurrent processes number is 3/4, with peaks of 10/15. *Sustained* system load varies from 1.5 to 4, while peak load reaches 20 and above, always with low iowait%. I see this as a clear sign of more processors power need. I'm aware of the context-switching storm problem, but here the cs stays well under 50,000, so I think it's not the problem here. Current machine is an Acer Altos R700 (2 Xeon 2.8 Ghz, 4 Gb RAM), similar to this one (R710): http://www.acer.co.uk/acereuro/page9.do?sp=page4dau34.oid=7036UserCtxParam=0 GroupCtxParam=0dctx1=17CountryISOCtxParam=UKLanguageISOCtxParam=enctx3=-1 ctx4=United+Kingdomcrc=334044639 So, I'm looking for advice on a mid-range OLTP server with the following (flexible) requirements: - 4 physical processors expandable to 8 (dual core preferred), either Intel or AMD - 8 Gb RAM exp. to at least 32 Gb - Compatibility with Acer S300 External storage enclosure. The old server uses that, and it is boxed with 15krpm hdds in RAID-10, which do perfectly well their job. The enclosure is connected via 2 x LSI Logic PCI U320 controllers - On-site, same-day hardware support contract - Rack mount Machines we're evaluating currently include: - Acer Altos R910 - Sun Fire V40Z, - HP Integrity RX4640 - IBM eServer 460 Experiences on these machines? Other suggestions? Thanks. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 01:20:50PM -0500, Ron wrote: (The validity of the claim has nothing to do with the skills or experience of the claimant or anyone else in the discussion. Only on the evidence.) Please go back and reread the original post. I don't think the response was unwarranted. It is a tad unfair and prejudicial to call claims that CPU optimizations matter to the performance of DB product extraordinary. Evidence outside the DBMS field exists; and previous posts here show that pg can indeed become CPU-bound during what should be IO bound tasks. At the moment, Daniel's claims are not well supported. That is far different from being extraordinary given the current circumstantial evidence. No, they're extraordinary regardless of whether postgres is CPU bound. The question is whether cpu-specific compiler flags will have a significant impact--which is, historically, fairly unlikely. Far more likely is that performance can be improved with either a non-cpu-specific optimization (e.g., loop unrolling vs not) or with an algorithmic enhancement. More importantly, you're arguing *your own* point, not the original claim. I'll refresh your memory: My Linux is not an Intel-AMD binary compatible turtle like Fedora/RedHat/SUSE/... It's really important to have your GLIBC compiled for your processor. It is essencial for performance. You wanna draw the line between that (IMO, extraordinary) claim and the rational argument that you're trying to substitute in its place? [1] The evidence that arch specific flags matter to performance can be found as easily as recompiling your kernel or your compiler. Then, please, point to the body of evidence. IME, the results of such efforts aren't statistically all that signficant on most workloads. I'm sure there are edge cases, but it's certainly not going to be on my top ten things to look at when tuning a database system. (If your kernel's cpu utilization is the bottleneck in your database, you've probably got bigger problems than compiler flags can solve.) Where you get the real big benefits in a (linux) kernel recompile is when you select code that's specifically tuned for a particular processor--not from arch-specific gcc flags--and those sorts of things are increasingly moving toward boot-time autotuning rather than compile-time manual tuning for obvious reasons. Mike Stone ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 10:30:55AM -0800, Luke Lonergan wrote: Here's one - I wrote a general purpose Computational Fluid Dynamics analysis method used by hundreds of people to perform aircraft and propulsion systems analysis. That's kinda the opposite of what I meant by general code. I was trying (perhaps poorly) to distinguish between scientific codes and other stuff (especially I/O or human interface code). Compiler flag tuning would speed it up by factors of 2-3 or even more on some architectures. The reason it was so effective is that the structure of the code was designed to be general, but also to expose the critical performance sections in a way that the compilers could use - deep pipelining/vectorization, unrolling, etc, were carefully made easy for the compilers to exploit in critical sections. It also sounds like code specifically written to take advantage of compiler techniques, rather than random code thrown at a pile of cflags. I don't disagree that it is possible to get performance improvements if code is written to be performant code; I do (and did) disagree with the idea that you'll get huge performance improvements by taking regular old C application code and playing with compiler flags. Yes, this made the code in those sections harder to read, but it was a common practice because it might take weeks of runtime to get an answer and performance mattered. IMO that's appropriate for some science codes (although I think even that sector is beginning to find that they've gone too far in a lot of ways), but for a database I'd rather have people debugging clean, readable code than risking my data to something incomprehensible that runs in optimal time. Column databases like C-Store remove these abstractions at planner time to expose native operations in large chunks to the compiler and the IPC reflects that - typically 1+ and as high as 2.5. If we were to redesign the executor and planner to emulate that same structure we could achieve similar speedups and the compiler would matter more. gcc --make-it-really-fast-by-rewriting-it-from-the-ground-up? Mike Stone ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] really quick multiple inserts can use COPY?
On 12/11/06, Andreas Kretschmer [EMAIL PROTECTED] wrote: Jens Schipkowski [EMAIL PROTECTED] schrieb: Hello! In our JAVA application we do multiple inserts to a table by data from a Hash Map. Due to poor database access implemention - done by another company (we got the job to enhance the software) - we cannot use prepared statements. (We are not allowed to change code at database access!) First, we tried to fire one INSERT statement per record to insert. This costs 3 ms per row which is to slow because normally we insert 10.000 records which results in 30.000 ms just for inserts. Can you change this from INSERT-Statements to COPY? Copy is *much* faster than INSERT. If no, do you using all INSERTs in one transaction? I believe, a 'BEGIN' and a 'COMMIT' around all INSERTs may increase the speed. Performance increment can also be gained by disabling constraints in the transaction. These disabled constraints are invoked at the end of the transaction according to the SQL standard, so no worries about data consistency. Hmmm... PG currently supports disabling foreign constraints only. But that can still be significant. --Imad www.EnterpriseDB.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] New to PostgreSQL, performance considerations
On 12/11/06, Ron [EMAIL PROTECTED] wrote: Statements like these can not be reasonably interpreted in any manner _except_ that of presuming the results: I expect that you'll discover, if you actually do these tests, that this belief (that using arch specific compiler options lead to better performing SW) is fairly much nonsense. ...IMO a waste of time... etc The correct objective response to claims w/o evidence is to request evidence, and to do everything we can to support it being properly gathered. Not to try to discourage the claimant from even trying by ganging up on them with multiple instances of Argument From Authority or variations of Ad Hominem attacks. (The validity of the claim has nothing to do with the skills or experience of the claimant or anyone else in the discussion. Only on the evidence.) /shrugs, this is not debate class, I just happened to have barked up this particular tree before, and decided to share my insights from it. A lot of the misunderstanding here stems from legacy perceptions about how cpus work, not to mention the entire architecture. If somebody produces hard facts to the contrary, great, and I encourage them to do so. also, some people posting here, not necessarily me, are authority figures. :-) merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] New to PostgreSQL, performance considerations
At 01:47 PM 12/11/2006, Michael Stone wrote: On Mon, Dec 11, 2006 at 01:20:50PM -0500, Ron wrote: (The validity of the claim has nothing to do with the skills or experience of the claimant or anyone else in the discussion. Only on the evidence.) Please go back and reread the original post. I don't think the response was unwarranted. So he's evidently young and perhaps a trifle over-enthusiast. We were once too. ;-) We are not going to get valuable contributions nor help people become more valuable to the community by flaming them into submission. ...and who knows, =properly= done experiment may provide both surprises and unexpected insights/benefits. I agree completely with telling him he needs to get better evidence and even with helping him understand how he should go about getting it. It should be noted that his opposition has not yet done these experiments either. (Else they could just simply point to the results that refute Daniel's hypothesis.) The reality is that a new CPU architecture and multiple new memory technologies are part of this discussion. I certainly do not expect them to change the fundamental thinking regarding how to get best performance for a DBMS. OTOH, there are multiple valid reasons to give such new stuff a thorough and rigorous experimental shake-down. ATM, =both= sides of this debate are lacking evidence for their POV. Let's support getting definitive evidence. No matter who brings it to the table ;-) Ron Peacetree ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] New to PostgreSQL, performance considerations
On 12/11/06, Steinar H. Gunderson [EMAIL PROTECTED] wrote: On Mon, Dec 11, 2006 at 11:17:06AM -0200, Daniel van Ham Colchete wrote: I just remebered one case with MySQL. When I changed the distro from Conectiva 10 (rpm-based ended brazilian distro) to Gentoo, a MySQL operation that usually took 2 minutes to run, ended in 47 seconds. How do you know that this improvement had _anything_ to do with the use of different optimization flags? Were even the MySQL versions or configuration the same? This is absolutely vage. Indeed it is. Finally we agreed on something. I don't have how to prove it to you. No, but you should stop making this sort of absolutely essential claims if you can't. And I can't mesure how each factor helped: compiling glibc and Mysql with good cflags, rebuilding my database in a ordered way, never kernel, etc.. Exactly. So why are you attributing it to the first factor only? And why do you think this would carry over to PostgreSQL? Remember, anecdotal evidence isn't. But that's exactly what I said. I'm not attributing this case to the optimization factor. As I said there are a lot of factors involved. The MySQL version change of a minor upgrade (from 4.1.15 to 4.1.21). Steinar, I say I'll do the benchmark and it will be the end of the story. /* Steinar */ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] New to PostgreSQL, performance considerations
At 02:28 PM 12/11/2006, Merlin Moncure wrote: also, some people posting here, not necessarily me, are authority figures. :-) merlin Noam Chomsky was one of the most influential thinkers in Linguistics to yet have lived. He was proven wrong a number of times. Even within Linguistics. There are plenty of other historical examples. As others have said, opinion without evidence and logic is just that- opinion. And even Expert Opinion has been known to be wrong. Sometimes very much so. Part of what makes an expert an expert is that they can back up their statements with evidence and logic that are compelling even to the non expert when asked to do so. All I'm saying is let's all remember how assume is spelled and support the getting of some hard data. Ron Peacetree ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] New to PostgreSQL, performance considerations
Michael, On 12/11/06 10:57 AM, Michael Stone [EMAIL PROTECTED] wrote: That's kinda the opposite of what I meant by general code. I was trying (perhaps poorly) to distinguish between scientific codes and other stuff (especially I/O or human interface code). Yes - choice of language has often been a differentiator in these markets - LISP versus FORTRAN, C++ versus SQL/DBMS. This isn't just about science, it's also in Business Intelligence - e.g. Special purpose datamining code versus algorithms expressed inside a data management engine. It also sounds like code specifically written to take advantage of compiler techniques, rather than random code thrown at a pile of cflags. I don't disagree that it is possible to get performance improvements if code is written to be performant code; I do (and did) disagree with the idea that you'll get huge performance improvements by taking regular old C application code and playing with compiler flags. Agreed - that's my point exactly. IMO that's appropriate for some science codes (although I think even that sector is beginning to find that they've gone too far in a lot of ways), but for a database I'd rather have people debugging clean, readable code than risking my data to something incomprehensible that runs in optimal time. Certainly something of a compromise is needed. Column databases like C-Store remove these abstractions at planner time to gcc --make-it-really-fast-by-rewriting-it-from-the-ground-up? Maybe not from ground-up, but rather from about 10,000 ft - 25,000 ft? There are some who have done a lot of work studying the impact of more efficient DBMS, see here: http://homepages.cwi.nl/~boncz/x100.html - Luke ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] New to PostgreSQL, performance considerations
On 12/11/06, Steinar H. Gunderson [EMAIL PROTECTED] wrote: On Mon, Dec 11, 2006 at 11:31:48AM -0200, Daniel van Ham Colchete wrote: What PostgreSQL benchmark software should I use??? Look up the list archives; search for TPC. I'll test PostgreSQL 8.1 on a Fedora Core 6 and on a Gentoo. I'll get the same version FC6 uses and install it at my Gentoo. I'll use the same hardware (diferent partitions to each). Why do you want to compare FC6 and Gentoo? Wasn't your point that the -march= was supposed to be the relevant factor here? In that case, you want to keep all other things equal; so use the same distribution, only with -O2 -march=i686 vs. -march=athlon-xp (or whatever). /* Steinar */ Using Gentoo is just a easy way to make cflag optimizations to all the other libs as well: glibc, ... I can also mesure performance on Gentoo with cflag optimized PostgreSQL and plain PostgreSQL as well. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 02:51:09PM -0500, Ron wrote: Let's support getting definitive evidence. Since nobody opposed the concept of contrary evidence, I don't suppose you're fighting an uphill battle on that particular point. It's fine to get preachy about supporting intellectual curiosity, but do remember that it's a waste of everyone's (limited) time to give equal time to all theories. If someone comes to you with an idea for a perpetual motion machine your effort is probably better spent on something other than helping him build it, regardless of whether that somehow seems unfair. (Now if he brings a working model, that's a different story...) Heck, even building a bunch of non-working perpetual motion machines as a demonstration is a waste of time, because it's always easy to say well, if you had just That's precisely why the person bringing the extraordinary claim is also expected to bring the proof, rather than expecting that everyone else prove the status quo. Mike Stone ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] New to PostgreSQL, performance considerations
On 12/11/06, Chris Browne [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] (Daniel van Ham Colchete) writes: You are right Christopher. Okay. Let's solve this matter. What PostgreSQL benchmark software should I use??? pgbench is one option. There's a TPC-W at pgFoundry (http://pgfoundry.org/projects/tpc-w-php/). There's the Open Source Database Benchmark. (http://osdb.sourceforge.net/) Those are three reasonable options. Thanks Chris, I'm going to take a look at those options. I'll test PostgreSQL 8.1 on a Fedora Core 6 and on a Gentoo. I'll get the same version FC6 uses and install it at my Gentoo. I'll use the same hardware (diferent partitions to each). Wrong approach. You'll be comparing apples to oranges, because Gentoo and Fedora pluck sources from different points in the source code stream. In order to prove what you want to prove, you need to run the benchmarks all on Gentoo, where you run with 4 categorizations: 1. Where you run PostgreSQL and GLIBC without any processor-specific optimizations 2. Where you run PostgreSQL and GLIBC with all relevant processor-specific optimizations 3. Where you run PostgreSQL with, and GLIBC without processor-specific optimizations 4. Where you run PostgreSQL without, and GLIBC with processor-specific optimizations That would allow one to clearly distinguish which optimizations are particularly relevant. Good ideia also. And it is much easier to do as well. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Looking for hw suggestions for high concurrency
On 12/11/06, Luke Lonergan [EMAIL PROTECTED] wrote: The Sun X4600 is very good for this, the V40z is actually EOL so I'd stay away from it. also, it has no pci express slots. make sure to get pci-e slots :) You can currently do 8 dual core CPUs with the X4600 and 128GB of RAM and soon you should be able to do 8 quad core CPUs and 256GB of RAM. ...and this 6 of them (wow!). the v40z was top of its class. Will K8L run on this server? merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Looking for hw suggestions for high concurrency
Merlin, On 12/11/06 12:19 PM, Merlin Moncure [EMAIL PROTECTED] wrote: ...and this 6 of them (wow!). the v40z was top of its class. Will K8L run on this server? No official word yet. The X4600 slipped in there quietly under the X4500 (Thumper) announcement, but it's a pretty awesome server. It's been in production for supercomputing in Japan for a long while now, so I'd trust it. - Luke ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] New to PostgreSQL, performance considerations
Ron wrote: We are not going to get valuable contributions nor help people become more valuable to the community by flaming them into submission. Let's support getting definitive evidence. No matter who brings it to the table ;-) Thanks, Ron, for a voice of respect and reason. Since I first started using Usenet back in 1984, inexplicable rudeness has been a plague on otherwise civilized people. We're a community, we're all in this to help one another. Sometimes we give good advice, and sometimes even those wearing the mantle of authority can make boneheaded comments. I know I do, and when it happens, I always appreciate it when I'm taken to task with good humor and tolerance. When someone comes to this forum with an idea you disagree with, no matter how brash or absurd their claims, it's so easy to challenge them with grace and good humor, rather than chastizing with harsh words and driving someone from our community. If you're right, you will have taught a valuable lesson to someone. And if on occasion a newcomer shows us something new, then we've all learned. Either way, we have a new friend and contributor to the community. Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] New to PostgreSQL, performance considerations
Daniel van Ham Colchete wrote: On 12/11/06, Steinar H. Gunderson [EMAIL PROTECTED] wrote: On Mon, Dec 11, 2006 at 11:31:48AM -0200, Daniel van Ham Colchete wrote: What PostgreSQL benchmark software should I use??? Look up the list archives; search for TPC. I'll test PostgreSQL 8.1 on a Fedora Core 6 and on a Gentoo. I'll get the same version FC6 uses and install it at my Gentoo. I'll use the same hardware (diferent partitions to each). Why do you want to compare FC6 and Gentoo? Wasn't your point that the -march= was supposed to be the relevant factor here? In that case, you want to keep all other things equal; so use the same distribution, only with -O2 -march=i686 vs. -march=athlon-xp (or whatever). /* Steinar */ Using Gentoo is just a easy way to make cflag optimizations to all the other libs as well: glibc, ... I can also mesure performance on Gentoo with cflag optimized PostgreSQL and plain PostgreSQL as well. I can certainly recall that when I switched from Fedora Core (2 or 3 can't recall now) to Gentoo that the machine was faster for many activities. Of course I can't recall precisely what now :-(. To actually track down *why* and *what* make it faster is another story, and custom CFLAGS is only 1 of the possible factors: others could be: - different kernel versions (Gentoo would have possibly been later) - different kernel patches (both RedHat and Gentoo patch 'em) - different versions of glibc (Gentoo possibly later again). - different config options for glibc (not sure if they in fact are, but it's possible...) - kernel and glibc built with different versions of gcc (again I suspect Gentoo may have used a later version) So there are a lot of variables to consider if you want to settle this debate once and for all :-)! Best wishes Mark ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] New to PostgreSQL, performance considerations
Hi yall, I made some preliminary tests. Before the results, I would like to make some acknowledgments: 1 - I didn't show any prove to any of the things I said until now. 2 - It really is a waste of everyone's time to say one thing when I can't prove it. But all I said, is the knowledge I have been acumulating over the past few years working on a project where optimization is important. After algorithmic optimizations, compiler options is the second on my list and with my software they show measurable improvement. With the other software I use, they seen to run faster, but I didn't measure it. TEST PROCEDURE I ran this test at a Gentoo test machine I have here. It's a Pentium 4 3.0GHz (I don't know witch P4) with 1 GB of RAM memory. It only uses SATA drives. I didn't changed my glibc (or any other lib) during the test. I used GCC 3.4.6. I ran each test three times. So we can get an idea about average values and standard deviation. Each time I ran the test with the following commands: dropdb mydb createdb mydb pgbench -i -s 10 mydb 2 /dev/null psql -c 'vacuum analyze' mydb psql -c 'checkpoint' mydb sync pgbench -v -n -t 600 -c 5 mydb My postgresql.conf was the default one, except for: fsync = depends on the test shared_buffers = 1 work_mem = 10240 Every test results should begin the above, but I removed it because it's always the same: transaction type: TPC-B (sort of) scaling factor: 10 number of clients: 5 number of transactions per client: 600 number of transactions actually processed: 3000/3000 TESTS RESULTS == TEST 01: CFLAGS=-O2 -march=i686 fsync=false tps = 734.948620 (including connections establishing) tps = 736.866642 (excluding connections establishing) tps = 713.225047 (including connections establishing) tps = 715.039059 (excluding connections establishing) tps = 721.769259 (including connections establishing) tps = 723.631065 (excluding connections establishing) TEST 02: CFLAGS=-O2 -march=i686 fsync=true tps = 75.466058 (including connections establishing) tps = 75.485675 (excluding connections establishing) tps = 75.115797 (including connections establishing) tps = 75.135311 (excluding connections establishing) tps = 73.883113 (including connections establishing) tps = 73.901997 (excluding connections establishing) TEST 03: CFLAGS=-O2 -march=pentium4 fsync=false tps = 846.337784 (including connections establishing) tps = 849.067017 (excluding connections establishing) tps = 829.476269 (including connections establishing) tps = 832.008129 (excluding connections establishing) tps = 831.416457 (including connections establishing) tps = 835.31 (excluding connections establishing) TEST 04 CFLAGS=-O2 -march=pentium4 fsync=true tps = 83.224016 (including connections establishing) tps = 83.248157 (excluding connections establishing) tps = 80.811892 (including connections establishing) tps = 80.834525 (excluding connections establishing) tps = 80.671406 (including connections establishing) tps = 80.693975 (excluding connections establishing) CONCLUSIONS Everyone can get their own conclusion. Mine is: 1 - You have improvement when you compile your postgresql using processor specific tecnologies. With the fsync the you have an improvement of 9% at the tps rate. Without the fsync, the improvement is of 15,6%. 2 - You can still improve your indexes, sqls and everythingelse, this only adds another possible improvment. 3 - I can't prove this but I *think* that this is related to the fact that GCC knows how to do the same thing better on each processor. 4 - I'm still using source-based distros. WHAT NOW There are other things I wish to test: 1 - What efect an optimized glibc has on PostgreSQL? 2 - How much improvement can I get playing with my postgresql.conf. 3 - What efect optimizations have with concurrency? 4 - What if I used Intel C++ Compiler instead of GCC? 5 - What if I use GCC 4.1.1 instead of GCC 3.4.6? I'm thinking about writing a script to make all the tests (more than 3 times each), get the data and plot some graphs. I don't have the time right now to do it, maybe next week I'll have. I invite everyone to comment/sugest on the procedure or the results. Best regards, Daniel Colchete ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] New to PostgreSQL, performance considerations
Daniel, Good stuff. Can you try this with just -O3 versus -O2? - Luke On 12/11/06 2:22 PM, Daniel van Ham Colchete [EMAIL PROTECTED] wrote: Hi yall, I made some preliminary tests. Before the results, I would like to make some acknowledgments: 1 - I didn't show any prove to any of the things I said until now. 2 - It really is a waste of everyone's time to say one thing when I can't prove it. But all I said, is the knowledge I have been acumulating over the past few years working on a project where optimization is important. After algorithmic optimizations, compiler options is the second on my list and with my software they show measurable improvement. With the other software I use, they seen to run faster, but I didn't measure it. TEST PROCEDURE I ran this test at a Gentoo test machine I have here. It's a Pentium 4 3.0GHz (I don't know witch P4) with 1 GB of RAM memory. It only uses SATA drives. I didn't changed my glibc (or any other lib) during the test. I used GCC 3.4.6. I ran each test three times. So we can get an idea about average values and standard deviation. Each time I ran the test with the following commands: dropdb mydb createdb mydb pgbench -i -s 10 mydb 2 /dev/null psql -c 'vacuum analyze' mydb psql -c 'checkpoint' mydb sync pgbench -v -n -t 600 -c 5 mydb My postgresql.conf was the default one, except for: fsync = depends on the test shared_buffers = 1 work_mem = 10240 Every test results should begin the above, but I removed it because it's always the same: transaction type: TPC-B (sort of) scaling factor: 10 number of clients: 5 number of transactions per client: 600 number of transactions actually processed: 3000/3000 TESTS RESULTS == TEST 01: CFLAGS=-O2 -march=i686 fsync=false tps = 734.948620 (including connections establishing) tps = 736.866642 (excluding connections establishing) tps = 713.225047 (including connections establishing) tps = 715.039059 (excluding connections establishing) tps = 721.769259 (including connections establishing) tps = 723.631065 (excluding connections establishing) TEST 02: CFLAGS=-O2 -march=i686 fsync=true tps = 75.466058 (including connections establishing) tps = 75.485675 (excluding connections establishing) tps = 75.115797 (including connections establishing) tps = 75.135311 (excluding connections establishing) tps = 73.883113 (including connections establishing) tps = 73.901997 (excluding connections establishing) TEST 03: CFLAGS=-O2 -march=pentium4 fsync=false tps = 846.337784 (including connections establishing) tps = 849.067017 (excluding connections establishing) tps = 829.476269 (including connections establishing) tps = 832.008129 (excluding connections establishing) tps = 831.416457 (including connections establishing) tps = 835.31 (excluding connections establishing) TEST 04 CFLAGS=-O2 -march=pentium4 fsync=true tps = 83.224016 (including connections establishing) tps = 83.248157 (excluding connections establishing) tps = 80.811892 (including connections establishing) tps = 80.834525 (excluding connections establishing) tps = 80.671406 (including connections establishing) tps = 80.693975 (excluding connections establishing) CONCLUSIONS Everyone can get their own conclusion. Mine is: 1 - You have improvement when you compile your postgresql using processor specific tecnologies. With the fsync the you have an improvement of 9% at the tps rate. Without the fsync, the improvement is of 15,6%. 2 - You can still improve your indexes, sqls and everythingelse, this only adds another possible improvment. 3 - I can't prove this but I *think* that this is related to the fact that GCC knows how to do the same thing better on each processor. 4 - I'm still using source-based distros. WHAT NOW There are other things I wish to test: 1 - What efect an optimized glibc has on PostgreSQL? 2 - How much improvement can I get playing with my postgresql.conf. 3 - What efect optimizations have with concurrency? 4 - What if I used Intel C++ Compiler instead of GCC? 5 - What if I use GCC 4.1.1 instead of GCC 3.4.6? I'm thinking about writing a script to make all the tests (more than 3 times each), get the data and plot some graphs. I don't have the time right now to do it, maybe next week I'll have. I invite everyone to comment/sugest on the procedure or the results. Best regards, Daniel Colchete ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Low throughput of binary inserts from windows to linux
On 12/11/06, Tom Lane [EMAIL PROTECTED] wrote: Yeah, that's what I couldn't think of the other day. The principal report was here: http://archives.postgresql.org/pgsql-general/2005-01/msg01231.php By default, Windows XP installs the QoS Packet Scheduler service. It is not installed by default on Windows 2000. After I installed QoS Packet Scheduler on the Windows 2000 machine, the latency problem vanished. I found a QoS-RVPS service (not sure about the last four characters and I'm sitting at my mac at home now...) on one of the WinXP test boxes, started it and immediately lost network connection :-(. Since I have pretty much the same skepticism regarding the usefulness of a QoS packet scheduler to help with a raw-throughput-problem like Lincoln Yeoh in a follow up mail to the above (http://archives.postgresql.org/pgsql-general/2005-01/msg01243.php), I didn't investigate this further. And regarding the TCP_NODELAY hint from Kevin Grittner: if I am not wrong with interpreting fe_connect.c, the libpq already deals with it (fe_connect.c:connectNoDelay). But this made me think about the 'page'-size I use in my blob table... I tested different sizes on linux some time ago and found that 64KB was optimal. But playing with different sizes again revealed that my windows-linux problem seems to be solved if I use _any_ other (reasonable - meaning something between 4K and 512K) power of two ?!? Does this make sense to anyone? Thanks, axel ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Low throughput of binary inserts from windows to linux
Axel Waggershauser [EMAIL PROTECTED] writes: I tested different sizes on linux some time ago and found that 64KB was optimal. But playing with different sizes again revealed that my windows-linux problem seems to be solved if I use _any_ other (reasonable - meaning something between 4K and 512K) power of two ?!? I think this almost certainly indicates a Nagle/delayed-ACK interaction. I googled and found a nice description of the issue: http://www.stuartcheshire.org/papers/NagleDelayedAck/ Note that there are no TCP connections in which message payloads are exact powers of two (and no, I don't know why they didn't try to make it so). You are probably looking at a situation where this particular transfer size results in an odd number of messages where the other sizes do not, with the different overheads between Windows and everybody else accounting for the fact that it's only seen with a Windows sender. If you don't like that theory, another line of reasoning has to do with the fact that the maximum advertiseable window size in TCP is 65535 --- there could be some edge-case behaviors in the Windows and Linux stacks that don't play nicely together for 64K transfer sizes. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, Dec 11, 2006 at 08:22:42PM -0200, Daniel van Ham Colchete wrote: TEST 01: CFLAGS=-O2 -march=i686 fsync=false tps = 734.948620 (including connections establishing) tps = 736.866642 (excluding connections establishing) [snip] TEST 03: CFLAGS=-O2 -march=pentium4 fsync=false tps = 846.337784 (including connections establishing) tps = 849.067017 (excluding connections establishing) Can anyone else reproduce these results? I'm on similar hardware (2.5GHz P4, 1.5G RAM) and my test results are more like this: (postgresql 8.2.0) CFLAGS=(default) tps = 527.300454 (including connections establishing) tps = 528.898671 (excluding connections establishing) tps = 517.874347 (including connections establishing) tps = 519.404970 (excluding connections establishing) tps = 534.934905 (including connections establishing) tps = 536.562150 (excluding connections establishing) CFLAGS=686 tps = 525.179375 (including connections establishing) tps = 526.801278 (excluding connections establishing) tps = 557.821136 (including connections establishing) tps = 559.602414 (excluding connections establishing) tps = 532.142941 (including connections establishing) tps = 533.740209 (excluding connections establishing) CFLAGS=pentium4 tps = 518.869825 (including connections establishing) tps = 520.394341 (excluding connections establishing) tps = 537.759982 (including connections establishing) tps = 539.402547 (excluding connections establishing) tps = 538.522198 (including connections establishing) tps = 540.200458 (excluding connections establishing) Mike Stone ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] New to PostgreSQL, performance considerations
After a long battle with technology, [EMAIL PROTECTED] (Michael Stone), an earthling, wrote: [1] I will say that I have never seen a realistic benchmark of general code where the compiler flags made a statistically significant difference in the runtime. When we were initially trying out PostgreSQL on AIX, I did some (limited, admittedly) comparisons between behaviour when compiled using GCC 3.something, VisualAge C, and VisualAge C++. I did some modifications of -O values; I didn't find differences amounting to more than a percent or two between any of the combinations. If there's to be a difference, anywhere, it ought to have figured pretty prominently between a pretty elderly GCC version and IBM's top of the line PowerPC compiler. -- output = reverse(gro.mca @ enworbbc) http://cbbrowne.com/info/linuxdistributions.html I doubt this language difference would confuse anybody unless you were providing instructions on the insertion of a caffeine enema. -- On alt.coffee ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] New to PostgreSQL, performance considerations
On Mon, 11 Dec 2006, Michael Stone wrote: Can anyone else reproduce these results? I'm on similar hardware (2.5GHz P4, 1.5G RAM)... There are two likely candidates for why Daniel's P4 3.0GHz significantly outperforms your 2.5GHz system. 1) Most 2.5GHZ P4 processors use a 533MHz front-side bus (FSB); most 3.0GHZ ones use an 800MHz bus. 2) A typical motherboard paired with a 2.5GHz era processor will have a single-channel memory interface; a typical 3.0GHZ era board supports dual-channel DDR. These changes could easily explain the magnitude of difference in results you're seeing, expecially when combined with a 20% greater raw CPU clock. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] New to PostgreSQL, performance considerations
Greg Smith [EMAIL PROTECTED] writes: On Mon, 11 Dec 2006, Michael Stone wrote: Can anyone else reproduce these results? I'm on similar hardware (2.5GHz P4, 1.5G RAM)... There are two likely candidates for why Daniel's P4 3.0GHz significantly outperforms your 2.5GHz system. Um, you entirely missed the point: the hardware speedups you mention are quite independent of any compiler options. The numbers we are looking at are the relative speeds of two different compiles on the same hardware, not whether hardware A is faster than hardware B. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Low throughput of binary inserts from windows to linux
* Tom Lane: If you don't like that theory, another line of reasoning has to do with the fact that the maximum advertiseable window size in TCP is 65535 --- there could be some edge-case behaviors in the Windows and Linux stacks that don't play nicely together for 64K transfer sizes. Linux enables window scaling, so the actual window size can be more than 64K. Windows should cope with it, but some PIX firewalls and other historic boxes won't. -- Florian Weimer[EMAIL PROTECTED] BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] really quick multiple inserts can use COPY?
Thanks a lot to all for your tips. Of course, I am doing all the INSERTs using a transaction. So the cost per INSERT dropped from 30 ms to 3 ms. The improvement factor matches with the hint by Brian Hurt. Sorry, I forgot to mention we are using PostgreSQL 8.1.4. Thanks for the code snippet posted by mallah. It looks like you are using prepared statements, which are not available to us. But I will check our database access if its possible to do a workaround, because this looks clean and quick to me. regards Jens Schipkowski On Mon, 11 Dec 2006 17:53:52 +0100, Guillaume Cottenceau [EMAIL PROTECTED] wrote: Jens Schipkowski jens.schipkowski 'at' apus.co.at writes: Hello! In our JAVA application we do multiple inserts to a table by data from a Hash Map. Due to poor database access implemention - done by another company (we got the job to enhance the software) - we cannot use prepared statements. (We are not allowed to change code at database access!) First, we tried to fire one INSERT statement per record to insert. This costs 3 ms per row which is to slow because normally we insert 10.000 records which results in 30.000 ms just for inserts. for(){ sql = INSERT INTO tblfoo(foo,bar) VALUES(+it.next()+,+CONST.BAR+);; } You should try to wrap that into a single transaction. PostgreSQL waits for I/O write completion for each INSERT as it's implicitely in its own transaction. Maybe the added performance would be satisfactory for you. -- ** APUS Software GmbH ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate