Re: [PERFORM] Air-traffic benchmark
On Thu, 2010-01-07 at 13:38 +0100, Lefteris wrote: Reported query times are (in sec): MonetDB 7.9s InfoBright 12.13s LucidDB 54.8s It needs to be pointed out that those databases are specifically optimised for Data Warehousing, whereas Postgres core is optimised for concurrent write workloads in production systems. If you want a best-vs-best type of comparison, you should be looking at a version of Postgres optimised for Data Warehousing. These results show that Postgres-related options exist that clearly beat the above numbers. http://community.greenplum.com/showthread.php?t=111 I note also that Greenplum's Single Node Edition is now free to use, so is a reasonable product for comparison on this list. Also, I'm unimpressed by a Data Warehouse database that requires everything to reside in memory, e.g. MonetDB. That severely limits real-world usability, in my experience because it implies the queries you're running aren't ad-hoc. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Hello - Lefteris lsi...@gmail.com escreveu: Hi all, following the simple but interesting air-traffic benchmark published at: http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/ Quite interesting test, if you have the time to download all that raw data. of 5 minutes while the slow ones in the order of 40 minutes. I came to the conclusion that some parameters are not set correctly. I give all I do think so too. The hardware characteristics are: Platform Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz with 8GB RAM and ample disk space (2x 500 GB SATA disk @ 7200 RPM as SW-RAID-0) SATA disks are not the best for a benchmark like yours, but the results were so deeply different from expected that I'm sure that you have hardware enough. I used postgresql-8.4.2. Source download and compiled by hand, no special parameters where passed to the configure phase. -- DETAILS on loading, analyze and index creation on postgresql-8.4.2 -- loading time was 77m15.976s -- total rows: 119790558 -- total size of pgdata/base: 46G ANALYZE; -- 219698.365ms aka 3m39.698s CREATE INDEX year_month_idx ON ontime (Year, Month); -- 517481.632ms aka 8m37.481s CREATE INDEX DepDelay_idx ON ontime (DepDelay); -- 1257051.668ms aka 20m57.051s Here comes my first question: Did you ANALYZE your database (or at least related tables) _after_ index creation? If not, try to do so. PostgreSQL needs statistics of the database when everything is in its place. airtraffic=# EXPLAIN ANALYZE SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER BY c DESC; QUERY PLAN -- Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual time=371188.821..371188.823 rows=7 loops=1) Sort Key: (count(*)) Sort Method: quicksort Memory: 25kB - HashAggregate (cost=7407754.04..7407754.08 rows=4 width=2) (actual time=371163.316..371163.320 rows=7 loops=1) - Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727 width=2) (actual time=190938.959..346180.079 rows=52484047 loops=1) Filter: ((Year = 2000) AND (Year = 2008)) Total runtime: 371201.156 ms (7 rows) You'll see here that PostgreSQL is not using the index you just created. ANALYZE VERBOSE ontime; should solve this. I understand that the problem here is the memory used by the sort method. *But*, I already changed the work_mem parameter to 6gigs:) If you look to you explain you'll see that you don't need that much of memory. You have 8GB of total RAM, if you use that much for sorting you'll start to swap. which means that of course work_mem == sort_mem, as such, shouldn't be the case that the sort algorithm should have used much more memory? sort_mem is a part of the work_mem in recent versions of PostgreSQL. No, the sort algorithm doesn't need that at all. I also attach the output of 'show all;' so you can advice me in any other configuration settings that I might need to change to perform better. Let's take a look. geqo| on If in doubt, turn this off. Geqo is capable of making bad execution plans for you. But this is not the most important to change. shared_buffers | 32MB Here it is. The default setting for shared_buffer doesn't give space for the buffercache. I would recomend you to increase this to, at least, 40% of your total RAM. Don't forget to restart PostgreSQL after changing here, and it's possible that you'll need to increase some system V parameters in Fedora. Read PostgreSQL documentation about it. effective_cache_size| 8MB Increase here to, at least, shared_buffers + any caches that you have in your hardware (e.g. in the RAID controller) wal_buffers | 64kB Increasing here can make your bulk load faster. 8MB is a good number. This will not make your SELECT queries faster. wal_sync_method | fdatasync fdatasync is the recommended method for Solaris. I would use open_sync. It's not important for SELECT too. work_mem| 6GB Start here with 10MB. If you have temp files when executing your SELECTs try to increase just a bit. Let us know what happens in your new tests. Best regards Flavio Henrique A. Gurgel tel. 55-11-2125.4765 fax. 55-11-2125.4777 www.4linux.com.br FREE SOFTWARE SOLUTIONS -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
In response to Lefteris : airtraffic=# EXPLAIN ANALYZE SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER BY c DESC; QUERY PLAN -- Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual time=371188.821..371188.823 rows=7 loops=1) Sort Key: (count(*)) Sort Method: quicksort Memory: 25kB - HashAggregate (cost=7407754.04..7407754.08 rows=4 width=2) (actual time=371163.316..371163.320 rows=7 loops=1) - Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727 width=2) (actual time=190938.959..346180.079 rows=52484047 loops=1) Filter: ((Year = 2000) AND (Year = 2008)) Total runtime: 371201.156 ms (7 rows) I understand that the problem here is the memory used by the sort method. *But*, I already changed the work_mem parameter to 6gigs:) No. The problem here is the Seq-Scan. It returns about 52.000.000 rows, approximately roughly table, it needs 346 seconds. The sort needs only 25 KByte and only 0.02ms. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Thank you all for your answers! Andrea, I see the other way around what you are saying: Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual time=371188.821..371188.823 rows=7 loops=1) Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727 width=2) (actual time=190938.959..346180.079 rows=52484047 loops=1) I dont see the seq scan to ba a problem, and it is the correct choice here because Year spans from 1999 to 2009 and the query asks from 2000 and on, so PG correctly decides to use seq scan and not index access. lefteris On Thu, Jan 7, 2010 at 2:32 PM, A. Kretschmer andreas.kretsch...@schollglas.com wrote: In response to Lefteris : airtraffic=# EXPLAIN ANALYZE SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER BY c DESC; QUERY PLAN -- Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual time=371188.821..371188.823 rows=7 loops=1) Sort Key: (count(*)) Sort Method: quicksort Memory: 25kB - HashAggregate (cost=7407754.04..7407754.08 rows=4 width=2) (actual time=371163.316..371163.320 rows=7 loops=1) - Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727 width=2) (actual time=190938.959..346180.079 rows=52484047 loops=1) Filter: ((Year = 2000) AND (Year = 2008)) Total runtime: 371201.156 ms (7 rows) I understand that the problem here is the memory used by the sort method. *But*, I already changed the work_mem parameter to 6gigs:) No. The problem here is the Seq-Scan. It returns about 52.000.000 rows, approximately roughly table, it needs 346 seconds. The sort needs only 25 KByte and only 0.02ms. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
On 7-1-2010 13:38 Lefteris wrote: I decided to run the benchmark over postgres to get some more experience and insights. Unfortunately, the query times I got from postgres were not the expected ones: Why were they not expected? In the given scenario, column databases are having a huge advantage. Especially the given simple example is the type of query a column database *should* excel. You should, at the very least, compare the queries to MyISAM: http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/ But unfortunately, that one also beats your postgresql-results. The hardware characteristics are: Platform Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz with 8GB RAM and ample disk space (2x 500 GB SATA disk @ 7200 RPM as SW-RAID-0) Unfortunately, the blogpost fails to mention the disk-subsystem. So it may well be much faster than yours, although its not a new, big or fast server, so unless it has external storage, it shouldn't be too different for sequential scans. SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER BY c DESC; Reported query times are (in sec): MonetDB 7.9s InfoBright 12.13s LucidDB 54.8s For pg-8.4.2 I got with 3 consecutive runs on the server: 5m52.384s 5m55.885s 5m54.309s Maybe an index of the type 'year, dayofweek' will help for this query. But it'll have to scan about half the table any way, so a seq scan isn't a bad idea. In this case, a partitioned table with partitions per year and constraint exclusion enabled would help a bit more. Best regards, Arjen -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Hi Arjen, so I understand from all of you that you don't consider the use of 25k for sorting to be the cause of the slowdown? Probably I am missing something on the specific sort algorithm used by PG. My RAM does fill up, mainly by file buffers from linux, but postgres process remains to 0.1% consumption of main memory. There is no way to force sort to use say blocks of 128MB ? wouldn't that make a difference? lefteris p.s. i already started the analyze verbose again as Flavio suggested and reset the parrameters, although I think some of Flavioo's suggestions have to do with multiple users/queries and not 1 long running query, like shared_buffers, or not? On Thu, Jan 7, 2010 at 2:36 PM, Arjen van der Meijden acmmail...@tweakers.net wrote: On 7-1-2010 13:38 Lefteris wrote: I decided to run the benchmark over postgres to get some more experience and insights. Unfortunately, the query times I got from postgres were not the expected ones: Why were they not expected? In the given scenario, column databases are having a huge advantage. Especially the given simple example is the type of query a column database *should* excel. You should, at the very least, compare the queries to MyISAM: http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/ But unfortunately, that one also beats your postgresql-results. The hardware characteristics are: Platform Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz with 8GB RAM and ample disk space (2x 500 GB SATA disk @ 7200 RPM as SW-RAID-0) Unfortunately, the blogpost fails to mention the disk-subsystem. So it may well be much faster than yours, although its not a new, big or fast server, so unless it has external storage, it shouldn't be too different for sequential scans. SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER BY c DESC; Reported query times are (in sec): MonetDB 7.9s InfoBright 12.13s LucidDB 54.8s For pg-8.4.2 I got with 3 consecutive runs on the server: 5m52.384s 5m55.885s 5m54.309s Maybe an index of the type 'year, dayofweek' will help for this query. But it'll have to scan about half the table any way, so a seq scan isn't a bad idea. In this case, a partitioned table with partitions per year and constraint exclusion enabled would help a bit more. Best regards, Arjen -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
In response to Lefteris : Thank you all for your answers! Andrea, I see the other way around what you are saying: Sort (cost=7407754.12..7407754.13 rows=4 width=2) (actual time=371188.821..371188.823 rows=7 loops=1) Seq Scan on ontime (cost=0.00..7143875.40 rows=52775727 width=2) (actual time=190938.959..346180.079 rows=52484047 loops=1) I dont see the seq scan to ba a problem, and it is the correct choice here because Year spans from 1999 to 2009 and the query asks from 2000 and on, so PG correctly decides to use seq scan and not index access. Thats right. But this is not a contradiction, this seq-scan *is* the real problem, not the sort. And yes, as others said, increment the work_mem isn't the solution. It is counterproductive, because you lost buffer-cache. Andreas, note the 's' at the end ;-) -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
In response to Lefteris : Hi Arjen, so I understand from all of you that you don't consider the use of 25k for sorting to be the cause of the slowdown? Probably I am missing something on the specific sort algorithm used by PG. My RAM does fill up, mainly by file buffers from linux, but postgres process remains to 0.1% consumption of main memory. There is no way to force sort to use say blocks of 128MB ? wouldn't that make a difference? The result-table fits in that piece of memory. You have only this little table: [ 5, 7509643 ] [ 1, 7478969 ] [ 4, 7453687 ] [ 3, 7412939 ] [ 2, 7370368 ] [ 7, 7095198 ] [ 6, 6425690 ] Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Yes, I am reading the plan wrong! I thought that each row from the plan reported the total time for the operation but it actually reports the starting and ending point. So we all agree that the problem is on the scans:) So the next question is why changing shared memory buffers will fix that? i only have one session with one connection, do I have like many reader workers or something? Thank you and sorry for the plethora of questions, but I know few about the inner parts of postgres:) lefteris On Thu, Jan 7, 2010 at 3:05 PM, Jochen Erwied joc...@pgsql-performance.erwied.eu wrote: Thursday, January 7, 2010, 2:47:36 PM you wrote: so I understand from all of you that you don't consider the use of 25k for sorting to be the cause of the slowdown? Probably I am missing Maybe you are reading the plan wrong: - the sort needs only 25kB of memory, and finishes in sub-second time, mainly because the sort only sorts the already summarized data, and not the whole table - the sequential scan takes 346 seconds, and thus is the major factor in time to finish! So the total query time is 371 seconds, of which 346 are required to completely scan the table once. -- Jochen Erwied | home: joc...@erwied.eu +49-208-38800-18, FAX: -19 Sauerbruchstr. 17 | work: j...@mbs-software.de +49-2151-7294-24, FAX: -50 D-45470 Muelheim | mobile: jochen.erw...@vodafone.de +49-173-5404164 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Lefteris escribió: Yes, I am reading the plan wrong! I thought that each row from the plan reported the total time for the operation but it actually reports the starting and ending point. So we all agree that the problem is on the scans:) So the next question is why changing shared memory buffers will fix that? i only have one session with one connection, do I have like many reader workers or something? No amount of tinkering is going to change the fact that a seqscan is the fastest way to execute these queries. Even if you got it to be all in memory, it would still be much slower than the other systems which, I gather, are using columnar storage and thus are perfectly suited to this problem (unlike Postgres). The talk about compression ratios caught me by surprise until I realized it was columnar stuff. There's no way you can get such high ratios on a regular, row-oriented storage. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Alvaro Herrera escribió: No amount of tinkering is going to change the fact that a seqscan is the fastest way to execute these queries. Even if you got it to be all in memory, it would still be much slower than the other systems which, I gather, are using columnar storage and thus are perfectly suited to this problem (unlike Postgres). The talk about compression ratios caught me by surprise until I realized it was columnar stuff. There's no way you can get such high ratios on a regular, row-oriented storage. FWIW if you want a fair comparison, get InnoDB numbers. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
On 7.1.2010 15:23, Lefteris wrote: I think what you all said was very helpful and clear! The only part that I still disagree/don't understand is the shared_buffer option:)) Did you ever try increasing shared_buffers to what was suggested (around 4 GB) and see what happens (I didn't see it in your posts)? Shared_buffers can be thought as the PostgreSQLs internal cache. If the pages being scanned for a particular query are in the cache, this will help performance very much on multiple exequtions of the same query. OTOH, since the file system's cache didn't help you significantly, there is low possibility shared_buffers will. It is still worth trying. From the description of the data (...from years 1988 to 2009...) it looks like the query for between 2000 and 2009 pulls out about half of the data. If an index could be used instead of seqscan, it could be perhaps only 50% faster, which is still not very comparable to others. The table is very wide, which is probably why the tested databases can deal with it faster than PG. You could try and narrow the table down (for instance: remove the Div* fields) to make the data more relational-like. In real life, speedups in this circumstances would probably be gained by normalizing the data to make the basic table smaller and easier to use with indexing. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras ivo...@freebsd.org wrote: On 7.1.2010 15:23, Lefteris wrote: I think what you all said was very helpful and clear! The only part that I still disagree/don't understand is the shared_buffer option:)) Did you ever try increasing shared_buffers to what was suggested (around 4 GB) and see what happens (I didn't see it in your posts)? No I did not to that yet, mainly because I need the admin of the machine to change the shmmax of the kernel and also because I have no multiple queries running. Does Seq scan uses shared_buffers? Shared_buffers can be thought as the PostgreSQLs internal cache. If the pages being scanned for a particular query are in the cache, this will help performance very much on multiple exequtions of the same query. OTOH, since the file system's cache didn't help you significantly, there is low possibility shared_buffers will. It is still worth trying. From the description of the data (...from years 1988 to 2009...) it looks like the query for between 2000 and 2009 pulls out about half of the data. If an index could be used instead of seqscan, it could be perhaps only 50% faster, which is still not very comparable to others. The table is very wide, which is probably why the tested databases can deal with it faster than PG. You could try and narrow the table down (for instance: remove the Div* fields) to make the data more relational-like. In real life, speedups in this circumstances would probably be gained by normalizing the data to make the basic table smaller and easier to use with indexing. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
On Thu, Jan 7, 2010 at 3:05 PM, Lefteris lsi...@gmail.com wrote: On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras ivo...@freebsd.org wrote: On 7.1.2010 15:23, Lefteris wrote: I think what you all said was very helpful and clear! The only part that I still disagree/don't understand is the shared_buffer option:)) Did you ever try increasing shared_buffers to what was suggested (around 4 GB) and see what happens (I didn't see it in your posts)? No I did not to that yet, mainly because I need the admin of the machine to change the shmmax of the kernel and also because I have no multiple queries running. Does Seq scan uses shared_buffers? Think of the shared buffer as a cache. It will help subsequent queries running to not have to use disk. -- GJ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
- Lefteris lsi...@gmail.com escreveu: Did you ever try increasing shared_buffers to what was suggested (around 4 GB) and see what happens (I didn't see it in your posts)? No I did not to that yet, mainly because I need the admin of the machine to change the shmmax of the kernel and also because I have no multiple queries running. Does Seq scan uses shared_buffers? Having multiple queries running is *not* the only reason you need lots of shared_buffers. Think of shared_buffers as a page cache, data in PostgreSQL is organized in pages. If one single query execution had a step that brought a page to the buffercache, it's enough to increase another step speed and change the execution plan, since the data access in memory is (usually) faster then disk. help performance very much on multiple exequtions of the same query. This is also true. This kind of test should, and will, give different results in subsequent executions. From the description of the data (...from years 1988 to 2009...) it looks like the query for between 2000 and 2009 pulls out about half of the data. If an index could be used instead of seqscan, it could be perhaps only 50% faster, which is still not very comparable to others. The use of the index over seqscan has to be tested. I don't agree in 50% gain, since simple integers stored on B-Tree have a huge possibility of beeing retrieved in the required order, and the discarded data will be discarder quickly too, so the gain has to be measured. I bet that an index scan will be a lot faster, but it's just a bet :) The table is very wide, which is probably why the tested databases can deal with it faster than PG. You could try and narrow the table down (for instance: remove the Div* fields) to make the data more relational-like. In real life, speedups in this circumstances would probably be gained by normalizing the data to make the basic table smaller and easier to use with indexing. Ugh. I don't think so. That's why indexes were invented. PostgreSQL is smart enough to jump over columns using byte offsets. A better option for this table is to partition it in year (or year/month) chunks. 45GB is not a so huge table compared to other ones I have seen before. I have systems where each partition is like 10 or 20GB and data is very fast to access even whith aggregation queries. Flavio Henrique A. Gurgel tel. 55-11-2125.4765 fax. 55-11-2125.4777 www.4linux.com.br FREE SOFTWARE SOLUTIONS -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
2010/1/7 Lefteris lsi...@gmail.com: On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras ivo...@freebsd.org wrote: On 7.1.2010 15:23, Lefteris wrote: I think what you all said was very helpful and clear! The only part that I still disagree/don't understand is the shared_buffer option:)) Did you ever try increasing shared_buffers to what was suggested (around 4 GB) and see what happens (I didn't see it in your posts)? No I did not to that yet, mainly because I need the admin of the machine to change the shmmax of the kernel and also because I have no multiple queries running. Does Seq scan uses shared_buffers? Everything uses shared_buffers, even things that do not benefit from it. This is because shared_buffers is the part of the general database IO - it's unavoidable. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
On Thu, Jan 7, 2010 at 4:57 PM, Ivan Voras ivo...@freebsd.org wrote: 2010/1/7 Lefteris lsi...@gmail.com: On Thu, Jan 7, 2010 at 3:51 PM, Ivan Voras ivo...@freebsd.org wrote: On 7.1.2010 15:23, Lefteris wrote: I think what you all said was very helpful and clear! The only part that I still disagree/don't understand is the shared_buffer option:)) Did you ever try increasing shared_buffers to what was suggested (around 4 GB) and see what happens (I didn't see it in your posts)? No I did not to that yet, mainly because I need the admin of the machine to change the shmmax of the kernel and also because I have no multiple queries running. Does Seq scan uses shared_buffers? Everything uses shared_buffers, even things that do not benefit from it. This is because shared_buffers is the part of the general database IO - it's unavoidable. I will increase the shared_buffers once my kernel is configured and I will report back to you. As for the index scan, I already build an b-tree on year/month but PG (correctly) decides not to use it. The data are from year 1999 up to 2009 (my typo mistake) so it is almost 90% of the data to be accessed. When I ask for a specific year, like 2004 then the index is used and query times become faster. lefteris -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
On Thu, 7 Jan 2010, Gurgel, Flavio wrote: If one single query execution had a step that brought a page to the buffercache, it's enough to increase another step speed and change the execution plan, since the data access in memory is (usually) faster then disk. Postgres does not change a query plan according to the shared_buffers setting. It does not anticipate one step contributing to another step in this way. It does however make use of the effective_cache_size setting to estimate this effect, and that does affect the planner. The use of the index over seqscan has to be tested. I don't agree in 50% gain, since simple integers stored on B-Tree have a huge possibility of beeing retrieved in the required order, and the discarded data will be discarder quickly too, so the gain has to be measured. I bet that an index scan will be a lot faster, but it's just a bet :) In a situation like this, the opposite will be true. If you were accessing a very small part of a table, say to order by a field with a small limit, then an index can be very useful by providing the results in the correct order. However, in this case, almost the entire table has to be read. Changing the order in which it is read will mean that the disc access is no longer sequential, which will slow things down, not speed them up. The Postgres planner isn't stupid (mostly), there is probably a good reason why it isn't using an index scan. The table is very wide, which is probably why the tested databases can deal with it faster than PG. You could try and narrow the table down (for instance: remove the Div* fields) to make the data more relational-like. In real life, speedups in this circumstances would probably be gained by normalizing the data to make the basic table smaller and easier to use with indexing. Ugh. I don't think so. That's why indexes were invented. PostgreSQL is smart enough to jump over columns using byte offsets. A better option for this table is to partition it in year (or year/month) chunks. Postgres (mostly) stores the columns for a row together with a row, so what you say is completely wrong. Postgres does not jump over columns using byte offsets in this way. The index references a row in a page on disc, and that page is fetched separately in order to retrieve the row. The expensive part is physically moving the disc head to the right part of the disc in order to fetch the correct page from the disc - jumping over columns will not help with that at all. Reducing the width of the table will greatly improve the performance of a sequential scan, as it will reduce the size of the table on disc, and therefore the time taken to read the entire table sequentially. Moreover, your suggestion of partitioning the table may not help much with this query. It will turn a single sequential scan into a UNION of many tables, which may be harder for the planner to plan. Also, for queries that access small parts of the table, indexes will help more than partitioning will. Partitioning will help most in the case where you want to summarise a single year's data. Not really otherwise. Matthew -- Q: What's the difference between ignorance and apathy? A: I don't know, and I don't care. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Alvaro Herrera wrote: No amount of tinkering is going to change the fact that a seqscan is the fastest way to execute these queries. Even if you got it to be all in memory, it would still be much slower than the other systems which, I gather, are using columnar storage and thus are perfectly suited to this problem (unlike Postgres). The talk about compression ratios caught me by surprise until I realized it was columnar stuff. There's no way you can get such high ratios on a regular, row-oriented storage. One of the good tricks with Postgres is to convert a very wide table into a set of narrow tables, then use a view to create something that looks like the original table. It requires you to modify the write portions of your app, but the read portions can stay the same. A seq scan on one column will *much* faster when you rearrange your database this way since it's only scanning relevant data. You pay the price of an extra join on primary keys, though. If you have just a few columns in a very wide table that are seq-scanned a lot, you can pull out just those columns and leave the rest in the wide table. The same trick is also useful if you have one or a few columns that are updated frequently: pull them out, and use a view to recreate the original appearance. It saves a lot on garbage collection. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
On Thu, Jan 7, 2010 at 10:57 AM, Gurgel, Flavio fla...@4linux.com.br wrote: - Matthew Wakeling matt...@flymine.org escreveu: On Thu, 7 Jan 2010, Gurgel, Flavio wrote: Postgres does not change a query plan according to the shared_buffers setting. It does not anticipate one step contributing to another step in this way. It does however make use of the effective_cache_size setting to estimate this effect, and that does affect the planner. That was what I was trying to say :) In a situation like this, the opposite will be true. If you were accessing a very small part of a table, say to order by a field with a small limit, then an index can be very useful by providing the results in the correct order. However, in this case, almost the entire table has to be read. Changing the order in which it is read will mean that the disc access is no longer sequential, which will slow things down, not speed them up. The Postgres planner isn't stupid (mostly), there is probably a good reason why it isn't using an index scan. Sorry but I disagree. This is the typical case where the test has to be made. The results are partial, let's say 50% of the table. Considerind that the disk is fast enough, the cost estimation of sequential and random reads are in a proportion of 1 to 4, considering default settings in PostgreSQL. You do know that indexes in postgresql are not covering right? I.e. after hitting the index, the db then has to hit the table to see if those rows are in fact visible. So there's no such thing in pgsql, at the moment, as an index only scan. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
- Scott Marlowe scott.marl...@gmail.com escreveu: You do know that indexes in postgresql are not covering right? I.e. after hitting the index, the db then has to hit the table to see if those rows are in fact visible. So there's no such thing in pgsql, at the moment, as an index only scan. That was just an estimation of effort to reach a tuple through seqscan X indexscan. In both cases the tuple have to be checked, sure. Flavio Henrique A. Gurgel tel. 55-11-2125.4786 cel. 55-11-8389.7635 www.4linux.com.br FREE SOFTWARE SOLUTIONS -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Lefteris wrote: So we all agree that the problem is on the scans:) So the next question is why changing shared memory buffers will fix that? i only have one session with one connection, do I have like many reader workers or something? I wouldn't expect it to. Large sequential scans like this one are optimized in PostgreSQL to only use up a small portion of the shared_buffers cache. Allocating more RAM to the database won't improve the fact that you're spending the whole time waiting for physical I/O to happen very much. What might help is increasing effective_cache_size a lot though, because there you might discover the database switching to all new sorts of plans for some of these queries. But, again, that doesn't impact the situation where a sequential scan is the only approach. I have this whole data set on my PC already and have been trying to find time to get it loaded and start my own tests here, it is a quite interesting set of information. Can you tell me what you had to do in order to get it running in PostgreSQL? If you made any customizations there, I'd like to get a copy of them. Would save me some time and help me get to where I could give suggestions out if I had a pgdumpall --schema-only dump from your database for example, or however you got the schema into there, and the set of PostgreSQL-compatible queries you're using. By the way: if anybody else wants to join in, here's a script that generates a script to download the whole data set: #!/usr/bin/env python for y in range(1988,2010): for m in range(1,13): print wget --limit-rate=100k http://www.transtats.bts.gov/Download/On_Time_On_Time_Performance_%s_%s.zip; % (y,m) It's 3.8GB of download that uncompresses into 46GB of CSV data, which is why I put the rate limiter on there--kept it from clogging my entire Internet connection. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Hi Greg, thank you for your help. The changes I did on the dataset was just removing the last comma from the CSV files as it was interpreted by pg as an extra column. The schema I used, the load script and queries can be found at: http://homepages.cwi.nl/~lsidir/postgres/ (I understood that if I attach these files here, my email will not reach the list so I give you a link to download them). Also since you are interesting on the benchmark, you can also check http://homepages.cwi.nl/~mk/ontimeReport for a report of various experiments with MonetDB and comparison with previously published numbers. The schema I used for pg is slightly different from that one of MonetDB since the delay fields could not be parsed by pg as integers but only as varchar(4). Hence the extra index on DepDelay field:) Also at http://homepages.cwi.nl/~lsidir/PostgreSQL-ontimeReport you can see the detailed times I got from postgres. I really appreciate your help! this is a great opportunity for me to get some feeling and insights on postgres since I never had the chance to use it in a large scale project. lefteris On Thu, Jan 7, 2010 at 11:21 PM, Greg Smith g...@2ndquadrant.com wrote: Lefteris wrote: So we all agree that the problem is on the scans:) So the next question is why changing shared memory buffers will fix that? i only have one session with one connection, do I have like many reader workers or something? I wouldn't expect it to. Large sequential scans like this one are optimized in PostgreSQL to only use up a small portion of the shared_buffers cache. Allocating more RAM to the database won't improve the fact that you're spending the whole time waiting for physical I/O to happen very much. What might help is increasing effective_cache_size a lot though, because there you might discover the database switching to all new sorts of plans for some of these queries. But, again, that doesn't impact the situation where a sequential scan is the only approach. I have this whole data set on my PC already and have been trying to find time to get it loaded and start my own tests here, it is a quite interesting set of information. Can you tell me what you had to do in order to get it running in PostgreSQL? If you made any customizations there, I'd like to get a copy of them. Would save me some time and help me get to where I could give suggestions out if I had a pgdumpall --schema-only dump from your database for example, or however you got the schema into there, and the set of PostgreSQL-compatible queries you're using. By the way: if anybody else wants to join in, here's a script that generates a script to download the whole data set: #!/usr/bin/env python for y in range(1988,2010): for m in range(1,13): print wget --limit-rate=100k http://www.transtats.bts.gov/Download/On_Time_On_Time_Performance_%s_%s.zip; % (y,m) It's 3.8GB of download that uncompresses into 46GB of CSV data, which is why I put the rate limiter on there--kept it from clogging my entire Internet connection. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
On Thu, Jan 7, 2010 at 11:57 PM, Lefteris lsi...@gmail.com wrote: Hi Greg, thank you for your help. The changes I did on the dataset was just removing the last comma from the CSV files as it was interpreted by pg as an extra column. The schema I used, the load script and queries can be found at: http://homepages.cwi.nl/~lsidir/postgres/ (I understood that if I attach these files here, my email will not reach the list so I give you a link to download them). Also since you are interesting on the benchmark, you can also check http://homepages.cwi.nl/~mk/ontimeReport for a report of various experiments with MonetDB and comparison with previously published numbers. The schema I used for pg is slightly different from that one of MonetDB since the delay fields could not be parsed by pg as integers but only as varchar(4). Hence the extra index on DepDelay field:) Sorry, I mean the ArrTime DepTime fields were changed, because they apear on the data as HHMM, but these fields were not used on the queries. The index on DepDelay was done for q3,4,5 and 7 Also at http://homepages.cwi.nl/~lsidir/PostgreSQL-ontimeReport you can see the detailed times I got from postgres. I really appreciate your help! this is a great opportunity for me to get some feeling and insights on postgres since I never had the chance to use it in a large scale project. lefteris On Thu, Jan 7, 2010 at 11:21 PM, Greg Smith g...@2ndquadrant.com wrote: Lefteris wrote: So we all agree that the problem is on the scans:) So the next question is why changing shared memory buffers will fix that? i only have one session with one connection, do I have like many reader workers or something? I wouldn't expect it to. Large sequential scans like this one are optimized in PostgreSQL to only use up a small portion of the shared_buffers cache. Allocating more RAM to the database won't improve the fact that you're spending the whole time waiting for physical I/O to happen very much. What might help is increasing effective_cache_size a lot though, because there you might discover the database switching to all new sorts of plans for some of these queries. But, again, that doesn't impact the situation where a sequential scan is the only approach. I have this whole data set on my PC already and have been trying to find time to get it loaded and start my own tests here, it is a quite interesting set of information. Can you tell me what you had to do in order to get it running in PostgreSQL? If you made any customizations there, I'd like to get a copy of them. Would save me some time and help me get to where I could give suggestions out if I had a pgdumpall --schema-only dump from your database for example, or however you got the schema into there, and the set of PostgreSQL-compatible queries you're using. By the way: if anybody else wants to join in, here's a script that generates a script to download the whole data set: #!/usr/bin/env python for y in range(1988,2010): for m in range(1,13): print wget --limit-rate=100k http://www.transtats.bts.gov/Download/On_Time_On_Time_Performance_%s_%s.zip; % (y,m) It's 3.8GB of download that uncompresses into 46GB of CSV data, which is why I put the rate limiter on there--kept it from clogging my entire Internet connection. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
On Thu, Jan 07, 2010 at 01:38:41PM +0100, Lefteris wrote: airtraffic=# EXPLAIN ANALYZE SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year BETWEEN 2000 AND 2008 GROUP BY DayOfWeek ORDER BY c DESC; Well, this query basically has to be slow. Correct approach to this problem is to add precalculated aggregates - either with triggers or with some cronjob. Afterwards query speed depends only on how good are your aggregates, and/or how detailed. Of course calculating them is not free, but is done on write (or periodically), and not on user-request, which makes user-requests *much* faster. depesz -- Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/ jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Air-traffic benchmark
Craig Ringer cr...@postnewspapers.com.au writes: Can Pg even read partial records ? I thought it all operated on a page level, where if an index indicates that a particular value is present on a page the whole page gets read in and all records on the page are checked for the value of interest. No? The whole page gets read, but we do know which record on the page the index entry is pointing at. (This statement is an oversimplification, because of lossy indexes and lossy bitmap scans, but most of the time it's not a matter of checking all records on the page.) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance