Re: [PERFORM] Slow count(*) again...

2010-10-21 Thread Scott Carey
On Oct 21, 2010, at 11:13 AM, Jesper Krogh wrote: > On 2010-10-21 06:47, Scott Carey wrote: > > On a wimpy disk, I/O bound for > sure. But my disks go 1000MB/sec. > > > No query can go fast enough for them. The best I've gotten is > > > 800MB/sec, on a wide row (average 800 bytes). Most tab

Re: [PERFORM] Slow count(*) again...

2010-10-21 Thread Jesper Krogh
On 2010-10-21 06:47, Scott Carey wrote: On a wimpy disk, I/O bound for sure. But my disks go 1000MB/sec. No query can go fast enough for them. The best I've gotten is 800MB/sec, on a wide row (average 800 bytes). Most tables go 300MB/sec or so. And with 72GB of RAM, many scans are in-mem

Re: [PERFORM] Slow count(*) again...

2010-10-20 Thread Scott Carey
On Oct 12, 2010, at 11:58 AM, Tom Lane wrote: > Jesper Krogh writes: >> On 2010-10-12 19:07, Tom Lane wrote: >>> Anyway, if anyone is hot to make COUNT(*) faster, that's where to look. > >> Just having 32 bytes bytes of "payload" would more or less double >> you time to count if I read you test

Re: [PERFORM] Slow count(*) again...

2010-10-20 Thread Bruce Momjian
bricklen wrote: > On Sat, Oct 9, 2010 at 4:26 PM, Neil Whelchel wrote: > > Maybe an > > estimate(*) that works like count but gives an answer from the index without > > checking visibility? I am sure that this would be good enough to make a page > > list, it is really no big deal if it errors on t

Re: [PERFORM] Slow count(*) again...

2010-10-15 Thread Greg Smith
Jesper Krogh wrote: To be honest, if it is EDB, Redpill, Command Prompt, 2nd Quadrant or whoever end up doing the job is, seen from this perspective not important, just it ends in the hands of someone "capable" of doing it. ... although Heikki has done some work on this task already. Now you're

Re: [PERFORM] Slow count(*) again...

2010-10-15 Thread Mladen Gogala
Neil Whelchel wrote: That is why I suggested an estimate(*) that works like (a faster) count(*) except that it may be off a bit. I think that is what he was talking about when he wrote this. The main problem with "select count(*)" is that it gets seriously mis-used. Using "select count

Re: [PERFORM] Slow count(*) again...

2010-10-15 Thread Mladen Gogala
Jon Nelson wrote: Well, I didn't quite mean that - having no familiarity with Oracle I don't know what the alter system statement does, but I was talking specifically about the linux buffer and page cache. Those are not utilized by Oracle. This is a RAC instance, running on top of ASM, w

Re: [PERFORM] Slow count(*) again...

2010-10-15 Thread Devrim GÜNDÜZ
On Wed, 2010-10-13 at 09:02 -0400, Greg Smith wrote: > XFS support is available as an optional module starting in RHEL 5.5. > In CentOS, you just grab it, so that's what I've been doing. My > understanding is that you may have to ask your sales rep to enable > access to it under the official Re

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread Jesper Krogh
On 2010-10-14 21:56, Robert Haas wrote: On Thu, Oct 14, 2010 at 12:22 AM, mark wrote: Could this be an interesting test use of https://www.fossexperts.com/ ? 'Community' driven proposal - multiple people / orgs agree to pay various portions? Maybe with multiple funders a reasonable target

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread Robert Haas
On Thu, Oct 14, 2010 at 12:22 AM, mark wrote: > Could this be an interesting test use of https://www.fossexperts.com/ ? > > 'Community' driven proposal - multiple people / orgs agree to pay various > portions? Maybe with multiple funders a reasonable target fund amount could > be reached. > > Just

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread Jesper Krogh
On 2010-10-14 06:22, mark wrote: Could this be an interesting test use of https://www.fossexperts.com/ ? 'Community' driven proposal - multiple people / orgs agree to pay various portions? Maybe with multiple funders a reasonable target fund amount could be reached. I might convince my boss

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread mark
--Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Robert Haas Sent: Wednesday, October 13, 2010 7:29 AM To: Neil Whelchel Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow count(*) again... On Wed, Oct 13, 20

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread Robert Haas
On Wed, Oct 13, 2010 at 1:59 PM, Jesper Krogh wrote: > On 2010-10-13 15:28, Robert Haas wrote: >> >> On Wed, Oct 13, 2010 at 6:16 AM, Neil Whelchel >>  wrote: >> >>> >>> I might go as far as to rattle the cage of the developers to see if it >>> makes >>> any sense to add some column oriented stora

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread Mark Kirkwood
On 13/10/10 23:16, Neil Whelchel wrote: The good ol' bruit force approach! I knew I'd see this one sooner or later. Though I was not sure if I was going to see the 16TB of RAM suggestion first. Seriously though, as the title of this thread suggests, everything is relative. Sure count(*) and e

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread Mark Kirkwood
On 13/10/10 21:44, Mladen Gogala wrote: On 10/13/2010 3:19 AM, Mark Kirkwood wrote: I think that major effect you are seeing here is that the UPDATE has made the table twice as big on disk (even after VACUUM etc), and it has gone from fitting in ram to not fitting in ram - so cannot be effectiv

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread Neil Whelchel
On Wednesday 13 October 2010 06:27:34 you wrote: > On Wed, Oct 13, 2010 at 4:38 AM, Neil Whelchel wrote: > > There seems to be allot of discussion about VACUUM FULL, and its > > problems. The overall buzz seems to be that VACUUM FULL is a bad idea (I > > could be wrong here). It has been some tim

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Neil Whelchel
On Wednesday 13 October 2010 05:33:28 Mladen Gogala wrote: > On 10/13/2010 8:12 AM, Greg Smith wrote: > > The work incorporating a more stable XFS into RHEL started with xfsprogs > > 3.0.1-6 going into Fedora 11, and 3.1.X would represent a current > > release. So your Ubuntu kernel is two major

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread david
On Wed, 13 Oct 2010, Tom Lane wrote: Neil Whelchel writes: That's probably got little to do with caching and everything to do with setting hint bits on the first SELECT pass. I concur with Mark's question about whether your UPDATE pushed the table size across the limit of what would fit in RA

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Alex Hunsaker
On Wed, Oct 13, 2010 at 02:38, Neil Whelchel wrote: > And the cache helps... > So, we are right back to within 10ms of where we started after INSERTing the > data, but it took a VACUUM FULL to accomplish this (by making the table fit in > RAM). > This is a big problem on a production machine as t

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Alex Hunsaker
On Wed, Oct 13, 2010 at 07:49, Tom Lane wrote: > Neil Whelchel writes: > I concur with Mark's question about whether your UPDATE pushed the table > size across the limit of what would fit in RAM. Yeah, you said you have ~2GB of ram, just counting the bytes and the number of rows (not including

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Jesper Krogh
On 2010-10-13 15:28, Robert Haas wrote: On Wed, Oct 13, 2010 at 6:16 AM, Neil Whelchel wrote: I might go as far as to rattle the cage of the developers to see if it makes any sense to add some column oriented storage capability to Postgres. That would be the hot ticket to be able to specify

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Tom Lane
Neil Whelchel writes: > Insert the data into one table: > crash:~# time psql -U test test -q < log.sql > real679m43.678s > user1m4.948s > sys 13m1.893s > crash:~# echo 3 > /proc/sys/vm/drop_caches > crash:~# time psql -U test test -c "SELECT count(*) FROM log;" > count > ---

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Kevin Grittner
Neil Whelchel wrote: > crash:~# time psql -U test test -c "UPDATE log SET > raw_data=raw_data+1" > UPDATE 10050886 > > real14m13.802s > user0m0.000s > sys 0m0.000s > > crash:~# time psql -U test test -c "SELECT count(*) FROM log;" > count > -- > 10050886 > (1 row) >

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Robert Haas
On Wed, Oct 13, 2010 at 6:16 AM, Neil Whelchel wrote: > I might go as far as to rattle the cage of the developers to see if it makes > any sense to add some column oriented storage capability to Postgres. That > would be the hot ticket to be able to specify an attribute on a column so that > the b

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Robert Haas
On Wed, Oct 13, 2010 at 4:38 AM, Neil Whelchel wrote: > There seems to be allot of discussion about VACUUM FULL, and its problems. The > overall buzz seems to be that VACUUM FULL is a bad idea (I could be wrong > here). It has been some time since I have read the changelogs, but I seem to > rememb

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Greg Smith
Mladen Gogala wrote: I am afraid that my management will not let me use anything that doesn't exist as a RPM package in the current Red Hat distribution. No Ubuntu, no Fedora, no manual linking. There will always be that ominous question: how many other companies are using XFS? From the busin

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Mladen Gogala
On 10/13/2010 8:12 AM, Greg Smith wrote: The work incorporating a more stable XFS into RHEL started with xfsprogs 3.0.1-6 going into Fedora 11, and 3.1.X would represent a current release. So your Ubuntu kernel is two major improvement releases behind, 3.0 and 3.1 were the upgrades to xfsprogs

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Greg Smith
Dan Harris wrote: I'm not sure how to figure out what version of XFS we're on.. but it's Linux kernel 2.6.24-24 x86_64 on Ubuntu Server 8.04.3. Postgres version 8.3 There's the kernel side support that matches your kernel, as well as the xfsprogs package. The latter is where a lot of distri

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Craig Ringer
On 13/10/2010 12:38 AM, Jesper Krogh wrote: If some clever postgres hacker could teach postgres to allocate blocks using posix_fallocate in quite large batches, say .. something like: fallocate(min(current_relation_size *0.1,1073741824)) There doesn't seem to be any use of posix_fallocate in t

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Vitalii Tymchyshyn
12.10.10 21:58, Tom Lane написав(ла): I'm less than convinced that that approach will result in a significant win. It's certainly not going to do anything to convert COUNT(*) into an O(1) operation, which frankly is what the complainants are expecting. There's basically no hope of solving the "

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Vitalii Tymchyshyn
12.10.10 14:44, Craig Ringer написав(ла): in the case where you are doing a count(*) where query and the where is on an indexed column, could the search just look at the index + the visibility mapping rather than doing an sequential search through the table? Nope, because the visibility map,

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Neil Whelchel
On Wednesday 13 October 2010 01:50:23 Mark Kirkwood wrote: > On 13/10/10 21:38, Neil Whelchel wrote: > > So with our conclusion pile so far we can deduce that if we were to keep > > all of our data in two column tables (one to link them together, and the > > other to store one column of data), we s

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Mark Kirkwood
On 13/10/10 21:38, Neil Whelchel wrote: So with our conclusion pile so far we can deduce that if we were to keep all of our data in two column tables (one to link them together, and the other to store one column of data), we stand a much better chance of making the entire table to be counted fit

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Mladen Gogala
On 10/13/2010 3:19 AM, Mark Kirkwood wrote: I think that major effect you are seeing here is that the UPDATE has made the table twice as big on disk (even after VACUUM etc), and it has gone from fitting in ram to not fitting in ram - so cannot be effectively cached anymore. In the real world, t

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Mladen Gogala
On 10/13/2010 2:47 AM, Neil Whelchel wrote: Even with VACUUM database table speed degrades What the heck is the "database table speed"? Tables don't do anything. -- Mladen Gogala Sr. Oracle DBA 1500 Broadway New York, NY 10036 (212) 329-5251 www.vmsinfo.com -- Sent via pgsql-performance mai

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Neil Whelchel
On Wednesday 13 October 2010 00:19:26 Mark Kirkwood wrote: > On 13/10/10 19:47, Neil Whelchel wrote: > > Nope... > > So, possible conclusions are: > > 1. Even with VACUUM database table speed degrades as tables are updated. > > 2. Time testing on a freshly INSERTed table gives results that are not

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Pierre C
I guess I have to comment here again and point out that while I am having this issue with text searches, I avoid using count(*) in such cases, I just use next and previous links. Unfortunately sometimes you got to do an ORDER BY on search results, and then all the rows got to be read...

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Mark Kirkwood
On 13/10/10 19:47, Neil Whelchel wrote: Nope... So, possible conclusions are: 1. Even with VACUUM database table speed degrades as tables are updated. 2. Time testing on a freshly INSERTed table gives results that are not real- world. 3. Filesystem defragmentation helps (some). 4. Cache only mak

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Neil Whelchel
On Sunday 10 October 2010 21:15:56 Neil Whelchel wrote: > Right now, I am building a test machine with two dual core Intel processors > and two 15KRPM mirrored hard drives, 1 GB ram. I am using a small amount of > ram because I will be using small test tables. I may do testing in the > future with

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Robert Haas
On Tue, Oct 12, 2010 at 1:07 PM, Tom Lane wrote: > Anyway, if anyone is hot to make COUNT(*) faster, that's where to look. > I don't think any of the previous discussion in this thread is on-point > at all, except for the parts where people suggested avoiding it. I kind of hope that index-only sc

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Dan Harris
On 10/12/10 4:33 PM, Neil Whelchel wrote: On Tuesday 12 October 2010 08:39:19 Dan Harris wrote: On 10/11/10 8:02 PM, Scott Carey wrote: would give you a 1MB read-ahead. Also, consider XFS and its built-in defragmentation. I have found that a longer lived postgres DB will get extreme file

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Neil Whelchel
On Tuesday 12 October 2010 14:35:01 you wrote: > > suggest that 99% instances of the "select count(*)" idiom are probably > > > >> bad use of the SQL language. > > Well, suppose you paginate results. If the user sees that the search query > returns 500 pages, there are two options : > > - you're

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Neil Whelchel
On Tuesday 12 October 2010 08:39:19 Dan Harris wrote: > On 10/11/10 8:02 PM, Scott Carey wrote: > > would give you a 1MB read-ahead. Also, consider XFS and its built-in > > defragmentation. I have found that a longer lived postgres DB will get > > extreme file fragmentation over time and sequen

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Mladen Gogala
Pierre C wrote: Well, suppose you paginate results. If the user sees that the search query returns 500 pages, there are two options : With Google, I usually lose patience on the page 3. All that I, as an end user, need to know is whether there are more than 10 pages. The fact that ther

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Neil Whelchel
On Tuesday 12 October 2010 07:19:57 you wrote: > >> The biggest single problem with "select count(*)" is that it is > >> seriously overused. People use that idiom to establish existence, which > >> usually leads to a performance disaster in the application using it, > >> unless the table has no mor

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Pierre C
suggest that 99% instances of the "select count(*)" idiom are probably bad use of the SQL language. Well, suppose you paginate results. If the user sees that the search query returns 500 pages, there are two options : - you're google, and your sorting algorithms are so good that the answe

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Tom Lane
Jesper Krogh writes: > On 2010-10-12 19:07, Tom Lane wrote: >> Anyway, if anyone is hot to make COUNT(*) faster, that's where to look. > Just having 32 bytes bytes of "payload" would more or less double > you time to count if I read you test results correctly?. .. and in the > situation where dis

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Jesper Krogh
On 2010-10-12 19:07, Tom Lane wrote: Anyway, if anyone is hot to make COUNT(*) faster, that's where to look. I don't think any of the previous discussion in this thread is on-point at all, except for the parts where people suggested avoiding it. I would have to say that allthough it is nice

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Mladen Gogala
da...@lang.hm wrote: On Tue, 12 Oct 2010, Mladen Gogala wrote: da...@lang.hm wrote: from a PR point of view, speeding up the trivil count(*) case could be worth it, just to avoid people complaining about it not being fast. Fixing PR stuff is not the approach that I would take

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Tom Lane
So I spent a bit of quality time with oprofile this morning, and found once again that there's no substitute for having actual data before theorizing. Test case software: current Git HEAD (plus one code change explained below), compiled with --enable-debug to support oprofile, cassert off; no othe

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Dan Harris
On 10/12/10 10:44 AM, Scott Carey wrote: On Oct 12, 2010, at 8:39 AM, Dan Harris wrote: On 10/11/10 8:02 PM, Scott Carey wrote: would give you a 1MB read-ahead. Also, consider XFS and its built-in defragmentation. I have found that a longer lived postgres DB will get extreme file fragmen

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Scott Carey
On Oct 12, 2010, at 9:46 AM, Scott Carey wrote: > > On Oct 12, 2010, at 8:54 AM, wrote: > >> On Tue, 12 Oct 2010, Craig Ringer wrote: >> >>> On 10/12/2010 04:22 PM, da...@lang.hm wrote: >>> from a PR point of view, speeding up the trivil count(*) case could be worth it, just to avo

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Scott Carey
On Oct 12, 2010, at 8:54 AM, wrote: > On Tue, 12 Oct 2010, Craig Ringer wrote: > >> On 10/12/2010 04:22 PM, da...@lang.hm wrote: >> >>> from a PR point of view, speeding up the trivil count(*) case could be >>> worth it, just to avoid people complaining about it not being fast. >> >> At the c

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Scott Carey
On Oct 12, 2010, at 8:39 AM, Dan Harris wrote: > On 10/11/10 8:02 PM, Scott Carey wrote: >> would give you a 1MB read-ahead. Also, consider XFS and its built-in >> defragmentation. I have found that a longer lived postgres DB will get >> extreme >> file fragmentation over time and sequential

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Jesper Krogh
On 2010-10-12 18:02, Scott Carey wrote: However, for large reporting queries and sequential scans, XFS will win in the long run if you use the online defragmenter. Otherwise, your sequential scans won't be all that sequential on any file system over time if your tables aren't written once, f

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Scott Carey
>> > > A count with any joins or filter criteria would still have to scan all > pages with visible tuples in them. So the visibility map helps speed up > scanning of bloated tables, but doesn't provide a magical "fast count" > except in the utterly trivial "select count(*) from tablename;" cas

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Chris Browne
sgend...@ideasculptor.com (Samuel Gendler) writes: > Geez.  I wish someone would have written something quite so bold as > 'xfs is always faster than ext3' in the standard tuning docs.  I > couldn't find anything that made a strong filesystem > recommendation.  How does xfs compare to ext4?  I woun

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Samuel Gendler
On Tue, Oct 12, 2010 at 9:02 AM, Scott Carey wrote: > > > However, for large reporting queries and sequential scans, XFS will win in > the long run if you use the online defragmenter. Otherwise, your sequential > scans won't be all that sequential on any file system over time if your > tables aren

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Scott Carey
On Oct 11, 2010, at 9:21 PM, Samuel Gendler wrote: On Mon, Oct 11, 2010 at 9:06 PM, Scott Carey mailto:sc...@richrelevance.com>> wrote: I can't speak to documentation, but it is something that helps as your I/O subsystem gets more powerful, and how much it helps depends more on your hardware

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread david
On Tue, 12 Oct 2010, Craig Ringer wrote: On 10/12/2010 04:22 PM, da...@lang.hm wrote: from a PR point of view, speeding up the trivil count(*) case could be worth it, just to avoid people complaining about it not being fast. At the cost of a fair bit more complexity, though, and slowing ever

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread david
On Tue, 12 Oct 2010, Mladen Gogala wrote: da...@lang.hm wrote: from a PR point of view, speeding up the trivil count(*) case could be worth it, just to avoid people complaining about it not being fast. Fixing PR stuff is not the approach that I would take. People are complaining about selec

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread david
On Tue, 12 Oct 2010, Joe Uhl wrote: The biggest single problem with "select count(*)" is that it is seriously overused. People use that idiom to establish existence, which usually leads to a performance disaster in the application using it, unless the table has no more than few hundred records.

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Dan Harris
On 10/11/10 8:02 PM, Scott Carey wrote: would give you a 1MB read-ahead. Also, consider XFS and its built-in defragmentation. I have found that a longer lived postgres DB will get extreme file fragmentation over time and sequential scans end up mostly random. On-line file defrag helps treme

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Kevin Grittner
Jon Nelson wrote: > Greg Smith wrote: >> Usually the sequence used to remove all cached data from RAM >> before a benchmark is: > > All cached data (as cached in postgresql - *not* the Linux system > caches)..., right? No. The stop and start of PostgreSQL causes empty PostgreSQL caches. Th

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread bricklen
On Sat, Oct 9, 2010 at 4:26 PM, Neil Whelchel wrote: > Maybe an > estimate(*) that works like count but gives an answer from the index without > checking visibility? I am sure that this would be good enough to make a page > list, it is really no big deal if it errors on the positive side, maybe th

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Joe Uhl
The biggest single problem with "select count(*)" is that it is seriously overused. People use that idiom to establish existence, which usually leads to a performance disaster in the application using it, unless the table has no more than few hundred records. SQL language, of which PostgreSQL offe

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Mladen Gogala
Tom Lane wrote: Mladen Gogala writes: The number of rows is significantly smaller, but the table contains rather significant "text" field which consumes quite a bit of TOAST storage and the sizes are comparable. Postgres read through 27GB in 113 seconds, less than 2 minutes and oracle took

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Kevin Grittner
Neil Whelchel wrote: > What is the best method to make a page of results and a list of > links to other pages of results? For our most heavily used web app we decided to have the renderer just read the list of cases and render the pages to disk, and then present the first one. We set a limit

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Tom Lane
Mladen Gogala writes: > The number of rows is significantly smaller, but the table contains > rather significant "text" field which consumes quite a bit of TOAST > storage and the sizes are comparable. Postgres read through 27GB in 113 > seconds, less than 2 minutes and oracle took 2 minutes 37

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Jon Nelson
On Tue, Oct 12, 2010 at 8:18 AM, Greg Smith wrote: > No.  Usually the sequence used to remove all cached data from RAM before a > benchmark is: All cached data (as cached in postgresql - *not* the Linux system caches)..., right? -- Jon -- Sent via pgsql-performance mailing list (pgsql-perfor

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Luca Tettamanti
On Tue, Oct 12, 2010 at 3:07 PM, Jon Nelson wrote: > On Tue, Oct 12, 2010 at 7:27 AM, Mladen Gogala > wrote: >> >> So, the results weren't cached the first time around. The explanation is the >> fact that Oracle, as of the version 10.2.0, reads the table in the private >> process memory, not in t

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Greg Smith
Jon Nelson wrote: Is there a command to tell postgresql to drop/clear/reset it's buffer_cache? No. Usually the sequence used to remove all cached data from RAM before a benchmark is: pg_ctl stop sync echo 3 > /proc/sys/vm/drop_caches pg_ctl start -- Greg Smith, 2ndQuadrant US g...@2ndqu

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Jon Nelson
On Tue, Oct 12, 2010 at 7:27 AM, Mladen Gogala wrote: > > So, the results weren't cached the first time around. The explanation is the > fact that Oracle, as of the version 10.2.0, reads the table in the private > process memory, not in the shared buffers.  This table alone is  35GB in > size,  Or

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Mladen Gogala
da...@lang.hm wrote: from a PR point of view, speeding up the trivil count(*) case could be worth it, just to avoid people complaining about it not being fast. Fixing PR stuff is not the approach that I would take. People are complaining about select count(*) because they're using it in all

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Craig Ringer
On 10/12/2010 04:22 PM, da...@lang.hm wrote: from a PR point of view, speeding up the trivil count(*) case could be worth it, just to avoid people complaining about it not being fast. At the cost of a fair bit more complexity, though, and slowing everything else down. The proper solution he

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Vitalii Tymchyshyn
12.10.10 11:14, Craig Ringer написав(ла): On 10/12/2010 03:56 PM, Vitalii Tymchyshyn wrote: BTW: There is a lot of talk about MVCC, but is next solution possible: 1) Create a page information map that for each page in the table will tell you how may rows are within and if any write (either succ

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread david
On Tue, 12 Oct 2010, Craig Ringer wrote: BTW: There is a lot of talk about MVCC, but is next solution possible: 1) Create a page information map that for each page in the table will tell you how may rows are within and if any write (either successful or not) were done to this page. This even c

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Craig Ringer
On 10/12/2010 03:56 PM, Vitalii Tymchyshyn wrote: BTW: There is a lot of talk about MVCC, but is next solution possible: 1) Create a page information map that for each page in the table will tell you how may rows are within and if any write (either successful or not) were done to this page. This

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Vitalii Tymchyshyn
11.10.10 20:46, Craig James написав(ла): First of all, it's not true. There are plenty of applications that need an exact answer. Second, even if it is only 1%, that means it's 1% of the queries, not 1% of people. Sooner or later a large fraction of developers will run into this. It's probabl

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Greg Smith
Samuel Gendler wrote: I spent some time going through the various tuning docs on the wiki whie bringing some new hardware up and I can't remember seeing any discussion of tweaking read-ahead at all in the normal performance-tuning references. Do you have any documentation of the kinds of twea

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Greg Smith
Mladen Gogala wrote: I agree, but I am afraid that after the demise of SGI, XFS isn't being developed. It's back to being well maintained again; see http://blog.2ndquadrant.com/en/2010/04/the-return-of-xfs-on-linux.html for some history here and why it's become relevant to RedHat in particul

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread david
On Mon, 11 Oct 2010, Samuel Gendler wrote: On Mon, Oct 11, 2010 at 9:06 PM, Scott Carey wrote: I can't speak to documentation, but it is something that helps as your I/O subsystem gets more powerful, and how much it helps depends more on your hardware, which may have adaptive read ahead on its

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Samuel Gendler
On Mon, Oct 11, 2010 at 9:06 PM, Scott Carey wrote: > I can't speak to documentation, but it is something that helps as your I/O > subsystem gets more powerful, and how much it helps depends more on your > hardware, which may have adaptive read ahead on its own, and your file > system which may be

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Scott Carey
I can't speak to documentation, but it is something that helps as your I/O subsystem gets more powerful, and how much it helps depends more on your hardware, which may have adaptive read ahead on its own, and your file system which may be more or less efficient at sequential I/O. For example ex

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Samuel Gendler
On Mon, Oct 11, 2010 at 7:19 PM, Greg Smith wrote: > > > This is a problem for the operating system to solve, and such solutions out > there are already good enough that PostgreSQL has little reason to try and > innovate in this area. I routinely see seq scan throughput double on Linux > just by

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Neil Whelchel
On Monday 11 October 2010 19:23:46 Mladen Gogala wrote: > On 10/11/2010 10:02 PM, Scott Carey wrote: > > Did you tune the linux FS read-ahead first? You can get large gains by > > doing that if you are on ext3. blockdev --setra 2048 > > Actually, I have blockdev --setra 32768 > > > would give

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Neil Whelchel
On Monday 11 October 2010 16:50:36 you wrote: > On 10/11/2010 3:54 PM, Neil Whelchel wrote: > > 1. A faster count(*), or something like my proposed estimate(*). > > 2. A way to get the total rows matched when using LIMIT and OFFSET before > > LIMIT and OFFSET are applied. > > The biggest single

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Mladen Gogala
On 10/11/2010 10:02 PM, Scott Carey wrote: Did you tune the linux FS read-ahead first? You can get large gains by doing that if you are on ext3. blockdev --setra 2048 Actually, I have blockdev --setra 32768 would give you a 1MB read-ahead. Also, consider XFS and its built-in defragmentat

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Scott Carey
On Oct 11, 2010, at 7:02 PM, Scott Carey wrote: > > On Oct 10, 2010, at 10:14 AM, Mladen Gogala wrote: > >> >> SQL> show parameter db_file_multi >> >> NAME TYPEVALUE >> --- >> --

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Greg Smith
Joshua Tolley wrote: It was asserted that reading bigger chunks would help performance; a response suggested that, at least in Linux, setting readahead on a device would essentially do the same thing. Or that's what I got from the thread, anyway. I'm interested to know how similar performance mig

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Scott Carey
On Oct 10, 2010, at 10:14 AM, Mladen Gogala wrote: > > SQL> show parameter db_file_multi > > NAME TYPEVALUE > --- > -- > db_file_multiblock_read_countinteger 16 > SQL> a

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Mladen Gogala
On 10/11/2010 3:54 PM, Neil Whelchel wrote: 1. A faster count(*), or something like my proposed estimate(*). 2. A way to get the total rows matched when using LIMIT and OFFSET before LIMIT and OFFSET are applied. The biggest single problem with "select count(*)" is that it is seriously overus

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Samuel Gendler
On Mon, Oct 11, 2010 at 12:54 PM, Neil Whelchel wrote: > > 2. You need a slice of the data which requires another scan to the table to > get, and using the same WHERE clause as above. This seems like a total > waste, > because we just did that with the exception of actually fetching the data. > >

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Neil Whelchel
On Monday 11 October 2010 10:46:17 Craig James wrote: > On 10/9/10 6:47 PM, Scott Marlowe wrote: > > On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel wrote: > >> I know that there haven been many discussions on the slowness of > >> count(*) even when an index is involved because the visibility of th

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Craig James
On 10/9/10 6:47 PM, Scott Marlowe wrote: On Sat, Oct 9, 2010 at 5:26 PM, Neil Whelchel wrote: I know that there haven been many discussions on the slowness of count(*) even when an index is involved because the visibility of the rows has to be checked. In the past I have seen many suggestions a

Re: [PERFORM] Slow count(*) again...

2010-10-11 Thread Pierre C
I ran into a fine example of this when I was searching this mailing list, "Searching in 856,646 pages took 13.48202 seconds. Site search powered by PostgreSQL 8.3." Obviously at some point count(*) came into play here Well, tsearch full text search is excellent, but it has to work inside the

Re: [PERFORM] Slow count(*) again...

2010-10-10 Thread Neil Whelchel
On Sunday 10 October 2010 15:41:16 you wrote: > On 10/11/2010 01:14 AM, Mladen Gogala wrote: > > I can provide measurements, but from Oracle RDBMS. Postgres doesn't > > allow tuning of that aspect, so no measurement can be done. Would the > > numbers from Oracle RDBMS be acceptable? > > Well, they

Re: [PERFORM] Slow count(*) again...

2010-10-10 Thread Craig Ringer
On 11/10/10 11:14, Mladen Gogala wrote: > On 10/10/2010 8:27 PM, Joshua Tolley wrote: >> It was asserted that reading bigger chunks would help performance; a >> response >> suggested that, at least in Linux, setting readahead on a device would >> essentially do the same thing. Or that's what I got

Re: [PERFORM] Slow count(*) again...

2010-10-10 Thread Joshua Tolley
On Sun, Oct 10, 2010 at 11:14:43PM -0400, Mladen Gogala wrote: > The fact is, however, that the question > about slow sequential scan appears with some regularity on PostgreSQL > forums. Definitely. Whether that's because there's something pathologically wrong with sequential scans, or just be

Re: [PERFORM] Slow count(*) again...

2010-10-10 Thread Mladen Gogala
On 10/10/2010 8:27 PM, Joshua Tolley wrote: It was asserted that reading bigger chunks would help performance; a response suggested that, at least in Linux, setting readahead on a device would essentially do the same thing. Or that's what I got from the thread, anyway. I'm interested to know how

  1   2   >