Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-31 Thread Robert Haas
On Mon, Oct 31, 2011 at 9:51 AM, Anssi Kääriäinen wrote: > Stupid question, but why not keep the whole VM pinned? It might be that keeping more than one VM page pinned is a good idea, but we'd have to think carefully about it. For example, if we pin too many pages in shared_buffers, other querie

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-31 Thread Anssi Kääriäinen
On 10/31/2011 02:44 PM, Robert Haas wrote: What I think you're probably measuring here (oprofile would tell us for sure) is that once the size of the table goes beyond about half a gigabyte, it will have more than one page in the visibility map. The index-only scan code keeps the most recently u

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-31 Thread Robert Haas
On Sun, Oct 30, 2011 at 8:02 AM, Kääriäinen Anssi wrote: > Table size is around 600MB, index size is around 350MB and VM on-disk > size is 16kB with default fillfactor. With fillfactor = 10, the VM size is 104 > KB, and table size is around 6GB.  The index size is the same. What I think you're pr

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-30 Thread Kääriäinen Anssi
Sorry, I forgot to include the version used & some information about my setup: PostgreSQL version: Git HEAD as of: Date: Fri Oct 28 21:18:36 2011 -0400 Commit: 51eba98cf4595e90730dedd9305da8aa84b649ee Compiled with defaults, (only change --with-pgport = 5431). I used default settings, shared_buf

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-30 Thread Kääriäinen Anssi
Quoting Robert Haas: """ I tried this on my MacBook Pro this morning, using pgbench -i -s 500 to create a database about 7.5GB in size, and then using "SELECT sum(1) FROM pgbench_accounts" as a test query, on a build WITHOUT --enable-cassert. This machine has 4GB of memory, and I set shared_buffer

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-28 Thread Robert Haas
On Fri, Oct 28, 2011 at 3:53 PM, Tom Lane wrote: > Robert Haas writes: >> On Fri, Oct 28, 2011 at 3:27 PM, Tom Lane wrote: >>> Yeah, I find that unlikely as well.  But leaving Asserts in place would >>> tell us. > >> OK.  Should I go do that, or are you all over it? > > Go for it. OK, done. An

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-28 Thread Tom Lane
Robert Haas writes: > On Fri, Oct 28, 2011 at 3:27 PM, Tom Lane wrote: >> Yeah, I find that unlikely as well.  But leaving Asserts in place would >> tell us. > OK. Should I go do that, or are you all over it? Go for it. regards, tom lane -- Sent via pgsql-hackers mai

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-28 Thread Robert Haas
On Fri, Oct 28, 2011 at 3:27 PM, Tom Lane wrote: > Robert Haas writes: >> On Fri, Oct 28, 2011 at 2:48 PM, Tom Lane wrote: >>> Hmm.  I wonder whether it wouldn't be better to get rid of the range >>> checks in BufferIsValid, or better convert them into Asserts.  It seems >>> less than intuitive

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-28 Thread Tom Lane
Robert Haas writes: > On Fri, Oct 28, 2011 at 2:48 PM, Tom Lane wrote: >> Hmm.  I wonder whether it wouldn't be better to get rid of the range >> checks in BufferIsValid, or better convert them into Asserts.  It seems >> less than intuitive that BufferIsValid and BufferIsInvalid aren't simple >>

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-28 Thread Robert Haas
On Fri, Oct 28, 2011 at 2:48 PM, Tom Lane wrote: > Robert Haas writes: >> I also tried changing the BufferIsValid() tests in >> visibilitymap_test() to use BufferIsInvalid() instead, with the sense >> of the tests reversed (see attached vismap-test-invalid.patch).  Since >> BufferIsInvalid() just

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-28 Thread Tom Lane
Robert Haas writes: > I also tried changing the BufferIsValid() tests in > visibilitymap_test() to use BufferIsInvalid() instead, with the sense > of the tests reversed (see attached vismap-test-invalid.patch). Since > BufferIsInvalid() just checks for InvalidBuffer instead of also doing > the sa

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-28 Thread Robert Haas
On Mon, Oct 24, 2011 at 4:23 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Oct 24, 2011 at 3:35 PM, Tom Lane wrote: >>> I wonder how trustworthy the measure of the visibilitymap_test call site >>> as a consumer of cycles really is. > >> I'm not sure either.  I guess we could try short-cir

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-28 Thread Bruce Momjian
Tom Lane wrote: > I wonder how trustworthy the measure of the visibilitymap_test call site > as a consumer of cycles really is. I've frequently noticed that > oprofile blames remarkably large fractions of the runtime on individual > statements that appear to be quite trivial. I'm not sure if that

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Wolfgang Wilhelm
Von: Tom Lane An: Robert Haas Cc: Kevin Grittner ; pgsql-hackers@postgresql.org Gesendet: 21:35 Montag, 24.Oktober 2011 Betreff: Re: [HACKERS] So, is COUNT(*) fast now? Robert Haas writes: > But even though Tom's statement that most indexes are one column might >

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Tom Lane
Robert Haas writes: > On Mon, Oct 24, 2011 at 3:35 PM, Tom Lane wrote: >> I wonder how trustworthy the measure of the visibilitymap_test call site >> as a consumer of cycles really is. > I'm not sure either. I guess we could try short-circuiting > visibilitymap_test and see what that does to pe

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Kevin Grittner
Josh Berkus wrote: > On 10/24/11 12:35 PM, Tom Lane wrote: >> Your point about people trying to create wider indexes to exploit >> index-only scans is an interesting one, but I think it's >> premature to optimize on the basis of hypotheses about what >> people might do in future. > > I don't th

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Robert Haas
On Mon, Oct 24, 2011 at 3:35 PM, Tom Lane wrote: > Your point about people trying to create wider indexes to exploit > index-only scans is an interesting one, but I think it's premature to > optimize on the basis of hypotheses about what people might do in > future. Well, I don't think it's too m

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Josh Berkus
On 10/24/11 12:35 PM, Tom Lane wrote: > Your point about people trying to create wider indexes to exploit > index-only scans is an interesting one, but I think it's premature to > optimize on the basis of hypotheses about what people might do in > future. I don't think that this is hypothetical at

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Tom Lane
Robert Haas writes: > But even though Tom's statement that most indexes are one column might > be a slight exaggeration, I suspect it probably is true that the > optimizations he's talking about for large numbers of columns won't > produce any material benefit even for a 3 or 4 column index. Whic

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Robert Haas
On Mon, Oct 24, 2011 at 2:15 PM, Kevin Grittner wrote: >> It doesn't look to me like the mean is above 2 (unless you have >> many fewer toast tables than I suspect), so trying to optimize >> many-column cases isn't going to help. > > The mean is 2.4 (give or take a little depending on whether you

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Kevin Grittner
Copy/paste problems -- the first set includes the system tables except for toast. User tables would be the difference between the results below. Sorry. -Kevin "Kevin Grittner" wrote: > Tom Lane wrote: > >> Yeah, TOAST indexes are 2-column. It would be best to exclude >> those from your

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Kevin Grittner
Tom Lane wrote: > Yeah, TOAST indexes are 2-column. It would be best to exclude > those from your counts, since it seems pretty unlikely that anyone > will care how fast nodeIndexonlyscan.c is for scans on toast > tables. User indexes (excluding toast): indnatts | count --+---

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Tom Lane
"Kevin Grittner" writes: > Tom Lane wrote: >> I had wondered whether it'd be worth optimizing that along the >> lines of slot_getallattrs(). But most indexes probably have only >> one column, or anyway not enough to make for a useful savings. >> From a heavily-used production database: > cir

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Kevin Grittner
Tom Lane wrote: > I had wondered whether it'd be worth optimizing that along the > lines of slot_getallattrs(). But most indexes probably have only > one column, or anyway not enough to make for a useful savings. >From a heavily-used production database: cir=> select indnatts, count(*) from

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-24 Thread Robert Haas
On Sun, Oct 23, 2011 at 7:01 PM, Jeff Janes wrote: > On Fri, Oct 21, 2011 at 12:52 PM, Robert Haas wrote: >> >> Also, this line is kind of expensive: >> >>        if (!visibilitymap_test(scandesc->heapRelation, >>                                ItemPointerGetBlockNumber(tid), >>                  

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-23 Thread Jeff Janes
On Fri, Oct 21, 2011 at 12:52 PM, Robert Haas wrote: > > Also, this line is kind of expensive: > >        if (!visibilitymap_test(scandesc->heapRelation, >                                ItemPointerGetBlockNumber(tid), >                                &node->ioss_VMBuffer)) > > Around 2%.  But I d

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-23 Thread Tom Lane
Jeff Janes writes: > count(*) and sum(1) do different things internally, and in my hands > sum(1) is ~10% slower. > I don't know how to dump the output of ExecBuildProjectionInfo into a > human readable form, so I don't know the basis of the difference. But > I wonder if using count(*) would lowe

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-23 Thread Jeff Janes
On Fri, Oct 21, 2011 at 12:07 PM, Robert Haas wrote: > On Fri, Oct 21, 2011 at 2:33 PM, Tom Lane wrote: >>> I think HeapTupleSatisfiesMVCC is probably being skipped anyway in >>> this case, since all the heap pages should be PD_ALL_VISIBLE. >> >> Proves my point ;-) ... you're comparing a code pa

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-22 Thread Tom Lane
Robert Haas writes: > On Fri, Oct 21, 2011 at 3:55 PM, Tom Lane wrote: >> Robert Haas writes: >>> Anyhow, here's the scoop. On my desktop machine running F14, running >>> SELECT sum(1) FROM pgbench_accounts in a tight loop, 60 s worth of >>> oprofile data: >>> 176830 13.0801 postgres

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-22 Thread Tom Lane
Robert Haas writes: > On Fri, Oct 21, 2011 at 10:57 PM, Jeff Janes wrote: >> If count(*) could cause the index-only scan to happen in physical >> order of the index, rather than logical order, that might be a big >> win.  Both for all in memory and for not-all-in-memory. > That's an interesting

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-22 Thread karavelov
- Цитат от Tom Lane (t...@sss.pgh.pa.us), на 22.10.2011 в 19:19 - > Andres Freund writes: >> On Saturday, October 22, 2011 05:20:26 PM Tom Lane wrote: >>> Huh? In the case he's complaining about, the index is all in RAM. >>> Sequentiality of access is not an issue (at least not at the pa

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-22 Thread Robert Haas
On Fri, Oct 21, 2011 at 10:57 PM, Jeff Janes wrote: >> Yeah, but it works out to fewer pages. > > But since those pages are already in RAM, why would it matter all that > much?  (Other than in the case of highly concurrent access, which you > don't seem to be testing?) Well, because memory access

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-22 Thread Tom Lane
Andres Freund writes: > On Saturday, October 22, 2011 05:20:26 PM Tom Lane wrote: >> Huh? In the case he's complaining about, the index is all in RAM. >> Sequentiality of access is not an issue (at least not at the page >> level --- within a page I suppose there could be cache-line effects). > I

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-22 Thread Andres Freund
On Saturday, October 22, 2011 05:20:26 PM Tom Lane wrote: > Andres Freund writes: > > On Friday, October 21, 2011 08:14:12 PM Robert Haas wrote: > >> On Fri, Oct 21, 2011 at 2:08 PM, Tom Lane wrote: > >>> It's not "touching six times less data". It's touching the exact same > >>> number of tuple

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-22 Thread Tom Lane
Andres Freund writes: > On Friday, October 21, 2011 08:14:12 PM Robert Haas wrote: >> On Fri, Oct 21, 2011 at 2:08 PM, Tom Lane wrote: >>> It's not "touching six times less data". It's touching the exact same >>> number of tuples either way, just index tuples in one case and heap >>> tuples in t

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-22 Thread desmodemone
2011/10/22 Andres Freund > On Friday, October 21, 2011 08:14:12 PM Robert Haas wrote: > > On Fri, Oct 21, 2011 at 2:08 PM, Tom Lane wrote: > > > Robert Haas writes: > > >> On Fri, Oct 21, 2011 at 1:18 PM, Tom Lane wrote: > > >>> I don't know why you'd imagine that touching an index is free, or

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-22 Thread Andres Freund
On Friday, October 21, 2011 08:14:12 PM Robert Haas wrote: > On Fri, Oct 21, 2011 at 2:08 PM, Tom Lane wrote: > > Robert Haas writes: > >> On Fri, Oct 21, 2011 at 1:18 PM, Tom Lane wrote: > >>> I don't know why you'd imagine that touching an index is free, or even > >>> cheap, CPU-wise. The who

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Jeff Janes
On Fri, Oct 21, 2011 at 11:14 AM, Robert Haas wrote: > On Fri, Oct 21, 2011 at 2:08 PM, Tom Lane wrote: >> Robert Haas writes: >>> On Fri, Oct 21, 2011 at 1:18 PM, Tom Lane wrote: I don't know why you'd imagine that touching an index is free, or even cheap, CPU-wise.  The whole point

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Robert Haas
On Fri, Oct 21, 2011 at 3:55 PM, Tom Lane wrote: > Robert Haas writes: >> Anyhow, here's the scoop.  On my desktop machine running F14, running >> SELECT sum(1) FROM pgbench_accounts in a tight loop, 60 s worth of >> oprofile data: > >> 176830   13.0801  postgres                 postgres        

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Tom Lane
Robert Haas writes: > Hmm, I guess there is a bit of a hotspot in StoreIndexTuple, which is > probably being folded into IndexOnlyNext in the per-function timings: > ExecClearTuple(slot); > for (i = 0; i < nindexatts; i++) > values[i] = index_getattr(itup, i + 1, itupdesc, &isnull

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Tom Lane
Robert Haas writes: > Anyhow, here's the scoop. On my desktop machine running F14, running > SELECT sum(1) FROM pgbench_accounts in a tight loop, 60 s worth of > oprofile data: > 176830 13.0801 postgres postgres > ExecProject Hm, that's weird. In both these

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Robert Haas
On Fri, Oct 21, 2011 at 3:07 PM, Robert Haas wrote: > [ oprofile results ] *grovels through the line-by-line results* Hmm, I guess there is a bit of a hotspot in StoreIndexTuple, which is probably being folded into IndexOnlyNext in the per-function timings: ExecClearTuple(slot); for (i

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Robert Haas
On Fri, Oct 21, 2011 at 2:33 PM, Tom Lane wrote: >> I think HeapTupleSatisfiesMVCC is probably being skipped anyway in >> this case, since all the heap pages should be PD_ALL_VISIBLE. > > Proves my point ;-) ... you're comparing a code path that's been beat on > for *years* with one that just got

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Tom Lane
Robert Haas writes: > On Fri, Oct 21, 2011 at 2:08 PM, Tom Lane wrote: >> What this test case proves is that btree's overhead per index >> tuple touched is significantly more than the cost of the fastest path >> through HeapTupleSatisfiesMVCC, which I don't find surprising >> considering how much

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Robert Haas
On Fri, Oct 21, 2011 at 2:08 PM, Tom Lane wrote: > Robert Haas writes: >> On Fri, Oct 21, 2011 at 1:18 PM, Tom Lane wrote: >>> I don't know why you'd imagine that touching an index is free, or even >>> cheap, CPU-wise.  The whole point of the index-only optimization is to >>> avoid I/O.  When yo

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Tom Lane
Robert Haas writes: > On Fri, Oct 21, 2011 at 1:18 PM, Tom Lane wrote: >> I don't know why you'd imagine that touching an index is free, or even >> cheap, CPU-wise.  The whole point of the index-only optimization is to >> avoid I/O.  When you try it on a case where there's no I/O to be saved, >>

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Robert Haas
On Fri, Oct 21, 2011 at 1:18 PM, Tom Lane wrote: > Robert Haas writes: >> That's a bit disappointing - it's now more than a third faster to do >> the sequential scan, even though the sequential scan has to touch six >> times as many blocks (at scale factor 20, index is 43 MB, table is 256 >> MB)

Re: [HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Tom Lane
Robert Haas writes: > That's a bit disappointing - it's now more than a third faster to do > the sequential scan, even though the sequential scan has to touch six > times as many blocks (at scale factor 20, index is 43 MB, table is 256 > MB) all of which are in cache. Of course, touching that man

[HACKERS] So, is COUNT(*) fast now?

2011-10-21 Thread Robert Haas
Laments at: http://wiki.postgresql.org/wiki/FAQ#Why_is_.22SELECT_count.28.2A.29_FROM_bigtable.3B.22_slow.3F http://wiki.postgresql.org/wiki/Slow_Counting I tried this on my MacBook Pro this morning, using pgbench -i -s 500 to create a database about 7.5GB in size, and then using "SELECT sum(1) FR