Re: [PERFORM] count(*) using index scan in "query often, update rarely" environment
On 10/7/05, Cestmir Hybl <[EMAIL PROTECTED]> wrote: No, I can't speed-up evaluation of generic "count(*) where ()" queries this way. no you can't speed up generic where(), *but* you can check what are the most common "where"'s (like usually i do where on one column like: select count(*) from table where some_particular_column = 'some value'; where you can simply make the trigger aware of the fact that it should count based on value in some_particular_column. works good enough for me not to look for alternatives. depesz
Re: [HACKERS] [PERFORM] A Better External Sort?
On Fri, Oct 07, 2005 at 09:20:59PM -0700, Luke Lonergan wrote: > On 10/7/05 5:17 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > > On Fri, Oct 07, 2005 at 04:55:28PM -0700, Luke Lonergan wrote: > >> On 10/5/05 5:12 PM, "Steinar H. Gunderson" <[EMAIL PROTECTED]> wrote: > >>> What? strlen is definitely not in the kernel, and thus won't count as > >>> system time. > >> System time on Linux includes time spent in glibc routines. > > Do you have a reference for this? > > I believe this statement to be 100% false. > How about 99%? OK, you're right, I had this confused with the profiling > problem where glibc routines aren't included in dynamic linked profiles. Sorry to emphasize the 100%. It wasn't meant to judge you. It was meant to indicate that I believe 100% of system time is accounted for, while the system call is actually active, which is not possible while glibc is active. I believe the way it works, is that a periodic timer interrupt increments a specific integer every time it wakes up. If it finds itself within the kernel, it increments the system time for the active process, if it finds itself outside the kernel, it incremenets the user time for the active process. > Back to the statements earlier - the output of time had much of time for a > dd spent in system, which means kernel, so where in the kernel would that be > exactly? Not really an expert here. I only play around. At a minimum, their is a cost to switching from user context to system context and back, and then filling in the zero bits. There may be other inefficiencies, however. Perhaps /dev/zero always fill in a whole block (8192 usually), before allowing the standard file system code to read only one byte. I dunno. But, I see this oddity too: $ time dd if=/dev/zero of=/dev/zero bs=1 count=1000 1000+0 records in 1000+0 records out dd if=/dev/zero of=/dev/zero bs=1 count=1000 4.05s user 11.13s system 94% cpu 16.061 total $ time dd if=/dev/zero of=/dev/zero bs=10 count=100 100+0 records in 100+0 records out dd if=/dev/zero of=/dev/zero bs=10 count=100 0.37s user 1.37s system 100% cpu 1.738 total >From my numbers, it looks like 1 byte reads are hard in both the user context and the system context. It looks almost linearly, even: $ time dd if=/dev/zero of=/dev/zero bs=100 count=10 10+0 records in 10+0 records out dd if=/dev/zero of=/dev/zero bs=100 count=10 0.04s user 0.15s system 95% cpu 0.199 total $ time dd if=/dev/zero of=/dev/zero bs=1000 count=1 1+0 records in 1+0 records out dd if=/dev/zero of=/dev/zero bs=1000 count=1 0.01s user 0.02s system 140% cpu 0.021 total At least some of this gets into the very in-depth discussions as to whether kernel threads, or user threads, are more efficient. Depending on the application, user threads can switch many times faster than kernel threads. Other parts of this may just mean that /dev/zero isn't implemented optimally. Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] count(*) using index scan in "query often, update rarely" environment
On Fri, Oct 07, 2005 at 12:48:16PM +0200, Steinar H. Gunderson wrote: > On Fri, Oct 07, 2005 at 11:24:05AM +0200, Cestmir Hybl wrote: > > Isn't it possible (and reasonable) for these environments to keep track of > > whether there is a transaction in progress with update to given table and > > if not, use an index scan (count(*) where) or cached value (count(*)) to > > perform this kind of query? > Even if there is no running update, there might still be dead rows in the > table. In any case, of course, a new update could always be occurring while > your counting query was still running. I don't see this being different from count(*) as it is today. Updating a count column is certainly clever. If using a trigger, perhaps it would allow the equivalent of: select count(*) from table for update; :-) Cheers, mark (not that this is necessarily a good thing!) -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PERFORM] A Better External Sort?
On Thu, Sep 29, 2005 at 03:28:27PM +0200, Zeugswetter Andreas DAZ SD wrote: > > > In my original example, a sequential scan of the 1TB of 2KB > > or 4KB records, => 250M or 500M records of data, being sorted > > on a binary value key will take ~1000x more time than reading > > in the ~1GB Btree I described that used a Key+RID (plus node > > pointers) representation of the data. > > Imho you seem to ignore the final step your algorithm needs of > collecting the > data rows. After you sorted the keys the collect step will effectively > access the > tuples in random order (given a sufficiently large key range). > > This random access is bad. It effectively allows a competing algorithm > to read the > whole data at least 40 times sequentially, or write the set 20 times > sequentially. > (Those are the random/sequential ratios of modern discs) True, but there is a compromise... not shuffling full tuples around when sorting in memory. Do your sorting with pointers, then write the full tuples out to 'tape' if needed. Of course the other issue here is that as correlation improves it becomes better and better to do full pointer-based sorting. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq