Re: [PERFORM] count(*) using index scan in "query often, update rarely" environment

2005-10-08 Thread hubert depesz lubaczewski
On 10/7/05, Cestmir Hybl <[EMAIL PROTECTED]> wrote:
No, I can't speed-up evaluation of generic 
"count(*) where ()" queries this way.
no you can't speed up generic where(), *but* you can check what are the
most common "where"'s (like usually i do where on one column like:
select count(*) from table where some_particular_column = 'some value';
where you can simply make the trigger aware of the fact that it should count based on value in some_particular_column.
works good enough for me not to look for alternatives.

depesz




Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-08 Thread mark
On Fri, Oct 07, 2005 at 09:20:59PM -0700, Luke Lonergan wrote:
> On 10/7/05 5:17 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> > On Fri, Oct 07, 2005 at 04:55:28PM -0700, Luke Lonergan wrote:
> >> On 10/5/05 5:12 PM, "Steinar H. Gunderson" <[EMAIL PROTECTED]> wrote:
> >>> What? strlen is definitely not in the kernel, and thus won't count as
> >>> system time.
> >> System time on Linux includes time spent in glibc routines.
> > Do you have a reference for this?
> > I believe this statement to be 100% false.
> How about 99%? OK, you're right, I had this confused with the profiling
> problem where glibc routines aren't included in dynamic linked profiles.

Sorry to emphasize the 100%. It wasn't meant to judge you. It was meant
to indicate that I believe 100% of system time is accounted for, while
the system call is actually active, which is not possible while glibc
is active.

I believe the way it works, is that a periodic timer interrupt
increments a specific integer every time it wakes up. If it finds
itself within the kernel, it increments the system time for the active
process, if it finds itself outside the kernel, it incremenets the
user time for the active process.

> Back to the statements earlier - the output of time had much of time for a
> dd spent in system, which means kernel, so where in the kernel would that be
> exactly?

Not really an expert here. I only play around. At a minimum, their is a
cost to switching from user context to system context and back, and then
filling in the zero bits. There may be other inefficiencies, however.
Perhaps /dev/zero always fill in a whole block (8192 usually), before
allowing the standard file system code to read only one byte.

I dunno.

But, I see this oddity too:

$ time dd if=/dev/zero of=/dev/zero bs=1 count=1000
1000+0 records in
1000+0 records out
dd if=/dev/zero of=/dev/zero bs=1 count=1000  4.05s user 11.13s system 94% 
cpu 16.061 total

$ time dd if=/dev/zero of=/dev/zero bs=10 count=100
100+0 records in
100+0 records out
dd if=/dev/zero of=/dev/zero bs=10 count=100  0.37s user 1.37s system 100% 
cpu 1.738 total

>From my numbers, it looks like 1 byte reads are hard in both the user context
and the system context. It looks almost linearly, even:

$ time dd if=/dev/zero of=/dev/zero bs=100 count=10
10+0 records in
10+0 records out
dd if=/dev/zero of=/dev/zero bs=100 count=10  0.04s user 0.15s system 95% 
cpu 0.199 total

$ time dd if=/dev/zero of=/dev/zero bs=1000 count=1
1+0 records in
1+0 records out
dd if=/dev/zero of=/dev/zero bs=1000 count=1  0.01s user 0.02s system 140% 
cpu 0.021 total

At least some of this gets into the very in-depth discussions as to
whether kernel threads, or user threads, are more efficient. Depending
on the application, user threads can switch many times faster than
kernel threads. Other parts of this may just mean that /dev/zero isn't
implemented optimally.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] count(*) using index scan in "query often, update rarely" environment

2005-10-08 Thread mark
On Fri, Oct 07, 2005 at 12:48:16PM +0200, Steinar H. Gunderson wrote:
> On Fri, Oct 07, 2005 at 11:24:05AM +0200, Cestmir Hybl wrote:
> > Isn't it possible (and reasonable) for these environments to keep track of
> > whether there is a transaction in progress with update to given table and
> > if not, use an index scan (count(*) where) or cached value (count(*)) to
> > perform this kind of query?
> Even if there is no running update, there might still be dead rows in the
> table. In any case, of course, a new update could always be occurring while
> your counting query was still running.

I don't see this being different from count(*) as it is today.

Updating a count column is certainly clever. If using a trigger,
perhaps it would allow the equivalent of:

select count(*) from table for update;

:-)

Cheers,
mark

(not that this is necessarily a good thing!)

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-08 Thread Jim C. Nasby
On Thu, Sep 29, 2005 at 03:28:27PM +0200, Zeugswetter Andreas DAZ SD wrote:
> 
> > In my original example, a sequential scan of the 1TB of 2KB 
> > or 4KB records, => 250M or 500M records of data, being sorted 
> > on a binary value key will take ~1000x more time than reading 
> > in the ~1GB Btree I described that used a Key+RID (plus node 
> > pointers) representation of the data.
> 
> Imho you seem to ignore the final step your algorithm needs of
> collecting the
> data rows. After you sorted the keys the collect step will effectively
> access the 
> tuples in random order (given a sufficiently large key range).
> 
> This random access is bad. It effectively allows a competing algorithm
> to read the
> whole data at least 40 times sequentially, or write the set 20 times
> sequentially. 
> (Those are the random/sequential ratios of modern discs)

True, but there is a compromise... not shuffling full tuples around when
sorting in memory. Do your sorting with pointers, then write the full
tuples out to 'tape' if needed.

Of course the other issue here is that as correlation improves it
becomes better and better to do full pointer-based sorting.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq