Re: [HACKERS] Sort performance

2006-09-02 Thread Heikki Linnakangas

Gregory Stark kirjoitti:
[aside, that said that may be a useful feature to have: a user option 
to use
our internal heap sort instead of qsort. I'm thinking of users on 
platforms
where libc's qsort either performs poorly or is buggy. Since we have 
all the

code for heap sort there already anyways...]


Actually, we already have our own qsort implementation in 
src/port/qsort.c for those cases.


--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Sort performance

2006-09-01 Thread Gregory Stark

I'm not sure if this is good news or bad news. Either some kudos are due to
the gang that worked on the external sort performance or something's very
wrong with the qsort implementation in glibc because I'm seeing Postgres's
external sort perform better than qsort.

This is despite Postgres external sorts having to execute filesystem calls
pushing buffers back and forth between user-space and kernel-space, which
seems hard to believe. I feel like something's got to be pretty far wrong with
the qsort call here for this to be possible.

At first I chalked this up to qsort having O(n^2) behaviour occasionally but
a) This is glibc where qsort is actually mergesort which should behave pretty
similarly to Postgres's mergesort and b) the input data is randomized pretty
well so it really ought be a problem even were it qsort.

Mem RunsTime

1MB 18  8.25s
10MB3   5.6s
100MB   qsort   6.1s

The input is a table with one column, a text field. It contains
/usr/share/dict/words ordered by random() and then repeated a bunch of times.
(Sorry about the imprecision, I set this table up a while ago and don't
remember exactly what I did). a

The machine has plenty of RAM and isn't swapping or running any other
services.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Sort performance

2006-09-01 Thread Gregory Stark


 I'm not sure if this is good news or bad news. Either some kudos are due to
 the gang that worked on the external sort performance or something's very
 wrong with the qsort implementation in glibc because I'm seeing Postgres's
 external sort perform better than qsort.

And here's a really perverse case. The external sort runs in 740 milliseconds
but qsort takes over 2 seconds:

postgres=# select count(*) from (select * from  (select * from w5 limit 20) 
 as x order by w ) as x;
 count  

 20
(1 row)

Time: 740.324 ms
postgres=# set work_mem = '12MB';
SET
Time: 0.145 ms
postgres=# select count(*) from (select * from  (select * from w5 limit 20) 
 as x order by w ) as x;
 count  

 20
(1 row)

Time: 2051.317 ms


LOG:  statement: set work_mem = '11MB';

LOG:  statement: select count(*) from (select * from  (select * from w5 limit 
20)  as x order by w ) as x;
LOG:  begin tuple sort: nkeys = 1, workMem = 11264, randomAccess = f
LOG:  switching to external sort with 41 tapes: CPU 0.01s/0.04u sec elapsed 
0.05 sec
LOG:  performsort starting: CPU 0.01s/0.34u sec elapsed 0.35 sec
LOG:  finished writing run 1 to tape 0: CPU 0.01s/0.52u sec elapsed 0.54 sec
LOG:  finished writing final run 2 to tape 1: CPU 0.01s/0.60u sec elapsed 0.62 
sec
LOG:  performsort done (except 2-way final merge): CPU 0.01s/0.63u sec elapsed 
0.65 sec
LOG:  external sort ended, 593 disk blocks used: CPU 0.02s/0.71u sec elapsed 
0.73 sec

LOG:  statement: set work_mem = '12MB';

LOG:  statement: select count(*) from (select * from  (select * from w5 limit 
20)  as x order by w ) as x;
LOG:  begin tuple sort: nkeys = 1, workMem = 12288, randomAccess = f
LOG:  performsort starting: CPU 0.00s/0.06u sec elapsed 0.06 sec
LOG:  doing qsort of 20 tuples
LOG:  performsort done: CPU 0.00s/1.99u sec elapsed 2.00 sec
LOG:  internal sort ended, 11919 KB used: CPU 0.00s/2.03u sec elapsed 2.04 sec


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Sort performance

2006-09-01 Thread Luke Lonergan
What version of pgsql?

Recent changes stripped the sort set down considerably in size in external 
sort, I'm not sure the same is done if the data doesn't spill to disk.

- Luke

Sent by GoodLink (www.good.com)


 -Original Message-
From:   Gregory Stark [mailto:[EMAIL PROTECTED]
Sent:   Friday, September 01, 2006 11:03 AM Eastern Standard Time
To: pgsql-hackers
Subject:[HACKERS] Sort performance


I'm not sure if this is good news or bad news. Either some kudos are due to
the gang that worked on the external sort performance or something's very
wrong with the qsort implementation in glibc because I'm seeing Postgres's
external sort perform better than qsort.

This is despite Postgres external sorts having to execute filesystem calls
pushing buffers back and forth between user-space and kernel-space, which
seems hard to believe. I feel like something's got to be pretty far wrong with
the qsort call here for this to be possible.

At first I chalked this up to qsort having O(n^2) behaviour occasionally but
a) This is glibc where qsort is actually mergesort which should behave pretty
similarly to Postgres's mergesort and b) the input data is randomized pretty
well so it really ought be a problem even were it qsort.

Mem RunsTime

1MB 18  8.25s
10MB3   5.6s
100MB   qsort   6.1s

The input is a table with one column, a text field. It contains
/usr/share/dict/words ordered by random() and then repeated a bunch of times.
(Sorry about the imprecision, I set this table up a while ago and don't
remember exactly what I did). a

The machine has plenty of RAM and isn't swapping or running any other
services.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Sort performance

2006-09-01 Thread Gregory Stark
Luke Lonergan [EMAIL PROTECTED] writes:

 What version of pgsql?

 Recent changes stripped the sort set down considerably in size in external
 sort, I'm not sure the same is done if the data doesn't spill to disk.

This is a recent CVS checkout.

If you're referring to MinimalTuples I think that's done before tuplesort ever
sees the tuples. Besides when swapping things around in memory only the first
datum and a pointer to the rest of the object actually gets moved around. I
think.

Now that I've investigated further I'm even more confused though. The cases
where I'm seeing external sorts outperform internal sorts are when it just
barely exceeds work_mem which means it's only doing one merge pass between
initial tapes generated using inittapes. That means most of the work is
actually being done using in-memory sorts. Guess what algorithm we use to
generate initial tapes: heap sort!

  * See Knuth, volume 3, for more than you want to know about the external
  * sorting algorithm.  We divide the input into sorted runs using replacement
  * selection, in the form of a priority tree implemented as a heap
  * (essentially his Algorithm 5.2.3H), 

So basically our heap sort implementation is 3x as fast a glibc's qsort
implementation?! Is that believable?

Certainly I don't get results like that if I just change the code to do a heap
sort instead of qsort. I see it being substantially slower.

[aside, that said that may be a useful feature to have: a user option to use
our internal heap sort instead of qsort. I'm thinking of users on platforms
where libc's qsort either performs poorly or is buggy. Since we have all the
code for heap sort there already anyways...]

I feel like I'm missing some extra work tuplesort is doing (possibly
needlessly) in addition to the qsort. Now I'm getting paranoid that perhaps
this is just a bug in my hacked up copy of this code. I can't see how that
could be but I'll try reproducing it with stock CVS Postgres.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Sort performance

2006-09-01 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Mem   RunsTime
   
 1MB   18  8.25s
 10MB  3   5.6s
 100MB qsort   6.1s

I'm confused what this means exactly?  Are you saying that in the first
two cases, there were 18 and 3 sorted runs generated in the initial
pass, and in the third case we just did the sort in memory using qsort?

How many items are being sorted, exactly?  Since it's text, it probably
also makes a big difference what LC_COLLATE setting you are using.
Non-C sort locale could mean that the strcoll() calls swamp all else.

How long does it take sort(1) to do the same task?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org