Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-08 Thread mark
On Fri, Oct 07, 2005 at 09:20:59PM -0700, Luke Lonergan wrote: On 10/7/05 5:17 PM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: On Fri, Oct 07, 2005 at 04:55:28PM -0700, Luke Lonergan wrote: On 10/5/05 5:12 PM, Steinar H. Gunderson [EMAIL PROTECTED] wrote: What? strlen is definitely not in

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-07 Thread Luke Lonergan
Steinar, On 10/5/05 5:12 PM, Steinar H. Gunderson [EMAIL PROTECTED] wrote: What? strlen is definitely not in the kernel, and thus won't count as system time. System time on Linux includes time spent in glibc routines. - Luke ---(end of

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-07 Thread mark
On Fri, Oct 07, 2005 at 04:55:28PM -0700, Luke Lonergan wrote: On 10/5/05 5:12 PM, Steinar H. Gunderson [EMAIL PROTECTED] wrote: What? strlen is definitely not in the kernel, and thus won't count as system time. System time on Linux includes time spent in glibc routines. Do you have a

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-07 Thread Luke Lonergan
Mark, On 10/7/05 5:17 PM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: On Fri, Oct 07, 2005 at 04:55:28PM -0700, Luke Lonergan wrote: On 10/5/05 5:12 PM, Steinar H. Gunderson [EMAIL PROTECTED] wrote: What? strlen is definitely not in the kernel, and thus won't count as system time. System

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-06 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: Indeed, one of the things on my list is to remove all the lseeks in favour of pread. Halving the number of kernel calls has got to be worth something right? Portability is an issue ofcourse... Being sure that it's not a pessimization is another

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-06 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: Are we awfully worried about people still using 2.0 kernels? And it would replace two calls with three in the worst case, we currently lseek before every read. That's utterly false. regards, tom lane

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-06 Thread Alvaro Herrera
On Thu, Oct 06, 2005 at 03:57:38PM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Indeed, one of the things on my list is to remove all the lseeks in favour of pread. Halving the number of kernel calls has got to be worth something right? Portability is an issue

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Hannu Krosing
On E, 2005-10-03 at 14:16 -0700, Josh Berkus wrote: Jeff, Nope, LOTS of testing, at OSDL, GreenPlum and Sun. For comparison, A Big-Name Proprietary Database doesn't get much more than that either. I find this claim very suspicious. I get single-threaded reads in excess of 1GB/sec

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Gregory Maxwell
On 10/3/05, Ron Peacetree [EMAIL PROTECTED] wrote: [snip] Just how bad is this CPU bound condition? How powerful a CPU is needed to attain a DB IO rate of 25MBps? If we replace said CPU with one 2x, 10x, etc faster than that, do we see any performance increase? If a modest CPU can drive a

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Dann Corbit
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of PFC Sent: Thursday, September 29, 2005 9:10 AM To: [EMAIL PROTECTED] Cc: Pg Hackers; pgsql-performance@postgresql.org Subject: Re: [HACKERS] [PERFORM] A Better External Sort

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Dann Corbit
: Re: [HACKERS] [PERFORM] A Better External Sort? Jeffrey W. Baker [EMAIL PROTECTED] writes: I think the largest speedup will be to dump the multiphase merge and merge all tapes in one pass, no matter how large M. Currently M is capped at 6, so a sort of 60GB with 1GB sort memory needs 13

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Dann Corbit
- [EMAIL PROTECTED] On Behalf Of Jignesh K. Shah Sent: Friday, September 30, 2005 1:38 PM To: Ron Peacetree Cc: Josh Berkus; pgsql-hackers@postgresql.org; pgsql- [EMAIL PROTECTED] Subject: Re: [HACKERS] [PERFORM] A Better External Sort? I have seen similar performance as Josh and my

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Dann Corbit
] Subject: Re: [HACKERS] [PERFORM] A Better External Sort? On 9/28/05, Ron Peacetree [EMAIL PROTECTED] wrote: 2= We use my method to sort two different tables. We now have these very efficient representations of a specific ordering on these tables. A join operation can now be done using

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Dann Corbit
I see the following routines that seem to be related to sorting. If I were to examine these routines to consider ways to improve it, what routines should I key in on? I am guessing that tuplesort.c is the hub of activity for database sorting. Directory of

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Martijn van Oosterhout
On Sat, Oct 01, 2005 at 10:22:40AM -0400, Ron Peacetree wrote: Assuming we get the abyssmal physical IO performance fixed... (because until we do, _nothing_ is going to help us as much) I'm still not convinced this is the major problem. For example, in my totally unscientific tests on an oldish

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Gregory Maxwell
On 9/28/05, Ron Peacetree [EMAIL PROTECTED] wrote: 2= We use my method to sort two different tables. We now have these very efficient representations of a specific ordering on these tables. A join operation can now be done using these Btrees rather than the original data tables that involves

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Andrew Dunstan
Ron Peacetree wrote: The good news is all this means it's easy to demonstrate that we can improve the performance of our sorting functionality. Assuming we get the abyssmal physical IO performance fixed... (because until we do, _nothing_ is going to help us as much) I for one would be

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Gregory Maxwell
On 9/30/05, Ron Peacetree [EMAIL PROTECTED] wrote: 4= I'm sure we are paying all sorts of nasty overhead for essentially emulating the pg filesystem inside another filesystem. That means ~2x as much overhead to access a particular piece of data. The simplest solution is for us to implement a

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread David Fetter
On Thu, Sep 29, 2005 at 10:06:52AM -0700, Luke Lonergan wrote: Josh, On 9/29/05 9:54 AM, Josh Berkus josh@agliodbs.com wrote: Following an index creation, we see that 95% of the time required is the external sort, which averages 2mb/s. This is with seperate drives for the WAL, the

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Zeugswetter Andreas DAZ SD
In my original example, a sequential scan of the 1TB of 2KB or 4KB records, = 250M or 500M records of data, being sorted on a binary value key will take ~1000x more time than reading in the ~1GB Btree I described that used a Key+RID (plus node pointers) representation of the data. Imho

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Michael Stone
On Sat, Oct 01, 2005 at 06:19:41PM +0200, Martijn van Oosterhout wrote: COPY TO /dev/null WITH binary 13MB/s55% user 45% system (ergo, CPU bound) [snip] the most expensive. But it does point out that the whole process is probably CPU bound more than anything else. Note that 45% of that

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Michael Stone
On Tue, Oct 04, 2005 at 12:43:10AM +0300, Hannu Krosing wrote: Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k disks in RAID10, reiserfs). A little less than 100MB sec. And none of that 15G table is in the

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Luke Lonergan
@postgresql.org; pgsql-performance@postgresql.org Subject:Re: [HACKERS] [PERFORM] A Better External Sort? On Sat, Oct 01, 2005 at 06:19:41PM +0200, Martijn van Oosterhout wrote: COPY TO /dev/null WITH binary 13MB/s55% user 45% system (ergo, CPU bound) [snip] the most expensive

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Michael Stone
On Wed, Oct 05, 2005 at 11:24:07AM -0400, Luke Lonergan wrote: Nope - it would be disk wait. I said I/O overhead; i.e., it could be the overhead of calling the kernel for I/O's. E.g., the following process is having I/O problems: time dd if=/dev/sdc of=/dev/null bs=1 count=1000

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Ron Peacetree
Subject: Re: [HACKERS] [PERFORM] A Better External Sort? Nope - it would be disk wait. COPY is CPU bound on I/O subsystems faster that 50 MB/s on COPY (in) and about 15 MB/s (out). - Luke -Original Message- From: Michael Stone [mailto:[EMAIL PROTECTED] Sent: Wed Oct 05 09:58:41 2005

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Joshua D. Drake
We have to fix this. Ron The source is freely available for your perusal. Please feel free to point us in specific directions in the code where you may see some benefit. I am positive all of us that can, would put resources into fixing the issue had we a specific direction to attack.

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Ron Peacetree
is the code, but the code in isolation is often the Slow Path to understanding with systems as complex as a DBMS IO layer. Ron -Original Message- From: Joshua D. Drake [EMAIL PROTECTED] Sent: Oct 5, 2005 1:18 PM Subject: Re: [HACKERS] [PERFORM] A Better External Sort? The source

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Jeffrey W. Baker
On Wed, 2005-10-05 at 12:14 -0400, Ron Peacetree wrote: I've now gotten verification from multiple working DBA's that DB2, Oracle, and SQL Server can achieve ~250MBps ASTR (with as much as ~500MBps ASTR in setups akin to Oracle RAC) when attached to a decent (not outrageous, but decent) HD

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Ron Peacetree
I'm putting in as much time as I can afford thinking about pg related performance issues. I'm doing it because of a sincere desire to help understand and solve them, not to annoy people. If I didn't believe in pg, I would't be posting thoughts about how to make it better. It's probably worth

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Steinar H. Gunderson
On Wed, Oct 05, 2005 at 04:55:51PM -0700, Luke Lonergan wrote: In COPY, we found lots of libc functions like strlen() being called ridiculous numbers of times, in one case it was called on every timestamp/date attribute to get the length of TZ, which is constant. That one function call was in

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Tom Lane
Jeffrey W. Baker [EMAIL PROTECTED] writes: I think the largest speedup will be to dump the multiphase merge and merge all tapes in one pass, no matter how large M. Currently M is capped at 6, so a sort of 60GB with 1GB sort memory needs 13 passes over the tape. It could be done in a single

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Simon Riggs
On Fri, 2005-09-30 at 13:41 -0700, Josh Berkus wrote: Yeah, that's what I thought too. But try sorting an 10GB table, and you'll see: disk I/O is practically idle, while CPU averages 90%+. We're CPU-bound, because sort is being really inefficient about something. I just don't know what

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Simon Riggs
On Sat, 2005-10-01 at 02:01 -0400, Tom Lane wrote: Jeffrey W. Baker [EMAIL PROTECTED] writes: I think the largest speedup will be to dump the multiphase merge and merge all tapes in one pass, no matter how large M. Currently M is capped at 6, so a sort of 60GB with 1GB sort memory needs 13

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Ron Peacetree
do, _nothing_ is going to help us as much) Ron -Original Message- From: Tom Lane [EMAIL PROTECTED] Sent: Oct 1, 2005 2:01 AM Subject: Re: [HACKERS] [PERFORM] A Better External Sort? Jeffrey W. Baker [EMAIL PROTECTED] writes: I think the largest speedup will be to dump the multiphase

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes: The biggest single area where I see PostgreSQL external sort sucking is on index creation on large tables. For example, for free version of TPCH, it takes only 1.5 hours to load a 60GB Lineitem table on OSDL's hardware, but over 3 hours to create

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: Jeffrey W. Baker [EMAIL PROTECTED] writes: I think the largest speedup will be to dump the multiphase merge and merge all tapes in one pass, no matter how large M. Currently M is capped at 6, so a sort of 60GB with 1GB sort memory needs 13 passes over

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Ron Peacetree
@svana.org Sent: Oct 1, 2005 12:19 PM Subject: Re: [HACKERS] [PERFORM] A Better External Sort? On Sat, Oct 01, 2005 at 10:22:40AM -0400, Ron Peacetree wrote: Assuming we get the abyssmal physical IO performance fixed... (because until we do, _nothing_ is going to help us as much) I'm still

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-30 Thread Josh Berkus
Ron, Hmmm. 60GB/5400secs= 11MBps. That's ssllooww. So the first problem is evidently our physical layout and/or HD IO layer sucks. Actually, it's much worse than that, because the sort is only dealing with one column. As I said, monitoring the iostat our top speed was 2.2mb/s. --Josh

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-30 Thread Ron Peacetree
-hackers@postgresql.org, pgsql-performance@postgresql.org Subject: Re: [HACKERS] [PERFORM] A Better External Sort? Ron, Hmmm. 60GB/5400secs= 11MBps. That's ssllooww. So the first problem is evidently our physical layout and/or HD IO layer sucks. Actually, it's much worse than that, because

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-30 Thread Josh Berkus
Ron, That 11MBps was your =bulk load= speed. If just loading a table is this slow, then there are issues with basic physical IO, not just IO during sort operations. Oh, yeah. Well, that's separate from sort. See multiple posts on this list from the GreenPlum team, the COPY patch for 8.1,

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-30 Thread Jignesh K. Shah
josh@agliodbs.com Sent: Sep 30, 2005 1:23 PM To: Ron Peacetree [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org, pgsql-performance@postgresql.org Subject: Re: [HACKERS] [PERFORM] A Better External Sort? Ron, Hmmm. 60GB/5400secs= 11MBps. That's ssllooww. So the first problem is evidently

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-30 Thread Luke Lonergan
Ron, On 9/30/05 1:20 PM, Ron Peacetree [EMAIL PROTECTED] wrote: That 11MBps was your =bulk load= speed. If just loading a table is this slow, then there are issues with basic physical IO, not just IO during sort operations. Bulk loading speed is irrelevant here - that is dominated by

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-30 Thread PFC
Bulk loading speed is irrelevant here - that is dominated by parsing, which we have covered copiously (har har) previously and have sped up by 500%, which still makes Postgres 1/2 the loading speed of MySQL. Let's ask MySQL 4.0 LOAD DATA INFILE blah 0 errors, 666 warnings SHOW

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-30 Thread Ron Peacetree
josh@agliodbs.com Sent: Sep 30, 2005 4:41 PM To: Ron Peacetree [EMAIL PROTECTED] Cc: pgsql-hackers@postgresql.org, pgsql-performance@postgresql.org Subject: Re: [HACKERS] [PERFORM] A Better External Sort? Ron, That 11MBps was your =bulk load= speed. If just loading a table is this slow

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread Pailloncy Jean-Gerard
Your main example seems to focus on a large table where a key column has constrained values. This case is interesting in proportion to the number of possible values. If I have billions of rows, each having one of only two values, I can think of a trivial and very fast method of returning

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread PFC
Just to add a little anarchy in your nice debate... Who really needs all the results of a sort on your terabyte table ? I guess not many people do a SELECT from such a table and want all the results. So, this leaves : - Really wanting all the results, to fetch using

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread Josh Berkus
Jeff, Ron, First off, Jeff, please take it easy. We're discussing 8.2 features at this point and there's no reason to get stressed out at Ron. You can get plenty stressed out when 8.2 is near feature freeze. ;-) Regarding use cases for better sorts: The biggest single area where I see

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread Luke Lonergan
Josh, On 9/29/05 9:54 AM, Josh Berkus josh@agliodbs.com wrote: Following an index creation, we see that 95% of the time required is the external sort, which averages 2mb/s. This is with seperate drives for the WAL, the pg_tmp, the table and the index. I've confirmed that increasing

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread Jeffrey W. Baker
On Thu, 2005-09-29 at 10:06 -0700, Luke Lonergan wrote: Josh, On 9/29/05 9:54 AM, Josh Berkus josh@agliodbs.com wrote: Following an index creation, we see that 95% of the time required is the external sort, which averages 2mb/s. This is with seperate drives for the WAL, the pg_tmp,

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread Josh Berkus
Jeff, Josh, do you happen to know how many passes are needed in the multiphase merge on your 60GB table? No, any idea how to test that? I think the largest speedup will be to dump the multiphase merge and merge all tapes in one pass, no matter how large M. Currently M is capped at 6, so a

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread Ron Peacetree
From: Pailloncy Jean-Gerard [EMAIL PROTECTED] Sent: Sep 29, 2005 7:11 AM Subject: Re: [HACKERS] [PERFORM] A Better External Sort? Jeff Baker: Your main example seems to focus on a large table where a key column has constrained values. This case is interesting in proportion to the number

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread Ron Peacetree
From: Zeugswetter Andreas DAZ SD [EMAIL PROTECTED] Sent: Sep 29, 2005 9:28 AM Subject: RE: [HACKERS] [PERFORM] A Better External Sort? In my original example, a sequential scan of the 1TB of 2KB or 4KB records, = 250M or 500M records of data, being sorted on a binary value key will take ~1000x

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread Luke Lonergan
Jeff, On 9/29/05 10:44 AM, Jeffrey W. Baker [EMAIL PROTECTED] wrote: On Thu, 2005-09-29 at 10:06 -0700, Luke Lonergan wrote: Looking through tuplesort.c, I have a couple of initial ideas. Are we allowed to fork here? That would open up the possibility of using the CPU and the I/O in

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-29 Thread Ron Peacetree
From: Josh Berkus josh@agliodbs.com Sent: Sep 29, 2005 12:54 PM Subject: Re: [HACKERS] [PERFORM] A Better External Sort? The biggest single area where I see PostgreSQL external sort sucking is on index creation on large tables. For example, for free version of TPCH, it takes only 1.5 hours

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-28 Thread Ron Peacetree
From: Jeffrey W. Baker [EMAIL PROTECTED] Sent: Sep 27, 2005 1:26 PM To: Ron Peacetree [EMAIL PROTECTED] Subject: Re: [HACKERS] [PERFORM] A Better External Sort? On Tue, 2005-09-27 at 13:15 -0400, Ron Peacetree wrote: That Btree can be used to generate a physical reordering of the data in one

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-28 Thread Ron Peacetree
In the interest of efficiency and not reinventing the wheel, does anyone know where I can find C or C++ source code for a Btree variant with the following properties: A= Data elements (RIDs) are only stored in the leaves, Keys (actually KeyPrefixes; see D below) and Node pointers are only stored

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-27 Thread Josh Berkus
Ron, I've somehow missed part of this thread, which is a shame since this is an area of primary concern for me. Your suggested algorithm seems to be designed to relieve I/O load by making more use of the CPU. (if I followed it correctly). However, that's not PostgreSQL's problem;

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-27 Thread Ron Peacetree
From: Josh Berkus josh@agliodbs.com ent: Sep 27, 2005 12:15 PM To: Ron Peacetree [EMAIL PROTECTED] Subject: Re: [HACKERS] [PERFORM] A Better External Sort? I've somehow missed part of this thread, which is a shame since this is an area of primary concern for me. Your suggested algorithm seems

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-27 Thread Jeffrey W. Baker
On Tue, 2005-09-27 at 13:15 -0400, Ron Peacetree wrote: That Btree can be used to generate a physical reordering of the data in one pass, but that's the weakest use for it. The more powerful uses involve allowing the Btree to persist and using it for more efficient re-searches or combining

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-26 Thread Tom Lane
Ron Peacetree [EMAIL PROTECTED] writes: Let's start by assuming that an element is = in size to a cache line and a node fits into L1 DCache. [ much else snipped ] So far, you've blithely assumed that you know the size of a cache line, the sizes of L1 and L2 cache, and that you are working

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-26 Thread Ron Peacetree
From: Dann Corbit [EMAIL PROTECTED] Sent: Sep 26, 2005 5:13 PM To: Ron Peacetree [EMAIL PROTECTED], pgsql-hackers@postgresql.org, pgsql-performance@postgresql.org Subject: RE: [HACKERS] [PERFORM] A Better External Sort? I think that the btrees are going to be O(n*log(n)) in construction

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-26 Thread Ron Peacetree
SECOND ATTEMPT AT POST. Web mailer appears to have eaten first one. I apologize in advance if anyone gets two versions of this post. =r From: Tom Lane [EMAIL PROTECTED] Sent: Sep 26, 2005 9:42 PM Subject: Re: [HACKERS] [PERFORM] A Better External Sort? So far, you've blithely assumed that you