> -----Original Message----- > From: Stephen Frost [mailto:[EMAIL PROTECTED] > Sent: Thursday, March 09, 2006 3:49 PM > To: Tom Lane > Cc: Luke Lonergan; Jim C. Nasby; Greg Stark; Dann Corbit; Simon Riggs; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Merge algorithms for large numbers of "tapes" > > * Tom Lane ([EMAIL PROTECTED]) wrote: > > "Luke Lonergan" <[EMAIL PROTECTED]> writes: > > > I would only suggest that we replace the existing algorithm with one > that > > > will work regardless of (reasonable) memory requirements. Perhaps we > can > > > agree that at least 1MB of RAM for external sorting will always be > available > > > and proceed from there? > > > > If you can sort indefinitely large amounts of data with 1MB work_mem, > > go for it. > > It seems you two are talking past each other and I'm at least slightly > confused. So, I'd like to ask for a bit of clarification and perhaps > that will help everyone. > > #1: I'm as much a fan of eliminating unnecessary code as anyone > #2: There have been claims of two-pass improving things 400% > #3: Supposedly two-pass requires on the order of sqrt(total) memory
Two pass does not require sqrt(total) memory. This figure is clearly wrong. Two pass will create the count of subfiles proportional to: Subfile_count = original_stream_size/sort_memory_buffer_size The merge pass requires (sizeof record * subfile_count) memory. Example: You have a 7 gigabyte table to sort and you have 100 MB sort buffer. The number of subfiles will be: 7000000000 / 100000000 = 70 files Suppose that a record is 2K wide. The merge pass requires 70*2k = 143,360 bytes of RAM. Suppose that a record is 65535 bytes wide. The merge pass requires 70*65535 = 4,587,450 bytes of RAM. > #4: We have planner statistics to estimate size of total > #5: We have a work_mem limitation for a reason > > So, if we get a huge performance increase, what's wrong with: > if [ sqrt(est(total)) <= work_mem ]; then > two-pass-sort(); > else > tape-sort(); > fi > > ? > > If the performance isn't much different and tape-sort can do it with > less memory then I don't really see any point in removing it. > > If the intent is to remove it and then ask for the default work_mem to > be increased- I doubt going about it this way would work very well. :) > > Thanks, > > Stephen ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly