Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Tom Lane
Jeffrey W. Baker [EMAIL PROTECTED] writes: I think the largest speedup will be to dump the multiphase merge and merge all tapes in one pass, no matter how large M. Currently M is capped at 6, so a sort of 60GB with 1GB sort memory needs 13 passes over the tape. It could be done in a single

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Simon Riggs
On Fri, 2005-09-30 at 13:41 -0700, Josh Berkus wrote: Yeah, that's what I thought too. But try sorting an 10GB table, and you'll see: disk I/O is practically idle, while CPU averages 90%+. We're CPU-bound, because sort is being really inefficient about something. I just don't know what

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Simon Riggs
On Sat, 2005-10-01 at 02:01 -0400, Tom Lane wrote: Jeffrey W. Baker [EMAIL PROTECTED] writes: I think the largest speedup will be to dump the multiphase merge and merge all tapes in one pass, no matter how large M. Currently M is capped at 6, so a sort of 60GB with 1GB sort memory needs 13

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Ron Peacetree
*blink* Tapes?! I thought that was a typo... If our sort is code based on sorting tapes, we've made a mistake. HDs are not tapes, and Polyphase Merge Sort and it's brethren are not the best choices for HD based sorts. Useful references to this point: Knuth, Vol 3 section 5.4.9, (starts p356 of

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes: The biggest single area where I see PostgreSQL external sort sucking is on index creation on large tables. For example, for free version of TPCH, it takes only 1.5 hours to load a 60GB Lineitem table on OSDL's hardware, but over 3 hours to create

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: Jeffrey W. Baker [EMAIL PROTECTED] writes: I think the largest speedup will be to dump the multiphase merge and merge all tapes in one pass, no matter how large M. Currently M is capped at 6, so a sort of 60GB with 1GB sort memory needs 13 passes over

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Ron Peacetree
You have not said anything about what HW, OS version, and pg version used here, but even at that can't you see that something Smells Wrong? The most common CPUs currently shipping have clock rates of ~2-3GHz and have 8B-16B internal pathways. SPARCs and other like CPUs are clocked slower but

Re: [PERFORM] [HACKERS] Query in SQL statement

2005-10-01 Thread Roger Hand
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Jim C. Nasby Sent: Friday, September 30, 2005 4:49 PM Subject: Re: [PERFORM] [HACKERS] Query in SQL statement I suggest ditching the CamelCase and going with underline_seperators. I'd also not use the

Re: [PERFORM] [HACKERS] Query in SQL statement

2005-10-01 Thread Tom Lane
Roger Hand [EMAIL PROTECTED] writes: I suggest ditching the CamelCase and going with underline_seperators. I'd also not use the bareword id, instead using bad_user_id. And I'd name the table bad_user. But that's just me. :) I converted a db from MS SQL, where tables and fields were CamelCase,