Re: [HACKERS] Sorting Improvements for 8.4

Michał Zaborowski Wed, 19 Dec 2007 02:35:23 -0800

2007/12/19, Mark Mielke <[EMAIL PROTECTED]>:
> Jeff Davis wrote:
> > My first thought would be that we would need a new executor node (e.g.
> > "ParallelSort") that would only be chosen when the cost of the sort is
> > large enough to outweigh other factors (such as process creation time,
> > dividing available work_mem, and any necessary IPC).
> >
> > It seems to me the simplest way to do it would be to allow each sub
> > process to allocate work_mem/P where P is the degree of parallelization.
> > However, that somewhat works against our schemes for dynamic run
> > handling and forecasting, and may lead to more random reading from disk.
> > Any other scheme I can think of would involve more IPC, which might make
> > the idea just too complex.
> >
> I am curious - what algorithms exist to efficiently do a parallel sort?
> Do you mean if sorting 1 million items, it is possible to separate this
> into  2 sets of 500 thousand each, execute them in separate threads
> (with task administration and synchronization overhead) , combine the
> results, and complete the task in significantly less time than doing it
> in one thread? I am skeptical that this is possible, and suspect that
> the overall efficiency of the system would go down even if the
> throughput of a single execution increases.
>
Ok - we want to sort table with quick sort and we want to do it on - N threads.
Every thread - gets begin and end of indices of the table. First step starts
at 0 and lasts with count -1. Single step:  find medium value and move
lover before it and bigger after. In normal case - we use recursive call - so
the same procedure is being called for that two parts. In thread we can put
indices at side list - and use queue of threads to pick up data from the list.
We can use common table, access to side list with indices has to be serialized.


> Or do you mean being able to perform parts of the query plan fully in
> parallel? If this, then one would need a lot more than ParallelSort...
>
Nice to have, but rather for data warehouses. In other cases... IE - backend
for Internet - there are many requests and every processor / core works nice.

-- 
Regards,
  Michał Zaborowski (TeXXaS)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: [HACKERS] Sorting Improvements for 8.4

Reply via email to