krisp wrote:
> I am looking for an efficient algorithm to search for k-median across
> disjointed vectors.
>
> For example, I have 10 different size arrays of floats (size varies
> from 20k to 100k elements)  from which I would like to get 1000th
> element of the combined set (total set is 500k elements).
>
> Currently I am extracting 1000 elements from each array and doing
> merge-sort. Since this operation is performed atleast 100k times (i.e.
> a set of 100k * 500k elements), the constant factor of shipping
> elements across a different machine is very high. I am wondering if
> there is a better approach to limit the amount of data passed around
> while keeping the efficiency.

This looks like an interesting problem, but you don't provide enough
information for a useful answer.  You imply that the independent
vectors are sorted (otherwise how do you peel off the top k elements).
True?  You imply vectors are separated by relatively slo
communications.  True?  If so, how slow?  I.e. what is the throughput
for a large message?  What is the turn-around time for a tiny message?
The reason for these questions is that if it's cheaper to send O(log N)
separate short messages than to send M messages with k floats (M is the
number of vectors) as you are currently doing, then a speedup is
possible.  Otherwise it probably isn't.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---

Reply via email to