krisp wrote: > I am looking for an efficient algorithm to search for k-median across > disjointed vectors. > > For example, I have 10 different size arrays of floats (size varies > from 20k to 100k elements) from which I would like to get 1000th > element of the combined set (total set is 500k elements). > > Currently I am extracting 1000 elements from each array and doing > merge-sort. Since this operation is performed atleast 100k times (i.e. > a set of 100k * 500k elements), the constant factor of shipping > elements across a different machine is very high. I am wondering if > there is a better approach to limit the amount of data passed around > while keeping the efficiency.
This looks like an interesting problem, but you don't provide enough information for a useful answer. You imply that the independent vectors are sorted (otherwise how do you peel off the top k elements). True? You imply vectors are separated by relatively slo communications. True? If so, how slow? I.e. what is the throughput for a large message? What is the turn-around time for a tiny message? The reason for these questions is that if it's cheaper to send O(log N) separate short messages than to send M messages with k floats (M is the number of vectors) as you are currently doing, then a speedup is possible. Otherwise it probably isn't. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/algogeeks -~----------~----~----~----~------~----~------~--~---
