[Boston.pm] Binary search requirements + estimates (not empty)

Ricker, William Thu, 09 Jul 2009 07:02:18 -0700

Are the prefix uniform size? Or can a small number of expected matchable  
prefixes be parsed out of the 250k strings?


Are you doing this once, or new set of 7500 perefixes every minute?

Are both lists already sorted? If big one isn't that dominates. If big one is 
but not according to primitive lt(3P) all algs will be slowed by custom compare 
(unless inline::C or XS, and there is still some ovh).. 

Master file update aka merge of sorted files would find all hits in time O(N+M) 
and space O(1). that sounds bigger than O(ln N * M) for your numbers, but it 
leverages the O(N) load of big list which is required before the first O(ln N) 
search.

Map-Reduce is the modern way if you have cpu farm, but optimizes response time 
not global warming. 

Bill
Not speaking for $dayjob

Bill, typing with thumbs

_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm

[Boston.pm] Binary search requirements + estimates (not empty)

Reply via email to