how about this

http://en.wikipedia.org/wiki/Reservoir_sampling

On Fri, Apr 20, 2012 at 10:44 AM, Dmitriy Lyubimov <dlie...@gmail.com>wrote:

> Hello,
>
> There should be some way to compile quartiles in a map/reduce fashion
> (i.e. with api similar to Pig's Arithmetic custom function) without
> keeping enormous count hash?
> There's this countsketch thing that i implemented before on map
> reduce, but it is sort of like bloom filter: if it gives a wrong
> result, the error is fairly huge (in case of bloom filter, 100%) and
> to get good results it still requires quite a bit of memory
>



-- 
Yee Yang Li Hector <https://plus.google.com/106746796711269457249>
Professional Profile <http://www.linkedin.com/in/yeehector>
http://hectorgon.blogspot.com/ (tech + travel)
http://hectorgon.com (book reviews)

Reply via email to