Thank you, Ted.
On Fri, Apr 20, 2012 at 2:30 PM, Ted Dunning <[email protected]> wrote: > Look at our OnlineSummarizer. THis should be roughly parallelizable. > > On Fri, Apr 20, 2012 at 2:12 PM, Dmitriy Lyubimov <[email protected]> wrote: > >> Thank you, sir. Let me consider this. >> >> On Fri, Apr 20, 2012 at 11:50 AM, Hector Yee <[email protected]> wrote: >> > how about this >> > >> > http://en.wikipedia.org/wiki/Reservoir_sampling >> > >> > On Fri, Apr 20, 2012 at 10:44 AM, Dmitriy Lyubimov <[email protected] >> >wrote: >> > >> >> Hello, >> >> >> >> There should be some way to compile quartiles in a map/reduce fashion >> >> (i.e. with api similar to Pig's Arithmetic custom function) without >> >> keeping enormous count hash? >> >> There's this countsketch thing that i implemented before on map >> >> reduce, but it is sort of like bloom filter: if it gives a wrong >> >> result, the error is fairly huge (in case of bloom filter, 100%) and >> >> to get good results it still requires quite a bit of memory >> >> >> > >> > >> > >> > -- >> > Yee Yang Li Hector <https://plus.google.com/106746796711269457249> >> > Professional Profile <http://www.linkedin.com/in/yeehector> >> > http://hectorgon.blogspot.com/ (tech + travel) >> > http://hectorgon.com (book reviews) >>
