Hi All, Please let me know, where we are adding all these data structures. If it is under a new sub-package of Malhar/library, please suggest the package name.
On Fri, Dec 11, 2015 at 7:21 AM, Chandni Singh <[email protected]> wrote: > Hi Sandeep, > > Thanks for picking up Hyperloglog. > We need implementations of probabilistic data-structures which can be > easily integrated with any operator. > > Having these data structures will make development of operators and its > optimization easier. > > The ticket for Hyperloglog actually talks about creating an operator but > what I had in mind was a component that can be easily integrated with any > operator. > > Thanks, > Chandni > > On Thu, Dec 10, 2015 at 5:32 PM, Sandesh Hegde <[email protected]> > wrote: > > > As many of these are common algorithms, it will be really good if the > > external libraries are also evaluated as part of this effort. If the > > external libraries are great then there is no point in implementing them. > > > > On Thu, Dec 10, 2015 at 5:24 PM Narayanaswami, Sandeep < > > [email protected]> wrote: > > > > > Hi, > > > > > > I¹d like to work on implementing Hyperloglog (MLHR-1822) if no one else > > is > > > working on it already (I have asked to be assigned the JIRA). > > > > > > Beyond hyperloglog, I¹m interested in developing other sampling > > algorithms > > > (e.g., computing quantiles), Count-Min Sketch, and sketching for > > > dimensionality reduction. I¹d love to hear the community¹s thoughts on > > how > > > useful/interesting this effort would be. > > > > > > Cheers, > > > Sandeep > > > > > > > > > > > > On 12/10/15, 4:19 PM, "Chandni Singh" <[email protected]> wrote: > > > > > > >Here is the Jira: > > > >https://malhar.atlassian.net/browse/MLHR-1937 > > > > > > > >On Thu, Dec 10, 2015 at 3:30 PM, Isha Arkatkar <[email protected]> > > > >wrote: > > > > > > > >> Hi, > > > >> > > > >> I would like to take it up. > > > >> > > > >> Thanks, > > > >> Isha > > > >> > > > >> On Thu, Dec 10, 2015 at 3:17 PM, Chandni Singh < > > [email protected] > > > > > > > >> wrote: > > > >> > > > >> > Any takers for MinHash? > > > >> > > > > >> > On Wed, Dec 9, 2015 at 3:13 AM, Chaitanya Chebolu < > > > >> > [email protected] > > > >> > > wrote: > > > >> > > > > >> > > Hi Chandni, > > > >> > > > > > >> > > Yes. I have the implementation of BloomFilter and this can be > > > >>added > > > >> to > > > >> > > Malhar. > > > >> > > Needs to update the branch and then will open a PR. > > > >> > > > > > >> > > Regards, > > > >> > > Chaitanya > > > >> > > > > > >> > > On Wed, Dec 9, 2015 at 1:42 PM, Chandni Singh > > > >><[email protected] > > > >> > > > > >> > > wrote: > > > >> > > > > > >> > > > Chaitanya, > > > >> > > > > > > >> > > > I believe you have an implementation of BloomFilter in your > > folk. > > > >>Do > > > >> > you > > > >> > > > think that can be added to Malhar? > > > >> > > > > > > >> > > > Chandni > > > >> > > > > > > >> > > > On Tue, Dec 8, 2015 at 9:02 PM, David Yan < > > [email protected]> > > > >> > wrote: > > > >> > > > > > > >> > > > > Bloom Filter, MinHash, and HyperLogLog are some of the > > commonly > > > >> used > > > >> > > > > algorithms in Big Data. I think having them in the Malhar > > > >>library > > > >> > > would > > > >> > > > be > > > >> > > > > a good idea. > > > >> > > > > > > > >> > > > > There's a ticket for HyperLogLog created long time ago: > > > >> > > > > https://malhar.atlassian.net/browse/MLHR-1822 > > > >> > > > > > > > >> > > > > On Tue, Dec 8, 2015 at 5:42 PM, Chandni Singh < > > > >> > [email protected] > > > >> > > > > > > >> > > > > wrote: > > > >> > > > > > > > >> > > > > > Hi, > > > >> > > > > > > > > >> > > > > > We need to add a BloomFilter implementation in Malhar. > > > >> ManagedState > > > >> > > > has a > > > >> > > > > > use for it and I am pretty sure we will come up more and > > more > > > >>use > > > >> > > cases > > > >> > > > > > that will need it. Tim's suggestion on Spill-able/Spooled > > data > > > >> > > > structures > > > >> > > > > > may use it too. > > > >> > > > > > > > > >> > > > > > Chandni > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > ________________________________________________________ > > > > > > The information contained in this e-mail is confidential and/or > > > proprietary to Capital One and/or its affiliates and may only be used > > > solely in performance of work or services for Capital One. The > > information > > > transmitted herewith is intended only for use by the individual or > entity > > > to which it is addressed. If the reader of this message is not the > > intended > > > recipient, you are hereby notified that any review, retransmission, > > > dissemination, distribution, copying or other use of, or taking of any > > > action in reliance upon this information is strictly prohibited. If you > > > have received this communication in error, please contact the sender > and > > > delete the material from your computer. > > > > > > > > >
