Hey,

indeed, this is mindblowing.


Best regards,

Sebastian



On Tue, 2021-02-23 at 09:58 -0800, leerho wrote:
> You might be interested in a current java PR #349
> <https://github.com/apache/datasketches-java/pull/349> that is adding
> Jaccard similarity to the Tuple sketches and is capable of doing a
> jaccard(Tuple, Theta) as well.
> This doesn't immediately solve the problem for Hive, but when it
> appears in
> a release version, we would want to leverage it in the Hive adaptor.
> 
> Lee.
> 
> 
> On Mon, Feb 22, 2021 at 1:28 PM Sebastian Klemke
> <[email protected]> wrote:
> 
> > Hey,
> > 
> > great, will do exactly that :-)
> > 
> > Best regards,
> > 
> > Sebastian
> > 
> > 
> > On Mon, 2021-02-22 at 08:46 -0800, Alexander Saydakov wrote:
> > > Sebastian,
> > > Yes, a pull request is the way to go.
> > > 
> > > On Sun, Feb 21, 2021 at 9:28 PM leerho <[email protected]> wrote:
> > > 
> > > > Thanks for offering a contribution.  The person best able to
> > > > handle
> > > > this
> > > > has been out.  He will be back this coming week.
> > > > Cheers.
> > > > Lee.
> > > > 
> > > > On Sat, Feb 20, 2021 at 10:07 AM Sebastian Klemke
> > > > <[email protected]> wrote:
> > > > 
> > > > > Hi!
> > > > > 
> > > > > Thanks for providing the datasketches library, it's a really
> > > > > powerful
> > > > > tool that I use in several projects. Lately, I have been
> > > > > using
> > > > > the
> > > > > Jaccard similarity estimator and found it would be easier to
> > > > > use
> > > > > if it
> > > > > was available as Hive UDF. I created such Hive UDF here:
> > > > > 
> > > > > 
> > > > > 
> > https://github.com/packet23/datasketches-hive/commit/9c0d72537ed5cede45d6b5282789af01a158af35
> > > > > <
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_packet23_datasketches-2Dhive_commit_9c0d72537ed5cede45d6b5282789af01a158af35&d=DwMFaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=0TpvE_u2hS1ubQhK3gLhy94YgZm2k_r8JHJnqgjOXx4&m=kjRPMYeMlrDsfXXVHUp0sUkLCHQUplHh-j_DQlE3W3g&s=iBN3nPWQzJ_MjDpIgDqfwX6lGW0Cz46nksXoqqDYmbY&e=
> > > > > > 
> > > > > 
> > > > > but I'm unclear how to proceed with contribution: Should I
> > > > > just
> > > > > make a
> > > > > pull request on github or do you prefer other means?
> > > > > 
> > > > > 
> > > > > Best regards,
> > > > > 
> > > > > Sebastian
> > > > > 
> > > > > 
> > > > > --
> > > > > Sebastian Klemke
> > > > > [email protected]
> > > > >            147EEC173170C3F1A19F200244741CA8D4106FE9 @
> > > > > keys.openpgp.org
> > > > > <
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__keys.openpgp.org&d=DwMFaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=0TpvE_u2hS1ubQhK3gLhy94YgZm2k_r8JHJnqgjOXx4&m=kjRPMYeMlrDsfXXVHUp0sUkLCHQUplHh-j_DQlE3W3g&s=aczAzZ77PXQGZ-3FCrBg6H9bAm76yoVBO--t6U0UqsM&e=
> > > > > > 
> > > > > 
> > > > 
> > 
> > --
> > Sebastian Klemke                                   
> > [email protected]
> >            147EEC173170C3F1A19F200244741CA8D4106FE9 @
> > keys.openpgp.org
> > 
> > 
> > -------------------------------------------------------------------
> > --
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> > 
> > 

-- 
Sebastian Klemke                                    [email protected]
           147EEC173170C3F1A19F200244741CA8D4106FE9 @ keys.openpgp.org


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to