On Monday, July 3, 2017 at 6:33:05 PM UTC+2, Tong Sun wrote:
>
> FYI, I've just enhanced the Go simhash package from mfonda/simhash 
> <https://github.com/mfonda/simhash> to go-dedup 
> <https://github.com/go-dedup>/simhash 
> <https://github.com/go-dedup/simhash>. 
>
> The reasons & enhancements are summaries here:
> https://github.com/go-dedup/simhash#versions
>
> Detailed documents of such changes, and the reasons behind it, also how to 
> use the original (v1) design API can be found here 
> <https://github.com/go-dedup/simhash/wiki/Version-2>.
>
> All patches welcome! Thx. 
>
>

   - dgryski/go-simstore <https://github.com/dgryski/go-simstore> One of 
   the earliest but "*not very promising*" 
   <https://groups.google.com/forum/#!msg/golang-nuts/E9UVskCnSJc/gm7KF27LnI0J>
    :(


Note that the package that was listed as "not very promising" was the 
initial "trifles" implementation, and not the complete service that's in 
go-simstore.

It's hard-coded for distance 3 or 6 and I've done a bunch of work on it in 
terms of memory reduction.  The service is used in production at 
Booking.com on data sets of ~60 million records.

Damian
 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to