Not elasticsearch, but I'm doing a similar thing using Redis. I started out
with the recommendify ruby gem and then wrote my own, commendo.
http://rubygems.org/gems/commendo

We're using it for production pairwise comparison of about 30,000 resources
at Meducation. Both visit-based similarity and content-based similarity. It
implements Jaccard now and could be extended.

hth,

rob


On 25 April 2014 23:09, NM <[email protected]> wrote:

>
> I have N documents containing attributes.
>
> I  needed to precompute a special similarity measure between each pairwise
> of documents.
>
> Now I would to understand how to index and search using ES to answer a
> query like
>
>  "Retrieve me  the Top N  documents that are  the most similar to document
> ID 1 and having as fieldA = 1"
> and facets the results according to a given field
>
> --
>
> I was thinking to create an index of documnts with all the associated
> pairwises as attributes,like:
>
> Doc
> id: 1
> field1: 7
> field2: 10
> sim_doc_id2: 10
> sim_doc_id3: 8
> sim_doc_id4: 12
> ...
> sim_doc_idN: 12
>
> Doc
> id: 2
> field1: 5
> field2: 2
> sim_doc_id1: 10
> sim_doc_id3: 3
> sim_doc_id4: 2
> ...
> sim_doc_idN: 10
> ..
>
> issue with such design
> The number of generated fields per document is very large for me (10K)
> and I am not sure how to search  efficiently   (I tried a script score
> like  return doc['sim_doc_id1'] + field1) but it was quite slow..
> especially compared to a stupid loop in Java. however I would to use the
> aggregation framework of ES to create facets of the results.
>
> Do you have any recommendation / guideline to handle this pb?
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4e5f22d6-4f0a-4739-92c8-8b2e85885a6f%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/4e5f22d6-4f0a-4739-92c8-8b2e85885a6f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAORoscnc2Py%2BKRDB0jED0M3%2B6vNPvuVekQz8WoL-eb7aj-Yy_w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to