Not elasticsearch, but I'm doing a similar thing using Redis. I started out with the recommendify ruby gem and then wrote my own, commendo. http://rubygems.org/gems/commendo
We're using it for production pairwise comparison of about 30,000 resources at Meducation. Both visit-based similarity and content-based similarity. It implements Jaccard now and could be extended. hth, rob On 25 April 2014 23:09, NM <[email protected]> wrote: > > I have N documents containing attributes. > > I needed to precompute a special similarity measure between each pairwise > of documents. > > Now I would to understand how to index and search using ES to answer a > query like > > "Retrieve me the Top N documents that are the most similar to document > ID 1 and having as fieldA = 1" > and facets the results according to a given field > > -- > > I was thinking to create an index of documnts with all the associated > pairwises as attributes,like: > > Doc > id: 1 > field1: 7 > field2: 10 > sim_doc_id2: 10 > sim_doc_id3: 8 > sim_doc_id4: 12 > ... > sim_doc_idN: 12 > > Doc > id: 2 > field1: 5 > field2: 2 > sim_doc_id1: 10 > sim_doc_id3: 3 > sim_doc_id4: 2 > ... > sim_doc_idN: 10 > .. > > issue with such design > The number of generated fields per document is very large for me (10K) > and I am not sure how to search efficiently (I tried a script score > like return doc['sim_doc_id1'] + field1) but it was quite slow.. > especially compared to a stupid loop in Java. however I would to use the > aggregation framework of ES to create facets of the results. > > Do you have any recommendation / guideline to handle this pb? > > Thanks > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/4e5f22d6-4f0a-4739-92c8-8b2e85885a6f%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/4e5f22d6-4f0a-4739-92c8-8b2e85885a6f%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAORoscnc2Py%2BKRDB0jED0M3%2B6vNPvuVekQz8WoL-eb7aj-Yy_w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
