I have N documents containing attributes. I needed to precompute a special similarity measure between each pairwise of documents.
Now I would to understand how to index and search using ES to answer a query like "Retrieve me the Top N documents that are the most similar to document ID 1 and having as fieldA = 1" and facets the results according to a given field -- I was thinking to create an index of documnts with all the associated pairwises as attributes,like: Doc id: 1 field1: 7 field2: 10 sim_doc_id2: 10 sim_doc_id3: 8 sim_doc_id4: 12 ... sim_doc_idN: 12 Doc id: 2 field1: 5 field2: 2 sim_doc_id1: 10 sim_doc_id3: 3 sim_doc_id4: 2 ... sim_doc_idN: 10 .. issue with such design The number of generated fields per document is very large for me (10K) and I am not sure how to search efficiently (I tried a script score like return doc['sim_doc_id1'] + field1) but it was quite slow.. especially compared to a stupid loop in Java. however I would to use the aggregation framework of ES to create facets of the results. Do you have any recommendation / guideline to handle this pb? Thanks -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4e5f22d6-4f0a-4739-92c8-8b2e85885a6f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
