Hi, We are currently using Lucene and are exploring Elasticsearch for scaling. We have a requirement to filter queries based on doc id and the set of docs to be filtered can be quite large e.g. out of a corpus of 10 million documents, user can choose a set of 5 million and run a query targeting that subset. Hence we need to pass in a set of 5 million doc ids so that the query can run only on those rather than the full index.
I am planning to use a mapped _id field that will be set during index mapping and then use a filtered query with IdsFilterBuilder to generate a filtered query. The issue is that the API takes a list of strings and hence will not scale - ideally we would like to pass in a bit set containing all the doc ids. We will be using the java api. What is the best way to approach this issue? I understand that we would need to write a custom API that will accept a bit set. If we write a plugin, can be access the internal APIs of Elasticsearch and hence not use the SearchRequestBuilder? Is a plugin the right approach? Any pointers as to where to start? Thanks, Shantanu Sen -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/663af063-525d-42f8-a2dd-a208c65a7621%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
