Hi,

We are currently using Lucene and are exploring Elasticsearch for scaling. 
We have a requirement to filter queries based on doc id and the set of docs 
to be filtered can be quite large e.g. out of a corpus of 10 million 
documents, user can choose a set of 5 million and run a query targeting 
that subset. Hence we need to pass in a set of 5 million doc ids so that 
the query can run only on those rather than the full index.

I am planning to use a mapped _id field that will be set during index 
mapping and then use a filtered query with IdsFilterBuilder to generate a 
filtered query. The issue is that the API takes a list of strings and hence 
will not scale - ideally we would like to pass in a bit set containing all 
the doc ids.

We will be using the java api. What is the best way to approach this issue? 
I understand that we would need to write a custom API that will accept a 
bit set. If we write a plugin, can be access the internal APIs of 
Elasticsearch and hence not use the SearchRequestBuilder? 

Is a plugin the right approach? Any pointers as to where to start?

Thanks,
Shantanu Sen

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/663af063-525d-42f8-a2dd-a208c65a7621%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to