As I am looking through the code, I am thinking of the following approach 1. Write a plugin that will accept an encoded string containing the doc ids instead of the array of ids 2. Add a custom IdsFilterParser that will decode this string to a bit set and pass it downstream.
But it seems that the TermsFilter also needs to be customized (or a custom TermsFilter added) as the TermsFilter.getDocIdSet is the one that needs to be overridden/modified to generate the DocidSet from a set of doc ids rather than from a list of TermsAndFields as it is now. Is this the right approach? Any pointers? Thanks, Shantanu Sen On Wednesday, June 11, 2014 9:26:27 PM UTC-7, Shantanu Sen wrote: > > Hi, > > We are currently using Lucene and are exploring Elasticsearch for scaling. > We have a requirement to filter queries based on doc id and the set of docs > to be filtered can be quite large e.g. out of a corpus of 10 million > documents, user can choose a set of 5 million and run a query targeting > that subset. Hence we need to pass in a set of 5 million doc ids so that > the query can run only on those rather than the full index. > > I am planning to use a mapped _id field that will be set during index > mapping and then use a filtered query with IdsFilterBuilder to generate a > filtered query. The issue is that the API takes a list of strings and hence > will not scale - ideally we would like to pass in a bit set containing all > the doc ids. > > We will be using the java api. What is the best way to approach this > issue? I understand that we would need to write a custom API that will > accept a bit set. If we write a plugin, can be access the internal APIs of > Elasticsearch and hence not use the SearchRequestBuilder? > > Is a plugin the right approach? Any pointers as to where to start? > > Thanks, > Shantanu Sen > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/35392141-78e9-4451-82af-08e14111a906%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
