Additional - I'm on lucene 4.10.2 If I use a BooleanFilter as per Ian's suggestion I still get a null acceptDocs being passed to my NDV filter.
Sent from my iPhone > On 11 Mar 2015, at 17:19, Chris Bamford <[email protected]> wrote: > > Hi Shai > > I thought that might be what acceptDocs was for, but in my case it is null > and throws a NPE if I try your suggestion. > > What am I doing wrong? I'd like to really understand this stuff .. > > Thanks > > Chris > > >> On 11 Mar 2015, at 13:05, Shai Erera <[email protected]> wrote: >> >> I don't see that you use acceptDocs in your MyNDVFilter. I think it would >> return false for all userB docs, but you should confirm that. >> >> Anyway, because you use an NDV field, you can't automatically skip >> unrelated documents, but rather your code would look something like: >> >> for (int i = 0; i < reader.maxDoc(); i++) { >> if (!acceptDocs.get(i)) { >> continue; >> } >> // document is accepted, read values >> ... >> } >> >> Shai >> >>> On Wed, Mar 11, 2015 at 1:25 PM, Ian Lea <[email protected]> wrote: >>> >>> Can you use a BooleanFilter (or ChainedFilter in 4.x) alongside your >>> BooleanQuery? Seems more logical and I suspect would solve the problem. >>> Caching filters can be good too, depending on how often your data changes. >>> See CachingWrapperFilter. >>> >>> -- >>> Ian. >>> >>> >>> On Tue, Mar 10, 2015 at 12:45 PM, Chris Bamford <[email protected]> >>> wrote: >>> >>>> >>>> Hi, >>>> >>>> I have an index of 30 docs, 20 of which have an owner field of "UserA" >>>> and 10 of "UserB". >>>> I also have a query which consists of: >>>> >>>> BooleanQuery: >>>> -- Clause 1: TermQuery >>>> -- Clause 2: FilteredQuery >>>> ----- Branch 1: MatchAllDocsQuery() >>>> ----- Branch 2: MyNDVFilter >>>> >>>> I execute my search as follows: >>>> >>>> searcher.search( booleanQuery, >>>> new TermFilter(new Term("owner", >>>> "UserA"), >>>> 50); >>>> >>>> The TermFilter's job is to reduce the number of searchable documents >>>> from 30 to 20, which it does for all clauses of the BooleanQuery except >>> for >>>> MyNDVFilter which iterates through the full 30 docs, 10 needlessly. How >>>> can I restrict it so it behaves the same as the other query branches? >>>> >>>> MyNDVFilter source code: >>>> >>>> public class MyNDVFilter extends Filter { >>>> >>>> private String fieldName; >>>> private String matchTag; >>>> >>>> public TagFilter(String ndvFieldName, String matchTag) { >>>> this.fieldName = ndvFieldName; >>>> this.matchTag = matchTag; >>>> } >>>> >>>> @Override >>>> public DocIdSet getDocIdSet(AtomicReaderContext context, Bits >>>> acceptDocs) throws IOException { >>>> >>>> AtomicReader reader = context.reader(); >>>> int maxDoc = reader.maxDoc(); >>>> final FixedBitSet bitSet = new FixedBitSet(maxDoc); >>>> BinaryDocValues ndv = reader.getBinaryDocValues(fieldName); >>>> >>>> if (ndv != null) { >>>> for (int i = 0; i < maxDoc; i++) { >>>> BytesRef br = ndv.get(i); >>>> if (br.length > 0) { >>>> String strval = br.utf8ToString(); >>>> if (strval.equals(matchTag)) { >>>> bitSet.set(i); >>>> System.out.println("MyNDVFilter >> " + matchTag + >>>> " matched " + i + " [" + strval + "]"); >>>> } >>>> } >>>> } >>>> } >>>> >>>> return new DVDocSetId(bitSet); // just wraps a FixedBitSet >>>> } >>>> } >>>> >>>> >>>> >>>> Chris Bamford m: +44 7860 405292 w: www.mimecast.com Senior >>> Developer p: >>>> +44 207 847 8700 Address click here >>>> <http://www.mimecast.com/About-us/Contact-us/> >>>> ------------------------------ >>>> [image: http://www.mimecast.com] >>>> < >>> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=83be674748892bc34425eb4133af3e68 >>>> >>>> [image: LinkedIn] >>>> < >>> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=83a78f78bdfa40c471501ae0b813a68f> >>> [image: >>>> YouTube] >>>> < >>> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=ad1ed1af5bb9cf9dc965267ed43faff0> >>> [image: >>>> Facebook] >>>> < >>> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=172d4ea57e4a4673452098ba62badace> >>> [image: >>>> Blog] >>>> < >>> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=871b30b627b3263b9ae2a8f37b0de5ff> >>> [image: >>>> Twitter] >>>> < >>> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=cc3a825e202ee26a108f3ef8a1dc3c6f > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
