Try something like TermDocs termDocs = reader.termDocs(); termDocs.seek(new Term("<relevant field name here>", "")); while (termDocs.next()) { bits.set(termDocs.doc()); }
I *think* (and I'm remembering things folks wrote, haven't done this myself) that the empty string for the Term matches all terms. If not, you might have to wrap in in an outer loop that loops through all the elements, something like bits = new BitSet(reader.maxDoc()); TermDocs termDocs = reader.termDocs(); FilteredTermEnum fEnum = new FilteredTermEnum(reader, new Term(field, "")); for (Term term = null; (term = fEnum.term()) != null; fEnum.next()) { termDocs.seek(new Term( field, term.text())); while (termDocs.next()) { bits.set(termDocs.doc()); } } That said, it may be best for you to loop through each document and add that doc to the relevant filters if it had the fields you're interested in. You'd only be fetching each document once, so it'd only be one loop. I don't know enough about relative efficiencies to make a call here, probably depends upon how many docs you're dealing with. I'd stop at the first solution that works with acceptable performance unless you expect your corpus to grow significantly.... And since this is done in off hours, there's not a pressing reason to go with the very most efficient solution unless it takes a too long or you expect to have orders of magnitued more documents in your index eventually. Best Erick