> I'd have to create a similarly sized array of Boolean.Occur too, isn't it?
What for? > 1. Is that how SIREn implements it? No idea. > 2. Is that optimal solution if I'm going to have an index of a billion docs > with varying number of fields? Probably not. I always use a catchall field. I don't claim that is optimal, certainly not for your circumstances which I know nothing about, but it works for me. Sure, it makes the index bigger but not dramatically so, for my data anyway. There is no need to store it, just index it. And unless you need to search on the individual fields that contribute to the catchall you don't need them indexed. Disk space is cheap and I assume you have suitable hardware to support your billion doc index/sharded indexes. -- Ian. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org