I'm seeking any kind of advice that I can find to solve a problem I've run into with using lucene.
I'm integrating lucene as an alternative to other methods of indexing and searching that already exist in our product. Therefore it would be best if I could make the integration of lucene live up to the existing requirements.
What is indexed as lucene documents is structured in a tree (just like files in a filesystem), and the feature that I am working on is restricting a search to a certain part of this tree.
To implement this I used a PrefixQuery with the path to the folder to search below. Since the PrefixQuery creates a boolean query with a clause for each mathching term, this is a problem if there are more than 1024 subfolders below the selected folder.
One way of getting around this would be if maxClauseCount could be set for a PrefixQuery, but there are problems with this.
Picking a number for this would be hard. In order to support very large installations a value of a million or so would have to be used. This would probably not perform very well.
The only alternative I can think of would be to store a whitespace seperated list of all ancestors along with a document:
/foo /foo/bar /foo/bar/baz
But this has two drawbacks: Index storage space used, and the cost of indexing (finding all ancestors).
So my question boils down to: Are there any alternatives to solve this scenario in an efficient way?
Thanks in advance,
Dennis Thrys�e
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
