But please do give these changes a try! They should make a lot of phrase and conjunctive queries faster, especially with big indexes. Tell me if you have any problems.
Cheers,
Doug
[EMAIL PROTECTED] wrote:
+ + 1. Changed the format of the .tis file, so that: + + - it has a format version number, which makes it easier to + back-compatibly change file formats in the future. + + - the term count is now stored as a long. This was the one aspect + of the Lucene's file formats which limited index size. + + - a few internal index parameters are now stored in the index, so + that they can (in theory) now be changed from index to index, + although there is not yet an API to do so. + + These changes are back compatible. The new code can read old + indexes. But old code will not be able read new indexes. (cutting) + + 2. Added an optimized implementation of TermDocs.skipTo(). A skip + table is now stored for each term in the .frq file. This only + adds a percent or two to overall index size, but can substantially + speedup many searches. (cutting) + + 3. Restructured the Scorer API and all Scorer implementations to take + advantage of an optimized TermDocs.skipTo() implementation. In + particular, PhraseQuerys and conjunctive BooleanQuerys are + faster when one clause has substantially fewer matches than the + others. (A conjunctive BooleanQuery is a BooleanQuery where all + clauses are required.) (cutting) + +
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
