great improvement! I did a test in our data set. doc count is about 2M+ and index size after optimization is about 13.3GB(including fdt) it seems lucene4's index format is better than lucene2.9.3. and PFor give good results. Besides BlockEncoder for frq and pos. is there any other modification for lucene 4?
decoder \ avg time single word(ms) and query(ms) or query(ms) VINT in lucene 2.9 11.2 36.5 38.6 VINT in lucene 4 branch 10.6 26.5 35.4 PFor in lucene 4 branch 8.1 22.5 30.7 2010/12/21 Li Li <fancye...@gmail.com>: >> OK we should have a look at that one still. We need to converge on a >> good default codec for 4.0. Fortunately it's trivial to take any int >> block encoder (fixed or variable block) and make a Lucene codec out of >> it! > > I suggests you not to use this one, I fixed dozens of bugs but it > still failed when with random tests. it's codes is hand coded rather > than generated by program. But we may learn something from it. > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org