Is indexing much slower in 3.5.0 than in 2.4.1 for Wikipedia data?

2011-12-11 Thread Sean Tong
Hi, We plan to upgrade the Lucene library in our application from 2.4.1 to 3.5.0. I have been running benchmark tests that come with Lucence. To my surprise, I found that the indexing in 3.5.0 is significant slower than 2.4.1 for the Wikipedia data. Attached is the algorithm for the tests.

RE: Is indexing much slower in 3.5.0 than in 2.4.1 for Wikipedia data?

2011-12-12 Thread Sean Tong
ystemErase { "Populate" CreateIndex { "MAddDocs" AddDoc > : 20 CloseIndex } NewRound } : 3 RepSumByName RepSumByPrefRound MAddDocs #End of wikipedia-default.alg file Thanks, Sean From: Sean Tong [mailto:st...@jamasoftware.com] Sent: Sund

RE: Is indexing much slower in 3.5.0 than in 2.4.1 for Wikipedia data?

2011-12-12 Thread Sean Tong
s a different implementation than in 2.9 and 2.4 or rerun the 2.4 benchmarks with a WhitespaceAnalyzer just for the comparison. simon On Mon, Dec 12, 2011 at 7:08 PM, Sean Tong wrote: > Looks like the attachment for the algorithm is missing from last email.  I > have pasted the text here.

RE: Is indexing much slower in 3.5.0 than in 2.4.1 for Wikipedia data?

2011-12-13 Thread Sean Tong
ck? I also wonder if it maybe now uses update instead of add ie. buffers and applies deletes etc. simon On Mon, Dec 12, 2011 at 10:03 PM, Sean Tong wrote: > Thanks Simon for your response. > > I just re-ran the 3.5 benchmark with the ClassicAnalyzer. Here are the > resul

RE: Is indexing much slower in 3.5.0 than in 2.4.1 for Wikipedia data?

2011-12-13 Thread Sean Tong
.77% -Original Message- From: Sean Tong Sent: Tuesday, December 13, 2011 10:47 AM To: 'java-user@lucene.apache.org' Subject: RE: Is indexing much slower in 3.5.0 than in 2.4.1 for Wikipedia data? Simon, I checked the indexes with Luke and you were right about the benchmark