Hi,
We plan to upgrade the Lucene library in our application from 2.4.1 to 3.5.0. I
have been running benchmark tests that come with Lucence. To my surprise, I
found that the indexing in 3.5.0 is significant slower than 2.4.1 for the
Wikipedia data.
Attached is the algorithm for the tests.
ystemErase
{ "Populate"
CreateIndex
{ "MAddDocs" AddDoc > : 20
CloseIndex
}
NewRound
} : 3
RepSumByName
RepSumByPrefRound MAddDocs
#End of wikipedia-default.alg file
Thanks,
Sean
From: Sean Tong [mailto:st...@jamasoftware.com]
Sent: Sund
s a different implementation than in 2.9
and 2.4 or rerun the 2.4 benchmarks with a WhitespaceAnalyzer just for the
comparison.
simon
On Mon, Dec 12, 2011 at 7:08 PM, Sean Tong wrote:
> Looks like the attachment for the algorithm is missing from last email. I
> have pasted the text here.
ck? I also wonder if it maybe now uses
update instead of add ie. buffers and applies deletes etc.
simon
On Mon, Dec 12, 2011 at 10:03 PM, Sean Tong wrote:
> Thanks Simon for your response.
>
> I just re-ran the 3.5 benchmark with the ClassicAnalyzer. Here are the
> resul
.77%
-Original Message-
From: Sean Tong
Sent: Tuesday, December 13, 2011 10:47 AM
To: 'java-user@lucene.apache.org'
Subject: RE: Is indexing much slower in 3.5.0 than in 2.4.1 for Wikipedia data?
Simon,
I checked the indexes with Luke and you were right about the benchmark