[ https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-2181: -------------------------------- Attachment: LUCENE-2181.patch ok i think we might be close to something committable now: * wrote tests for NewLocaleTask and NewCollationAnalyzerTask * set doc.stored=false, doc.tokenized=false, doc.body.tokenized=true in the collation.alg file * i moved the two scripts into a 'scripts' directory, i thought this made more sense? * I also renamed the bm2jira.pl script to collation.bm2jira.pl here is the output from 'ant collation' from the benchmark package: ||Language||java.text||ICU4J||KeywordAnalyzer||ICU4J Improvement|| |English|10.78s|7.32s|1.58s|60%| |French|11.48s|7.52s|1.59s|67%| |German|11.19s|7.52s|1.61s|62%| |Ukrainian|13.03s|8.68s|1.66s|62%| i think its more accurate relative to KeywordAnalyzer now that we aren't storing the body text in a stored field and things like that, but of course you can change the .alg file to see if the differences matter in the context of overall indexing by turning these back on. > benchmark for collation > ----------------------- > > Key: LUCENE-2181 > URL: https://issues.apache.org/jira/browse/LUCENE-2181 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/benchmark > Reporter: Robert Muir > Assignee: Robert Muir > Attachments: LUCENE-2181.patch, LUCENE-2181.patch, LUCENE-2181.patch, > top.100k.words.de.en.fr.uk.wikipedia.2009-11.tar.bz2 > > > Steven Rowe attached a contrib/benchmark-based benchmark for collation (both > jdk and icu) under LUCENE-2084, along with some instructions to run it... > I think it would be a nice if we could turn this into a committable patch and > add it to benchmark. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org