[ 
https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798508#action_12798508
 ] 

Steven Rowe commented on LUCENE-2181:
-------------------------------------

Looks good.  I like the way you've integrated it into the benchmark suite, and 
as you say the NewLocaleTask should prove useful elsewhere.

bq. I put the files in my apache directory, but modified your patch somewhat

One major thing you changed but didn't mention above is that rather than 
applying the collation key transform only to the LineDoc body field, it's now 
applied also to the title and date fields.  Given the nature of the top 100k 
words files -- the title is an integer representing term frequency, and the 
date is essentially meaningless (the date on which I created the file) -- I 
don't think this makes sense (and that's why I made analyzers that only applied 
collation to the body field).

> benchmark for collation
> -----------------------
>
>                 Key: LUCENE-2181
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2181
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/benchmark
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>         Attachments: LUCENE-2181.patch, LUCENE-2181.patch, 
> top.100k.words.de.en.fr.uk.wikipedia.2009-11.tar.bz2
>
>
> Steven Rowe attached a contrib/benchmark-based benchmark for collation (both 
> jdk and icu) under LUCENE-2084, along with some instructions to run it... 
> I think it would be a nice if we could turn this into a committable patch and 
> add it to benchmark.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to