[ 
https://issues.apache.org/jira/browse/LUCENE-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802034#action_12802034
 ] 

Robert Muir commented on LUCENE-2226:
-------------------------------------

Hi DM, I wanted to give my opinion on some of your comments:

bq. Robert, I'm suggesting that you move it. But that in CHANGES.txt that you 
make it clear that part of the user's responsibility in upgrading is to delete 
the snowball jar. I've been bit too many times by having both the old jar and a 
new jar in the classpath. I know better but ...

ok, I'll improve the wording in CHANGES.

bq. I'd more or less agree with you that one could stay with old jars if it 
weren't for bug fixes. Some bug fixes, such as in LUCENE-2055, will partially 
invalidate an index, requiring it to be rebuilt to work as expected. I think 
these need to be done and should not require bw compat maintenance of the bug.

ok, we can discuss this on that issue when the time comes. personally I am for 
"fixing the bug" which means for this case using snowball instead. i'd like to 
remove the old cruft, maybe we decide we should keep it around with Version for 
a while, but not too long, i think 2 major releases is too long in this case, 
since lucene major releases don't seem to happen that often.

bq. Part of what Robert is saying (or at least what I am hearing): Snowball 
should be trusted, but contrib/common stemmers should not.

no, I am not trying to imply this really. Snowball has its own problems (Karl 
Wettin and I both reported problems to their mailing list recently), but for 
these languages, its better to simply use snowball itself rather than some code 
that tries to implement one of its algorithms, but doesn't quite do it right.

bq. I think that ultimately it will be important for bw compat to cover 
non-English well.

I agree, and we can continue to do this. but ultimately is a long time away, 
contrib/analyzers still needs some work, and even then, for non-english some 
stuff is simply outside of our control, i.e. Myanmar unicode model changing 
between 5.1 and 5.2, things like that. 

but for this change, we simply need to move things from one contrib to another, 
the packages are left unchanged. 

If i knew of an easy way to 'move things from one jar to another and preserve 
drop-in jar back compat', i would do it. But i think this concept doesn't even 
make sense for what we are doing here?


> move contrib/snowball to contrib/analyzers
> ------------------------------------------
>
>                 Key: LUCENE-2226
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2226
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: contrib/analyzers
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>             Fix For: 3.1
>
>         Attachments: LUCENE-2226.patch
>
>
> to fix bugs in some duplicate, handcoded impls of these stemmers (nl, fr, ru, 
> etc) we should simply merge snowball and analyzers, and replace the buggy 
> impls with the proper snowball stemfilters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to