[ 
https://issues.apache.org/jira/browse/LUCENE-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12725972#action_12725972
 ] 

Rob ten Hove commented on LUCENE-1373:
--------------------------------------

Is it possible that when a property has a value that ends on "Type" like 
"InputFileType" is not indexed when the OS language is Dutch due to the same 
bug? I have two installations of Alfresco 3 Labs with Lucene 2.1.0 
autoinstalled and with exactly the same installation options (English as 
language for Alfresco) the main difference next to the Hardware is the OS 
language. In both cases XP with SP2 but one English and the other Dutch. In the 
installation on the Dutch OS three properties with values ending on Type could 
not be found whereas they are present in the English version.

> Most of the contributed Analyzers suffer from invalid recognition of acronyms.
> ------------------------------------------------------------------------------
>
>                 Key: LUCENE-1373
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1373
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis, contrib/analyzers
>    Affects Versions: 2.3.2
>            Reporter: Mark Lassau
>            Priority: Minor
>         Attachments: LUCENE-1373.patch
>
>
> LUCENE-1068 describes a bug in StandardTokenizer whereby a string like 
> "www.apache.org." would be incorrectly tokenized as an acronym (note the dot 
> at the end).
> Unfortunately, keeping the "backward compatibility" of a bug turns out to 
> harm us.
> StandardTokenizer has a couple of ways to indicate "fix this bug", but 
> unfortunately the default behaviour is still to be buggy.
> Most of the non-English analyzers provided in lucene-analyzers utilize the 
> StandardTokenizer, and in v2.3.2 not one of these provides a way to get the 
> non-buggy behaviour :(
> I refer to:
> * BrazilianAnalyzer
> * CzechAnalyzer
> * DutchAnalyzer
> * FrenchAnalyzer
> * GermanAnalyzer
> * GreekAnalyzer
> * ThaiAnalyzer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to