[ 
https://issues.apache.org/jira/browse/LUCENE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5517:
--------------------------------

    Attachment: LUCENE-5517.patch

> stricter parsing for hunspell parseFlag()
> -----------------------------------------
>
>                 Key: LUCENE-5517
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5517
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/analysis
>            Reporter: Robert Muir
>         Attachments: LUCENE-5517.patch
>
>
> I was trying to debug why a hunspell dictionary (an updated version fixes the 
> bug!) used so much ram, and the reason is the dictionary was buggy and didnt 
> have FLAG NUM (so each digit was treated as its own flag, leading to chaos).
> In many situations in the hunspell file (e.g. affix rule), the flag should 
> only be a single one. But today we don't detect this, we just take the first 
> one.
> We should throw exception here: in most cases hunspell itself is doing this 
> for the impacted dictionaries. In these cases the dictionary is buggy and in 
> some cases you do in fact get an error from hunspell commandline. We should 
> throw exception instead of emitting chaos...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to