Robert Muir created LUCENE-5505:
-----------------------------------
Summary: hunspell SET/FLAG whitespace/BOM handling
Key: LUCENE-5505
URL: https://issues.apache.org/jira/browse/LUCENE-5505
Project: Lucene - Core
Issue Type: Bug
Reporter: Robert Muir
Several dictionaries cannot be loaded today (Armenian, Papiamento, Macedonian,
Russian, Urdu) because they have stuff like SET<tab>UTF-8,
FLAG<space><space>UTF-8 or have a BOM marker on the first line (or even
combinations of these).
Also because SET need not be the first line in the file, we should ignore BOM
markers on the first line in general (e.g. it might be something else like
FLAG).
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]