On Sonntag, 10. Juni 2012, Dominique Pellé wrote:
fromy=1 fromx=5 toy=2 tox=10
The value tox=10 is wrong. It should be 2.
Of course we should fix this, but shouldn't we also just specify the
position as from/to positions, ignoring lines/columns? That's more robust.
Regards
Daniel
--
There is the possibility that some words that are included in the tagger
dictionary (or are tagged in the disambiguation file) are marked as errors
by Hunspell, because they are missing in the Hunspell dictionary. In order
to avoid it we could add a condition in the Hunspell Java rule: mark as an
Daniel Naber list2...@danielnaber.de wrote:
On Sonntag, 10. Juni 2012, Dominique Pellé wrote:
fromy=1 fromx=5 toy=2 tox=10
The value tox=10 is wrong. It should be 2.
Of course we should fix this, but shouldn't we
also just specify the position as from/to positions,
ignoring
Thanks Marcin, that will be very useful for debugging
disambiguation rules.
There is something which I do not understand though.
Take this example with the French sentence Les avions
(= The planes). Both words have 2 POS tags in the
French dictionary:
$ egrep ^(les|avions)\s
On Sonntag, 10. Juni 2012, Dominique Pellé wrote:
I think that the following patch fixes it.
But the code is a bit hairy so I'm not 100% sure
it's OK. I have not checked-in:
I don't have time to look at it now, but if you can write a test case that
only works with your change (and if no
Marcin,
I talked to the developers of the Catalan Hunspell dictionary. The
WORDCHARS line will be added to the .aff file.
LibreOffice ignores this line. By default it takes middle dot and
apostrophe characters as word characters but not hyphen character. So this
is a limitation of LibreOffice.