W dniu 2014-09-16 o 11:21, R.J. Baars pisze:
> Marcin,
>
> We don't agree. There is a spellchecker, but also a single word ignore
> list for it.

Yes, but for multi-words, we'd have to use the disambiguator code 
internally anyway. You ask for yet another notation of the same thing.

Notice also that no spell checker will propose "Tel Aviv" for "Aviv". 
You need to have an XML rule for that. A simple one, to be sure, but 
still an XML rule. I think it's pretty trivial to go through a list of 
such words and create parallel lists of ignore-spelling rule for 
disambiguation and missing part grammar rules.

Regards,
Marcin

> There are XML rules, but also a Simplereplace rule, a compounding rule.
>
> So apart from the hammer and the screwdriver, there are more tools.
>
> But anyway, adding the most frequent ones tot the disambiguator works.
>
> Getting rid of wrong postags and 10% reported possible spelling errors on
> the entire corpus is a higher priority.
> And fixing false positives. Having almost doubled the amount or rules is
> enough for this month.
>
> Ruud
>
>
>
>> W dniu 2014-09-16 o 09:03, R.J. Baars pisze:
>>> A word like 'Aviv'is not correct unless 'Tel' is before it.
>>> So it is best to leave Tel and Aviv out of the spell checker.
>>> That results in spell checking reporting errors for Aviv.
>>>
>>> In the disambiguator, there is the option to block that, by making an
>>> immunizing rule:
>>>
>>>     <!-- Tel Aviv-->
>>>     <rule id="TEL_AVIV" name="Tel Aviv">
>>>       <pattern>
>>>         <token>Tel</token>
>>>         <token>Aviv</token>
>>>       </pattern>
>>>       <disambig action="ignore_spelling"/>
>>>     </rule>
>>>
>>> That works perfectly. But then, there are a lot of these word
>>> combinations. Wouldn't it be better to have a multi-word ignore list for
>>> the spell checker?
>>>
>>> (Or even a multi-word spell checker, not just knowing 'correct' and 'not
>>> in list', but 'correct', 'incorrect' and 'not in list')
>>
>> It would not be an enhancement, as this would not give new functionality
>> but cripple the existing one. Also, the ability to use all XML syntax is
>> extremely important to me (I use POS tags and regular expressions), so I
>> wouldn't make use of the multi-word spell checker anyway. So we'd have
>> to introduce a crippled syntax that would look a little bit different
>> for a human being but with no meaningful functional change. I don't
>> think it's worth our time.
>>
>> The spell checker is best for checking individual words. Just like a
>> hammer, it's good for nails, and not for screws. For screws, we have a
>> screwdriver. For multi-word entities, we have more refined tools, like
>> tagging and disambiguation and special attributes.
>>
>> Best,
>> Marcin
>>
>> ------------------------------------------------------------------------------
>> Want excitement?
>> Manually upgrade your production database.
>> When you want reliability, choose Perforce.
>> Perforce version control. Predictably reliable.
>> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Languagetool-devel mailing list
>> Languagetool-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>
>
>
>
> ------------------------------------------------------------------------------
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce.
> Perforce version control. Predictably reliable.
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>
>


------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce.
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to