Hi,
one of the significant sources of false alarms in English is the fact that
LT doesn't properly handle phrases. For example:
"There are several cargo and passenger ferries."
leads to an error because only "several cargo" is considered and LT
requires "several" to be followed by a plural noun. Instead, "cargo and
passenger ferries" should be considered one plural noun phrase.
Has anybody an idea how practical it would be to find these phrases with
disambiguation rules? One could do this (just an example, it doesn't fully
cover the example above):
<rule id="NNPS_PHRASE1" name="plural noun phrase">
<pattern>
<marker>
<token postag="NN"></token>
</marker>
<token postag="NNS"></token>
</pattern>
<disambig action="add"><wd pos="NNPS_PHRASE_START"/></disambig>
</rule>
Then the rules that now look for plural nouns would have to be changed to
look for NNPS_PHRASE_START.
Is there a way to get "longest match" with disambiguation rules? It seems
to me it's at least difficult to remove shorter phrases inside longer
phrases.
Any ideas or actual rules for this are very welcome. I think this is one of
the remaining major problems for English (and actually not only English).
Regards
Daniel
--
http://www.danielnaber.de
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel