We are facing the same issue in italian: without understanding the context it is hard to disambiguate by means of general rules. You need to get to the level of specific words.
I came to the conclusion that this problem should be addressed at the tagger level by providing context based tagging (at least in the first instance). The tagger should use a large corpus of correct sentences and the relative tags in order to incorporate a knowledge base. Moreover, the tool itself should be able to feed into the corpus additional correct sentences and learn when needed. I understand that a tagger based on simple word lookup is at the base of the way it works right now, but i don't think that such an implementation wouldn't be compatible. Ciao. Paolo On 01/mar/2013, at 14:44, "Mike Unwalla" <[email protected]> wrote: > Daniel wrote: Has anybody an idea how practical it would be to find these > [noun] phrases with disambiguation rules? > > Probably, you can do it, but a simple rule is unlikely to be sufficient. I > had a related problem when I wanted to disambiguate nouns and verbs. > > The groups of examples that follow show some problems that I had with the > identification of noun phrases. The target nouns phrases are in CAPITAL > LETTERS: > > SOME THIN OIL FILTERS are not satisfactory. > SOME THIN OIL filters through the sand. > > THE TEMPERATURE INCREASES and decreases are small. > THE TEMPERATURE increases and the gas expands. > > USED PLASTIC COVERS are not satisfactory. > The technician used PLASTIC COVERS, not metal covers. > > The next 3 examples show a semantic problem. Without giving LT information > about real-world meaning, LT cannot correctly disambiguate the text. > > The technician made THE OIL FILTER from a piece of old rag. > The technician made THE OIL filter into a clean container. > The technician made THE OIL FILTER into a toy rocket for his 7-year-old son. > > To see my rules, look at the rulegroup id="POS_DISAMBIGUATION_IDENTIFY_NOUN" > in > www.simplified-english.co.uk/disambiguation-en-asdste100-issue3-2013-02-01.z > ip. (The rules use new POS, not the default POS in LT.) > > Regards, > > Mike Unwalla > Contact: www.techscribe.co.uk/techw/contact.htm > > > -----Original Message----- > From: Daniel Naber [mailto:[email protected]] > Sent: 01 March 2013 10:28 > To: development discussion for LanguageTool > Subject: finding English phrases > > Hi, > > one of the significant sources of false alarms in English is the fact that > LT doesn't properly handle phrases. For example: > > "There are several cargo and passenger ferries." > > leads to an error because only "several cargo" is considered and LT > requires "several" to be followed by a plural noun. Instead, "cargo and > passenger ferries" should be considered one plural noun phrase. > > Has anybody an idea how practical it would be to find these phrases with > disambiguation rules? One could do this (just an example, it doesn't fully > cover the example above): > > <rule id="NNPS_PHRASE1" name="plural noun phrase"> > <pattern> > <marker> > <token postag="NN"></token> > </marker> > <token postag="NNS"></token> > </pattern> > <disambig action="add"><wd pos="NNPS_PHRASE_START"/></disambig> > </rule> > > Then the rules that now look for plural nouns would have to be changed to > look for NNPS_PHRASE_START. > > Is there a way to get "longest match" with disambiguation rules? It seems > to me it's at least difficult to remove shorter phrases inside longer > phrases. > > Any ideas or actual rules for this are very welcome. I think this is one of > the remaining major problems for English (and actually not only English). > > Regards > Daniel > > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb > _______________________________________________ > Languagetool-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/languagetool-devel ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ Languagetool-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/languagetool-devel
