Daniel wrote: Has anybody an idea how practical it would be to find these
[noun] phrases with disambiguation rules?

Probably, you can do it, but a simple rule is unlikely to be sufficient. I
had a related problem when I wanted to disambiguate nouns and verbs.

The groups of examples that follow show some problems that I had with the
identification of noun phrases. The target nouns phrases are in CAPITAL
LETTERS:

SOME THIN OIL FILTERS are not satisfactory.
SOME THIN OIL filters through the sand.

THE TEMPERATURE INCREASES and decreases are small.
THE TEMPERATURE increases and the gas expands.

USED PLASTIC COVERS are not satisfactory.
The technician used PLASTIC COVERS, not metal covers.

The next 3 examples show a semantic problem. Without giving LT information
about real-world meaning, LT cannot correctly disambiguate the text.

The technician made THE OIL FILTER from a piece of old rag.
The technician made THE OIL filter into a clean container.
The technician made THE OIL FILTER into a toy rocket for his 7-year-old son.

To see my rules, look at the rulegroup id="POS_DISAMBIGUATION_IDENTIFY_NOUN"
in
www.simplified-english.co.uk/disambiguation-en-asdste100-issue3-2013-02-01.z
ip. (The rules use new POS, not the default POS in LT.)

Regards,

Mike Unwalla
Contact: www.techscribe.co.uk/techw/contact.htm 


-----Original Message-----
From: Daniel Naber [mailto:[email protected]] 
Sent: 01 March 2013 10:28
To: development discussion for LanguageTool
Subject: finding English phrases

Hi,

one of the significant sources of false alarms in English is the fact that 
LT doesn't properly handle phrases. For example:

"There are several cargo and passenger ferries."

leads to an error because only "several cargo" is considered and LT 
requires "several" to be followed by a plural noun. Instead, "cargo and 
passenger ferries" should be considered one plural noun phrase.

Has anybody an idea how practical it would be to find these phrases with 
disambiguation rules? One could do this (just an example, it doesn't fully 
cover the example above):

    <rule id="NNPS_PHRASE1" name="plural noun phrase">
        <pattern>
            <marker>
                <token postag="NN"></token>
            </marker>
            <token postag="NNS"></token>
        </pattern>
        <disambig action="add"><wd pos="NNPS_PHRASE_START"/></disambig>
    </rule>

Then the rules that now look for plural nouns would have to be changed to 
look for NNPS_PHRASE_START.

Is there a way to get "longest match" with disambiguation rules? It seems 
to me it's at least difficult to remove shorter phrases inside longer 
phrases.

Any ideas or actual rules for this are very welcome. I think this is one of 
the remaining major problems for English (and actually not only English).

Regards
 Daniel



------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to