On 2014-04-08 08:43, Marcin Miłkowski wrote:

>> Internally, we now have information like this: postag=VBD, pos=verb,
>> tense=past (etc.). But the disambiguation only works on the old tag? I
>> guess I will need to resolve VBD here so the action works on both the
>> old and the new representation?
> 
> I think the only thing needed is to parse the tags again, if they are
> different.

Although I'm not sure if I understood what you meant, I have now added a 
branch ("readable-pos-tags") for this, simply because the changes are 
getting so complex. It's still incomplete and buggy.

Here's the basic idea of my changes in that branch: class TokenPoS is 
the new structured representation of POS tags. EnglishTagger returns one 
or more TokenPoS for a given traditional POS tag (like NNS). More than 
one will be returned in cases that are ambiguous in the new 
representation, e.g. "walk/VBP" can be person=1|2 number=singular and 
person=1|2|3 person=plural. Each AnalyzedToken has one TokenPoS.

Currently the problem is this (when running the tests):
Caused by: org.xml.sax.SAXException: English rule error. The number of 
interpretations specified with wd: 5 must be equal to the number of 
matched tokens (1)
  Line: 1525, column: 12.

I roughly understand what the problem is but not yet the solution... any 
help is welcome, also any hints that what I'm doing in that branch might 
be wrong.

Regards
  Daniel


------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to