On 2014-04-08 08:43, Marcin Miłkowski wrote: >> Internally, we now have information like this: postag=VBD, pos=verb, >> tense=past (etc.). But the disambiguation only works on the old tag? I >> guess I will need to resolve VBD here so the action works on both the >> old and the new representation? > > I think the only thing needed is to parse the tags again, if they are > different.
Although I'm not sure if I understood what you meant, I have now added a branch ("readable-pos-tags") for this, simply because the changes are getting so complex. It's still incomplete and buggy. Here's the basic idea of my changes in that branch: class TokenPoS is the new structured representation of POS tags. EnglishTagger returns one or more TokenPoS for a given traditional POS tag (like NNS). More than one will be returned in cases that are ambiguous in the new representation, e.g. "walk/VBP" can be person=1|2 number=singular and person=1|2|3 person=plural. Each AnalyzedToken has one TokenPoS. Currently the problem is this (when running the tests): Caused by: org.xml.sax.SAXException: English rule error. The number of interpretations specified with wd: 5 must be equal to the number of matched tokens (1) Line: 1525, column: 12. I roughly understand what the problem is but not yet the solution... any help is welcome, also any hints that what I'm doing in that branch might be wrong. Regards Daniel ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel