I do not store the sequences of possible sets of outcomes. I generate them at runtime based on a dictionary and disambiguation rules(applied on the given input sentence). So for my case, I needed a method like tagger.tag(String[] sentence, String[][] possible_outcomes_for_each_word). Where possible_outcomes_for_each_word is generated by some code which is not related to OpenNLP. where sentence.length() = possible_outcomes_for_each_word.length
Radu ________________________________ From: Jörn Kottmann <[email protected]> To: [email protected] Sent: Tue, March 15, 2011 4:09:39 PM Subject: Re: Hybrid POS tagger On 3/15/11 2:46 PM, Radu Simionescu wrote: > I don't know how to do that. Anyway, i think it is best to just create > another >tag method for the POSTaggerME class which would have as parameter a sequence >of >sets of possible outcomes for the input sentence. And implement such a >sequence >validator in the api. This might be better cose it exposes less functionality >to >the end user, keeping things simpler. > > If you could do this jira issue it would be great. Otherwise, instruct me >please. Here is the link to the issue: https://issues.apache.org/jira/browse/OPENNLP-144 We can either extend our current dictionary to look up possible tag sequences for n-grams or make something new like you did. Which format do you use to store the tag sequences for n-grams? Jörn
