Sounds great! Let me know if you need some testing when you commit the fix, I'll be happy to help.
Thanks, James! El 04/01/2012 06:32, "James Kosin" <james.ko...@gmail.com> escribió: > I've narrowed the problem down to the output method that generates the > output text from the tokenNameFinder.... > At least I think that is where the problem lies. > > James > > On 1/2/2012 1:49 PM, Angel Luis Jimenez Martinez wrote: > > Hi Olivier, > > > > Right now is a small training set, but the curious thing is with a little > > corpus (4 lines) it detects phrases like "call to ann" but not "call > ann". > > So I suspect there is something wrong when training a with a phrase that > > has two consecutive markers. > > > > I have tried with a bigger corpus like: > > > > <START:action> call <END> <START:person> mary <END> a tope > > <START:action> call <END> <START:person> james <END> a tope > > <START:action> call <END> <START:person> mary <END> a tope > > <START:action> call <END> <START:person> joe smith <END> a tope > > ... > > ... > > > > With about 20 lines but no luck. > > > > And about the regex it was my first option for this problem, even I have > a > > working solution... but I quickly found that I wanted to have something > > less rigid that I could train with several different phrases, so hence > I'm > > playing with OpenNLP. > > > > I'm looking for something that allows me to process phrases like: > > > > weather in london > > how is the weather in london > > in london how is the weather right now > > today how is the weather near london > > > > As you can guess using regexes to implement this was not very fun ;-) > > > > And about the capitalization right now the input comes all in lowercase > (it > > comes from a speech recognizer like that) > > > > On Mon, Jan 2, 2012 at 7:31 PM, Olivier Grisel <olivier.gri...@ensta.org > >wrote: > > > >> How big is your training set? You don't have any upercase letters in > >> your phrases? > >> > >> You might need a larger and more diverse set of examples (including > >> negative examples without any kind of annotations). > >> > >> Do your sentence always follow such simple patterns? If so should > >> probably use a simple regular expression with a fixed / controlled > >> list of action names. > >> > >> -- > >> Olivier > >> http://twitter.com/ogrisel - http://github.com/ogrisel > >> > > > > > >