To be honest I couldn't remember if I opened a ticket for it so I had a quick look through jira but couldn't find any related ones....I will open the ticket this afternoon and provide a patch as well...

Jim


On 20/02/13 13:57, Jörn Kottmann wrote:
Did you open a jira for it as suggested by Lance? Do you recall the issue number?
We should have the fix for it into the 1.5.3 release.

Jörn

On 02/20/2013 02:14 PM, Jim foo.bar wrote:
a bit of googling and I managed to locate the thread from November! Here it is:

http://mail-archives.apache.org/mod_mbox/opennlp-users/201211.mbox/%3c509bec16.7050...@gmail.com%3E

I reported it and fixed it back then, but I can't remember whether I communicated my fix with you guys... I'll investigate my private fork and try to spot the differences and I'll let you know what happens...I think it was a minor bug...there was a '-1' somewhere if I'm not mistaken...

Jim


On 20/02/13 13:00, Jim foo.bar wrote:
Ok , sorry I rushed earlier...Now I remember what happened 8-9 months ago...It's not the Span.spansToStrings () that has the problem but the RegexNameFinder instead! Calling the .find method of the RegexNameFinder returns spans of the form I mentioned earlier (#<Span [3..3)>)...I do remember fixing this but I 'm not sure I submitted a patch...can anyone shed some light or should I go back to diff my sources?

Jim


On 20/02/13 12:16, Jim foo.bar wrote:
I forgot to mention that I'm referring to the 1.5.2-incubating version available on maven. Presumably this been fixed in trunk?

Jim

On 20/02/13 11:53, Jim foo.bar wrote:
Hi everyone,

I'm pretty sure we had this discussion last year and that it was fixed! Basically, whenever any NameFinder recognises a single word token the resulting span is something like this:
 (#<Span [3..3)> #<Span [6..6)>)

while I think it should have been (#<Span [3..4)> #<Span [6..7)>).
As a result the following exception is thrown : StringIndexOutOfBoundsException String index out of range: -1 java.lang.AbstractStringBuilder.substring (AbstractStringBuilder.java:872)


I am 99% positive that we've fixed this in the past...at least my private openNLP build behaves as expected. Just in case I'm doing something wrong here are my steps:

- create a RegexNameFinder passing the following regexes in an array: "\d+", "\w+ive?" -call find on it passing the following text in an array ["azestapine" "treatment" "is" "10" "times" "more" "effective" "."]
-I get back the aformentioned spans  (#<Span [3..3)> #<Span [6..6)>)
-trying to convert them to string-array (via Span/spansToStrings) doesn't work!


any ideas? This is quite important isn't it?

Jim






Reply via email to