On 9/26/11 6:30 AM, mcsek...@gmail.com wrote:
But, it doesnt give the correct output mentioned at that same link, instead
it gives this output:
Pierre Vinken, 61 years old, will join the board as a nonexecutive director
Nov. 29.
Mr. Vinken is chairman of Elsevier N.V.,
the Dutch publishing group.
Rudolph Agnew, 55 years old and former chairman of Consolidated Gold Fields
PLC,
was named a director of this British industrial conglomerate.
It created 5 sentences instead of 3.
I tried using the Java API of SentenceDetector, and that too gives incorrect
output.
A friend of mine ran the command line tool and used the Java API in Windows,
and it worked for him.
Hence, I am guessing this could be a Linux specific problem.
It might have to do with the white spaces between the sentences. Maybe there
are some differences in the test you did and your friend did.
You can easily check that by trying out our current release candidate,
because
we fixed the white space handling in the sentence detector there.
It can be downloaded from here:
http://people.apache.org/~joern/releases/opennlp-1.5.2-incubating/rc1/
Does your test with the API also behave different on Windows?
Jörn