I should have mentioned that to use your factory you can simply specify the
fully qualified name in the command line tool argument "-factory".


On Tue, Mar 26, 2013 at 10:21 AM, William Colen <[email protected]>wrote:

> Riccardo,
>
> You can tune your sentence detector using a custom context generator.
>
> At
> http://svn.apache.org/viewvc/opennlp/trunk/opennlp-tools/src/test/java/opennlp/tools/sentdetect/
> take a look at DummySentenceDetectorFactory.java
> and SentenceDetectorFactoryTest.java
>
> If you prefer a concrete example, take a look at an implementation I did
> for another project:
>
> https://github.com/cogroo/cogroo4/tree/master/cogroo-nlp/src/main/java/org/cogroo/tools/sentdetect
>
> William
>
>
> On Tue, Mar 26, 2013 at 9:52 AM, Riccardo Tasso 
> <[email protected]>wrote:
>
>> Thank you Jörn, in fact the results improved a lot:
>> Precision: 0.5325131810193322
>> Recall: 0.4745497259201253
>> F-Measure: 0.5018633540372671
>>
>> I guess the splitter could have better results if it were able to detect
>> parenthetic structure such as:
>> some text - speech - other text
>> which in my dataset is splitted as:
>> some text
>> - speech -
>> other text
>> Is it possible?
>>
>> Another optimization should be the one which could detect symbols to end a
>> sentence longer than one character, for example "...".
>>
>> Can you tell me more about the following parameters?
>>
>>    - iterations
>>    - cutoff
>>
>> Is there any guideline on how tune them?
>>
>> Cheers,
>> Riccardo
>>
>>
>>
>> 2013/3/26 Jörn Kottmann <[email protected]>
>>
>> > On 03/26/2013 08:40 AM, Riccardo Tasso wrote:
>> >
>> >> Is the Sentence Detector able to split also on non dot characters? In
>> my
>> >> case there should be also other characters delimiting the end of a
>> >> segment,
>> >> such as: colon (:), dash (-), various kind of quotation marks (", `, ',
>> >> ...).
>> >>
>> >
>> > The Sentence Detector can only split on end-of-sentence characters, by
>> > default these
>> > are . ! ? but with 1.5.3 you can set them during training to your custom
>> > set, there is
>> > a command line argument for it on the Sentence Detector Trainer, haver a
>> > look at the help.
>> >
>> > If you don't want to compile yourself use the 1.5.3 RC2 which we are
>> > currently testing.
>> >
>> > Jörn
>> >
>> >
>> >
>>
>
>

Reply via email to