Yes that would give ultimate control.

C 
On May 5, 2011, at 9:42 AM, Jörn Kottmann wrote:

> On 5/5/11 6:12 PM, Chris Collins wrote:
>> Right, I guess so:
>> 
>> #Mon Mar 28 12:17:52 PDT 2011
>> Training-Eventhash=d61e8fc9af7e230ff91060f27e0d2959
>> Manifest-Version=1.0
>> Language=de
>> useTokenEnd=true
>> Training-Cutoff=5
>> Training-Iterations=100
>> OpenNLP-Version=1.5.0
>> Timestamp=1301339872213
>> Component-Name=SentenceDetectorME
>> 
>> though I meant also major minor version that the person doing the build can 
>> provide for the version of the data not the OpenNLP software (don't forget 
>> data location e.g. 
>> /Users/chris/model_training/en/me_playing_around_dont_use_in_production :-})
> 
> Maybe we should give the user the freedom to write custom properties into the 
> earlier proposed training file and
> extend the above manifest with automatically generates properties as far as 
> it makes sense.
> 
> I guess that would suit your needs?
> 
> The training data location might not always be available. I for example 
> retrieve my training data from a
> database which contains my corpus. The data is then directly streamed into 
> OpenNLP without ever hitting the disk.
> 
> Jörn

Reply via email to