Thanks Jeff it worked. I think it is not mentioned in docs. On Mon, Apr 17, 2017 at 1:20 AM, Jeff Zemerick <jzemer...@apache.org> wrote:
> Saurabh, > > Are there document boundaries (new lines) in your training data? > > Jeff > > > > On Tue, Apr 11, 2017 at 6:07 AM, Saurabh Jain <saurabh4768j...@gmail.com> > wrote: > > > Hi All > > > > I am cross validating NameFinder training data using > > TokenNameFinderCrossValidator. Training parameters are as follows: > > > > Train algorithm name: MAXENT > > Trainer Type name: EventModel > > Iteration value: 100 > > Cut off value: 5 > > Beam size: 5 > > No of folds: 3 > > Total training instances: 22351 > > > > Code snippet: > > > > try { > > > > evaluate = new TokenNameFinderCrossValidator("en", entity, > > trainingParameters, TokenNameFinderFactory.create(null, > > > > entityExtractionProcessor.getFeatureGenMap().get(entity), > > Collections.emptyMap(), new BioCodec())); > > > > } catch (InvalidFormatException e) { > > > > e.printStackTrace(); > > > > } > > > > evaluate.evaluate(sampleStream, 3); > > > > > > evaluate method is giving InsufficientTrainingDataException. Can anyone > > suggest me why it is happening as I have passed 22351 training instances > > and if it is 3 folds, then each fold will get around 7000 instances. > > > > > > -- > > *Thanks & Regards* > > > > > > *Saurabh Jain * > > *AI Developer* > > > > *Active Intelligence * > > > > *"* > > *To do a thing yesterday was the best time . Second best time is today > .” * > > > -- *Thanks & Regards* *Saurabh Jain * *AI Developer* *Active Intelligence * *"* *To do a thing yesterday was the best time . Second best time is today .” *