Thanks Jeff it worked.  I think it is not mentioned in docs.

On Mon, Apr 17, 2017 at 1:20 AM, Jeff Zemerick <jzemer...@apache.org> wrote:

> Saurabh,
>
> Are there document boundaries (new lines) in your training data?
>
> Jeff
>
>
>
> On Tue, Apr 11, 2017 at 6:07 AM, Saurabh Jain <saurabh4768j...@gmail.com>
> wrote:
>
> > Hi All
> >
> > I am cross validating NameFinder training data using
> > TokenNameFinderCrossValidator. Training parameters are as follows:
> >
> > Train algorithm name: MAXENT
> > Trainer Type name: EventModel
> > Iteration value: 100
> > Cut off value: 5
> > Beam size: 5
> > No of folds: 3
> > Total training instances: 22351
> >
> > Code snippet:
> >
> >         try {
> >
> >         evaluate = new TokenNameFinderCrossValidator("en", entity,
> >  trainingParameters, TokenNameFinderFactory.create(null,
> >
> >        entityExtractionProcessor.getFeatureGenMap().get(entity),
> > Collections.emptyMap(), new BioCodec()));
> >
> >         } catch (InvalidFormatException e) {
> >
> >                   e.printStackTrace();
> >
> >         }
> >
> >         evaluate.evaluate(sampleStream, 3);
> >
> >
> > evaluate method is giving InsufficientTrainingDataException. Can anyone
> > suggest me why it is happening as I have passed 22351 training instances
> > and if it is 3 folds, then each fold will get around 7000 instances.
> >
> >
> > --
> > *Thanks & Regards*
> >
> >
> > *Saurabh Jain *
> > *AI Developer*
> >
> > *Active Intelligence  *
> >
> > *"*
> > *To do a thing yesterday was the best time . Second best time is today
> .” *
> >
>



-- 
*Thanks & Regards*


*Saurabh Jain *
*AI Developer*

*Active Intelligence  *

*"*
*To do a thing yesterday was the best time . Second best time is today .” *

Reply via email to