Re: How does good training data look like?

Vyacheslav Zholudev Wed, 05 Oct 2011 14:51:25 -0700

Hi Em,

could you please share the outcome when you have some results. I would be 
interested to hear them


Thanks,
Vyacheslav 

On Oct 5, 2011, at 11:08 PM, Em wrote:

> Thanks Jörn!
> 
> I'll experiment with this.
> 
> Regards,
> Em
> 
> Am 05.10.2011 19:47, schrieb Jörn Kottmann:
>> On 10/3/11 10:30 AM, Em wrote:
>>> What about document's length?
>>> Just as an example: The production-data will contain documents with a
>>> length of several pages as well as very short texts containing only a
>>> few sentences.
>>> 
>>> I think about chunking the long documents into smaller ones (i.e. a page
>>> of a longer document is splitted into an individual doc). Does this
>>> makes sense?
>> 
>> I would first try to process a long document at once. If you encounter any
>> issues you could just call clearAdaptiveData before the end of the
>> document.
>> But as Olivier said, you might just want to include a couple of these in
>> your training
>> data.
>> 
>> Jörn
>> 

Best,
Vyacheslav

Re: How does good training data look like?

Reply via email to