On 01/29/2014 07:33 PM, Richard Eckart de Castilho wrote:
If I understand the SequenceClassificationModel interface correctly,
the input data to be classified is passed as an array T[].

What about data that is very large? I think it would be nice if
the new interface would support sequence classifications on streams,
e.g. by passing an Iterator<T> or an actual stream to the classifier.

Exactly, the sequence to be classified is passed in as an array.

The current interface will support passing in quite long sequences (only limited by memory)
of probably easily a few ten thousand elements.

Do you have a use case where this would not be good enough? In the OpenNLP components the sequences are usually a sentence, but even a really long document should work.

Supporting streams would add quite some complexity, and I wonder if it is really necessary,
e.g. looking back or forward in the sequence during feature generation.

Thanks,
Jörn

Reply via email to