Hello all, I would like to propose that we implement counters / statistics for the various ObjectStreams of Samples we typically have in various parts, e.g. for training, eval, converters and maybe the tools which process data.
The stats can be printed at the end of a training run to give an overview of the data. I already implemented one [1] for the NameFinder but I believe it is useful enough to undertake the effort to implement it for all components we have. Any opinions? Jörn [1] https://github.com/apache/opennlp/blob/master/opennlp-tools/src/mai n/java/opennlp/tools/cmdline/namefind/NameSampleCountersStream.java
