+1 It would be nice, specially because it allows understanding the data we are using for training.
2017-01-14 10:18 GMT-02:00 Joern Kottmann <[email protected]>: > Hello all, > > I would like to propose that we implement counters / statistics for the > various ObjectStreams of Samples we typically have in various parts, > e.g. for training, eval, converters and maybe the tools which process > data. > > The stats can be printed at the end of a training run to give an > overview of the data. I already implemented one [1] for the NameFinder > but I believe it is useful enough to undertake the effort to implement > it for all components we have. > > Any opinions? > > Jörn > > [1] https://github.com/apache/opennlp/blob/master/opennlp-tools/src/mai > n/java/opennlp/tools/cmdline/namefind/NameSampleCountersStream.java >
