Ok, I definitely agree that my data size should be higher, but even a zero count doesn't necessitate an NPE. This could be tested via some data set (ala UCI http://archive.ics.uci.edu/ml/), as to whether there exists some threshold where the cross validator doesn't throw NPEs, and that information included in the documentation :-D. I think we can all see where I'm going with this... haha.
On Tue, Nov 26, 2013 at 12:34 AM, Jörn Kottmann <[email protected]> wrote: > On 11/26/2013 12:57 AM, Walrus theCat wrote: > >> I'd like to get more data, but this is what I've got right now... I have >> 5,000 sentences, and 10-20 name annotations per label. Would that really >> cause the null pointer, though? And only when cross-validating? >> >> >> > You need more name annotations, with a few tens the name finder will > probably detect > almost nothing. > > The cross validator splits the data into n parts, afterwards it > iteratively trains on n-1 parts and tests on > the left out part. The n-1 parts it uses for testing might not contain any > annotations at all. > > Jörn >
