2011/6/22 Jörn Kottmann <[email protected]>: > >> Do we have one CAS per sentence or one CAS per document? If the former >> is the case, then we will need some more metadata around the CAS >> documents to be able to show the context of a given sentence (if that is >> needed at all). If the latter is the case, then this will lead to many >> different Sofas, which only differ in a few characters, right? >> > > I was thinking about a system where we have one CAS per document, > but our tooling should still collect annotation on a sentence level. > So a user needs to annotate at least one sentence to add something > useful to the CAS. The training code should then take care of training > on a document which only contains a few annotated sentences.
I agree. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel
