2011/6/22 Hannes Korte <[email protected]>:
> On 06/22/2011 06:50 PM, Olivier Grisel wrote:
>> I am ok with switching to UIMA CAS. We might need additional metadata
>> outside of the CAS annotations though. For instance if the annotators
>> fixes a typo in the Sofa it-self, we might need to be able to tell
>> that Sofa1 is subject to being replaced by Sofa2 according to
>> annotator A1 for instance.
>
> Do we have one CAS per sentence or one CAS per document? If the former
> is the case, then we will need some more metadata around the CAS
> documents to be able to show the context of a given sentence (if that is
> needed at all). If the latter is the case, then this will lead to many
> different Sofas, which only differ in a few characters, right?
>
> If we want to add disambiguation and coref information into the
> annotator UI at a later stage, then one CAS per document would be much
> more useful.

I am +1 for one CAS per document with intra-CAS fast navigation using
keyboard and filtered sentences at the UI level only. However
pignlproc output OpenNLP formatted sentences without document
information. But this can change (it's just not implemented yet).

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Reply via email to