Maybe I'm getting ahead of myself, but I'm not sure we need to worry
about this. The versions of OmniFind that will support Apache will most
likely not use the CPM, so chunking support would need to be rethought
anyway. We could remove or just not document the feature. I very much
hope that none of our other users are utilizing this feature.
--Thilo
Adam Lally wrote:
On 12/7/06, Marshall Schor <[EMAIL PROTECTED]> wrote:
To support "chunking" as used in OmniFind, the reference chapter for the
CPE says things like:
<term>throttleID</term>
<listitem><para>[String] special attribute currently used by
OmniFind.</para></listitem>
It seems a bit strange to have Omnifind specific information in the
Apache UIMA documentation.
Does anyone have a suggestion on how to better handle this? -Marshall
Well, can we just document how to use the feature (which I think
involves using a particular type system for document metadata and
populating it in the CollectionReader)? Others might have similar
requirements and want to use this.
I guess one problem is that the type name might not be what we want
(e.g. does it start with com.ibm?) We might want to consider making
this a UIMA built-in type (or perhaps not truly built-in, but a type
system descriptor provided with the SDK). Maybe it should be combined
with the org.apache.uima.examples.SourceDocumentInformation type,
which already defines a Document URI and an isLastSegment flag - we'd
need to add the SegmentNumber.
-Adam