I suggest using the current types.

I think if we add a new one, we would still want to handle multiple 
classifications, and would still have the downside of having to iterate through 
the classifications to find the one of interest.  So I'm not sure how much we 
gain by adding a new type.

But you are closer to this than I am so I would go with whatever you recommend 
or others doing classification recommend.

-- James


> -----Original Message-----
> From: [email protected]
> [mailto:ctakes-dev-return-874-
> [email protected]] On Behalf Of Dmitriy Dligach
> Sent: Thursday, November 15, 2012 3:49 PM
> To: [email protected]
> Subject: Re: new type: document label?
> 
> James, thanks. This makes perfect sense.
> 
> So what's the conclusion? Can we do with the current types, or do we
> still need to create a new one?
> 
> Dima
> 
> On 11/15/2012 03:43 PM, Masanz, James J. wrote:
> > Yes, you can put multiple Pair annotations in the CAS.
> > There is a Pairs (plural) annotation type which is a list (FSArray) of
> Pair annotations.
> >
> > You could have two Pair annotations with
> > attribute="at_risk_for_early_brca"
> > value="T"
> >
> > attribute="alchohol_use"
> > value="heavy_drinker"
> >
> > The downside:
> > You have to iteratate through the Pair annotations to find the one
> with the attribute name you want.
> > The upside: we don't have to create new Annotation types for
> everything that might be imagined.
> >
> > As Stephen points out, not everything in Pairs needs to be a document
> > class or related to the text within the document. It can be used for
> example to keep version information about a pipeline or anything any
> annotator wants. A totally made-up example could be
> attribute="dictionary_lookup_version"
> > value="3.2.1"
> >
> > -- James
> >
> >
> >> -----Original Message-----
> >> From:
> >> [email protected]
> >> [mailto:ctakes-dev-return-869-
> >> [email protected]] On Behalf Of Dmitriy
> >> Dligach
> >> Sent: Thursday, November 15, 2012 1:03 PM
> >> To: [email protected]
> >> Subject: Re: new type: document label?
> >>
> >> Chen brings up a good point. But can't we solve this problem by
> >> creating multiple Pair annotations in the CAS?
> >>
> >> Dima
> >>
> >> On 11/15/2012 01:52 PM, Lin, Chen wrote:
> >>> I am curious to know if Pair allows multiple document level labels
> >>> for
> >> a single doc. It is possible that a single set of documents be used
> >> in multiple classification tasks.
> >>> For example, in one task a document may be labeled as "positive" or
> >> "negative", in another task this same doc may be labeled as "high",
> >> "moderate" or "low".  Many thanks!
> >>> Best,
> >>> Chen
> >>>
> >>> -----Original Message-----
> >>> From: Dmitriy Dligach [mailto:[email protected]]
> >>> Sent: Thursday, November 15, 2012 1:46 PM
> >>> To: [email protected]
> >>> Subject: Re: new type: document label?
> >>>
> >>> Thank you, James.
> >>>
> >>> So, in general did you envision this type of use for Pair:
> >>>
> >>> Pair.attribute -> "document_label"
> >>> Pair.value -> "positive"
> >>>
> >>> I think this may work.
> >>>
> >>> Dima
> >>>
> >>> On 11/15/2012 10:22 AM, Masanz, James J. wrote:
> >>>> Pair (org.apache.ctakes.typesystem.type.util.Pair) is intended for
> >> such document-level properties.
> >>>> Would that suit your need?
> >>>>
> >>>> -- James
> >>>>
> >>>>> -----Original Message-----
> >>>>> From:
> >>>>> [email protected]
> >>>>> [mailto:ctakes-dev-return-854-
> >>>>> [email protected]] On Behalf Of Dmitriy
> >>>>> Dligach
> >>>>> Sent: Thursday, November 15, 2012 9:16 AM
> >>>>> To: cTAKES Dev list @ ASF
> >>>>> Subject: new type: document label?
> >>>>>
> >>>>> We've recently been using cTAKES more and more for document-level
> >>>>> classification (e.g. phenotyping). Would it make sense to add a
> >>>>> new type (that would derive from TOP) to store the label for a
> document?
> >>>>> I know we currently have a doc id for each document, but having
> >>>>> the label type would simplify a lot of things (e.g. debugging).
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Dima

Reply via email to