James, thanks. This makes perfect sense.
So what's the conclusion? Can we do with the current types, or do we
still need to create a new one?
Dima
On 11/15/2012 03:43 PM, Masanz, James J. wrote:
Yes, you can put multiple Pair annotations in the CAS.
There is a Pairs (plural) annotation type which is a list (FSArray) of Pair
annotations.
You could have two Pair annotations with
attribute="at_risk_for_early_brca"
value="T"
attribute="alchohol_use"
value="heavy_drinker"
The downside:
You have to iteratate through the Pair annotations to find the one with the
attribute name you want.
The upside: we don't have to create new Annotation types for everything that
might be imagined.
As Stephen points out, not everything in Pairs needs to be a document class or
related to the text within the document. It can be used for example to keep
version information about a pipeline or anything any annotator wants. A totally
made-up example could be
attribute="dictionary_lookup_version"
value="3.2.1"
-- James
-----Original Message-----
From: [email protected]
[mailto:ctakes-dev-return-869-
[email protected]] On Behalf Of Dmitriy Dligach
Sent: Thursday, November 15, 2012 1:03 PM
To: [email protected]
Subject: Re: new type: document label?
Chen brings up a good point. But can't we solve this problem by creating
multiple Pair annotations in the CAS?
Dima
On 11/15/2012 01:52 PM, Lin, Chen wrote:
I am curious to know if Pair allows multiple document level labels for
a single doc. It is possible that a single set of documents be used in
multiple classification tasks.
For example, in one task a document may be labeled as "positive" or
"negative", in another task this same doc may be labeled as "high",
"moderate" or "low". Many thanks!
Best,
Chen
-----Original Message-----
From: Dmitriy Dligach [mailto:[email protected]]
Sent: Thursday, November 15, 2012 1:46 PM
To: [email protected]
Subject: Re: new type: document label?
Thank you, James.
So, in general did you envision this type of use for Pair:
Pair.attribute -> "document_label"
Pair.value -> "positive"
I think this may work.
Dima
On 11/15/2012 10:22 AM, Masanz, James J. wrote:
Pair (org.apache.ctakes.typesystem.type.util.Pair) is intended for
such document-level properties.
Would that suit your need?
-- James
-----Original Message-----
From:
[email protected]
[mailto:ctakes-dev-return-854-
[email protected]] On Behalf Of Dmitriy
Dligach
Sent: Thursday, November 15, 2012 9:16 AM
To: cTAKES Dev list @ ASF
Subject: new type: document label?
We've recently been using cTAKES more and more for document-level
classification (e.g. phenotyping). Would it make sense to add a new
type (that would derive from TOP) to store the label for a document?
I know we currently have a doc id for each document, but having the
label type would simplify a lot of things (e.g. debugging).
Thanks,
Dima