Re: sentence number in WordToken

samir chabou Mon, 30 Sep 2013 09:27:34 -0700

thanks for the feed back it's a good point,
I did it also with selectCovering but as Richard mention I'll changed to 
indexCovering since it's faster.
Samir





________________________________
 From: "Chen, Pei" <pei.c...@childrens.harvard.edu>
To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>; samir chabou 
<samir...@yahoo.com> 
Sent: Monday, September 30, 2013 12:10:45 PM
Subject: RE: sentence number in  WordToken
 

Samir,
I think Richard has a good point here.   What is the use to require adding 
sentenceNumber() to BaseToken in the TypeSystem?
If it's only temporary, It may be a good idea to do it programmatically with 
local variable rather than modifying the type system and having it stored in 
the CAS...?

Maybe something like:
boolean a = JCasUtil.isCovered(JCas, BaseToken1, Sentence.class);
Boolean b = JCasUtil.isCovered(JCas, BaseToken2, Sentence.class);
--Pei


> -----Original Message-----
> From: Richard Eckart de Castilho [mailto:r...@apache.org]
> Sent: Monday, September 30, 2013 11:59 AM
> To: dev@ctakes.apache.org; samir chabou
> Subject: Re: sentence number in WordToken
> 
> Hi,
> 
> if you do many selectCovering calls, you may be faster using indexCovering
> once and then using the lookup index it produces.
> 
> IMHO type systems should not contain information that can easily be
> calculated at runtime (e.g. sentence number, token number, etc.).
> 
> Mind, I have no say here ;) Just my personal opinion.
> 
> -- Richard
> 
> On 30.09.2013, at 16:17, samir chabou <samir...@yahoo.com> wrote:
> 
> > Hi Pei,
> >
> > I though
> > this may be have some use ...
> >
> > Because I
> > need to know if two or more words tokens belong to the same sentence;
> > and since WordToken does not define the feature sentence number. I
> > added it to the TypeSystem. These are the steps:
> >
> > 1)      I added the sentence number
> > features for the type BaseToken in TypeSystem.xml file (I choose the
> > supper class in order that the feature be propagated to all subclasses
> > (wordToken,SymboleToken,NumToken ...)
> >
> > 2)      In ctakes-core I in TokenizerAnnotatorPTB.java (methode
> annotateRange) I set the new feature
> > (BaseToken.sentenceNumber = sentence.getSentenceNumber()) as
> shown below :
> >
> > bta.setSentenceNumber(sentence.getSentenceNumber());
> >       bta.addToIndexes();
> >
> > 3)      Generate the JCASGen in the tab de TypeSystem of the
> > aggregate
> >
> > 4)      Add the feature in the source
> > tab of the aggregate
> >
> > Probably I
> > could have used as alternative:
> > List<Sentence> list = JCasUtil.selectCovering(aJcas, Sentence.class,
> > entity1.getBegin(), entity1.getEnd()); the issue with this is : if I
> > have many entities to be checked at the same time or if the entity1 is
> > found in many places, I have to add some if conditions to get sentence
> > number
> >
> >
> > Thanks
> > Samir

Re: sentence number in WordToken

Reply via email to