There is a lot of config handling, maybe PostLemmas is being set to true or
configInit() is not setting up the NLM wrapper incorrectly.
ctakes-lvg *README*
Note: as distributed, PostLemmas is set to false. This is done to reduce
the size of the CAS.
Set PostLemmas to true to have org.apache.ctak
The normalizedForm field is filled in. It is used by dictionary lookup.
So, for example, if the dictionary would contain "lymph node" but not "lymph
nodes", a document with text of "lymph nodes" would match the dictionary entry
"lymph node" because "node", being the normalized form of "nodes", w
Before the switch to OpenNLP (which was done before the first opensource
release of cTAKES), I believe the Lemma annotations were used by the POS tagger
and/or phrasal parser. As far as I know, that was the original intention of
the Lemmas. I believe they were turned off by default for some re
Quick follow-up since I was interested. The current dependency parser
does have the option to use ctakes lemmas or do its own lemmatizing, but
that doesn't use the lemma field, it uses the normalizedForm field. I'm
not sure if that field is actually ever filled in -- on my example data
it is always
Thanks James. Does it ring a bell to you that the original intention was
something like query expansion for a dictionary lookup?
Tim
On 04/17/2014 01:57 PM, Masanz, James J. wrote:
> Offhand I recall at least one of the dependency parsers used the Lemma
> annotations at one point.
> Not sure if
Offhand I recall at least one of the dependency parsers used the Lemma
annotations at one point.
Not sure if still does.
There is an option for turning off the posting of the lemmas to the cas.
Hope that helps
-Original Message-
From: Miller, Timothy [mailto:timothy.mil...@childrens.ha
Those variants are not used by the dictionary lookup. I did look at them to
see if it was worthwhile for the new dictionary, but they are all over the
place so I passed.
From: Miller, Timothy [timothy.mil...@childrens.harvard.edu]
Sent: Thursday, April
Pei and I had a similar discussion in person -- mapping from lexical
variants to a stem might be useful. Pei also mentioned that one intended
use might have been searching the dictionary with lexical variants, but
I don't think that is done. Looking at the precision of the variants, I
think its hig
I don’t know of any applications within cTAKES that make use of this… The
reverse (mapping from these “variants” to the normal form) may be useful though.
Dima
On Apr 17, 2014, at 11:50, Miller, Timothy
wrote:
> Sure, just as an example, I gave it a note with about 1000 words. It
> generat
Sure, just as an example, I gave it a note with about 1000 words. It
generates 11500 NonEmptyFSList elements (each is basically one lexical
variant).
For the word "symptomatic", these are the first 10 of 20 lexical variants:
Symptomaticer/JJ
Symptomaticer/RB
Symptomaticed/VB
Symptomaticcing/VB
Sym
Tim, this is a very interesting observation. Could you please send a few
examples of what LVG generates? Both sensical and non :)
Dima
On Apr 17, 2014, at 11:28, Miller, Timothy
wrote:
> The LVG annotator creates an enormous number of "lemmas" for every
> WordToken in the CAS, and I'm wond
The LVG annotator creates an enormous number of "lemmas" for every
WordToken in the CAS, and I'm wondering what the original purpose was? I
think this is probably a minor bottleneck for speed but mostly a pretty
big space hog (at least 50% of the space of xmi files in my tests).
As of right now I'
+1!
Im mainly using ctakes as middleware, which is totally inline with this.
What is NCBO?
JG
—
Sent from Mailbox for iPhone
On Wed, Apr 16, 2014 at 6:53 PM, andy mcmurry
wrote:
> Lowering the barrier to entry = worth the effort
> Notice the NCBO users mailing list has solid routine c
13 matches
Mail list logo