Hi,
here are the results of my investigations:
- the text of the document is not set directly. You should add something
like cas.setDocumentText(sentence.getDocumentText()); before populating
the CAS in your method. Otherwise there will be a DocumentAnnotation of
length 0. Ruta does not like these... that's the source of the problem.
If you add the line, or avoid size length annotations somehow, then the
rules should work just fine.
- I'd rather use tcas.addFsToIndexes(sentenceAnn); instead of
tcas.getIndexRepository().addFS(sentenceAnn); (but that shouldn't change
anything)
- You access the problem type "cogroo.ruta.Base.PROBLEM", but the rules
seem to use the type "Main.PROBLEM"
Best,
Peter
Am 03.06.2015 um 19:14 schrieb Diego Buoro:
Hi Peter, the example we used is the small sentence inside a string at
the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
Based on the Main.ruta we sent you, we expected the output to contain
7 "PROBLEM" annotations. This part is working.
The problem is when we change the last line of Main.ruta from
"cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
expected 6 "PROBLEM" annotations: the same ones we had on the first
example, excpect for the first one.That's what happens when you run
the script on a simple Ruta project, but when we run it in the Java
application we get 0 "PROBLEM" annotations.
We think this difference is happening because in the Ruta project we
don't use a simple text as input.Instead, we feed it a preprocessed
xmi file. On the other hand on the Java application, we do the
processing ourselves via the processCas method. It's possible that the
processCas method is creating tokens in a way that prevents us from
detecting when one is next to the other on the Ruta script.
We are sending you the xmi file to use as an example for a simple Ruta
project. If there are any other examples you'd like us to send you,
just say the word :D
Best,
Diego
2015-06-01 11:15 GMT-03:00 Diego Buoro <[email protected]
<mailto:[email protected]>>:
Sorry,please disregard my last answer. The idea wasn't to use the
xmi, we are still thinking in a minimal example to provide to you.
We will send you in the next few days.
2015-06-01 10:37 GMT-03:00 Diego Buoro <[email protected]
<mailto:[email protected]>>:
Hi Peter,how are you doing?
We were trying to run using the files such as Crase01.xmi and
rule_xml_001.xmi.
Our goal is trying to run those two more simpler first,and
then run with Crase.xmi.
About the package declaration, i still need to check what ruta
version is.
I will be checking this soon.
All Best,
Diego
2015-05-30 0:45 GMT-03:00 Diego Buoro <[email protected]
<mailto:[email protected]>>:
Hi Peter!
No problem, I appreciate your support.
All Best,
Diego
2015-05-27 14:22 GMT-03:00 Diego Buoro <[email protected]
<mailto:[email protected]>>:
Hi Peter!
We call the script with the following lines:
URL url = Resources.getResource("Main.ruta");
String text = Resources.toString(url, Charsets.UTF_8);
AnalysisEngineDescription aeDes =
Ruta.createAnalysisEngineDescription(text, tsd);
this.ae <http://this.ae> =
UIMAFramework.produceAnalysisEngine(aeDes);
CAS cas = ae.newCAS();
converter.populateCas(sentence.getTextSentence(), cas);
ae.process(cas);
The populateCAS method is responsible for translating
our annotations into RUTA annotations, but it doesn't
set any type priority explicitly.
We don't know much about type priorities, the RUTA
references we found say very little about that.Are
they necessary for doing what we need?
The file that contains the above lines is available here:
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
The processCAS mehtod is available here:
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
The script we are calling is available here:
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
PS:Yes, We remembered the semicolons.
Thanks for the help :)
2015-05-26 15:30 GMT-03:00 Diego Buoro
<[email protected] <mailto:[email protected]>>:
I think i wasn't clear enough, and i should be
more specific.
I have a type system in which all words have been
annotated as Tokens. I am calling a RUTA script
from a java class, and that script has only one rule:
Token Token {-> Problem}
However, with this script, no Problems are
created. When I try
Token {-> Problem}
I get one problem for each Token, which is what I
expected. Why can't I create annotations using
rules with more than one word?
Thanks
2015-05-26 14:49 GMT-03:00 Diego Buoro
<[email protected] <mailto:[email protected]>>:
Hello guys,how are you doing?
I would like to know once i have called RUTA
from a Java project, how can i mark
consecutive tokens as a "Problem" (the name of
my annotation, in this case)?
Thanks in advice!