Re: Marking cosnecutive tokens with RUTA

Diego Buoro Fri, 12 Jun 2015 05:20:27 -0700

Hi Peter, Armin

Thanks for the observations made, i hope we can finally get working here.
We will try the changes in the next few days and then give you a feedback.


All Best,

Diego



2015-06-03 14:14 GMT-03:00 Diego Buoro <[email protected]>:

> Hi Peter, the example we used is the small sentence inside a string at the
> end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
> Based on the Main.ruta we sent you, we expected the output to contain 7
> "PROBLEM" annotations. This part is working.
> The problem is when we change the last line of Main.ruta from
> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
> expected 6 "PROBLEM" annotations: the same ones we had on the first
> example, excpect for the first one.That's what happens when you run the
> script on a simple Ruta project, but when we run it in the  Java
> application we get 0 "PROBLEM" annotations.
> We think this difference is happening because in the Ruta project we don't
> use a simple text as input.Instead, we feed it a preprocessed xmi file. On
> the other hand on the Java application, we do the processing ourselves via
> the processCas method. It's possible that the processCas method is creating
> tokens in a way that prevents us from detecting when one is next to the
> other on the Ruta script.
> We are sending you the xmi file to use as an example for a simple Ruta
> project. If there are any other examples you'd like us to send you, just
> say the word :D
>
> Best,
>
> Diego
>
> 2015-06-01 11:15 GMT-03:00 Diego Buoro <[email protected]>:
>
>> Sorry,please disregard my last answer. The idea wasn't to use the xmi, we
>> are still thinking in a minimal example to provide to you.
>> We will send you in the next few days.
>>
>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <[email protected]>:
>>
>>> Hi Peter,how are you doing?
>>>
>>> We were trying to run using the files such as Crase01.xmi and
>>> rule_xml_001.xmi.
>>> Our goal is trying to run those two more simpler first,and then run with
>>> Crase.xmi.
>>>
>>> About the package declaration, i still need to check what ruta version
>>> is.
>>> I will be checking this soon.
>>>
>>> All Best,
>>>
>>> Diego
>>>
>>>
>>>
>>>
>>>
>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <[email protected]>:
>>>
>>>> Hi Peter!
>>>> No problem, I appreciate your support.
>>>>
>>>> All Best,
>>>>
>>>> Diego
>>>>
>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <[email protected]>:
>>>>
>>>>> Hi Peter!
>>>>> We call the script with the following lines:
>>>>>
>>>>>  URL url = Resources.getResource("Main.ruta");
>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>  AnalysisEngineDescription aeDes =
>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>
>>>>> CAS cas = ae.newCAS();
>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>  ae.process(cas);
>>>>>
>>>>> The populateCAS method is responsible for translating our annotations
>>>>> into RUTA annotations, but it doesn't set any type priority explicitly.
>>>>> We don't know much about type priorities, the RUTA references we found
>>>>> say very little about that.Are they necessary for doing what we need?
>>>>>
>>>>> The file that contains the above lines is available here:
>>>>>
>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>> The processCAS mehtod is available here:
>>>>>
>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>> The script we are calling is available here:
>>>>>
>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>
>>>>> PS:Yes, We remembered the semicolons.
>>>>>
>>>>> Thanks for the help :)
>>>>>
>>>>>
>>>>>
>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <[email protected]>:
>>>>>
>>>>>> I think i wasn't clear enough, and i should be more specific.
>>>>>>
>>>>>> I have a type system in which all words have been annotated as
>>>>>> Tokens. I am calling a RUTA script from a java class, and that script has
>>>>>> only one rule:
>>>>>> Token Token {-> Problem}
>>>>>>
>>>>>> However, with this script, no Problems are created. When I try
>>>>>> Token {-> Problem}
>>>>>>
>>>>>> I get one problem for each Token, which is what I expected. Why can't
>>>>>> I create annotations using rules with more than one word?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <[email protected]>:
>>>>>>
>>>>>>> Hello guys,how are you doing?
>>>>>>>
>>>>>>> I would like to know once i have called RUTA from a Java project,
>>>>>>> how can i mark consecutive tokens as a "Problem" (the name of my
>>>>>>> annotation, in this case)?
>>>>>>>
>>>>>>> Thanks in advice!
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Marking cosnecutive tokens with RUTA

Reply via email to