Hi,

here are the results of my investigations:

- the text of the document is not set directly. You should add something like cas.setDocumentText(sentence.getDocumentText()); before populating the CAS in your method. Otherwise there will be a DocumentAnnotation of length 0. Ruta does not like these... that's the source of the problem. If you add the line, or avoid size length annotations somehow, then the rules should work just fine.

- I'd rather use tcas.addFsToIndexes(sentenceAnn); instead of tcas.getIndexRepository().addFS(sentenceAnn); (but that shouldn't change anything)

- You access the problem type "cogroo.ruta.Base.PROBLEM", but the rules seem to use the type "Main.PROBLEM"

Best,

Peter


Am 03.06.2015 um 19:14 schrieb Diego Buoro:
Hi Peter, the example we used is the small sentence inside a string at the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.". Based on the Main.ruta we sent you, we expected the output to contain 7 "PROBLEM" annotations. This part is working. The problem is when we change the last line of Main.ruta from "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we expected 6 "PROBLEM" annotations: the same ones we had on the first example, excpect for the first one.That's what happens when you run the script on a simple Ruta project, but when we run it in the Java application we get 0 "PROBLEM" annotations. We think this difference is happening because in the Ruta project we don't use a simple text as input.Instead, we feed it a preprocessed xmi file. On the other hand on the Java application, we do the processing ourselves via the processCas method. It's possible that the processCas method is creating tokens in a way that prevents us from detecting when one is next to the other on the Ruta script. We are sending you the xmi file to use as an example for a simple Ruta project. If there are any other examples you'd like us to send you, just say the word :D

Best,

Diego

2015-06-01 11:15 GMT-03:00 Diego Buoro <[email protected] <mailto:[email protected]>>:

    Sorry,please disregard my last answer. The idea wasn't to use the
    xmi, we are still thinking in a minimal example to provide to you.
    We will send you in the next few days.

    2015-06-01 10:37 GMT-03:00 Diego Buoro <[email protected]
    <mailto:[email protected]>>:

        Hi Peter,how are you doing?

We were trying to run using the files such as Crase01.xmi and rule_xml_001.xmi.
        Our goal is trying to run those two more simpler first,and
        then run with Crase.xmi.

        About the package declaration, i still need to check what ruta
        version is.
        I will be checking this soon.

        All Best,

        Diego





        2015-05-30 0:45 GMT-03:00 Diego Buoro <[email protected]
        <mailto:[email protected]>>:

            Hi Peter!
            No problem, I appreciate your support.

            All Best,

            Diego

            2015-05-27 14:22 GMT-03:00 Diego Buoro <[email protected]
            <mailto:[email protected]>>:

                Hi Peter!
                We call the script with the following lines:

                 URL url = Resources.getResource("Main.ruta");
                String text = Resources.toString(url, Charsets.UTF_8);
                 AnalysisEngineDescription aeDes =
                Ruta.createAnalysisEngineDescription(text, tsd);
                this.ae <http://this.ae> =
                UIMAFramework.produceAnalysisEngine(aeDes);

                CAS cas = ae.newCAS();
                converter.populateCas(sentence.getTextSentence(), cas);
                 ae.process(cas);

                The populateCAS method is responsible for translating
                our annotations into RUTA annotations, but it doesn't
                set any type priority explicitly.
                We don't know much about type priorities, the RUTA
                references we found say very little about that.Are
                they necessary for doing what we need?

                The file that contains the above lines is available here:
                
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
                The processCAS mehtod is available here:
                
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
                The script we are calling is available here:
                
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta

                PS:Yes, We remembered the semicolons.

                Thanks for the help :)



                2015-05-26 15:30 GMT-03:00 Diego Buoro
                <[email protected] <mailto:[email protected]>>:

                    I think i wasn't clear enough, and i should be
                    more specific.

                    I have a type system in which all words have been
                    annotated as Tokens. I am calling a RUTA script
                    from a java class, and that script has only one rule:
                    Token Token {-> Problem}

                    However, with this script, no Problems are
                    created. When I try
                    Token {-> Problem}

                    I get one problem for each Token, which is what I
                    expected. Why can't I create annotations using
                    rules with more than one word?

                    Thanks




                    2015-05-26 14:49 GMT-03:00 Diego Buoro
                    <[email protected] <mailto:[email protected]>>:

                        Hello guys,how are you doing?

                        I would like to know once i have called RUTA
                        from a Java project, how can i mark
                        consecutive tokens as a "Problem" (the name of
                        my annotation, in this case)?

                        Thanks in advice!








Reply via email to