Re: Marking cosnecutive tokens with RUTA

Peter Klügl Thu, 18 Jun 2015 11:11:48 -0700

Hi,

the Ruta descriptors are built twice in your project. Once in the normalphase and once during testing (don't know why).The classpath of the two builds is different. In the second one, theimported type system can be found in the classpath, but not in the firstone. In a normal mvn clean install, the descriptors are not yet builtwhen they should be added to the jar. That's the reason why it works inEclipse (no clean?), but not for a mvn clean install.

The simplest solution is to extend the descriptor paths so that theimported type system is found without the classpath:


<descriptorPaths>
<descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath>
<descriptorPath>${basedir}/descriptor</descriptorPath>
</descriptorPaths>

btw, there are several duplicated files in the project, whichpotentially hide problems in the build process.


Best,

Peter

Am 18.06.2015 um 17:30 schrieb Diego Buoro:

Hi Peter, thanks for the support.

We are now using Java 7, but we are still facing problems. In Eclipse,we've set manually the path where descriptor files are located(target/generated-sources/ruta/descriptor), and therefore it'sworking. However, when we run mvn clean install, we generate ourdescriptors files incogroo-ruta/target/generated-sources/ruta/descriptor/cogroo/ruta butthey aren't being copied to .jar file. The errors are in this logfile, do you have any idea of why they are happening?

Here is the link to the repository:https://github.com/Fichberg/cogroo4/tree/labXP215_Will


All Best,

Diego

2015-06-17 16:20 GMT-03:00 Peter Klügl <[email protected]<mailto:[email protected]>>:


    Hi,

    UIMA Ruta 2.3.0 and also the maven plugin require Java 7. Thus,
    the maven build process has to use the correct Java version. Just
    wanted to mention it because I had this problem right away.

    The descriptors are not built because the plugin does not find any
    ruta files. The maven plugin is specified in one project while the
    ruta files are located in a different project. The problem is that
    the ruta maven plugin only collects ruta files within the basedir
    of the project -> no files built...

    In the next release, the maven plugin will get another parameter
    for specifying the input files.

    With UIMA Ruta 2.3.0, there are two options: Either you put the
    ruta files in the project with the ruta maven plugin, or you add
    the ruta maven plugin to the project pom with the ruta files.

    Best,

    Peter


    Am 17.06.2015 um 18:30 schrieb Diego Buoro:

        Hi, Peter! We are attempting to create the descriptors based
        on Ruta 2.3,
        but we're out of luck. We've added the lines from the link you
        gave us to
        the pom.xml file and corrected the directory paths to suit our
        project.
        However, when we try to run Maven with Ruta's "generate" goal,
        no files got
        generated on the folders we set. Is the goal supposed to
        generate the files
        and leave them in the folder or does it do something else?

        Here is the link to our altered pom.xml. The plugin section is
        at the end
        of the file:
        
https://raw.githubusercontent.com/Fichberg/cogroo4/labXP215_Will/cogroo-gc/pom.xml

        Thanks for the help so far. :D

        2015-06-14 9:40 GMT-03:00 Peter Klügl
        <[email protected] <mailto:[email protected]>>:

            Hi,

            the descriptor are always created at compile time.

            In Ruta 2.2.1, yes, you need to create the descriptors in
            the UIMA Ruta
            Workbench and then copy them or make them available in
            some other way. This
            is especially necessary if you declare additional types
            (type system
            descriptor changes) or add some subscript (analysis engine
            descriptor
            changes).

            In Ruta 2.3.0 which was just released, there is a maven
            plugin for
            building the descriptors. Take a look at:
            http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
            This means that you do not need the UIMA Ruta Workbench
            projects anymore,
            but you can use its development support and descriptor
            building in normal
            maven projects.

            Best,

            Peter


            Am 12.06.2015 um 21:38 schrieb Diego Buoro:

                Hello Peter

                We tried your suggestions and it worked liked a
                charm,thanks :D
                However, we are facing another problem: It seems that
                our application
                isn't
                creating the mainTypesystem and mainEngine files when
                we launch it. We
                don't know whether or not that's is the default
                behavior, but for now we
                are having to create these files in separate project
                and them copy them to
                the application whenever we change the script, which
                is a bad solution.
                Doy you have any suggestions?

                All Best,

                Diego

                2015-06-12 9:19 GMT-03:00 Diego Buoro
                <[email protected] <mailto:[email protected]>>:

                  Hi Peter, Armin

                    Thanks for the observations made, i hope we can
                    finally get working here.
                    We will try the changes in the next few days and
                    then give you a
                    feedback.

                    All Best,

                    Diego



                    2015-06-03 14:14 GMT-03:00 Diego Buoro
                    <[email protected] <mailto:[email protected]>>:

                      Hi Peter, the example we used is the small
                    sentence inside a string at

                        the end of UIMAChecker.java: "Refiro-me à
                        trabalho remunerado.".
                        Based on the Main.ruta we sent you, we
                        expected the output to contain 7
                        "PROBLEM" annotations. This part is working.
                        The problem is when we change the last line of
                        Main.ruta from
                        "cgToken{->PROBLEM};" to "cgToken
                        cgToken{->PROBLEM};"in this case we
                        expected 6 "PROBLEM" annotations: the same
                        ones we had on the first
                        example, excpect for the first one.That's what
                        happens when you run the
                        script on a simple Ruta project, but when we
                        run it in the  Java
                        application we get 0 "PROBLEM" annotations.
                        We think this difference is happening because
                        in the Ruta project we
                        don't use a simple text as input.Instead, we
                        feed it a preprocessed xmi
                        file. On the other hand on the Java
                        application, we do the processing
                        ourselves via the processCas method. It's
                        possible that the processCas
                        method is creating tokens in a way that
                        prevents us from detecting when
                        one
                        is next to the other on the Ruta script.
                        We are sending you the xmi file to use as an
                        example for a simple Ruta
                        project. If there are any other examples you'd
                        like us to send you, just
                        say the word :D

                        Best,

                        Diego

                        2015-06-01 11:15 GMT-03:00 Diego Buoro
                        <[email protected] <mailto:[email protected]>>:

                          Sorry,please disregard my last answer. The
                        idea wasn't to use the xmi,

                            we are still thinking in a minimal example
                            to provide to you.
                            We will send you in the next few days.

                            2015-06-01 10:37 GMT-03:00 Diego Buoro
                            <[email protected]
                            <mailto:[email protected]>>:

                              Hi Peter,how are you doing?

                                We were trying to run using the files
                                such as Crase01.xmi and
                                rule_xml_001.xmi.
                                Our goal is trying to run those two
                                more simpler first,and then run
                                with Crase.xmi.

                                About the package declaration, i still
                                need to check what ruta version
                                is.
                                I will be checking this soon.

                                All Best,

                                Diego





                                2015-05-30 0:45 GMT-03:00 Diego Buoro
                                <[email protected]
                                <mailto:[email protected]>>:

                                  Hi Peter!

                                    No problem, I appreciate your support.

                                    All Best,

                                    Diego

                                    2015-05-27 14:22 GMT-03:00 Diego
                                    Buoro <[email protected]
                                    <mailto:[email protected]>>:

                                      Hi Peter!

                                        We call the script with the
                                        following lines:

                                           URL url =
                                        Resources.getResource("Main.ruta");
                                        String text =
                                        Resources.toString(url,
                                        Charsets.UTF_8);
                                           AnalysisEngineDescription
                                        aeDes =
                                        
Ruta.createAnalysisEngineDescription(text,
                                        tsd);
                                        this.ae <http://this.ae> =
                                        
UIMAFramework.produceAnalysisEngine(aeDes);

                                        CAS cas = ae.newCAS();
                                        
converter.populateCas(sentence.getTextSentence(),
                                        cas);
                                           ae.process(cas);

                                        The populateCAS method is
                                        responsible for translating our
                                        annotations
                                        into RUTA annotations, but it
                                        doesn't set any type priority
                                        explicitly.
                                        We don't know much about type
                                        priorities, the RUTA references we
                                        found say very little about
                                        that.Are they necessary for
                                        doing what
                                        we need?

                                        The file that contains the
                                        above lines is available here:


                                        
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
                                        The processCAS mehtod is
                                        available here:


                                        
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
                                        The script we are calling is
                                        available here:


                                        
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta

                                        PS:Yes, We remembered the
                                        semicolons.

                                        Thanks for the help :)



                                        2015-05-26 15:30 GMT-03:00
                                        Diego Buoro
                                        <[email protected]
                                        <mailto:[email protected]>>:

                                          I think i wasn't clear
                                        enough, and i should be more
                                        specific.

                                            I have a type system in
                                            which all words have been
                                            annotated as
                                            Tokens. I am calling a
                                            RUTA script from a java
                                            class, and that
                                            script has
                                            only one rule:
                                            Token Token {-> Problem}

                                            However, with this script,
                                            no Problems are created.
                                            When I try
                                            Token {-> Problem}

                                            I get one problem for each
                                            Token, which is what I
                                            expected. Why
                                            can't I create annotations
                                            using rules with more than
                                            one word?

                                            Thanks




                                            2015-05-26 14:49 GMT-03:00
                                            Diego Buoro
                                            <[email protected]
                                            <mailto:[email protected]>>:

                                              Hello guys,how are you
                                            doing?

                                                I would like to know
                                                once i have called
                                                RUTA from a Java project,
                                                how can i mark
                                                consecutive tokens as
                                                a "Problem" (the name
                                                of my
                                                annotation, in this case)?

                                                Thanks in advice!

Re: Marking cosnecutive tokens with RUTA

Reply via email to