Hi Peter,

thanks for your answer. Now I see where my error in applying the script was. I was already wondering why I need to specify an output location both in the script and in the run configurations. What I did was to set both to "converted". Now that I changed that, i.e. use "converted" as output for the HtmlViewWriter, and set the output folder of the whole script to "output", I can indeed see the difference you are mentioning. Everything looks as expected now.

I hope I will make further progress now.

By the way, I would love to hear about doing the HTML conversion in Java. Maybe this approach would also give me more flexibility on what kind of html tags to convert to annotations?

Cheers,

Mandy


Am 06.02.19 um 16:42 schrieb Peter Klügl:
Hi,


does the plain vs _InitialView problem occur in the CASes in the output
folder or in the converted folder?


"output" should contain the result of the script processing. The
_InitialView is set by the launcher, it's static and cannot be changed.

"converted" should contain additional CASes where the plain view is
copied to the _InitialView, which hasn't been set yet.


(Although I think that I have written those rules as an example some
time ago, I personally prefer to perform the HTML conversion in Java)


Best,


Peter


Am 06.02.2019 um 16:18 schrieb Mandy Neumann:
Hi,

after some additional digging I found this setting in the workbench
preferences where SourceDocumentInformation is used for the output
parameter. This seems to have fixed the permission issue, I get no
more exceptions.

Unfortunately, the problem with plain vs. _InitialView still persists,
which is kind of annoying. Any ideas on that? (I'd like to also make
sure that this is not causing any further problems in my planned
workflow.)

Best,

Mandy

Am 06.02.19 um 15:40 schrieb Marshall Schor:
hi,

I'm not an expert, but I'm guessing that there still is a permissions
issue,
perhaps on a different file or directory than the one you checked.

Try having someone else take a look at your stack trace / error
message, and
your file system permissions.  A second pair of eyes often is helpful
(I speak
from personal experience).

Cheers. -Marshall

On 2/6/2019 5:44 AM, Mandy Neumann wrote:
Hi all,

I'm just starting to get familiar with UIMA Ruta and the workbench,
and I'm
having some strange issues.

I got a project from a co-worker who already prepared some scripts
for me to
extend. The project has .html files in the input folder, and he already
provided a Ruta script to convert HTML markup into annotations. The
script is
adapted from the Ruta manual:

ENGINE utils.HtmlAnnotator;
ENGINE utils.HtmlConverter;
ENGINE HtmlViewWriter;
TYPESYSTEM utils.HtmlTypeSystem;
TYPESYSTEM utils.SourceDocumentInformation;

Document{->CONFIGURE(HtmlAnnotator, "onlyContent"=true),
EXEC(HtmlAnnotator,
{TAG})};

Document { -> CONFIGURE(HtmlConverter, "inputView" = "_InitialView",
      "outputView" = "plain", "expandOffsets"=false,
"replaceLinebreaks"=true,
"skipWhitespacs"=true, "linebreakReplacement"=" ", "processAll"=true),
        EXEC(HtmlConverter)};

Document{ -> CONFIGURE(HtmlViewWriter, "inputView" = "plain",
      "outputView" = "_InitialView", "output" = "../converted"),
      EXEC(HtmlViewWriter)};
On my machine and with my settings, when I run this script, my
console get
spammed with
org.apache.uima.analysis_engine.AnalysisEngineProcessExceptions
caused by java.io.FileNotFoundException
   with the message "../converted (Permission denied)". I checked the
file
permissions on this directory which were 775 - I even chmodded to
777 but
still the same issue.

In spite of all these exceptions, the output still gets generated,
though. I
would be fine with it if there weren't another issue - although the
script
should write the annotations into _InitialView, I need to change the
view to
"plain" in the editor to get plain text with HTML annotations. The
_InitialView still shows the html markup.

I think both issues are related. Any ideas?

Cheers,

Mandy


System Info: eclipse Oxygen.3a Release (4.7.3a), UIMA Ruta workbench
2.6.1, OS
Kubuntu 18.04


Reply via email to