Hi,
I am trying to use the Tika Annotator within RUTA scripts.
Here is my current script:
PACKAGE test;
ENGINE tika.MarkupAnnotator;
TYPESYSTEM tika.MarkupAnnotationTypeSystem;
DECLARE Link;
Document{-> EXEC(MarkupAnnotator, {MarkupAnnotation})};
MarkupAnnotation { FEATURE("name", "a") -> MARK(Link) };
and the file I try to annotate:
<html>
<body>
<ul>
<li><a href="#">Link 1</a></li>
<li><a href="#">Link 2</a></li>
</ul>
<p><a href="#">Link 3</a></li></p>
</body>
</html>
After I execute the script, I noticed two points:
- The annotation browser view does not display the "MarkupAnnotation"
tags
- The "MarkupAnnotation" condition is not triggered so that no "Link"
tag is present in the output
(When I visualize the output file in a text editor, the "MarkupAnnotation" tags
are here)
Therefore, I have two questions (maybe the answer is the same):
- How can I visualize the annotations placed by the external engine ?
- How can I trigger the "MarkupAnnotation" condition within my script ?
Thank you for your help.
Best regards,
Fouad
FYI: I put the MarkupAnnotation.xml and MarkupAnnotationTypeSystem.xml files in
my RUTA project and I reference the Tika libraries in my Run configuration. I
use the HEAD version of the Tika Annotator.