Hi Tong,
I added a simple (trivial) exmaple xmiCAS with a type system to the CEV
file package on sourceforge. The text is in german, but I think you can
test at least the CEV functionality. The content is anyway fake.
Peter
Peter Klügl schrieb:
Hi Tong,
When processing input files that contain HTML tags, most of
annotators will
"clean-up" the HTML tags before doing any further processing. As the
result
of that, the xmiCAS doesn't contain the original HTML text anymore.
Ah ok. Visual and layout information is quite important for my
extraction tasks. My rule language has the capability to dynamically
filter all kinds and combinations of markup and annotations types.
Therefore the original HTML text stays the main artifact in the xmiCAS
even if the tags contain no valuable information. I plan to integrate
"external" annotators with restrictions also in that manner.
I think the most useful feature of your plug-in is its capability to
allow
users to edit the xmiCAS in the browser window similar to editing the
HTML
page with an HTML Editor (Please corect me if I am wrong).
I am not sure if I understand you. The structure or text of the HTML
cannot be modified by the CEV plugin (the rule language does such
things). I think the only real advantage to the CAS Viewer and the CAS
Editor is that the CEV can display annotations of an HTML artifact in
some kind of browser and the user can create new annotations in this
browser. It is really painfully to review or edit annotations in the
HTML source. There is probably no reason (except maybe the extension
point) to use the CEV plugin instead of the CAS Viewer if you are just
processing plain text.
Having some xmiCAS samples will help us to understand the plug-in's
capability.
Yes, I will provide a simple example next week.
Have a nice weekend!
Peter
--
Peter Klügl
University of Würzburg
[email protected]