I believe it should be extended since I think that a RUTA user would expect 
that the MARKUP annotation indeed captures at least XML and HTML markup 
properly. The examples are from a Pub Med Central XML file that follows the 
NISO JATS specification so I will assume it is proper formatted XML without 
knowing all the details of the spec.

We have managed to implement a crude workaround for now but let us know when an 
improved version becomes available.

Cheers
Mario

> On 20 Oct 2015, at 17:56 , Peter Klügl <[email protected]> wrote:
> 
> Hi Mario,
> 
> yes, and the different quote also causes problems (are these valid?).
> 
> The MARUP annotation is not created by jflex like the other annoations,
> but by a postprocessing step using an regular epression. This expression
> does not cover theses cases (markupPattern in DefaultSeeder.java).
> 
> Should we extend it?
> 
> Best,
> 
> Peter
> 
> Am 20.10.2015 um 17:26 schrieb Mario Gazzo:
>> Hi Peter,
>> 
>> RUTA doesn’t seem to capture some XML markup with attributes. Here are some 
>> examples:
>> 
>> <xref ref-type="bibr" rid="b35-ehp0113-000220”>
>> <sec sec-type="methods”>
>> 
>> The above markup examples are totally missing in the TokenSeed annotations. 
>> I wonder whether it is related to the dash in the attribute names since 
>> other markup without this appear to be captured.
>> 
>> Can you confirm that the dash could cause the problem?
>> 
>> Cheers
>> Mario
> 

Reply via email to