<anytag/>  is XML-compliant in schema-less XML (as long as the tag name 
complies to http://www.w3.org/TR/REC-xml/#NT-Name)

 

IMHO Moses input (with the -xml-input option) should stay schema-less, or we 
should define a schema. Right now I can't see a pressing reason to define a 
schema.

 

In any case it would be good to parse the input (with the -xml-input option) 
with a proper XML parser, e.g.

http://www.boost.org/doc/libs/1_54_0/doc/html/boost_propertytree/parsers.html#boost_propertytree.parsers.xml_parser
 

There are probably better XML parsers, but Moses already requires Boost. Using 
an XML parser could also solve some of the character escaping uncertainty.

 

Achim 

 

From: [email protected] [mailto:[email protected]] On 
Behalf Of [email protected]
Sent: Tuesday, October 15, 2013 10:25 PM
To: [email protected]
Subject: Re: [Moses-support] Placeholders

 

A change from <anytag/> will no-doubt disrupt existing pipelines. Communicating 
the change with the new release will be a great help.

 

On 2013-10-15 01:35, Hieu Hoang wrote:

they're good ideas. I'll have a think if I get round to doing it. 

Would also want to minimise the work I have to do, and minimize the disruption 
to people's existing pipeline.

 

On 15 October 2013 01:33, Tom Hoar <[email protected]> wrote:

I agree that <anytag/> could cause problems, especially with the growing
list of reserved tag names (ne, wall, zone). I wholeheartedly support a
fixed tag, but I'm not sure "option" is it. What about <np/> (already in
the manual) or <xml-markup/> or <xml-input/> or <moses/>?

Here's another idea. The -xml-input flag supports values "exclusive,"
"inclusive," "ignore" and "pass-through." What about changing the flag
to a boolean flag. Then, use the value as the xml tags: <exclusive/>,
<inclusive/> and <ignore/> so the one invocation of Moses would support
all modes on a per-sentence basis. Just a thought. Think this would also
be easier if you dropped the "pass-through" option because no need for
backwards compatibility.

Another idea, although slightly different subject. Moses'
-monotone-at-punctuation flag would be more useful if we could
define/override the punctuation & symbols that we want it to use. Not
sure how to best accomplish this.

Tom




On 10/15/2013 04:07 AM, Hieu Hoang wrote:
> In fact, we're thinking of changing <anytag/> to something fixed, like
> <option/>
>
> The <anytag/> behaviour isn't good XML and will cause problems in the
> future
>
> Any opinions on this gratefully received
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support




-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu

 

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

 

 

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to