Hi,
I came across some bad space handling in apertium-postchunk, notably: if
there were two spaces in a row, they would be treated as separate
blanks, so that if you had
^chunk{^word<tag>$ ^word<tag>$^word<tag>$}
and you tried outputting
chunk pos="1"
b pos="1"
chunk pos="2"
b pos="2"
chunk pos="3"
it would become
^chunk{^word<tag>$ ^word<tag>$ ^word<tag>$}
Also, escaped characters and non-alphabetics (stuff like \^ or " that
occur between words) were not output.
I added a patch to
http://bugs.apertium.org/cgi-bin/bugzilla/show_bug.cgi?id=89 where part
of the problem was reported already. I'd be happy if someone could test
if it works and can be committed.
On a related note, Gabriel Gregori Manzano's vm-for-transfer-cpp already
handles double spaces correctly, but doesn't handle escaped chars yet
(https://github.com/ggm/vm-for-transfer-cpp/issues/9). Although there
are still some issues with it, I'd recommend everyone who's working on
transfer to try apertium-transfervm-compiler; it can provide a lot of
helpful feedback (like if you've declared the wrong number of parameters
to a macro …).
best regards,
Kevin Brubeck Unhammer
------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Ciosco Self-Assessment and learn
about Cisco certifications, training, and career opportunities.
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff