Kevin Brubeck Unhammer <[email protected]> writes:

> There is one caveat: <g/> (group element) is not handled yet. The man
> page notes how to work around that. I have an idea for handling the
> group element[1], but I'm not sure when I'll get to try it out.

With the latest version on
https://github.com/unhammer/lttoolbox/tree/df-intersection, <g> elements
(shown as # in the stream) are now handled.

Apart from the hand-written tests, I've compared the full bokmål
transducer with one trimmed with nb-nn.autobil.bin, and the only
differences are actual testvoc bugs. So it seems to be working very well
in my testing, but I hope more people can check it out and see if they
find anything that isn't handled correctly. 

Regarding multiwords with both <j/> and <g> at once, I'm not even sure
what those should look like. But when I tested with ca-en they seem to
be included, except these inf+es entries which keeps getting trimmed
out:

canviar-se d':canviar<vblex><inf>+es<prn><enc><ref><p3><mf><sp># de             
      <
canviar-se dʼ:canviar<vblex><inf>+es<prn><enc><ref><p3><mf><sp># de             
      <
canviar-se d’:canviar<vblex><inf>+es<prn><enc><ref><p3><mf><sp># de             
      <

I don't know if those are real testvoc bugs or bugs in lt-trim, anyone?


-- 
Kevin Brubeck Unhammer

GPG: 0x766AC60C

Attachment: pgpKbjJFCwq3Y.pgp
Description: PGP signature

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to