A 2014-06-11 16:01, Kevin Brubeck Unhammer escrigué:
> Francis Tyers <[email protected]> writes:
> 
>> El dt 25 de 03 de 2014 a les 12:17 +0000, en/na Jim O'Regan va 
>> escriure:
> 
> [...]
> 
>>> Also, I have a tiny feature that allows the user to specify a set of
>>> characters to be ignored at runtime (motivated primarily by soft
>>> hyphens, but I've left it general[1]). I sent the patch to Sergio to
>>> review, but I'd really rather get it in now than wait n years until
>>> the next release :)
>>> 
>>> For the curious, I've attached the patch.
>>> 
>>> Current behaviour is:
>>> $ echo test­ing |lttoolbox/lt-proc  
>>> ~/Apertium/apertium-en-es/en-es.automorf.bin
>>> ^test/test<n><sg>/test<vblex><inf>/test<vblex><pres>$­^ing/*ing
>>> 
>>> Using this as soft-hyphen.icx:
>>> 
>>> <?xml version="1.0"?>
>>> <ignored-chars>
>>>   <char value="&#173; "/>
>>> </ignored-chars>
>>> 
>>> echo test­ing |lttoolbox/lt-proc -i soft-hyphen.icx
>>> ~/Apertium/apertium-en-es/en-es.automorf.bin
>>> ^testing/test<vblex><ger>/test<vblex><pprs>/test<vblex><subs>/testing<n><sg>$
>> 
>> Could this just be included as default ? I mean, are there any cases 
>> in
>> which we would not want to skip a soft-hyphen ?
> 
> So having an icx on the command line is nice for developers, and people
> who use lt-proc for non-Apertium things. But it would require changing
> modes files for any pairs that want to take advantage of it … I think
> maybe a hardcoded ignore-list in lttoolbox would be more helpful to 
> more
> users. Are there other use-cases than soft-hyphens? Or cases where we
> want to _not_ ignore the soft-hyphen?
> 
> (Tino Didriksen noted some other possibly skippable stuff:
> http://www.fileformat.info/info/unicode/category/Cf/list.htm )

I would say in the first version just skip soft hyphen. We can release 
minor releases if something else comes up.

F.

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to