On Fri, Feb 28, 2014 at 6:28 PM, Jimmy O'Regan <[email protected]> wrote:
> On 28 February 2014 18:21, Alex Aruj <[email protected]> wrote:
>> Hi group,
>>
...
>> Is the priority to make the charlifter case-sensitive and for it to respect
>> superblanks exactly as in the example in the box laid out here
>> http://wiki.apertium.org/wiki/Superblanks?
>>
>
> Respecting superblanks is a must: diacritic restoration must not be
> applied to them.
>
> Case should definitely be _respected_: the output needs to match the
> input in terms of case.
>
> As for case sensitivity, Kevin Scannell is the person to ask for a
> definitive answer.  My feeling is that case sensitivity can
> potentially be more accurate, but in the absence of sufficient data,
> case insensitive (trained on lowercase) should be the default.
>

This is spot on.  You'll do better in most cases with case sensitive
models (e.g. for Jimmy: Irish "Éire" vs. "eire") unless there is very
limited training data.

For individual cases, you can always try both and see which performs better.

Kevin

------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to