Bug#809623: RFS: telegram-purple/1.2.3-1

Jakub Wilk Sun, 03 Jan 2016 15:04:08 -0800

* Ben Wiederhake <benwiederh...@gmx.de>, 2016-01-02, 23:13:

The Russian PO file reads:
Plural-Forms: nplurals=4; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2&& n%10<=4 && (n%100<12 || n%100>14) ? 1 : n%10==0 || (n%10>=5 &&n%10<=9) || (n%100>=11 && n%100<=14)? 2 : 3);

[...]

Even though I don't speak Russian, I can tell that this Plural-Formscan't possibly be correct. Here 4 plural forms are declared, but theexpression never evaluates to 3.
Since it's just modular arithmetic, one can just parse the formula tofill out a 10x10 table as a "proof". I did that, and came to the sameresult as you do, without even looking at your program. (Originally Iassumed a precedence error / parsing issue / whatever, so I didn'twant to start reading C code ... sorry.)
For the record, here's my interpretation, with parenthesis added:

((n%10==1 && n%100!=11)
? 0
: ((n%10>=2 && n%10<=4 && (n%100<12 || n%100>14))
   ? 1
   : ((n%10==0 || (n%10>=5 && n%10<=9) || (n%100>=11 && n%100<=14))
      ? 2
      : 3)))
This rule can be written in regex as follows (note that there is animplicit "and not any of the above", although it doesn't make adifference):
- "[023456789]1" => "Transifex one"
- "[023456789][234]" => "Transifex few"
- "1.|.[056789]" => "Transifex many"
- else => "Transifex other"

Hmm, these "one"-"few"-"many"-"other" reminded me about CLDR. An indeed,if you look at CLDR's plurals table[0], there's a 4th form applicable tofloating-point numbers. My hypothesis is that this Plural-Forms is aresult of a botched automatic conversion from CLDR data.

[0] 
http://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html#ru

Now, it would be cool if i18nspector explained better what is wronghere. [snip] I hope to implement this in the future.
Sounds awesome! However, I was still able to understand that*something* about the expression was fishy, but didn't understand thati18nspector is able to detect issues like this.

i18nspector has a small database of "correct" Plural-Forms for the most"popular" languages (including Russian). So all it knew was that yourPlural-Forms was different than the rest of the world uses.

(It does have other Plural-Forms correctness checks that don't requireany linguistic data, but they didn't trigger in this case.)

(Doesn't that essentially require a SAT-solver?)

Theoretically, yes, checking that a plural expression never evaluates toa certain value is NP-hard.

But in practice almost all real-world Plural-Forms are structuredsimilarly to what we saw in this thread, making them easy to analyse. SoI intend to implement something like this:


1) Try to prove that f(i + 100) == f(i) for all i > 100.

2) If we're able to prove it, then we know that the image of f is equalto {f(0), f(1), ..., f(199), f(200)}.

3) Otherwise, assume that the Plural-Forms is okay. (Alternatively:assume that Plural-Forms so unusual that it's almost certainly broken.)


--
Jakub Wilk

Bug#809623: RFS: telegram-purple/1.2.3-1

Reply via email to