I did some code for Ukrainan that ignores decimal separator "," within numbers when tokenizing. I didn't address number group separator "." yet (looks like this will require srx file change), but . is not used widely so I didn't consider it as important. But it would be nice if this was handled at common level (taking to account locale of the language).
Andriy 2014-09-24 8:03 GMT-04:00 R.J. Baars <r.j.ba...@xs4all.nl>: > Numbers like 1.234 or 1,000.00 are tokenized into several tokens, while it > is one number. > > What do you think about changing the tokenizer to treat them as one > number? This would maybe affect all languages having rules concerning > numbers, so this is not the right time, but maybe after releasing 2.7? > > Ruud > > > ------------------------------------------------------------------------------ > Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer > Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports > Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper > Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer > http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk > _______________________________________________ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel ------------------------------------------------------------------------------ Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel