Thanks Jaume I looked at your approach and it was pretty straightforward (if a bit hackish) so I used it for Ukrainian. We're probably too late to develop anything common for 2.2 anyway.
Andriy 2013/6/13 Jaume OrtolĂ i Font <[email protected]>: > > 2013/6/12 Andriy Rysin <[email protected]> >> >> I noticed that numbers with fractions like 2,2 are split into '2', >> ',', '2' by word tokenizer. In Ukrainian I need to require difference >> case of the following noun based on whether it's a whole number or >> fractional so I was planning to adjust Ukrainian word tokenizer. But I >> think most European languages use comma for fractional numbers so I >> was wandering if somebody already has a solution or if this better be >> done in common code. >> > > Hi Andriy, > > This and other similar things are done in the Catalan word tokenizer. It is > a bit hackish. To make the code more elegant and more general, we could > perhaps do something like the srx segmentation at the world level.... Just > an idea. I'm not sure if it is reasonable. > > Regards, > Jaume > > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Windows: > > Build for Windows Store. > > http://p.sf.net/sfu/windows-dev2dev > _______________________________________________ > Languagetool-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > ------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev _______________________________________________ Languagetool-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/languagetool-devel
