Thanks Jaume

I looked at your approach and it was pretty straightforward (if a bit
hackish) so I used it for Ukrainian. We're probably too late to
develop anything common for 2.2 anyway.

Andriy

2013/6/13 Jaume OrtolĂ  i Font <[email protected]>:
>
> 2013/6/12 Andriy Rysin <[email protected]>
>>
>> I noticed that numbers with fractions like 2,2 are split into '2',
>> ',', '2' by word tokenizer. In Ukrainian I need to require difference
>> case of the following noun based on whether it's a whole number or
>> fractional so I was planning to adjust Ukrainian word tokenizer. But I
>> think most European languages use comma for fractional numbers so I
>> was wandering if somebody already has a solution or if this better be
>> done in common code.
>>
>
> Hi Andriy,
>
> This and other similar things are done in the Catalan word tokenizer. It is
> a bit hackish. To make the code more elegant and more general, we could
> perhaps do something like the srx segmentation at the world level.... Just
> an idea. I'm not sure if it is reasonable.
>
> Regards,
> Jaume
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Languagetool-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to