That's correct behaviour. There is no such word as เหลือง in Thai. It's a 
particle that always exists as an adjunct to something else. Although  สี is a 
word on its own, เหลือง is not. Even when Thais speakers say something like 
รถเหลือง, this is colloquial speech. Technically, it's รถสีเหลือง. 






On 24 Sep 2014, at 12:02, Gerriet M. Denkmann <gerr...@mdenkmann.de> wrote:

> 
> On 24 Sep 2014, at 11:46, Roland King <r...@rols.org> wrote:
> 
>> 
>>> On 24 Sep 2014, at 12:31 pm, Gerriet M. Denkmann <gerr...@mdenkmann.de> 
>>> wrote:
>>> 
>>> I have a problem with NSLinguisticTagger / CFStringTokenizer on iOS 8.0
>>> 
>>> OS X 10.9.5 (and iOS 7 and earlier) parses "สีเหลือง" quite rightly as two 
>>> words: "สี" = colour and "เหลือง" = yellow.
>>> 
>>> No dictionary will ever contain "yellow colour". Every dictionary will 
>>> contain "yellow" and "colour".
>>> There are hundreds, if not thousands of these expressions, which are 
>>> wrongly classified as one word.
>>> Might have something to do with the new predictive keyboard.
>>> 
>>> But I am not writing this to complain, but to ask for a favour: could 
>>> anybody on 10.10 just click anywhere in: "สีเหลือง" and tell me whether all 
>>> gets highlighted, or just a part (as in 10.9.5)?
>> 
>> 
>> If I double click anywhere on the right of that I get the second part (all 
>> bar the first character) highlighted. Clicking on the first character I get 
>> just that character. So 10.10 (beta 8) splits that sequence into two 
>> ‘words’. 
> This is a big relief. Thanks a lot.
> 
>> 
>> Why do you suspect the predictive keyboard? Certainly wouldn’t be the first 
>> thing I thought of seeing that issue. I would probably instead assume I’d 
>> written myself a bug.
> 
> Well, here is the code; maybe you can find a bug:
> 
> let text = "สีเหลือง"
> let opts: Int = 0
> let schemes = [ NSLinguisticTagSchemeTokenType, 
> NSLinguisticTagSchemeNameTypeOrLexicalClass ]
> let tagger = NSLinguisticTagger(tagSchemes: schemes, options: opts )
> 
> let nsText = text as NSString
> let length = nsText.length
> tagger.string = nsText
> let range = NSMakeRange(0,length)
> let theScheme = NSLinguisticTagSchemeTokenType
> let ops = NSLinguisticTaggerOptions(0)
> tagger.enumerateTagsInRange ( 
>       range, 
>       scheme:         theScheme, 
>       options:        ops,
>       usingBlock: 
>       {       (       tag:                    String!, 
>                       tokenRange:     NSRange, 
>                       sentenceRange:  NSRange, 
>                       stop:                   UnsafeMutablePointer<ObjCBool>
>               ) -> Void in
>               
>               let word = nsText.substringWithRange(tokenRange) 
>               println("\(tag) = \(word) " )
>       }
> )
> 
> Gerriet.
> 
> 
> _______________________________________________
> 
> Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
> 
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
> 
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/cocoa-dev/sqwarqdev%40icloud.com
> 
> This email sent to sqwarq...@icloud.com

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to