That's correct behaviour. There is no such word as เหลือง in Thai. It's a particle that always exists as an adjunct to something else. Although สี is a word on its own, เหลือง is not. Even when Thais speakers say something like รถเหลือง, this is colloquial speech. Technically, it's รถสีเหลือง.
On 24 Sep 2014, at 12:02, Gerriet M. Denkmann <gerr...@mdenkmann.de> wrote: > > On 24 Sep 2014, at 11:46, Roland King <r...@rols.org> wrote: > >> >>> On 24 Sep 2014, at 12:31 pm, Gerriet M. Denkmann <gerr...@mdenkmann.de> >>> wrote: >>> >>> I have a problem with NSLinguisticTagger / CFStringTokenizer on iOS 8.0 >>> >>> OS X 10.9.5 (and iOS 7 and earlier) parses "สีเหลือง" quite rightly as two >>> words: "สี" = colour and "เหลือง" = yellow. >>> >>> No dictionary will ever contain "yellow colour". Every dictionary will >>> contain "yellow" and "colour". >>> There are hundreds, if not thousands of these expressions, which are >>> wrongly classified as one word. >>> Might have something to do with the new predictive keyboard. >>> >>> But I am not writing this to complain, but to ask for a favour: could >>> anybody on 10.10 just click anywhere in: "สีเหลือง" and tell me whether all >>> gets highlighted, or just a part (as in 10.9.5)? >> >> >> If I double click anywhere on the right of that I get the second part (all >> bar the first character) highlighted. Clicking on the first character I get >> just that character. So 10.10 (beta 8) splits that sequence into two >> ‘words’. > This is a big relief. Thanks a lot. > >> >> Why do you suspect the predictive keyboard? Certainly wouldn’t be the first >> thing I thought of seeing that issue. I would probably instead assume I’d >> written myself a bug. > > Well, here is the code; maybe you can find a bug: > > let text = "สีเหลือง" > let opts: Int = 0 > let schemes = [ NSLinguisticTagSchemeTokenType, > NSLinguisticTagSchemeNameTypeOrLexicalClass ] > let tagger = NSLinguisticTagger(tagSchemes: schemes, options: opts ) > > let nsText = text as NSString > let length = nsText.length > tagger.string = nsText > let range = NSMakeRange(0,length) > let theScheme = NSLinguisticTagSchemeTokenType > let ops = NSLinguisticTaggerOptions(0) > tagger.enumerateTagsInRange ( > range, > scheme: theScheme, > options: ops, > usingBlock: > { ( tag: String!, > tokenRange: NSRange, > sentenceRange: NSRange, > stop: UnsafeMutablePointer<ObjCBool> > ) -> Void in > > let word = nsText.substringWithRange(tokenRange) > println("\(tag) = \(word) " ) > } > ) > > Gerriet. > > > _______________________________________________ > > Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) > > Please do not post admin requests or moderator comments to the list. > Contact the moderators at cocoa-dev-admins(at)lists.apple.com > > Help/Unsubscribe/Update your Subscription: > https://lists.apple.com/mailman/options/cocoa-dev/sqwarqdev%40icloud.com > > This email sent to sqwarq...@icloud.com
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com