Re: [Pharo-users] Testing a Unicode Character's Category

Richard Sargent Tue, 26 Sep 2017 10:10:12 -0700

Andrew P. Black wrote
> Hi Richard,
> 
> Normally I agree with you, and prefer boolean methods 
> 
>       inCatagoryCc: aChar
>       isCategorySm: aChar
> to 
>       categoryOf: aChar == #Cc
>       categoryOf: aChar == #Sm
> 
> In this particular case, though, the category codes are part of the
> Unicode Standard, so perhaps exposing them isn’t so bad.  Moreover, there
> is meaning encoded into the symbols — for example, all the letter
> categories start with L.  So one could write an isLetter test like this


I understand this argument, but I cannot agree with it. If I were to ask you
what code points were defined in character category Mx, you would
immediately go to a web site that contained the Unicode categories to find
the answer. In other words, you would "ask Unicode".

In general, I would expect to see:
Character>>#isCapitalLetter
^Unicode isCapitalLetter: self "actually, /self codePoint/"

Character>>#isLetter
^Unicode isLetter: self

Character>>#isMathSymbol
Unicode isMathSymbol: self

Unicode's methods would look at and interpret its category information for
the character ... which might be internally managed via some kind of tree
structure (who knows?).


And I would definitely expect to *not* see a method name like
"isCategorySm:". :-)

Magic numbers, magic codes, etc.: you always want there to be one definitive
expert (class) and you do not want other classes usurping its
responsibilities.


> Character >> isLetter
>       ^ (Unicode categoryOf: self) first == $L
> 
> rather than 
> 
> Character >> isLetter
>       Unicode inCategoryLl ifTrue: [ ^ true ].
>       Unicode inCategoryLm ifTrue: [ ^ true ].
>       Unicode inCategoryLo ifTrue: [ ^ true ].
>       Unicode inCategoryLt ifTrue: [ ^ true ].
>       Unicode inCategoryLu ifTrue: [ ^ true ].
>       ^ false
> 
> There is still the disadvantage that a typo in the Category name (typing
> #L1, for example, when one means #Ll) is likely to go undetected.
> 
>       Andrew





--
Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html

Re: [Pharo-users] Testing a Unicode Character's Category

Reply via email to