On 03-Oct-12 23:56, Ali Çehreli wrote:
On 10/03/2012 11:21 AM, Dmitry Olshansky wrote:
> On 03-Oct-12 21:10, Ali Çehreli wrote:
>> On 10/03/2012 03:56 AM, Minas wrote:
[...]
>>> map['ά'] = 'Ά';
[...]
> Glad you showed up!
Why? Do I whine better? :p
Well that might be the case :)
But honestly because you pushed for some Unicode support back in the
days. I currently look around to see if there are obviously important
things not covered in my project.
> One and by far the most useful case is case-insensitive matching.
> That being said this doesn't and shouldn't involve toLower/toUpper (and
> on the whole string) anywhere. Not only it's multipass vs single pass
> but it's also wrong. As a lot of other ASCII-minded carry-overs.
As I have written at other times, there is an experimental
alphabet-aware string library (unfortunately even the code is in Turkish
at this time).
If we are talking about the order then this is the way to go:
http://unicode.org/reports/tr10/
Looks like it's one of things I haven't to implemented :(
That library has the following struct for order-comparing alphabet-aware
strings and characters:
struct Order
{
/**
* Represents comparing characters at their bases.
*
* This value indicates that 'a' and 'b' are different. 'C' and 'c'
* are the same according to this value. This value disregards upper
* and lower cases.
*/
int base;
/**
* Represents comparing characters by their accents.
*
* This value indicates that 'a' and 'â' are different. This value
* disregards upper and lower cases.
*/
int accent;
/**
* Represents comparing characters also by their upper and lower
cases.
*
* Lower case letter comes before upper case.
*/
int cased;
}
(Of course opCmp() cannot return that type. :( )
The idea is that only the application knows what type of comparison
makes sense.
So instead library does all of them ? Ouch.. I'm not sure I got the idea.
--
Dmitry Olshansky