On Fri, Dec 9, 2016 at 10:19 AM, BartC <b...@freeuk.com> wrote: > I get this (although I suspect Thunderbird will screw up the tabs); the code > I used follows at the end:
Actually it came through just fine. Although, point of note: the case conversions of individual characters are not the same as the case conversions of the whole string. > As I said some characters have ill-defined upper and lower case conversions, > even if some aren't as esoteric as I'd thought. Can you show me any character with ill-defined conversions? > In English however the conversions are perfectly well defined for A-Z and > a-z, while they are not meaningful for characters such as space, and for > digits. Correct; caseless characters don't change. There's no concept of "upper-case 5", and no, it isn't "%", despite people who think that "upper-case" means "hold shift" :) > In English such conversions are immensely useful, and it is invaluable for > many purposes to have upper and lower case interchangeable (for example, you > don't have separate sections in a dictionary for letters starting with A and > those starting with a). That's not quite the same; that's a *collation order*. And some dictionaries will sort them all in together, but others will put the uppercase before the lowercase (ie sorting "AaBbCcDd"). It's not that they're considered identical; it's that they're placed near each other in the sort. In fact, it's possible to have collations without conversions - for instance, it's common for McFoo and MacFoo to be sorted together in a list of names. > So it it perfectly possible to have case conversion defined for English, > while other alphabets can do what they like. Aaaaand there we have it. Not only do you assume that English is the only thing that matters, you're happy to give the finger to everyone else on the planet. > It is a little ridiculous however to have over two thousand distinct files > all with the lower-case normalised name of "harry_potter". It's also ridiculous to have hundreds of files with unique two-letter names, but I'm sure someone has done it. The file system shouldn't stop you, any more than it should stop you from having "swimmer" and "swirnrner" in the same directory (in some fonts, they're indistinguishable). > What were we talking about again? Oh yes, belittling me because I work with > Windows! Or because you don't understand anything outside of what you have worked with, and assume that anything you don't understand must be inferior. It's not a problem to not know everything, but you repeatedly assert that other ways of doing things are "wrong", without acknowledging that these ways have worked for forty years and are strongly *preferred* by myriad people around the world. I grew up on OS/2, using codepage 437 and case insensitive file systems, and there was no way for me to adequately work with other Latin-script languages, much less other European languages (Russian, Greek), and certainly I had no way of working with non-alphabetic languages (Chinese, Japanese). Nor right-to-left languages (Hebrew, Arabic). And I didn't know anything about the Unix security model, with different processes being allowed to do different things. Today, the world is a single place, not lots of separate places that sorta kinda maybe grudgingly talk to each other. We aren't divided according to "people who use IBMs" and "people who use DECs". We shouldn't be divided according to "people who use CP-850" and "people who use Windows-1253". The tools we have today are capable of serving everyone properly. Let's use them. ChrisA -- https://mail.python.org/mailman/listinfo/python-list