* Aristotle Pagaltzis ([EMAIL PROTECTED]) [081204 16:57]: > * Mark Overmeer <[EMAIL PROTECTED]> [2008-12-04 16:50]: > > * Aristotle Pagaltzis ([EMAIL PROTECTED]) [081204 14:38]: > > > Furthermore, from the point of view of the OS, even treating file > > > names as opaque binary blobs is actually fine! Programs don’t > > > care after all. In fact, no problem shows up until the point > > > where you try to show filenames to a user; that is when the > > > headaches start, not any sooner. > > > > So, they start when > > - you have users pick filenames (with Tk) for a graphical > > applications. You have to know the right codeset to be able > > to display them correctly. > > Yes, but you can afford imperfection because presumably you know > which displayed filename corresponds to which stored octet > sequence, so even if the name displays incorrectly, you still > operate on the right file if the user picks it.
With all these different encodings, it is easy to show filenames which are not a little-bit incorrect, but which are unrecognizably corrupted. In the whole debate, it look like there are only two groups of developers involved: the programming language authors and the end-application developers. But do not forget that there are also CPAN library authors and maintainers (my main involvement) When you create a good library, you have to support multiple (unpredicatable) platformas and languages. Each time you say: "oh, just let the end-user figure that out", you add complexity and distribute implementation horrors. Good, generally available libraries are crucial for any language. > > - you have XML-files with meta-data on files which are > > being distributed. (I have a lot of those) > Use URI encoding unless you like a world of pain. You are looking at it from the wrong point of view: Perl is used as a glue language: other people determine what kind of data we have to process. So, also in my case, the content of these XML structures is totally out of my hands: no influence on the definitions at all. I think that is the more common situation. > NTFS seems to say it’s all Unicode and comes back as either > CP1252 or UTF-16 depending on which API you use, so I guess you > could auto-decode those. But FAT is codepage-dependent, and I > don’t know if Windows has a good way of distinguishing when you > are getting what. So Windows seems marginally more consistent > than Unix, but possibly only apparently. (What happens if you zip > a file with random binary garbage for a name on Unix and then > unzip it on Windows?) > > I have no idea what other systems do. Well, the nice thing about File::Spec/Class::Path is that someone did know how those systems work and everyone can benefit from it. So why are you all so hessitating in making each other's life easier? There is no 100% solution, but 0% is even worse! Once upon a time, Perl people where eager for good DWIMming and powerful programming. Nowadays, I see so much fear in our community to attempt simpler/better/other ways of programming. We get a brand new language, with a horribly outdated documentation system and very traditional OS approach. As if everyone prefers to stick to Perl's 22 years and Unixes 39 years old choices, where the world around us saw huge development and change in needs. Are "we" just getting old, grumpy and tired? Where is the new blood to stir us up? - MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions [EMAIL PROTECTED] [EMAIL PROTECTED] http://Mark.Overmeer.net http://solutions.overmeer.net