Erland Sommarskog <[EMAIL PROTECTED]> writes: >Nick Ing-Simmons ([EMAIL PROTECTED]) writes: >> Erland Sommarskog <[EMAIL PROTECTED]> writes: >>>I would really expect someone to have done this already, but I see no >>>reference to such a module. Or layer-directive like "<:use-bom" to open >>>the file. And then some way to open an output file "same mode as that >>>handle". >> >> Seems you are the 1st (at least to care) - so in true OpenSource >> spirit you would write the module and contribute it. > >Unfortunately my field of expertise is not in the area of C++ programming >or Perl internals. Believe me, you would not want to see my miserable >code entered into the Perl code base. :-)
Well you only learn by trying - but that is your choice. > >I guess, that if I want to write a utility which can handle Unicode >files, that I will implement the file-opening in Perl in some private >module. That would be a resonable way to prototype stuff for core anyway. With perl5.7+'s "layers" it should be possible to do this as module. (Which was at least part of motivation for inventing them.) > >> Many _programs_ yes. So when you write a perl _program_ you can >> handle it. C++ language doesn't do this for you, why should Perl? >> Now there may well be a C++ _library_ which does this, so there >> could be a perl _library_ (module) which did it too. > >But Perl is not C++. C++ is a strongly typed language where you use >different functions for 8-bit and Unicode data. Perl is also a higher- >level language that does more work for me. But there is a limit - or there would be just one perl program: #!/usr/bin/perl exit(do_what_I_mean(@ARGV)); >I'd say that it would be >perfectly in the spirit of Perl to magically handle file as ASCII or >Unicode without me having to bother. Agreed - but magic doesn't create itself. > >> It would seem best place to do this would be to change >> the initial layer in Win32 to a new layer (say :bomcrlf). >> This layer would get popped on binmode() - fixing above. >> It would look at 1st few bytes it got from OS and then if it was >> a BOM push an encoding() layer beneath itself and mutate into >> a :crlf layer with UTF8 flag set. > >Yes, that sounds like a good way that would ensure compatibility and >still give me what I want. When is Santa coming to town? :-) Implied timescale sounds viable ;-) > >However, that does not really help when the Perl script itself is in >UTF-16 or UTF-8. Yes it does - I _think_ one or more of perl -MWin32BOM UTF-16_script or set PERL5OPT -MWin32BOM or set PERLIO bomcrlf (with magical autoload) could be made to work. If it happens in core-perl it can certainly work. > >Anyway, thanks for all the replies. This is not really a big deal for >me at the moment. I was just puzzled by the results of my tests. Since >I working with a module that will support Unicode data, I'm a little >nervous that I will get questions from users about the topic.