>> It's the isolated snippets like the zillion I last night pointed out in >> perlfunc where I feel all the declaration detracts from the point. >> >> If you believe that every possible example in Perl needs to be fully >> declared, than by all means do so. But make sure you always start every >> snippet with >> >> #!/usr/bin/env perl -CLA >> use 5.010; >> use utf8; >> use strict; >> use autodie; >> use warnings qw<FATAL all>; >> use open qw<IO :utf8 :std>;
I forgot to add the obligatory: use sigtrap qw< stack-trace normal-signals error-signals >; >> END { close STDOUT } >> >> or whatever boilerplate is currently considered >> de rigeur by all those trendy mODERN pERL people. >> >> Can you truly argue that that would *help* everything? > I think we would both agree that that is way to much. And I > automatically assume code with "use utf8" in it is subtly > broken until proved otherwise anyway. :-) Oh drat! That's distressing. I some time ago reached the conclusion that C<use encoding> was evil, but if you are now telling me that use utf8 is just as bad, I don't know *what* I'm going to do! One potential problem area that I can imagine is the same one you get with XML's inline charset declaration: it has to a proper superset of ASCII, and no non-ASCII characters may occur before the charset declaration. Is it that you're worried people will think they will get all their strings magically _utf8_on()d that way -- when in fact, they don't and the same rules are followed as without the pragma? Or do you fear it's old code that thought that was the only way to get Unicode semantics, which is almost certainly wrong in many other ways, too? Or are you worried about non-shortest-form UTF-8 illegalities sneaking in unchecked? I have a feeling that there must be something more than those, because they're all obvious and I figure you wouldn't have mentioned it if there weren't something more perilous and more insidious. And *that* has got me nervous. > In fact I suspect over a pint we would probably mostly agree > about what is too much. :-) Prolly. Colorado is the state with the most microbreweries per capita. I don't much care for the beer in Europe apart from what you get in the British Isles and in Belgium (maybe Benelux). The rest of it is too easily forgotten, though now and then some beers from Germany pleasantly surprise me; just not the rule. > Sure, no worries. :-) Hope you have a good weekend. That will take some doing. I'm supposed to working on the Camel's regex chapter. But I'm also supposed to wrestling with Junit test cases whose results are due Monday morning on an already extended deadline for an academic paper submission. So far, I have already: * transparently rewritten all Java regexes out from under it so they actually work (our text is all UTF-8). * written Perl code to write a thousand Java Junit test cases because the framework is too stupid to behave properly. * taken to writing my Java with cpp frontending it so I can have assert macros showing the real unevaluated expression as a string, and so I can do things like: #ifdef FIX_BUGGY_JAVA_REGEXES import static tchrist.PatternUtils.unicode_charclass; # define FIX(BROKEN) unicode_charclass(BROKEN) #else # define FIX(BROKEN) #endif leading to code like: Matcher m = Pattern.compile(FIX(regex)).matcher(string); The Java monoglots are completely appalled. One "helpfully" gave me almost five pages of supernasty Java code just to get around what I did in a few lines with cpp and token-catenation to effectively pass function pointers. I politely declined. Fortunately the only success metric at work is getting the job done, not purity of soul. I *always* beat the Java people in time-to-solution, even when I use Java, but that's because per their viewpoint, I "cheat". Whatever. (Wonder whether Rob Pike's hiring for Go? :() So it may not be a good weekend. I'll try to take some time away from the computer. That should help. --tom