drieux, et al -- ...and then drieux said... % % On Saturday, June 8, 2002, at 08:13 , David T-G wrote: % >drieux, et al -- % >...and then drieux said... % >% On Saturday, June 8, 2002, at 04:47 , David T-G wrote: ... % > % >Tell me about the standard... Should perl happily chomp either a UNIX or % >a DOS (or even a MAC) line? Or do I turn around and explain it below, % >answering myself? % % the cannon is: % % EOL - end of line is denoted as % % mac: <CR> : chr(13) % dos: <CR><NL> : chr(13)chr(10) % nix: <NL> : chr(10)
OK, so it *shouldn't* somehow ever be \n\r and so it is extremely unlikely that that's why chomp was failing. % % note what happens: % % vladimir: 64:] echo line> file % vladimir: 65:] unix2dos file file.dox % could not open /dev/kbd to get keyboard type US keyboard assumed % could not get keyboard type US keyboard assumed % vladimir: 66:] od -c !$ % od -c file.dox % 0000000 l i n e \r \n % 0000006 % vladimir: 67:] OK. I'd probably see about the same when taking a look at my Cygwin find output. % % if you check the stty man pages you will find our friend onlcr % that does the mapping of NL to CR-NL - we still have the old % cross over problem here that what unix folks use as \n is the Right. % "new line" token - but which by way of stty goes out to their % 'terminal type' as if it were CR - or "\r" - return the carriage % head to the beginning of the line and then shift the roller up one. % % otherwise if you have merely the new line % you start typing here. The famous stair-step printer problem. Oh, how many sheets of paper have been wasted because of that mess. % % If you have merely the CR - you would start writing over the line. Tougher to demonstrate in a readable post :-) % % Hence to have "\n\r" would mean having implemented the stardard % for the EOL token to the file 'underappropriately' - although Right; I get it. % 'technically literally' and it would 'still work' in the case of % those systems that know how to parse them correctly. Since it % really does not matter to a teletype which order the commands % are generated - they will read them off the wire as commands % and execute them... Yep, and a screen can function in the same way so the user might never know, but code will care. % % { note you should seend three BEL tokens for the start and stop % of any message - but that has fallen out of habit.... and no one % seems to worry about taking them out of the data stream, or remembering % to put them in either... } *grin* % % [..] % >(you know, it can be a real challlenge to write a one-liner!) and found % >that I have either RL or L for all files, and no \n\r as I had thought, % [..] % % the problem here is that chomp is defined on the host you are on, % not on the host where you once were..... Actually, it's all on the same host, but there has been a Cygwin upgrade in the meantime. What I don't get is why kazin-1 and kazin-3 are not the same as kazin-2 and kazin-4, and yet all were made around the same time, definitely without any upgrades or code changes. That says to me that I ran the find in a different way or some such, and that's possible because I could have still been doing it manually, but I still don't know what it would take to generate different output. % % it's a reasonable compromise in that case... Yeah. % % where you have to get your poop in a group on this point is as you % move into 'network layer plays' - such as HTTP - unless you are % using the appropriate modules to do this stuff for you - and you Well, that's the plan when possible; I'd much rather use than roll :-) % find that the RFC for http defines the separator for the head from % the body as <CR><LF> - cf: % http://www.w3.org/Protocols/rfc2068/rfc2068 % section 2.2 to be specific - where they call out the decimal % values for them in the ASCII table.... Yeah. % % { may I recommend that you use the CPAN modules - hand cranking this % stuff from the IO::Socket layer - while what some of us did, is not You betcha! % what I would recommend now.... but yes, the original code I ripped % had the sort of 'oh look, we have that <CR><LF> hence we are out of % header and the rest is body....' sort of coding...} Hmmm... Perhaps good to use. % % [..] % >this would have really screwed me as I got way down into my lists :-) % % yes... not that I would wish to impose some 'puritanical morality' *grin* % on how you relate to yourself..... but in the coding space, I would % wish to impose a sense of % % THAT WILL HURT YOU! Oh, indeed. I knew it was a bad way to do it, but it was the only way that I could see. I'm all better now :-) % % >So now I should be able to put ... % > s/($cr|$lf)+//; ... % % test that - but I do not think it will do what you are expecting, % since I think the tradition is % % ([$cr|$lf]+) Actually, I ended up doing it in octal instead of wasting $cr and $lf, so I stole your brackets, and the code now works as s/[\012|\015]+//; and also works the other way around. Now that I'm pondering it, if anything, I'd think it would be ($cr|$lf)+ or perhaps [$cr$lf]+ and so I'll have to check some more. % % where the [ ] block off the sequence of characters, the "|" % here is the expected 'or me' - the "+" denoting one or more of these. Yeah, I imagine I need to get rid of the | and have again :-) gotten lucky because there are no |s in my pathnames to be caught. % { in this case you want that, helps the compiler not worry about % looking for the cases of 'not me' and we nest all of that in the % round braces to denote the 'yes, this pattern, do something with it!' Hmmm... Tell me more what you mean here; you've lost me. % % [..] % % ciao % drieux TIA & HAND :-D -- David T-G * It's easier to fight for one's principles (play) [EMAIL PROTECTED] * than to live up to them. -- fortune cookie (work) [EMAIL PROTECTED] http://www.justpickone.org/davidtg/ Shpx gur Pbzzhavpngvbaf Qrprapl Npg!
msg25712/pgp00000.pgp
Description: PGP signature