[EMAIL PROTECTED] wrote: > Hello. I hope this is not a bug, that I'm just doing something > wrong. Anyway, here's how this "journey" started. I had a file > with carriage return characters (^M) in it.
DOS has a CR-NL end of line convention. UNIX has a NL end of line convention. A classic conversion problem. > The file was one LONG record, and I wanted newline characters where > the ^M's were. I thought I could just set awk's RS variable to ^M, > and that would do it. But, I needed a way to "create" the ^M > character. I would use other commands myself. Such as tr -d. Try this: tr -d "\015" But please continue with your story. > Somewhere on the Internet I found this: > > cm=`echo m | tr 'm' '\015'` > > but, that did not seem to work. Seemed like "cm" ended up being > null. Works for me! > To test if the syntax of the command correct, I did the following: > > Script 1 (a file called ASCII): > > cat /dev/null > asc > cat /dev/null > asc.txt The 'cat' programs in the above are not needed. true > asc or : > asc or > asc All do the same thing without the extra program. (Sorry, but extra 'cat' processes are a common scripting mistake and a pet peeve of mine.) Here is a simple howto on common shell mistakes. http://www.greenend.org.uk/rjk/2001/04/shell.html > for i in 000 001 002 003 004 005 006 007 \ > 010 011 012 013 014 015 016 017 \ > 020 021 022 023 024 025 026 027 \ > 030 301 032 > do > echo "x=\`echo x | tr 'x' '\\${i}'\`" >> asc > echo "echo \"\${x}\" >>asc.txt" >>asc > done > > bash asc > > The execution of script 1 (ASCII) created a file called asc > (which was executed from within the ASCII file). > > asc file: > > x=`echo x | tr 'x' '\000' > if [[ "${x}" == "" ]]; then echo "x is null"; else echo "x is not null"; fi > #Note: the above "if" statement was not created by the script. I edited it > in afterwards, You have to be careful that your editor does not change any of the characters. In particular some editors will silently delete null (000) characters. > #and re-executed the asc file manually > echo "${x}" >>asc.txt > x=`echo x | tr 'x' '\001' > if [[ "${x}" == "" ]]; then echo "x is null"; else echo "x is not null"; fi > #Same for that "if" statement, too > echo "${x}" >>asc.txt > x=`echo x | tr 'x' '\002' > echo "${x}" >>asc.txt > . > . (several lines left out for brevity) > . > x=`echo x | tr 'x' '\031' > echo "${x}" >>asc.txt > x=`echo x | tr 'x' '\032' > echo "${x}" >>asc.txt > > And, the result of that execution was a file, asc.txt > (this is how it looked when viewed with vi): > > (a null character, OK, i.e., expected) > ^A > ^B > ^C > ^D > ^E > ^F > ^G > ^H > (a tab character, OK, i.e., expected) > (a null character, not expected) I can't recreate that. I don't see the null. > ^K > ^L > (a null character, not expected, at least I had hoped it would be a ^M) I can't recreate that. I don't see the null. What version of tr are you using? tr --version > ^N > . > .(several lines left out for brevity) > . > ^Z > > Note 1: where ^I would be is a tab character (OK) > where ^J would be is a null character > where ^M would be is a null character > > Note 2: I went back and edited in the following line to the > 2nd file (asc): > > if [[ "${x}" == "" ]]; then echo "x is null"; else echo "x is not null"; fi > > and inserted it after the "000", "001", "012", "013", "015", and "016" lines > to test. The character created by the "012", and "015" lines from the asc > file is null. :( You can probably do this easier with: echo x | tr x "\\015" | od -c or even for I in $(seq -w 0 32);do echo x | tr x "\\$i" | od -c;done > Note 3: GNU bash, version 2.05.0(8)-release (i686-pc-cygwin) Cygwin? You are probably running afoul of the DOS end of line conventions. Probably the program is doing its own conversions. Can you recreate this on a UN*X like machine? I don't think anyone on this list uses Cygwin. So if it is a Cygwin specific problem then you would need to take this to the cygwin list. > Note 4: I finally just made a copy of a file that had ^M's in it, > edited out everything but one ^M character, and then edited the > following around the ^M: > > BEGIN { RS = "^M" } > { print } > > and then used that to process my file with the ^M's in it: > > cat ctrl-Ms_file | awk -f RS_is_ctrl-M.awk > newlines_file Try 'tr -d "\015"' as the classic way to delete CRs from files. > Yucky thing is that I would have to keep that "RS_is_crtl-M.awk" > file around (or create it as needed using vi) since I can't create a > ^M character "on the fly". :( Sure you can! tr -d "\015" printf "\r" tr -d "$(printf "\r")" CR=$(prinf "\r") tr -d "$CR" I think 'perl -l' might rethread end of line conventions too. Not sure. I don't have a way to test this on DOS. But it is worth a test on Cygwin. perl -lne 'print' Bob _______________________________________________ Bug-textutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-textutils