Thanks to you and Shmuel for pointing out the flaw in my logic. Bill
On Mon, 6 Feb 2012 12:41:24 -0600, McKown, John <[email protected]> wrote: >I think that I may have misunderstood what the OP wanted. The awk script you >give and the perl one that I gave give different output on the first line of >my modified file. It is the way I was envisioning what the OP wanted was: >"Find the first instance of CD in the given string. Find all other characters >following that until the first QR substring. Replace those characters with >"junkt". What my perl regexp matches and your awk matches are not the same >segment. I don't know which the OP wanted, now. My modified file places a Q in >column 11, moving all other characters right one. It also removes the D which >was originally in column 27. > >$cat test.txt >QQQQABCDEFQGNOPQRXXXPPPPABCEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ > >$awk 'sub(/CD[^Q]*QR/,"junkt")' test.txt >QQQQABCDEFQGNOPQRXXXPPPPABCEFGNOPQRYYYOOOOABjunktZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ > >$perl -np -e 's/CD.*?QR/junkt/' test.txt >QQQQABjunktXXXPPPPABCEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ >QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ > >All the lines, other than the first, in "test.txt" are identical. The first >line has a Q inserted after the first F. And the second D removed from between >the C and the E. My string matches from the first CD (column 7) to the first >QR (column 16). Your awk matches the CD in column 45 to the QR in column 53. > > > > >-- >John McKown >Systems Engineer IV >IT > >Administrative Services Group > >HealthMarkets(r) > >9151 Boulevard 26 * N. Richland Hills * TX 76010 >(817) 255-3225 phone * >[email protected] * www.HealthMarkets.com > >Confidentiality Notice: This e-mail message may contain confidential or >proprietary information. If you are not the intended recipient, please contact >the sender by reply e-mail and destroy all copies of the original message. >HealthMarkets(r) is the brand name for products underwritten and issued by the >insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance >Company(r), Mid-West National Life Insurance Company of TennesseeSM and The >MEGA Life and Health Insurance Company.SM ><snip> >> >> >> >> try this: >> >> >> >> awk 'sub(/CD[^Q]*QR/,"junkt")' >> >> >> >> or this: >> >> >> >> sed -e 's/CD[^Q]*QR/junkt/' >> >> >> >> Bill >> > >> >Will work on that specific example. But won't if a Q appears >> with some other character after it, before the first QR. >> > >> >> Did you try it? >> >> Where a Q appears with some other character after it, before >> the first QR? I did. It skips to the one that has the first >> QR, as it should. >> >> echo >> "QQQQABCDEFGNOPQSXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ" | >> awk 'sub(/CD[^Q]*QR/,"junkt")' >> >> QQQQABCDEFGNOPQSXXXPPPPABjunktYYYOOOOABCDEFGNOPQRZZZ >> >> I see that Ken has added to the problem description since my >> earlier reply. >> >> Bill >> >> Bill >> >> ---------------------------------------------------------------------- >> For IBM-MAIN subscribe / signoff / archive access instructions, >> send email to [email protected] with the message: INFO IBM-MAIN >> >> > >---------------------------------------------------------------------- >For IBM-MAIN subscribe / signoff / archive access instructions, >send email to [email protected] with the message: INFO IBM-MAIN ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN

