I think that I may have misunderstood what the OP wanted. The awk script you give and the perl one that I gave give different output on the first line of my modified file. It is the way I was envisioning what the OP wanted was: "Find the first instance of CD in the given string. Find all other characters following that until the first QR substring. Replace those characters with "junkt". What my perl regexp matches and your awk matches are not the same segment. I don't know which the OP wanted, now. My modified file places a Q in column 11, moving all other characters right one. It also removes the D which was originally in column 27.
$cat test.txt QQQQABCDEFQGNOPQRXXXPPPPABCEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABCDEFGNOPQRXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ $awk 'sub(/CD[^Q]*QR/,"junkt")' test.txt QQQQABCDEFQGNOPQRXXXPPPPABCEFGNOPQRYYYOOOOABjunktZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ $perl -np -e 's/CD.*?QR/junkt/' test.txt QQQQABjunktXXXPPPPABCEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ QQQQABjunktXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ All the lines, other than the first, in "test.txt" are identical. The first line has a Q inserted after the first F. And the second D removed from between the C and the E. My string matches from the first CD (column 7) to the first QR (column 16). Your awk matches the CD in column 45 to the QR in column 53. -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM <snip> > >> > >> try this: > >> > >> awk 'sub(/CD[^Q]*QR/,"junkt")' > >> > >> or this: > >> > >> sed -e 's/CD[^Q]*QR/junkt/' > >> > >> Bill > > > >Will work on that specific example. But won't if a Q appears > with some other character after it, before the first QR. > > > > Did you try it? > > Where a Q appears with some other character after it, before > the first QR? I did. It skips to the one that has the first > QR, as it should. > > echo > "QQQQABCDEFGNOPQSXXXPPPPABCDEFGNOPQRYYYOOOOABCDEFGNOPQRZZZ" | > awk 'sub(/CD[^Q]*QR/,"junkt")' > > QQQQABCDEFGNOPQSXXXPPPPABjunktYYYOOOOABCDEFGNOPQRZZZ > > I see that Ken has added to the problem description since my > earlier reply. > > Bill > > Bill > > ---------------------------------------------------------------------- > For IBM-MAIN subscribe / signoff / archive access instructions, > send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN > > ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN