Regular Expressions (OMVS)
Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Looks to me like it's being greedy. Look up the term greedy in relation to Regexps and you'll see the match is much wider than you anticipated - matching many more characters. Not sure if awk can do non-greedy matching. But there are usually workarounds if not. (And this just about exhausts my knowledge on the greedy vs non-greedy matching in Regexps.) :-( Cheers, Martin Martin Packer, Mainframe Performance Consultant, zChampion Worldwide Banking Center of Excellence, IBM +44-7802-245-584 email: martin_pac...@uk.ibm.com Twitter / Facebook IDs: MartinPacker Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker From: Ken MacKenzie ken.macken...@pramerica.ie To: IBM-MAIN@bama.ua.edu, Date: 06/02/2012 14:53 Subject: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
I don't know if this is what you want: /* REXX */ EXP = ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ START = POS(QR,EXP) SAY START EXIT 0 Result is 15 On Mon, Feb 6, 2012 at 10:44 AM, Ken MacKenzie ken.macken...@pramerica.ie wrote: Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- Those who can make you believe religious absurdities, can make you commit atrocities. Voltaire The philosopher has never killed any priests, whereas the priest has killed a great many philosophers. Denis Diderot Men will never be free until the last king is strangled with the entrails of the last priest. Denis Diderot -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
On Mon, 6 Feb 2012 15:02:42 +, Martin Packer wrote: Looks to me like it's being greedy. Look up the term greedy in relation to Regexps and you'll see the match is much wider than you anticipated - matching many more characters. Not sure if awk can do non-greedy matching. But there are usually workarounds if not. Mostly, it can't. From: Ken MacKenzie To: IBM-MAIN@bama.ua.edu, Date: 06/02/2012 14:53 I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. There's also MVS-OE. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. You might try: sub(/CD[^Q]*QR/,junkt) # Wildcard matches no string containing Q. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. Eek! Are you using SYSCALL spawn in an EDIT macro for that? I thought only John M., Kirk, and I did that sort of thing. --gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Yeah, I know I can do that particular search in REXX but I'm trying to utilise regular expressions as I think that they should provide greater flexibility, if they'd only do what I want. The concept of looking for the first string just doesn't seem to exist in the Unix world. Seems crazy to me. From: Roberto Halais roberto.hal...@gmail.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 15:04 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu I don't know if this is what you want: /* REXX */ EXP = ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ START = POS(QR,EXP) SAY START EXIT 0 Result is 15 On Mon, Feb 6, 2012 at 10:44 AM, Ken MacKenzie ken.macken...@pramerica.ie wrote: Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- Those who can make you believe religious absurdities, can make you commit atrocities. Voltaire The philosopher has never killed any priests, whereas the priest has killed a great many philosophers. Denis Diderot Men will never be free until the last king is strangled with the entrails of the last priest. Denis Diderot -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Remember that a regex, by default, is greedy. IOW, it matches as much as is possible. So with the CD.*QR pattern, you match from the first CD until the __last__ QR it can find. Not the first QR it can find. You would need the non greedy regex CD.*?QR . However, it is unfortunate that IBM, in its POSIX-inspired blindness, does not implement non greedy regular expressions. I can't get this to work in either awk or sed. But, if you have the Ported Tools available, then you have a older version of perl. It can do what you want. example: $cat test.txt ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ $perl -np -e 's/CD.*?QR/junkt/' test.txt ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ Where the leading $ is my shell prompt character. I crossposted this replay to MVS-OE for their viewing pleasure. grin -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Ken MacKenzie Sent: Monday, February 06, 2012 8:44 AM To: IBM-MAIN@bama.ua.edu Subject: Regular Expressions (OMVS) Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Yes, I pretty much understand the concept of GREEDY. AFAICS I'm supposed to be able to use a ? To stop on first occurrence (eg awk 'gsub(/CD.*?QR/,junkt)' fxdata ) but when I do that I get: awk: /CD.*?QR/: FSUMB031 ?, *, + or - - not preceded by valid regular expression Context is: gsub(/CD.*?QR/ From: Martin Packer martin_pac...@uk.ibm.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 15:04 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu Looks to me like it's being greedy. Look up the term greedy in relation to Regexps and you'll see the match is much wider than you anticipated - matching many more characters. Not sure if awk can do non-greedy matching. But there are usually workarounds if not. (And this just about exhausts my knowledge on the greedy vs non-greedy matching in Regexps.) :-( Cheers, Martin Martin Packer, Mainframe Performance Consultant, zChampion Worldwide Banking Center of Excellence, IBM +44-7802-245-584 email: martin_pac...@uk.ibm.com Twitter / Facebook IDs: MartinPacker Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker From: Ken MacKenzie ken.macken...@pramerica.ie To: IBM-MAIN@bama.ua.edu, Date: 06/02/2012 14:53 Subject: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
:-( Xxx:/u/xxx: perl -np -e 's/CD.*?QR/junkt/' fxdata perl: FSUM7351 not found Xxx:/u/xxx: From: McKown, John john.mck...@healthmarkets.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 15:44 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu Remember that a regex, by default, is greedy. IOW, it matches as much as is possible. So with the CD.*QR pattern, you match from the first CD until the __last__ QR it can find. Not the first QR it can find. You would need the non greedy regex CD.*?QR . However, it is unfortunate that IBM, in its POSIX-inspired blindness, does not implement non greedy regular expressions. I can't get this to work in either awk or sed. But, if you have the Ported Tools available, then you have a older version of perl. It can do what you want. example: $cat test.txt ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ $perl -np -e 's/CD.*?QR/junkt/' test.txt ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ Where the leading $ is my shell prompt character. I crossposted this replay to MVS-OE for their viewing pleasure. grin -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Ken MacKenzie Sent: Monday, February 06, 2012 8:44 AM To: IBM-MAIN@bama.ua.edu Subject: Regular Expressions (OMVS) Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Do you need to escape the '?' -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Ken MacKenzie Sent: Monday, February 06, 2012 10:44 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) Yes, I pretty much understand the concept of GREEDY. AFAICS I'm supposed to be able to use a ? To stop on first occurrence (eg awk 'gsub(/CD.*?QR/,junkt)' fxdata ) but when I do that I get: awk: /CD.*?QR/: FSUMB031 ?, *, + or - - not preceded by valid regular expression Context is: gsub(/CD.*?QR/ From: Martin Packer martin_pac...@uk.ibm.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 15:04 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu Looks to me like it's being greedy. Look up the term greedy in relation to Regexps and you'll see the match is much wider than you anticipated - matching many more characters. Not sure if awk can do non-greedy matching. But there are usually workarounds if not. (And this just about exhausts my knowledge on the greedy vs non-greedy matching in Regexps.) :-( Cheers, Martin Martin Packer, Mainframe Performance Consultant, zChampion Worldwide Banking Center of Excellence, IBM +44-7802-245-584 email: martin_pac...@uk.ibm.com Twitter / Facebook IDs: MartinPacker Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker From: Ken MacKenzie ken.macken...@pramerica.ie To: IBM-MAIN@bama.ua.edu, Date: 06/02/2012 14:53 Subject: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN This e-mail may contain confidential or privileged information. If you think you have received this e-mail in error, please advise the sender by reply e-mail and then delete this e-mail immediately. Thank you. Aetna -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
On my z/OS 1.12 system (I'm the senior z/OS sysprog), perl in is /usr/lpp/perl/bin. This is where the CBPDO placed it. Perhaps you have this, but just not on your PATH? -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Ken MacKenzie Sent: Monday, February 06, 2012 9:48 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) :-( Xxx:/u/xxx: perl -np -e 's/CD.*?QR/junkt/' fxdata perl: FSUM7351 not found Xxx:/u/xxx: From: McKown, John john.mck...@healthmarkets.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 15:44 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu Remember that a regex, by default, is greedy. IOW, it matches as much as is possible. So with the CD.*QR pattern, you match from the first CD until the __last__ QR it can find. Not the first QR it can find. You would need the non greedy regex CD.*?QR . However, it is unfortunate that IBM, in its POSIX-inspired blindness, does not implement non greedy regular expressions. I can't get this to work in either awk or sed. But, if you have the Ported Tools available, then you have a older version of perl. It can do what you want. example: $cat test.txt ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ $perl -np -e 's/CD.*?QR/junkt/' test.txt ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ Where the leading $ is my shell prompt character. I crossposted this replay to MVS-OE for their viewing pleasure. grin -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Ken MacKenzie Sent: Monday, February 06, 2012 8:44 AM To: IBM-MAIN@bama.ua.edu Subject: Regular Expressions (OMVS) Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I
Re: Regular Expressions (OMVS)
I just tried that with both sed and awk. Escaping the ? doesn't help. Make is not match at all. -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Veilleux, Jon L Sent: Monday, February 06, 2012 9:55 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) Do you need to escape the '?' -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
It's getting worse :-( Xxx:/usr/lpp/perl/bin: pwd /usr/lpp/perl/bin Xxx:/usr/lpp/perl/bin: ls -la total 1038 drwxr-xr-x 2 ZDDFX3D $#OMGID1 704 Nov 12 2010 . drwxr-xr-x 6 ZDDFX3D $#OMGID1 448 Nov 12 2010 .. -rwxr-xr-x 2 ZDDFX3D $#OMGID1 114688 Nov 12 2010 a2p -rwxr-xr-x 2 ZDDFX3D $#OMGID1 164 Nov 12 2010 cppstdin -rwxr-xr-x 2 ZDDFX3D $#OMGID1 38091 Nov 12 2010 enc2xs -rwxr-xr-x 2 ZDDFX3D $#OMGID1 24133 Nov 12 2010 find2perl -rwxr-xr-x 2 ZDDFX3D $#OMGID1 27042 Nov 12 2010 h2ph -rwxr-xr-x 2 ZDDFX3D $#OMGID1 59777 Nov 12 2010 h2xs -rwxr-xr-x 2 ZDDFX3D $#OMGID13589 Nov 12 2010 instmodsh -rwxr-xr-x 2 ZDDFX3D $#OMGID1 15742 Nov 12 2010 libnetcfg lrwxrwxrwx 1 ZDDFX3D $#OMGID1 9 Dec 21 2010 perl - perl5.8.7 -rwxr-xr-x 2 ZDDFX3D $#OMGID1 24576 Nov 12 2010 perl5.8.7 -rwxr-xr-x 2 ZDDFX3D $#OMGID1 52735 Nov 12 2010 psed -rwxr-xr-x 2 ZDDFX3D $#OMGID1 52735 Nov 12 2010 s2p -rwxr-xr-x 2 ZDDFX3D $#OMGID1 17383 Nov 12 2010 splain -rwxr-xr-x 2 ZDDFX3D $#OMGID1 51457 Nov 12 2010 xsubpp Xxx:/usr/lpp/perl/bin: perl -np -e 's/CD.*?QR/junkt/' /u/xxx/fxdata CEE3501S The module libperl.so was not found. The traceback information could not be determined. LEAID ENTERED (LEVEL 08/30/2010 AT 12.37) LEAID LEAID112 LEAID ACKNOWLEDGES UNSUPPORTED USS ENVIRONMENT LEAID LEAID112 ABEND-AID PROCESSING SUPPRESSED LEAID PROCESSING COMPLETE. RC=4 [1] + Done(137) perl -np -e 's/CD.*?QR/junkt/' /u/xxx/fxdata 67109308 Killed ./perl Xxx:/usr/lpp/perl/bin: Ken MacKenzie Pramerica Systems Ireland Limited is a private company limited by shares incorporated and registered in the Republic of Ireland with registered number 319900 and registered office at 6th Floor, South Bank House, Barrow Street, Dublin 4, Ireland. From: McKown, John john.mck...@healthmarkets.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 16:02 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu On my z/OS 1.12 system (I'm the senior z/OS sysprog), perl in is /usr/lpp/perl/bin. This is where the CBPDO placed it. Perhaps you have this, but just not on your PATH? -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Ken MacKenzie Sent: Monday, February 06, 2012 9:48 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) :-( Xxx:/u/xxx: perl -np -e 's/CD.*?QR/junkt/' fxdata perl: FSUM7351 not found Xxx:/u/xxx: From: McKown, John john.mck...@healthmarkets.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 15:44 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu Remember that a regex, by default, is greedy. IOW, it matches as much as is possible. So with the CD.*QR pattern, you match from the first CD until the __last__ QR it can find. Not the first QR it can find. You would need the non greedy regex CD.*?QR . However, it is unfortunate that IBM, in its POSIX-inspired blindness, does not implement non greedy regular expressions. I can't get this to work in either awk or sed. But, if you have the Ported Tools available, then you have a older version of perl. It can do what you want. example: $cat test.txt ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ $perl -np -e 's/CD.*?QR/junkt/' test.txt ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ Where the leading $ is my shell prompt character. I crossposted this replay to MVS-OE
Re: Regular Expressions (OMVS)
I disagree. At least you have the filesystem there. Unfortunately, I did this so long ago, I did entirely remember what I needed to do to get it running. In addition to the PATH, you need to update the LIBPATH for the dynamically loaded shared objects. On my system, that's: /usr/lpp/perl/lib/5.8.7/os390-thread-multi/CORE export LIBPATH=${LIBPATH}:/usr/lpp/perl/lib/5.8.7/os390-thread-multi/CORE The only two lines in my /etc/profile which reference perl are: PATH=${PATH}:/usr/lpp/perl/bin LIBPATH=${LIBPATH}:/usr/lpp/perl/lib/5.8.7/os390-thread-multi/CORE I tend to put in a series of such assignments, to keep each subdirectory on a different line for ease of maintenance. At the end, I just put export PATH export LIBPATH -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Ken MacKenzie Sent: Monday, February 06, 2012 10:09 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) It's getting worse :-( Xxx:/usr/lpp/perl/bin: pwd /usr/lpp/perl/bin Xxx:/usr/lpp/perl/bin: ls -la total 1038 drwxr-xr-x 2 ZDDFX3D $#OMGID1 704 Nov 12 2010 . drwxr-xr-x 6 ZDDFX3D $#OMGID1 448 Nov 12 2010 .. -rwxr-xr-x 2 ZDDFX3D $#OMGID1 114688 Nov 12 2010 a2p -rwxr-xr-x 2 ZDDFX3D $#OMGID1 164 Nov 12 2010 cppstdin -rwxr-xr-x 2 ZDDFX3D $#OMGID1 38091 Nov 12 2010 enc2xs -rwxr-xr-x 2 ZDDFX3D $#OMGID1 24133 Nov 12 2010 find2perl -rwxr-xr-x 2 ZDDFX3D $#OMGID1 27042 Nov 12 2010 h2ph -rwxr-xr-x 2 ZDDFX3D $#OMGID1 59777 Nov 12 2010 h2xs -rwxr-xr-x 2 ZDDFX3D $#OMGID13589 Nov 12 2010 instmodsh -rwxr-xr-x 2 ZDDFX3D $#OMGID1 15742 Nov 12 2010 libnetcfg lrwxrwxrwx 1 ZDDFX3D $#OMGID1 9 Dec 21 2010 perl - perl5.8.7 -rwxr-xr-x 2 ZDDFX3D $#OMGID1 24576 Nov 12 2010 perl5.8.7 -rwxr-xr-x 2 ZDDFX3D $#OMGID1 52735 Nov 12 2010 psed -rwxr-xr-x 2 ZDDFX3D $#OMGID1 52735 Nov 12 2010 s2p -rwxr-xr-x 2 ZDDFX3D $#OMGID1 17383 Nov 12 2010 splain -rwxr-xr-x 2 ZDDFX3D $#OMGID1 51457 Nov 12 2010 xsubpp Xxx:/usr/lpp/perl/bin: perl -np -e 's/CD.*?QR/junkt/' /u/xxx/fxdata CEE3501S The module libperl.so was not found. The traceback information could not be determined. LEAID ENTERED (LEVEL 08/30/2010 AT 12.37) LEAID LEAID112 LEAID ACKNOWLEDGES UNSUPPORTED USS ENVIRONMENT LEAID LEAID112 ABEND-AID PROCESSING SUPPRESSED LEAID PROCESSING COMPLETE. RC=4 [1] + Done(137) perl -np -e 's/CD.*?QR/junkt/' /u/xxx/fxdata 67109308 Killed ./perl Xxx:/usr/lpp/perl/bin: Ken MacKenzie Pramerica Systems Ireland Limited is a private company limited by shares incorporated and registered in the Republic of Ireland with registered number 319900 and registered office at 6th Floor, South Bank House, Barrow Street, Dublin 4, Ireland. From: McKown, John john.mck...@healthmarkets.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 16:02 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu On my z/OS 1.12 system (I'm the senior z/OS sysprog), perl in is /usr/lpp/perl/bin. This is where the CBPDO placed it. Perhaps you have this, but just not on your PATH? -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Ken MacKenzie Sent: Monday, February 06, 2012 9:48 AM
Re: Regular Expressions (OMVS)
On Mon, 6 Feb 2012 08:44:01 -0600, Ken MacKenzie ken.macken...@pramerica.ie wrote: Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
I keep mistyping things today. My fingers are aching horribly. Sorry if I am causing confusion. -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
-Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Bill Godfrey Sent: Monday, February 06, 2012 10:19 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) snip try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill Will work on that specific example. But won't if a Q appears with some other character after it, before the first QR. -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Yeah, that worked but in reality, the from string will be supplied by the user and the to string will be computer generated so there's no predicting what they typed. Ken MacKenzie Pramerica Systems Ireland Limited is a private company limited by shares incorporated and registered in the Republic of Ireland with registered number 319900 and registered office at 6th Floor, South Bank House, Barrow Street, Dublin 4, Ireland. From: Bill Godfrey yak36...@yahoo.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 16:27 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu On Mon, 6 Feb 2012 08:44:01 -0600, Ken MacKenzie ken.macken...@pramerica.ie wrote: Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
On Mon, 6 Feb 2012 10:29:07 -0600, McKown, John wrote: -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Bill Godfrey Sent: Monday, February 06, 2012 10:19 AM snip try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill Will work on that specific example. But won't if a Q appears with some other character after it, before the first QR. Yup. Bill and I made the same mistake. But I've used a similar construct to skip to the next delimiter, such as: [^,]*, * # skips to the next comma; swallows the comma and trailing blanks. -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Right... This was the kind of workaround for lack of non greedy I was thinking of. Wondering what the IBM C++ support for non-greedy (if any) is. Martin Martin Packer, Mainframe Performance Consultant, zChampion Worldwide Banking Center of Excellence, IBM +44-7802-245-584 email: martin_pac...@uk.ibm.com Twitter / Facebook IDs: MartinPacker Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker From: Bill Godfrey yak36...@yahoo.com To: IBM-MAIN@bama.ua.edu, Date: 06/02/2012 16:25 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu On Mon, 6 Feb 2012 08:44:01 -0600, Ken MacKenzie ken.macken...@pramerica.ie wrote: Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Yeah, not implementing non greedy regular expressions is not nice. In my not-so-humble opinion. IMO, IBM would have done much better by all to start with Linux as a base and do whatever was needed to get it POSIX compliant and certified. But, IBM tends to have NIH real bad. And z/OS development (or management) really seems to have it bad. Maybe I've been around the FOSS and GNU people too much recently. -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Paul Gilmartin Sent: Monday, February 06, 2012 10:45 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) On Mon, 6 Feb 2012 10:29:07 -0600, McKown, John wrote: -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Bill Godfrey Sent: Monday, February 06, 2012 10:19 AM snip try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill Will work on that specific example. But won't if a Q appears with some other character after it, before the first QR. Yup. Bill and I made the same mistake. But I've used a similar construct to skip to the next delimiter, such as: [^,]*, * # skips to the next comma; swallows the comma and trailing blanks. -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
I would almost bet that awk and sed both use the C language's regex implementation. So I doubt that C (or C++) implements non greedy regexps. Personally, I like pcres: Perl Compatable Regular Expressions. -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Martin Packer Sent: Monday, February 06, 2012 11:00 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) Right... This was the kind of workaround for lack of non greedy I was thinking of. Wondering what the IBM C++ support for non-greedy (if any) is. Martin Martin Packer, -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
On Mon, 6 Feb 2012 11:05:01 -0600, McKown, John wrote: Yeah, not implementing non greedy regular expressions is not nice. In my not-so-humble opinion. IMO, IBM would have done much better by all to start with Linux as a base and do whatever was needed to get it POSIX compliant and certified. But, IBM tends to have NIH real bad. And z/OS development (or management) really seems to have it bad. Maybe I've been around the FOSS and GNU people too much recently. No, IBM is merely revenue-oriented. Well, a little NIH, but mostly because of fear of IP and malware liability. Still, I wish OMVS had merely supplied a kernel (ASCII-based, of course) and let shell and utilities happen by osmosis from FOSS. Actually IIRC, in the early releases shell and utilities were separately priced but FOSS never happened. I blame EBCDIC. -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
On Mon, 2012-02-06 at 12:05 -0500, McKown, John wrote: IMO, IBM would have done much better by all to start with Linux as a base and do whatever was needed to get it POSIX compliant and certified. But, IBM tends to have NIH real bad. I recall that Unix-branding was important to IBM, while Linus never felt the need for it. -- David Andrews A. Duda Sons, Inc. david.andr...@duda.com -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
On Mon, 6 Feb 2012 10:29:07 -0600, McKown, John john.mck...@healthmarkets.com wrote: -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Bill Godfrey Sent: Monday, February 06, 2012 10:19 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) snip try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill Will work on that specific example. But won't if a Q appears with some other character after it, before the first QR. Did you try it? Where a Q appears with some other character after it, before the first QR? I did. It skips to the one that has the first QR, as it should. echo ABCDEFGNOPQSXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ | awk 'sub(/CD[^Q]*QR/,junkt)' ABCDEFGNOPQSXXXABjunktYYYABCDEFGNOPQRZZZ I see that Ken has added to the problem description since my earlier reply. Bill Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
On Mon, 6 Feb 2012 12:19:35 -0500, David Andrews wrote: I recall that Unix-branding was important to IBM, while Linus never felt the need for it. Ironically, the market that impelled IBM to OpenEdition never seemed to care that Windows and Linux weren't UNIX branded. -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
In 6198832018945358.wa.ken.mackenziepramerica...@bama.ua.edu, on 02/06/2012 at 08:44 AM, Ken MacKenzie ken.macken...@pramerica.ie said: I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Ask him about lazy matches versus greedy matches. I don't believe that awk supports lazy matches. -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
John Mckown wrote: I would almost bet that awk and sed both use the C language's regex implementation. So I doubt that C (or C++) implements non greedy regexps. Personally, I like pcres: Perl Compatable Regular Expressions. This is an EXCELLENT example why I advised some developers the other day to go with whatever IBM C++ does - in order to avoid getting into (probably unhelpful to them) debates over which flavour of Regexps to support. Martin Packer, Mainframe Performance Consultant, zChampion Worldwide Banking Center of Excellence, IBM +44-7802-245-584 email: martin_pac...@uk.ibm.com Twitter / Facebook IDs: MartinPacker Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Oh, definitely critically important. US Gov required POSIX compliance in order to bid on their contracts. That was the birth of MVS Open Edition. -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of David Andrews Sent: Monday, February 06, 2012 11:20 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) On Mon, 2012-02-06 at 12:05 -0500, McKown, John wrote: IMO, IBM would have done much better by all to start with Linux as a base and do whatever was needed to get it POSIX compliant and certified. But, IBM tends to have NIH real bad. I recall that Unix-branding was important to IBM, while Linus never felt the need for it. -- David Andrews A. Duda Sons, Inc. david.andr...@duda.com -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
I will agree with you from the Developer's viewpoint. If for no other reason than it gives an explanation which is generally acceptable: We use the IBM supplied regexp engine. If you need changes, IBM is the proper venue for discussion. Maintaining the regexp code itself is just not cost effective. And, no matter what is decided, somebody will complain. like me - grin -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM -Original Message- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Martin Packer Sent: Monday, February 06, 2012 11:46 AM To: IBM-MAIN@bama.ua.edu Subject: Re: Regular Expressions (OMVS) John Mckown wrote: I would almost bet that awk and sed both use the C language's regex implementation. So I doubt that C (or C++) implements non greedy regexps. Personally, I like pcres: Perl Compatable Regular Expressions. This is an EXCELLENT example why I advised some developers the other day to go with whatever IBM C++ does - in order to avoid getting into (probably unhelpful to them) debates over which flavour of Regexps to support. Martin Packer, Mainframe Performance Consultant, zChampion Worldwide Banking Center of Excellence, IBM +44-7802-245-584 email: martin_pac...@uk.ibm.com Twitter / Facebook IDs: MartinPacker Blog: https://www.ibm.com/developerworks/mydeveloperworks/blogs/MartinPacker Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
On Mon, 6 Feb 2012 16:46:58 +, Ken MacKenzie ken.macken...@pramerica.ie wrote: Yeah, that worked but in reality, the from string will be supplied by the user and the to string will be computer generated so there's no predicting what they typed. There's no predicting what they typed? What would a user type that would end up as your awk 'sub(/CD.*QR/,junkt)' fxdata ? You haven't stated what the user types. I would think that a script would be able to figure out which letter goes after the ^ character in awk 'sub(/CD[^Q]*QR/,junkt)' and would be happy to try to demonstrate that, but I can't say for sure unless you provide more information about the user input. Bill Ken MacKenzie Pramerica Systems Ireland Limited is a private company limited by shares incorporated and registered in the Republic of Ireland with registered number 319900 and registered office at 6th Floor, South Bank House, Barrow Street, Dublin 4, Ireland. From: Bill Godfrey yak36...@yahoo.com To: IBM-MAIN@bama.ua.edu Date: 06/02/2012 16:27 Subject: Re: Regular Expressions (OMVS) Sent by: IBM Mainframe Discussion List IBM-MAIN@bama.ua.edu On Mon, 6 Feb 2012 08:44:01 -0600, Ken MacKenzie ken.macken...@pramerica.ie wrote: Hi All, I'm not sure if this is the appropriate forum, please point me to the correct one if it's not. I'm playing around with regular expressions and I want to achieve the following. I spoke to a Unix geek but he didn't really understand what I was asking. Given the following sample data, I want discover only the first occurrence of any string which matches my regexp. ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ I tried: awk 'sub(/CD.*QR/,junkt)' fxdata in an attempt to change ABCDEFGNOPQRXXX to ABjunktXXX but instead, it takes the final occurrence of QR, and returns ABjunktZZZ. Notice the ZZZ on the end instead of XXX. This is being driven from a REXX exec in ISPF, if any of the above is not clear, I will try to explain further. try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
I think that I may have misunderstood what the OP wanted. The awk script you give and the perl one that I gave give different output on the first line of my modified file. It is the way I was envisioning what the OP wanted was: Find the first instance of CD in the given string. Find all other characters following that until the first QR substring. Replace those characters with junkt. What my perl regexp matches and your awk matches are not the same segment. I don't know which the OP wanted, now. My modified file places a Q in column 11, moving all other characters right one. It also removes the D which was originally in column 27. $cat test.txt ABCDEFQGNOPQRXXXABCEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ $awk 'sub(/CD[^Q]*QR/,junkt)' test.txt ABCDEFQGNOPQRXXXABCEFGNOPQRYYYABjunktZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ $perl -np -e 's/CD.*?QR/junkt/' test.txt ABjunktXXXABCEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ All the lines, other than the first, in test.txt are identical. The first line has a Q inserted after the first F. And the second D removed from between the C and the E. My string matches from the first CD (column 7) to the first QR (column 16). Your awk matches the CD in column 45 to the QR in column 53. -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM snip try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill Will work on that specific example. But won't if a Q appears with some other character after it, before the first QR. Did you try it? Where a Q appears with some other character after it, before the first QR? I did. It skips to the one that has the first QR, as it should. echo ABCDEFGNOPQSXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ | awk 'sub(/CD[^Q]*QR/,junkt)' ABCDEFGNOPQSXXXABjunktYYYABCDEFGNOPQRZZZ I see that Ken has added to the problem description since my earlier reply. Bill Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
In 0078805690815046.wa.yak36790yahoo@bama.ua.edu, on 02/06/2012 at 11:25 AM, Bill Godfrey yak36...@yahoo.com said: Did you try it? Trying it on a string that you know it handles isn't good enough; try it on a string where a Q appears between the CD and the QR. Where a Q appears with some other character after it, before the first QR? It should match CDEFGNOPQSXXXABCDEFGNOPQR; instead it matches only CDEFGNOPQR. It skips to the one that has the first QR, as it should. It also skips past the first CD, which it shouldn't. echo ABCDEFGNOPQSXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ | awk 'sub(/CD[^Q]*QR/,junkt)' echo ABCDEQFGNOPQSXXXABCDEQFGNOPQRYYYABCDEQFGNOPQRZZZ | awk 'sub(/CD[^Q]*QR/,junkt)' -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
In 0584589877710221.wa.yak36790yahoo@bama.ua.edu, on 02/06/2012 at 10:19 AM, Bill Godfrey yak36...@yahoo.com said: try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' ABCDEFGNOPQQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDQEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN
Re: Regular Expressions (OMVS)
Thanks to you and Shmuel for pointing out the flaw in my logic. Bill On Mon, 6 Feb 2012 12:41:24 -0600, McKown, John john.mck...@healthmarkets.com wrote: I think that I may have misunderstood what the OP wanted. The awk script you give and the perl one that I gave give different output on the first line of my modified file. It is the way I was envisioning what the OP wanted was: Find the first instance of CD in the given string. Find all other characters following that until the first QR substring. Replace those characters with junkt. What my perl regexp matches and your awk matches are not the same segment. I don't know which the OP wanted, now. My modified file places a Q in column 11, moving all other characters right one. It also removes the D which was originally in column 27. $cat test.txt ABCDEFQGNOPQRXXXABCEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABCDEFGNOPQRXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ $awk 'sub(/CD[^Q]*QR/,junkt)' test.txt ABCDEFQGNOPQRXXXABCEFGNOPQRYYYABjunktZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ $perl -np -e 's/CD.*?QR/junkt/' test.txt ABjunktXXXABCEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ ABjunktXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ All the lines, other than the first, in test.txt are identical. The first line has a Q inserted after the first F. And the second D removed from between the C and the E. My string matches from the first CD (column 7) to the first QR (column 16). Your awk matches the CD in column 45 to the QR in column 53. -- John McKown Systems Engineer IV IT Administrative Services Group HealthMarkets(r) 9151 Boulevard 26 * N. Richland Hills * TX 76010 (817) 255-3225 phone * john.mck...@healthmarkets.com * www.HealthMarkets.com Confidentiality Notice: This e-mail message may contain confidential or proprietary information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. HealthMarkets(r) is the brand name for products underwritten and issued by the insurance subsidiaries of HealthMarkets, Inc. -The Chesapeake Life Insurance Company(r), Mid-West National Life Insurance Company of TennesseeSM and The MEGA Life and Health Insurance Company.SM snip try this: awk 'sub(/CD[^Q]*QR/,junkt)' or this: sed -e 's/CD[^Q]*QR/junkt/' Bill Will work on that specific example. But won't if a Q appears with some other character after it, before the first QR. Did you try it? Where a Q appears with some other character after it, before the first QR? I did. It skips to the one that has the first QR, as it should. echo ABCDEFGNOPQSXXXABCDEFGNOPQRYYYABCDEFGNOPQRZZZ | awk 'sub(/CD[^Q]*QR/,junkt)' ABCDEFGNOPQSXXXABjunktYYYABCDEFGNOPQRZZZ I see that Ken has added to the problem description since my earlier reply. Bill Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: INFO IBM-MAIN