Re: I can't understand re.sub
On 01/12/15 05:28, Jussi Piitulainen wrote: A real solution should be aware of the actual structure of those lines, assuming they follow some defined syntax. I think that we are in violent agreement on this ;) E. -- https://mail.python.org/mailman/listinfo/python-list
Re: I can't understand re.sub
On 29/11/15 21:36, Mr Zaug wrote: I need to use re.sub to replace strings in a text file. Do you? Is there any other way? result = re.sub(pattern, repl, string, count=0, flags=0); I think I understand that pattern is the regex I'm searching for and repl is the thing I want to substitute for whatever pattern finds but what is string? Where do you think the function gets the string you want to transform from? This should be simple, right? It is. And it could be even simpler if you don't bother with regexes at all (if your input is as fixed as you say it is): >>> foo = "foo bar baz spam CONTENT_PATH bar spam" >>> ' Substitute '.join(foo.split(' CONTENT_PATH ', 1)) 'foo bar baz spam Substitute bar spam' >>> E. -- https://mail.python.org/mailman/listinfo/python-list
Re: I can't understand re.sub
On 30/11/15 08:51, Jussi Piitulainen wrote: Surely the straight thing to say is: >>> foo.replace(' CONTENT_PATH ', ' Substitute ') 'foo bar baz spam Substitute bar spam' Not quite the same thing (but yes, with a third argument of 1, it would be). But there was no guarantee of spaces around the target. I know. It was just an example to show that there might be an option that's not a regex for the specific use indicated. It's up to the OP to decide whether they think the spaces (or any other, or no, delimiter) would actually be required or useful. Or whether they really prefer a regex after all. If you wish to, say, replace "spam" in your foo with "REDACTED" but leave it intact in "May the spammer be prosecuted", a regex might be attractive after all. But that's not what the OP said they wanted to do. They said everything was very fixed - they did not want a general purpose human language text processing solution ... ;) E. -- https://mail.python.org/mailman/listinfo/python-list
Re: I can't understand re.sub
Erik writes: > On 30/11/15 08:51, Jussi Piitulainen wrote: [- -] >> If you wish to, >> say, replace "spam" in your foo with "REDACTED" but leave it intact in >> "May the spammer be prosecuted", a regex might be attractive after all. > > But that's not what the OP said they wanted to do. They said > everything was very fixed - they did not want a general purpose human > language text processing solution ... ;) Language processing is not what I had in mind here. Merely this, that there is some sort of word boundary, be it punctuation, whitespace, or an end of the string: >>> re.sub(r'\bspam\b', '', 'spamalot spam') 'spamalot ' That's not perfect either, but it's simple and might be somewhat proportional to the problem. A real solution should be aware of the actual structure of those lines, assuming they follow some defined syntax. -- https://mail.python.org/mailman/listinfo/python-list
Re: I can't understand re.sub
Erik writes: > On 29/11/15 21:36, Mr Zaug wrote: >> This should be simple, right? > > It is. And it could be even simpler if you don't bother with regexes > at all (if your input is as fixed as you say it is): > > >>> foo = "foo bar baz spam CONTENT_PATH bar spam" > >>> ' Substitute '.join(foo.split(' CONTENT_PATH ', 1)) > 'foo bar baz spam Substitute bar spam' Surely the straight thing to say is: >>> foo.replace(' CONTENT_PATH ', ' Substitute ') 'foo bar baz spam Substitute bar spam' But there was no guarantee of spaces around the target. If you wish to, say, replace "spam" in your foo with "REDACTED" but leave it intact in "May the spammer be prosecuted", a regex might be attractive after all. -- https://mail.python.org/mailman/listinfo/python-list
Re: I can't understand re.sub
On Sun, 29 Nov 2015 13:36:57 -0800, Mr Zaug wrote: > result = re.sub(pattern, repl, string, count=0, flags=0); re.sub works on a string, not on a file. Read the file to a string, pass it in as the string. Or pre-compile the search pattern(s) and process the file line by line: import re patts = [ (re.compile("axe"), "hammer"), (re.compile("cat"), "dog"), (re.compile("tree"), "fence") ] with open("input.txt","r") as inf, open("output.txt","w") as ouf: line = inf.readline() for patt in patts: line = patt[0].sub(patt[1], line) ouf.write(line) Not tested, but I think it should do the trick. Or use a single patt and a replacement func: import re patt = re.compile("(axe)|(cat)|(tree)") def replfunc(match): if match == 'axe': return 'hammer' if match == 'cat': return 'dog' if match == 'tree': return 'fence' return match with open("input.txt","r") as inf, open("output.txt","w") as ouf: line = inf.readline() line = patt.sub(replfunc, line) ouf.write(line) (also not tested) -- Denis McMahon, denismfmcma...@gmail.com -- https://mail.python.org/mailman/listinfo/python-list
Re: I can't understand re.sub
Thanks. That does help quite a lot. -- https://mail.python.org/mailman/listinfo/python-list
Re: I can't understand re.sub
On Sunday, November 29, 2015 at 3:37:34 PM UTC-6, Mr Zaug wrote: > The items I'm searching for are few and they do not change. They are > "CONTENT_PATH", "ENV" and "NNN". These appear on a few lines in a template > file. They do not appear together on any line and they only appear once on > each line. This should be simple, right? Yes. In fact so simple that string methods and a "for loop" will suffice. Using regexps for this tasks would be like using a dump truck to haul a teaspoon of salt. -- https://mail.python.org/mailman/listinfo/python-list
Re: I can't understand re.sub
On Sunday, November 29, 2015 at 8:12:25 PM UTC-5, Rick Johnson wrote: > On Sunday, November 29, 2015 at 3:37:34 PM UTC-6, Mr Zaug wrote: > > > The items I'm searching for are few and they do not change. They are > > "CONTENT_PATH", "ENV" and "NNN". These appear on a few lines in a template > > file. They do not appear together on any line and they only appear once on > > each line. This should be simple, right? > > Yes. In fact so simple that string methods and a "for loop" will suffice. > Using regexps for this tasks would be like using a dump truck to haul a > teaspoon of salt. I rarely get a chance to do any scripting so yeah, I stink at it. Ideally I would have a script that will spit out a config file such as 087_pre-prod_snakeoil_farm.any and not need to manually rename said output file. -- https://mail.python.org/mailman/listinfo/python-list