Re: [Tutor] implementing sed - termination error
bruce wrote: > Hi > > Running a test on a linux box, with python. > > Trying to do a search/replace over a file, for a given string, and > replacing the string with a chunk of text that has multiple lines. > > From the cmdline, using sed, no prob. however, implementing sed, runs > into issues, that result in a "termination error" > > The error gets thrown, due to the "\" of the newline. SO, and other > sites have plenty to say about this, but haven't run across any soln. > > The test file contains 6K lines, but, the process requires doing lots > of search/replace operations, so I'm interested in testing this method > to see how "fast" the overall process is. > > The following psuedo code is what I've used to test. The key point > being changing the "\n" portion to try to resolved the termination > error. Here's a self-contained example that demonstrates that the key change is to avoid shell=True. $ cat input.txt foo alpha beta foo gamma epsilon foo zeta $ sed s/foo/bar\\nbaz/g input.txt bar baz alpha beta bar baz gamma epsilon bar baz zeta $ python3 Python 3.4.3 (default, Sep 14 2016, 12:36:27) [GCC 4.8.4] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import subprocess >>> subprocess.call(["sed", "s/foo/bar\\nbaz/g", "input.txt"]) bar baz alpha beta bar baz gamma epsilon bar baz zeta 0 Both the shell and Python require you to escape, so if you use one after the other you have to escape the escapes; but with only one level of escapes and a little luck you need not make any changes between Python and the shell. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] implementing sed - termination error
On 01Nov2016 20:18, brucewrote: Running a test on a linux box, with python. Trying to do a search/replace over a file, for a given string, and replacing the string with a chunk of text that has multiple lines. From the cmdline, using sed, no prob. however, implementing sed, runs into issues, that result in a "termination error" Just terminology: you're not "implementing sed", which is a nontrivial task that would involve writing a python program that could do everything sed does. You're writing a small python program to call sed to do the work. Further discussion below. The error gets thrown, due to the "\" of the newline. SO, and other sites have plenty to say about this, but haven't run across any soln. The test file contains 6K lines, but, the process requires doing lots of search/replace operations, so I'm interested in testing this method to see how "fast" the overall process is. The following psuedo code is what I've used to test. The key point being changing the "\n" portion to try to resolved the termination error. import subprocess ll_="ffdfdfdfg" ll2_="12112121212121212" hash="a" data_=ll_+"\n"+ll2_+"\n"+qq22_ print data_ Presuming qq22_ is not shown. cc='sed -i "s/'+hash+'/'+data_+'/g" '+dname print cc proc=subprocess.Popen(cc, shell=True,stdout=subprocess.PIPE) res=proc.communicate()[0].strip() There are two fairly large problems with this program. The first is your need to embed newlines in the replacement pattern. You have genuine newlines in your string, but a sed command would look like this: sed 's/a/ffdfdfdfg\ 12112121212121212\ q/g' so you need to replace the newlines with "backslash and newline". Fortunately strings have a .replace() method which you can use for this purpose. Look it up: https://docs.python.org/3/library/stdtypes.html#str.replace You can use it to make data_ how you want it to be for the command. The second problem is that you're then trying to invoke sed by constructing a shell command string and handing that to Popen. This means that you need to embed shell syntax in that string to quote things like the sed command. All very messy. It is better to _bypass_ the shell and invoke sed directory by leaving out the "shell=True" parameter. All the command line (which is the shell) is doing is honouring the shell quoting and constructing a sed invocation as distinct strings: sed -i s/this/that/g filename You want to do the equivalent in python, something like this: sed_argv = [ 'sed', '-i', 's/'+hash+'/'+data_+'/g', dname ] proc=subprocess.Popen(sed_argv, stdout=subprocess.PIPE) See how you're now unconcerned by any difficulties around shell quoting? You're now dealing directly in strings. There are a few other questions, such as: if you're using sed's -i option, why is stdout a pipe? And what if hash or data_ contain slashes, which you are using in sed to delimit them? Hoping this will help you move forward. Cheers, Cameron Simpson ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] implementing sed - termination error
On 02/11/16 00:18, bruce wrote: > Trying to do a search/replace over a file, for a given string, and > replacing the string with a chunk of text that has multiple lines. > > From the cmdline, using sed, no prob. however, implementing sed, runs > into issues, that result in a "termination error" I don;t understand what you mean by that last paragraph. "using sed, no prob" implies you know the command you want to run because you got it to work on the command line? If that's correct can you share the exact command you typed at the command line that worked? "implementing sed" implies you are trying to write the sed tool in Python. but your code suggests you are trying to run sed from within a Python script - very different. > The error gets thrown, due to the "\" of the newline. That sounds very odd. What leads you to that conclusion? For that matter which \ or newline? In which string - the search string, the replacement string or the file content? > The test file contains 6K lines, but, the process requires doing lots > of search/replace operations, so I'm interested in testing this method > to see how "fast" the overall process is. I'm not sure what you are testing? Is it the sed tool itself? Or is it the Python script that runs sed? Or something else? > The following psuedo code is what I've used to test. Pseudo code is fine to explain complex algorithms but in this case the actual code is probably more useful. > The key point > being changing the "\n" portion to try to resolved the termination > error. Again, I don't really understand what you mean by that. > import subprocess > > ll_="ffdfdfdfg" > ll2_="12112121212121212" > hash="a" > > data_=ll_+"\n"+ll2_+"\n"+qq22_ > print data_ > > cc='sed -i "s/'+hash+'/'+data_+'/g" '+dname > print cc I assume dname is your file? I'd also use string formatting to construct the command, simply because sed uses regex and a lot of + signs looks like a regex so it is confusing (to me at least). But see the comment below about Popen args. > > proc=subprocess.Popen(cc, shell=True,stdout=subprocess.PIPE) > res=proc.communicate()[0].strip() > > > > === > error > sed: -e expression #1, char 38: unterminated `s' command My first instinct when dealing with subprocess errors is to set shell=False to ensure the shell isn't messing about with my inputs. What happens if you set shell false? I'd also tend to put the sed arguments into a list rather than pass a single string. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] implementing sed - termination error
Hi Running a test on a linux box, with python. Trying to do a search/replace over a file, for a given string, and replacing the string with a chunk of text that has multiple lines. >From the cmdline, using sed, no prob. however, implementing sed, runs into issues, that result in a "termination error" The error gets thrown, due to the "\" of the newline. SO, and other sites have plenty to say about this, but haven't run across any soln. The test file contains 6K lines, but, the process requires doing lots of search/replace operations, so I'm interested in testing this method to see how "fast" the overall process is. The following psuedo code is what I've used to test. The key point being changing the "\n" portion to try to resolved the termination error. import subprocess ll_="ffdfdfdfg" ll2_="12112121212121212" hash="a" data_=ll_+"\n"+ll2_+"\n"+qq22_ print data_ cc='sed -i "s/'+hash+'/'+data_+'/g" '+dname print cc proc=subprocess.Popen(cc, shell=True,stdout=subprocess.PIPE) res=proc.communicate()[0].strip() === error sed: -e expression #1, char 38: unterminated `s' command ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor