Re: [Tutor] implementing sed - termination error

cs Tue, 01 Nov 2016 22:59:58 -0700

On 01Nov2016 20:18, bruce <badoug...@gmail.com> wrote:

Running a test on a linux box, with python.
Trying to do a search/replace over a file, for a given string, and
replacing the string with a chunk of text that has multiple lines.


From the cmdline, using sed, no prob. however, implementing sed, runs
into issues, that result in a "termination error"

Just terminology: you're not "implementing sed", which is a nontrivial taskthat would involve writing a python program that could do everything sed does.You're writing a small python program to call sed to do the work.


Further discussion below.

The error gets thrown, due to the "\" of the newline. SO, and other
sites have plenty to say about this, but haven't run across any soln.

The test file contains 6K lines, but, the process requires doing lots
of search/replace operations, so I'm interested in testing this method
to see how "fast" the overall process is.

The following psuedo code is what I've used to test. The key point
being changing the "\n" portion to try to resolved the termination
error.

import subprocess

ll_="ffdfdfdfghhhh"
ll2_="12112121212121212"
hash="aaaaa"

data_=ll_+"\n"+ll2_+"\n"+qq22_
print data_


Presuming qq22_ is not shown.

cc='sed -i "s/'+hash+'/'+data_+'/g" '+dname
print cc
proc=subprocess.Popen(cc, shell=True,stdout=subprocess.PIPE)
res=proc.communicate()[0].strip()

There are two fairly large problems with this program. The first is your needto embed newlines in the replacement pattern. You have genuine newlines in yourstring, but a sed command would look like this:


 sed 's/aaaaa/ffdfdfdfghhhh\
 12112121212121212\
 qqqqq/g'

so you need to replace the newlines with "backslash and newline".

Fortunately strings have a .replace() method which you can use for thispurpose. Look it up:


 https://docs.python.org/3/library/stdtypes.html#str.replace

You can use it to make data_ how you want it to be for the command.

The second problem is that you're then trying to invoke sed by constructing ashell command string and handing that to Popen. This means that you need toembed shell syntax in that string to quote things like the sed command. Allvery messy.

It is better to _bypass_ the shell and invoke sed directory by leaving out the"shell=True" parameter. All the command line (which is the shell) is doing ishonouring the shell quoting and constructing a sed invocation as distinctstrings:


 sed
 -i
 s/this/that/g
 filename

You want to do the equivalent in python, something like this:

 sed_argv = [ 'sed', '-i', 's/'+hash+'/'+data_+'/g', dname ]
 proc=subprocess.Popen(sed_argv, stdout=subprocess.PIPE)

See how you're now unconcerned by any difficulties around shell quoting? You'renow dealing directly in strings.

There are a few other questions, such as: if you're using sed's -i option, whyis stdout a pipe? And what if hash or data_ contain slashes, which you areusing in sed to delimit them?


Hoping this will help you move forward.

Cheers,
Cameron Simpson <c...@zip.com.au>
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] implementing sed - termination error

Reply via email to