Re: escaping characters in filenames
Nobody writes: > On Wed, 29 Jul 2009 09:29:55 -0400, J Kenneth King wrote: > >> I wrote a script to process some files using another program. One thing >> I noticed was that both os.listdir() and os.path.walk() will return >> unescaped file names (ie: "My File With Spaces & Stuff" instead of "My\ >> File\ With\ Spaces\ \&\ Stuff"). I haven't had much success finding a >> module or recipe that escapes file names and was wondering if anyone >> could point me in the right direction. >> >> As an aside, the script is using subprocess.call() with the "shell=True" >> parameter. There isn't really a reason for doing it this way (was just >> the fastest way to write it and get a prototype working). I was >> wondering if Popen objects were sensitive to unescaped names like the >> shell. I intend to refactor the function to use Popen objects at some >> point and thought perhaps escaping file names may not be entirely >> necessary. > > Note that subprocess.call() is nothing more than: > > def call(*popenargs, **kwargs): > return Popen(*popenargs, **kwargs).wait() > > plus a docstring. It accepts exactly the same arguments as Popen(), with > the same semantics. > > If you want to run a command given a program and arguments, you > should pass the command and arguments as a list, rather than trying to > construct a string. > > On Windows the value of shell= is unrelated to whether the command is > a list or a string; a list is always converted to string using the > list2cmdline() function. Using shell=True simply prepends "cmd.exe /c " to > the string (this allows you to omit the .exe/.bat/etc extension for > extensions which are in %PATHEXT%). > > On Unix, a string is first converted to a single-element list, so if you > use a string with shell=False, it will be treated as the name of an > executable to be run without arguments, even if contains spaces, shell > metacharacters etc. > > The most portable approach seems to be to always pass the command as a > list, and to set shell=True on Windows and shell=False on Unix. > > The only reason to pass a command as a string is if you're getting a > string from the user and you want it to be interpreted using the > platform's standard shell (i.e. cmd.exe or /bin/sh). If you want it to be > interpreted the same way regardless of platform, parse it into a > list using shlex.split(). I understand; I think I was headed towards subprocess.Popen() either way. It seems to handle the problem I posted about. And I got to learn a little something on the way. Thanks! Only now there's a new problem in that the output of the program is different if I run it from Popen than if I run it from the command line. The program in question is 'pdftotext'. More investigation to ensue. Thanks again for the helpful post. -- http://mail.python.org/mailman/listinfo/python-list
Re: escaping characters in filenames
On Wed, 29 Jul 2009 09:29:55 -0400, J Kenneth King wrote: > I wrote a script to process some files using another program. One thing > I noticed was that both os.listdir() and os.path.walk() will return > unescaped file names (ie: "My File With Spaces & Stuff" instead of "My\ > File\ With\ Spaces\ \&\ Stuff"). I haven't had much success finding a > module or recipe that escapes file names and was wondering if anyone > could point me in the right direction. > > As an aside, the script is using subprocess.call() with the "shell=True" > parameter. There isn't really a reason for doing it this way (was just > the fastest way to write it and get a prototype working). I was > wondering if Popen objects were sensitive to unescaped names like the > shell. I intend to refactor the function to use Popen objects at some > point and thought perhaps escaping file names may not be entirely > necessary. Note that subprocess.call() is nothing more than: def call(*popenargs, **kwargs): return Popen(*popenargs, **kwargs).wait() plus a docstring. It accepts exactly the same arguments as Popen(), with the same semantics. If you want to run a command given a program and arguments, you should pass the command and arguments as a list, rather than trying to construct a string. On Windows the value of shell= is unrelated to whether the command is a list or a string; a list is always converted to string using the list2cmdline() function. Using shell=True simply prepends "cmd.exe /c " to the string (this allows you to omit the .exe/.bat/etc extension for extensions which are in %PATHEXT%). On Unix, a string is first converted to a single-element list, so if you use a string with shell=False, it will be treated as the name of an executable to be run without arguments, even if contains spaces, shell metacharacters etc. The most portable approach seems to be to always pass the command as a list, and to set shell=True on Windows and shell=False on Unix. The only reason to pass a command as a string is if you're getting a string from the user and you want it to be interpreted using the platform's standard shell (i.e. cmd.exe or /bin/sh). If you want it to be interpreted the same way regardless of platform, parse it into a list using shlex.split(). -- http://mail.python.org/mailman/listinfo/python-list
Re: escaping characters in filenames
J Kenneth King wrote: I wrote a script to process some files using another program. One thing I noticed was that both os.listdir() and os.path.walk() will return unescaped file names (ie: "My File With Spaces & Stuff" instead of "My\ File\ With\ Spaces\ \&\ Stuff"). I haven't had much success finding a module or recipe that escapes file names and was wondering if anyone could point me in the right direction. As an aside, the script is using subprocess.call() with the "shell=True" parameter. There isn't really a reason for doing it this way (was just the fastest way to write it and get a prototype working). I was wondering if Popen objects were sensitive to unescaped names like the shell. I intend to refactor the function to use Popen objects at some point and thought perhaps escaping file names may not be entirely necessary. Cheers There are dozens of meanings for escaping characters in strings. Without some context, we're wasting our time. For example, if the filename is to be interpreted as part of a URL, then spaces are escaped by using %20. Exactly who is going to be using this string you think you have to modify? I don't know of any environment which expects spaces to be escaped with backslashes. Be very specific. For example, if a Windows application is parsing its own command line, you need to know what that particular application is expecting -- Windows passes the entire command line as a single string. But of course you may be invoking that application using subprocess.Popen(), in which case some transformations happen to your arguments before the single string is built. Then some more transformations may happen in the shell. Then some more in the C runtime library of the new process (if it happens to be in C, and if it happens to use those libraries). I'm probably not the one with the answer. But until you narrow down your case, you probably won't attract the attention of whichever person has the particular combination of experience that you're hoping for. DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: escaping characters in filenames
J Kenneth King wrote: I wrote a script to process some files using another program. One thing I noticed was that both os.listdir() and os.path.walk() will return unescaped file names (ie: "My File With Spaces & Stuff" instead of "My\ File\ With\ Spaces\ \&\ Stuff"). I haven't had much success finding a module or recipe that escapes file names and was wondering if anyone could point me in the right direction. That's only necessary if you're building a command line and passing it as a string. As an aside, the script is using subprocess.call() with the "shell=True" parameter. There isn't really a reason for doing it this way (was just the fastest way to write it and get a prototype working). I was wondering if Popen objects were sensitive to unescaped names like the shell. I intend to refactor the function to use Popen objects at some point and thought perhaps escaping file names may not be entirely necessary. Pass the command line to Popen as a list of strings. -- http://mail.python.org/mailman/listinfo/python-list