(Changed subject, as I don't think a PEP is the right focus for thinking
about this problem space)

Thiago Padilha wrote:
> Python is often used a shell scripting language due to it being more
> robust than traditional Unix shells for managing complex programs, but
> it is significantly more verbose than Bash & friends for doing common
> shell scripting tasks such as writing pipelines and redirection .

I 100% agree with this problem statement.

As others have mentioned, there are several libraries people have written
that try to do something in this direction. I’ve never found one that I
felt approached the problem from quite the right angle, so I’ve been
experimenting recently with my own solution (incorporating ideas from
friends.) I guess this is a good prompt to discuss it more widely!

All the ideas below are in the working implementation at
https://github.com/gnprice/pysh .

----------

First, one unsung strength of Bash (and friends) that I think is actually
an essential foundation is what happens when there’s no redirection, no
pipelines, no fancy substitutions — you want to just run a program, and
pass it some arguments.


The core thing you need to do here is make a list of strings, for the
command and its arguments. That’s how the underlying API works. So for
example in Bash:

    # 5 strings:
    gpg -d -o "$cleartext_path" "$cryptotext_path"

    # (4 + len(commit_ids)) strings:
    git merge "${commit_ids[@]}" -m "$message"


In Python this typically ends up looking like:

    subprocess.check_call(['gpg', '-d', '-o', cleartext_path,
cryptotext_path])

    subprocess.check_call(['git', 'merge'] + commit_ids + ['-m', message])


For the arguments that are literals — which often means most of them -- we
keep saying quote-comma-space-quote, quote-comma-space-quote, effectively
as a delimiter. That’s a lot of visual noise, as well as extra typing. And
when you want to splice in a whole list of arguments, it gets worse.


What we want here is something almost like this:

    subprocess.check_call(
        f'gpg -d -o {cleartext_path} {cryptotext_path}'.split()) # BAD!

... except that version steps into the classic shell pitfall where if a
value happens to contain whitespace, it turns into several arguments and
totally changes the meaning.


But it’s so close. Really we just want to do that but with the `split`
first — so it operates on the literal string you see in the source code —
and *then* the `format`, or the f-string substitution. In fact you can
implement a 70% solution in just a line:

    def shwords(fmt, **kwargs):
        return [word.format(**kwargs) for word in fmt.split()]

    >>> shwords('rm -rf {tmpdir}/{userdoc}', tmpdir='/tmp', userdoc='1 ..
2')
    ['rm', '-rf', '/tmp/1 .. 2']


A bit more work gives you f-string-like behavior, if you opt into it:

    subprocess.check_call(shwords_f('rm -rf {tmpdir}/{userdoc}'))

    check_cmd_f('rm -rf {tmpdir}/{userdoc}')  # small convenience helper

and some more gives you positional arguments:

    check_cmd('gpg -d -o {} {}', cleartext_path, cryptotext_path)

and a format-minilanguage extension `{...!@}` , to substitute in a whole
list:

    check_cmd('git merge {!@} -m {}', commit_ids, message)

Full implementation here:
  https://github.com/gnprice/pysh/blob/8faa55b06/pysh/words.py


As far as I know, this idea is a new one. I’ve found it goes a long way in
making scripts that run a lot of external commands feel Pythonically
concise and low-boilerplate.


----------

Then there are a number of features on top of that foundation that we need
to really make interacting with external Unix-style commands as convenient
as it can be. To avoid making this email longer I’ll just briefly gesture
at two of them for now — more details in the repo!


* Getting a command’s output in convenient form — not only on stdout, but
success/failure from the return code:

      if not try_cmd('git diff-index --quiet HEAD'):
          raise ClickException("worktree not clean")

      # ...

      commit_id = try_slurp_cmd_f('git rev-parse --verify --quiet {name}')
      if commit_id is None:
          commit_id = try_slurp_cmd_f('git rev-parse --verify --quiet
origin/{name}')
      if commit_id is None:
          raise MissingBranchException(name)


* Pipelines. :-) The `|` operator really is too good to pass up:

      def first_mention(pattern):
          return pysh.slurp(
              cmd.run('git log --oneline --reverse --abbrev=9')
              | cmd.run('grep -m1 {}', pattern)
              | cmd.run('perl -lane {}', 'print $F[0]')
          )

  I think one important direction for pipelines is making it convenient to
stick bits of Python code in the middle of the pipeline, in amongst
external programs. That direction isn’t fully developed in the current API.


----------

For anyone who finds this interesting, I’d encourage you to check out the
demo scripts: https://github.com/gnprice/pysh/tree/master/example

Then if you take some interesting fragment of script you have handy and try
converting it to use this library, I’d be very curious to hear how it turns
out!

That’s what the demo/example scripts are for, and I think it’s the main
ingredient this work needs at this stage: seeing how the API works out in a
range of use cases, and seeing what patterns it can be improved to better
serve.

Also naturally I will be glad to discuss it here for as long as others are
interested.

Cheers,
Greg
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DR6I2PYZHT5JUBNKBFOULTICBNFRZHKI/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to