> > getstatusoutput is a "legacy" function. It still exists for code that > has already been using it, but it is not recommended for new code. > > https://docs.python.org/3.5/library/subprocess.html#using-the-subprocess-module > > Since you're using Python 3.5, let's try using the brand new `run` > function and see if it does better: > > import subprocess > result = subprocess.run(["tail", "-3", "/tmp/pmaster.db"], > stdout=subprocess.PIPE) > print("return code is", result.returncode) > print("output is", result.stdout) > > > It should do better than getstatusoutput, since it returns plain bytes > without assuming they are ASCII. You can then decode them yourself: > > # try this and see if it is sensible > print("output is", result.stdout.decode('latin1')) > > # otherwise this > print("output is", result.stdout.decode('utf-8', errors='replace')) > > > >> >>> subprocess.getstatusoutput("tail -3 /tmp/pmaster.db",) >> Traceback (most recent call last): > [...] >> File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode >> return codecs.ascii_decode(input, self.errors)[0] >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 189: >> ordinal not in range(128) > > Let's look at the error message. getstatusoutput apparently expects only > pure ASCII output, because it is choking on a non-ASCII byte, namely > 0xe0. Obviously 0xe0 (or in decimal, 224) is not an ASCII value, since > ASCII goes from 0 to 127 only. > > If there's one non-ASCII byte in the file, there are probably more. > > So what is that mystery 0xe0 byte? It is hard to be sure, because it > depends on the source. If pmaster.db is a binary file, it could mean > anything or nothing. If it is a text file, it depends on the encoding > that the file uses. If it comes from a Mac, it might be: > > py> b'\xe0'.decode('macroman') > '‡' > > If it comes from Windows in Western Europe, it might be: > > py> b'\xe0'.decode('latin1') > 'à' > > If it comes from Windows in Greece, it might be: > > py> b'\xe0'.decode('iso 8859-7') > 'ΰ' > > and so forth. There's no absolutely reliable way to tell. This is the > sort of nightmare that Unicode was invented to fix, but unfortunately > there still exist millions of files, data formats and applications which > insist on using rubbish "extended ASCII" encodings instead. > > > >> That file's content is kryptonite for python apparently. Other shell >> operations work. >> >> >>> subprocess.getstatusoutput("file /tmp/pmaster.db",) >> (0, '/tmp/pmaster.db: Non-ISO extended-ASCII text, with very long lines, >> with LF, NEL line terminators') > > The `file` command agrees with me: it is not ASCII.
Thank you Steve! subprocess.run handles it better. >>> subprocess.getstatusoutput("tail -400 /tmp/pmaster.txt",) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.5/subprocess.py", line 805, in getstatusoutput data = check_output(cmd, shell=True, universal_newlines=True, stderr=STDOUT) File "/usr/lib/python3.5/subprocess.py", line 626, in check_output **kwargs).stdout File "/usr/lib/python3.5/subprocess.py", line 695, in run stdout, stderr = process.communicate(input, timeout=timeout) File "/usr/lib/python3.5/subprocess.py", line 1059, in communicate stdout = self.stdout.read() File "/usr/lib/python3.5/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe0 in position 60942: invalid continuation byte as opposed to: >>> result = subprocess.run(["tail", "-400", "/tmp/pmaster.txt"], >>> stdout=subprocess.PIPE) >>> result.returncode 0 >>> subprocess.getstatusoutput("file /tmp/pmaster.txt",) (0, '/tmp/pmaster.txt: Non-ISO extended-ASCII text, with very long lines, with LF, NEL line terminators') >>> That was awesome! :) _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor