On Wed, Nov 16, 2016 at 7:09 PM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > I'm trying to download a file using urllib.request and pipe it straight to an > external process. On Linux systems, the following is a test file that > demonstrates the problem: > > > --- cut --- > > #!/usr/bin/python3.5 > > import urllib.request > import subprocess > > TEST_URL = 'https://www.irs.gov/pub/irs-prior/f1040--1864.pdf' > > with urllib.request.urlopen(TEST_URL) as f: > data = subprocess.check_output(['file', '-'], stdin=f) > print(data)
Interesting. rosuav@sikorsky:~$ python3 Python 3.7.0a0 (default:72e64fc8746b+, Oct 28 2016, 12:35:28) [GCC 6.2.0 20161010] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import urllib.request >>> import subprocess >>> TEST_URL = 'https://www.irs.gov/pub/irs-prior/f1040--1864.pdf' >>> with urllib.request.urlopen(TEST_URL) as f: ... data = subprocess.check_output(['tee', 'tmp/asdfasdf'], stdin=f) ... rosuav@sikorsky:~/tmp$ hd asdfasdf |head 00000000 17 03 03 40 18 e9 b0 79 7c 03 c8 5d 21 40 2f 11 |...@...y|..]!@/.| 00000010 4a a3 f1 4d e0 19 04 fc 42 84 d9 cf 59 0b f8 56 |J..M....B...Y..V| 00000020 7d 35 08 88 17 50 24 8c 26 fe d8 13 2b fd 14 55 |}5...P$.&...+..U| 00000030 16 81 c3 1e 13 ae 00 1d d4 8e 9f 0f a4 19 bb 44 |...............D| 00000040 46 d5 bf 25 28 d0 b0 23 44 6f 1c ef 84 d9 82 9b |F..%(..#Do......| 00000050 17 15 3a 11 e1 ec de 59 65 d7 ea 41 dc 53 07 70 |..:....Ye..A.S.p| 00000060 99 d5 11 75 b7 90 7e cd 46 b5 67 ee 9a 62 18 63 |...u..~.F.g..b.c| 00000070 36 7f 7b df a1 fb 6d b8 66 8b 2f 82 e6 05 7e aa |6.{...m.f./...~.| 00000080 d7 9f 9e 05 cf 06 68 6b c8 4c df 5e 24 9d 92 f6 |......hk.L.^$...| 00000090 3d 53 76 11 c1 70 05 14 94 e5 5b ec b0 cf 64 70 |=Sv..p....[...dp| So that's what file(1) is seeing. My guess is that a urlopen object isn't "file-like" enough for subprocess. Maybe it's showing a more "raw" version? ChrisA -- https://mail.python.org/mailman/listinfo/python-list