Bug#909122: diffoscope: MemoryError when comparing big ISO images

2018-09-18 Thread Daniel Shahaf
Marek Marczykowski-Górecki wrote on Tue, 18 Sep 2018 21:09 +0200: > On Tue, Sep 18, 2018 at 09:00:11PM +0200, Marek Marczykowski-Górecki wrote: > > On Tue, Sep 18, 2018 at 06:39:28PM +, Daniel Shahaf wrote: > > > Slurping the file to a string object is an antipattern. Instead of > > > using f.

Bug#909122: diffoscope: MemoryError when comparing big ISO images

2018-09-18 Thread Chris Lamb
Hi Marek, > magic text in commit message to link it with this bug? Sure: "Blah blah blah. (Closes: #909122)" Thanks! Best wishes, -- ,''`. : :' : Chris Lamb `. `'` la...@debian.org / chris-lamb.co.uk `-

Bug#909122: diffoscope: MemoryError when comparing big ISO images

2018-09-18 Thread Marek Marczykowski-Górecki
On Tue, Sep 18, 2018 at 09:00:11PM +0200, Marek Marczykowski-Górecki wrote: > On Tue, Sep 18, 2018 at 06:39:28PM +, Daniel Shahaf wrote: > > Slurping the file to a string object is an antipattern. Instead of > > using f.read() to create a 4.5GB string, it would be better to use > > json.load(f

Bug#909122: diffoscope: MemoryError when comparing big ISO images

2018-09-18 Thread Marek Marczykowski-Górecki
On Tue, Sep 18, 2018 at 06:39:28PM +, Daniel Shahaf wrote: > Slurping the file to a string object is an antipattern. Instead of > using f.read() to create a 4.5GB string, it would be better to use > json.load(f) to read the file incrementally. That should raise an > exception rather quickly.

Bug#909122: diffoscope: MemoryError when comparing big ISO images

2018-09-18 Thread Daniel Shahaf
Marek Marczykowski-Górecki wrote on Tue, Sep 18, 2018 at 20:17:03 +0200: > File "/usr/lib/python3/dist-packages/diffoscope/comparators/json.py", > line 52, in recognizes > f.read().decode('utf-8', errors='ignore'), > MemoryError > > The JSONFile.recognizes function, for context: >

Bug#909122: diffoscope: MemoryError when comparing big ISO images

2018-09-18 Thread Chris Lamb
Hi Marek, > The whole thing could be avoided if earlier check (if initial 10 chars > contains '[' or '{') would be executed not only on "text" files. Indeed. The origins of this appear to be: https://salsa.debian.org/reproducible-builds/diffoscope/commit/2a758d3d0205e934ed6dffebb5d6462b00fe59

Bug#909122: diffoscope: MemoryError when comparing big ISO images

2018-09-18 Thread Marek Marczykowski-Górecki
Package: diffoscope Version: 101 Severity: normal Dear Maintainer, When comparing two 4.5GB ISO images, diffoscope tries to load them into memory, which fails with MemoryError in json comparator: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/diffoscope/main.py