Esa Peuha added the comment:

This code

import _lzma
with open('22h_ticks_bad.bi5', 'rb') as f:
    infile = f.read()
for i in range(8191, 8195):
    decompressor = _lzma.LZMADecompressor()
    first_out = decompressor.decompress(infile[:i])
    first_len = len(first_out)
    last_out = decompressor.decompress(infile[i:])
    last_len = len(last_out)
    print(i, first_len, first_len + last_len, decompressor.eof)

prints this

8191 36243 45480 True
8192 36251 45473 False
8193 36253 45475 False
8194 36260 45480 True

It seems to me that this is a subtle bug in liblzma; if the input stream to the 
incremental decompressor is broken at the wrong place, the internal state of 
the decompressor is corrupted. For this particular file, it happens when the 
break occurs after reading 8192 or 8193 bytes, and lzma.py happens to use a 
buffer of 8192 bytes. There is nothing wrong with the compressed file, since 
lzma.py decompresses it correctly if the buffer size is set to almost any other 
value.

----------
nosy: +Esa.Peuha

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21872>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to