[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-23 Thread Xiang Zhang

Xiang Zhang added the comment:

Thanks Martin too. Nobody than me knows how much work you have done to this. :) 
I could have made it better at first. :( Sorry for the noise to everyone.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-23 Thread Xiang Zhang

Xiang Zhang added the comment:

Thanks Martin too. Nobody than me knows how much work you have done to this. :) 
I could have made it better at first. :( Sorry for the noise to everyone.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-23 Thread Klamann

Klamann added the comment:

Thanks Xiang and Martin for solving this, you guys are awesome :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-23 Thread Martin Panter

Martin Panter added the comment:

Thanks Xiang for your work on this, and Klamann for the report.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-22 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 2192edcfea02 by Martin Panter in branch '2.7':
Issue #27130: Fix handling of buffers exceeding (U)INT_MAX in “zlib” module
https://hg.python.org/cpython/rev/2192edcfea02

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-22 Thread Roundup Robot

Roundup Robot added the comment:

New changeset bd61bcd9bee8 by Martin Panter in branch '3.5':
Issue #27130: Fix handling of buffers exceeding UINT_MAX in “zlib” module
https://hg.python.org/cpython/rev/bd61bcd9bee8

New changeset bd556f748cf8 by Martin Panter in branch 'default':
Issue #27130: Merge zlib 64-bit fixes from 3.5
https://hg.python.org/cpython/rev/bd556f748cf8

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-22 Thread Xiang Zhang

Xiang Zhang added the comment:

All tests passed now. :) I think it's OK. Also upload the v10 version restore 
the statement to avoid the crash you mentioned in comments.

[1/1] test_zlib
test_abcdefghijklmnop (test.test_zlib.ChecksumTestCase)
test issue1202 compliance: signed crc32, adler32 in 2.x ... ok
test_adler32empty (test.test_zlib.ChecksumTestCase) ... ok
test_adler32start (test.test_zlib.ChecksumTestCase) ... ok
test_crc32empty (test.test_zlib.ChecksumTestCase) ... ok
test_crc32start (test.test_zlib.ChecksumTestCase) ... ok
test_negative_crc_iv_input (test.test_zlib.ChecksumTestCase) ... ok
test_penguins (test.test_zlib.ChecksumTestCase) ... ok
test_same_as_binascii_crc32 (test.test_zlib.ChecksumTestCase) ... ok
test_big_buffer (test.test_zlib.ChecksumBigBufferTestCase) ... ok
test_badcompressobj (test.test_zlib.ExceptionTestCase) ... ok
test_baddecompressobj (test.test_zlib.ExceptionTestCase) ... ok
test_badlevel (test.test_zlib.ExceptionTestCase) ... ok
test_decompressobj_badflush (test.test_zlib.ExceptionTestCase) ... ok
test_overflow (test.test_zlib.ExceptionTestCase) ... ok
test_64bit_compress (test.test_zlib.CompressTestCase) ... ok
test_big_compress_buffer (test.test_zlib.CompressTestCase) ... ok
test_big_decompress_buffer (test.test_zlib.CompressTestCase) ... ok
test_custom_bufsize (test.test_zlib.CompressTestCase) ... ok
test_incomplete_stream (test.test_zlib.CompressTestCase) ... ok
test_large_bufsize (test.test_zlib.CompressTestCase) ... ok
test_speech (test.test_zlib.CompressTestCase) ... ok
test_speech128 (test.test_zlib.CompressTestCase) ... ok
test_64bit_compress (test.test_zlib.CompressObjectTestCase) ... ok
test_badcompresscopy (test.test_zlib.CompressObjectTestCase) ... ok
test_baddecompresscopy (test.test_zlib.CompressObjectTestCase) ... ok
test_big_compress_buffer (test.test_zlib.CompressObjectTestCase) ... ok
test_big_decompress_buffer (test.test_zlib.CompressObjectTestCase) ... ok
test_clear_unconsumed_tail (test.test_zlib.CompressObjectTestCase) ... ok
test_compresscopy (test.test_zlib.CompressObjectTestCase) ... ok
test_compressincremental (test.test_zlib.CompressObjectTestCase) ... ok
test_compressoptions (test.test_zlib.CompressObjectTestCase) ... ok
test_compresspickle (test.test_zlib.CompressObjectTestCase) ... ok
test_decompimax (test.test_zlib.CompressObjectTestCase) ... ok
test_decompinc (test.test_zlib.CompressObjectTestCase) ... ok
test_decompincflush (test.test_zlib.CompressObjectTestCase) ... ok
test_decompress_incomplete_stream (test.test_zlib.CompressObjectTestCase) ... ok
test_decompress_unused_data (test.test_zlib.CompressObjectTestCase) ... ok
test_decompresscopy (test.test_zlib.CompressObjectTestCase) ... ok
test_decompressmaxlen (test.test_zlib.CompressObjectTestCase) ... ok
test_decompressmaxlenflush (test.test_zlib.CompressObjectTestCase) ... ok
test_decompresspickle (test.test_zlib.CompressObjectTestCase) ... ok
test_empty_flush (test.test_zlib.CompressObjectTestCase) ... ok
test_flush_custom_length (test.test_zlib.CompressObjectTestCase) ... ok
test_flush_large_length (test.test_zlib.CompressObjectTestCase) ... ok
test_flush_with_freed_input (test.test_zlib.CompressObjectTestCase) ... ok
test_flushes (test.test_zlib.CompressObjectTestCase) ... ok
test_large_unconsumed_tail (test.test_zlib.CompressObjectTestCase) ... ok
test_large_unused_data (test.test_zlib.CompressObjectTestCase) ... ok
test_maxlen_custom (test.test_zlib.CompressObjectTestCase) ... ok
test_maxlen_large (test.test_zlib.CompressObjectTestCase) ... ok
test_maxlenmisc (test.test_zlib.CompressObjectTestCase) ... ok
test_odd_flush (test.test_zlib.CompressObjectTestCase) ... ok
test_pair (test.test_zlib.CompressObjectTestCase) ... ok
test_wbits (test.test_zlib.CompressObjectTestCase) ... ok

--
Ran 54 tests in 324.004s

OK
1 test OK.

--
Added file: http://bugs.python.org/file43824/64bit_support_for_zlib_v10.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-21 Thread Martin Panter

Martin Panter added the comment:

Here is a new patch for 2.7 that should avoid the problem with 
ChecksumBigBufferTestCase. I tested a hacked version of the test, so I am more 
confident in it now :)

--
Added file: 
http://bugs.python.org/file43823/64bit_support_for_zlib_v11-2.7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-20 Thread Xiang Zhang

Xiang Zhang added the comment:

Hi, Martin. I replied your last comment and your patch looks good to me except 
one test fails:

[1/1] test_zlib
test_abcdefghijklmnop (test.test_zlib.ChecksumTestCase)
test issue1202 compliance: signed crc32, adler32 in 2.x ... ok
test_adler32empty (test.test_zlib.ChecksumTestCase) ... ok
test_adler32start (test.test_zlib.ChecksumTestCase) ... ok
test_crc32empty (test.test_zlib.ChecksumTestCase) ... ok
test_crc32start (test.test_zlib.ChecksumTestCase) ... ok
test_negative_crc_iv_input (test.test_zlib.ChecksumTestCase) ... ok
test_penguins (test.test_zlib.ChecksumTestCase) ... ok
test_same_as_binascii_crc32 (test.test_zlib.ChecksumTestCase) ... ok
test_big_buffer (test.test_zlib.ChecksumBigBufferTestCase) ... ERROR
test_badcompressobj (test.test_zlib.ExceptionTestCase) ... ok
test_baddecompressobj (test.test_zlib.ExceptionTestCase) ... ok
test_badlevel (test.test_zlib.ExceptionTestCase) ... ok
test_decompressobj_badflush (test.test_zlib.ExceptionTestCase) ... ok
test_overflow (test.test_zlib.ExceptionTestCase) ... ok
test_64bit_compress (test.test_zlib.CompressTestCase) ... ok
test_big_compress_buffer (test.test_zlib.CompressTestCase) ... ok
test_big_decompress_buffer (test.test_zlib.CompressTestCase) ... ok
test_custom_bufsize (test.test_zlib.CompressTestCase) ... ok
test_incomplete_stream (test.test_zlib.CompressTestCase) ... ok
test_large_bufsize (test.test_zlib.CompressTestCase) ... ok
test_speech (test.test_zlib.CompressTestCase) ... ok
test_speech128 (test.test_zlib.CompressTestCase) ... ok
test_64bit_compress (test.test_zlib.CompressObjectTestCase) ... ok
test_badcompresscopy (test.test_zlib.CompressObjectTestCase) ... ok
test_baddecompresscopy (test.test_zlib.CompressObjectTestCase) ... ok
test_big_compress_buffer (test.test_zlib.CompressObjectTestCase) ... ok
test_big_decompress_buffer (test.test_zlib.CompressObjectTestCase) ... ok
test_clear_unconsumed_tail (test.test_zlib.CompressObjectTestCase) ... ok
test_compresscopy (test.test_zlib.CompressObjectTestCase) ... ok
test_compressincremental (test.test_zlib.CompressObjectTestCase) ... ok
test_compressoptions (test.test_zlib.CompressObjectTestCase) ... ok
test_compresspickle (test.test_zlib.CompressObjectTestCase) ... ok
test_decompimax (test.test_zlib.CompressObjectTestCase) ... ok
test_decompinc (test.test_zlib.CompressObjectTestCase) ... ok
test_decompincflush (test.test_zlib.CompressObjectTestCase) ... ok
test_decompress_incomplete_stream (test.test_zlib.CompressObjectTestCase) ... ok
test_decompress_unused_data (test.test_zlib.CompressObjectTestCase) ... ok
test_decompresscopy (test.test_zlib.CompressObjectTestCase) ... ok
test_decompressmaxlen (test.test_zlib.CompressObjectTestCase) ... ok
test_decompressmaxlenflush (test.test_zlib.CompressObjectTestCase) ... ok
test_decompresspickle (test.test_zlib.CompressObjectTestCase) ... ok
test_empty_flush (test.test_zlib.CompressObjectTestCase) ... ok
test_flush_custom_length (test.test_zlib.CompressObjectTestCase) ... ok
test_flush_large_length (test.test_zlib.CompressObjectTestCase) ... ok
test_flush_with_freed_input (test.test_zlib.CompressObjectTestCase) ... ok
test_flushes (test.test_zlib.CompressObjectTestCase) ... ok
test_large_unconsumed_tail (test.test_zlib.CompressObjectTestCase) ... ok
test_large_unused_data (test.test_zlib.CompressObjectTestCase) ... ok
test_maxlen_custom (test.test_zlib.CompressObjectTestCase) ... ok
test_maxlen_large (test.test_zlib.CompressObjectTestCase) ... ok
test_maxlenmisc (test.test_zlib.CompressObjectTestCase) ... ok
test_odd_flush (test.test_zlib.CompressObjectTestCase) ... ok
test_pair (test.test_zlib.CompressObjectTestCase) ... ok
test_wbits (test.test_zlib.CompressObjectTestCase) ... ok

==
ERROR: test_big_buffer (test.test_zlib.ChecksumBigBufferTestCase)
--
Traceback (most recent call last):
  File "/usr/home/zhangxiang3/Python-2.7.12/Lib/test/test_support.py", line 
1348, in wrapper
return f(self, maxsize)
  File "/usr/home/zhangxiang3/Python-2.7.12/Lib/test/test_zlib.py", line 90, in 
test_big_buffer
ChecksumTestCase.assertEqual32(self, zlib.crc32(data), 1044521549)
TypeError: unbound method assertEqual32() must be called with ChecksumTestCase 
instance as first argument (got ChecksumBigBufferTestCase instance instead)

--
Ran 54 tests in 317.008s

FAILED (errors=1)
test test_zlib failed -- Traceback (most recent call last):
  File "/usr/home/zhangxiang3/Python-2.7.12/Lib/test/test_support.py", line 
1348, in wrapper
return f(self, maxsize)
  File "/usr/home/zhangxiang3/Python-2.7.12/Lib/test/test_zlib.py", line 90, in 
test_big_buffer
ChecksumTestCase.assertEqual32(self, zlib.crc32(data), 1044521549)
TypeError: unbound method assertEqual32() must be called with ChecksumTestCase 
instance as first argument 

[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-20 Thread Martin Panter

Martin Panter added the comment:

Here is a possible patch for 2.7. To fix everything on 2.7 I changed the module 
to accept input buffers over 2 GiB by enabling PY_SSIZE_T_CLEAN. As a 
consequence, the patch also includes ports of Nadeem Vawda’s adler32(), crc32() 
changes from Issue 10276, and my tests from Issue 25626.

I have only tested this on computers with less than 4 GiB of memory. I can test 
compression and checksums with more input by using a sparse memory map, but not 
decompression.

fm = open("5GiB.sparse", "w+b")
fm.truncate(5 * 2**30)
m = mmap(fm.fileno(), 0)
z = compress(m)

--
type:  -> crash
Added file: 
http://bugs.python.org/file43805/64bit_support_for_zlib_v10-2.7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-18 Thread Xiang Zhang

Xiang Zhang added the comment:

I may be hard to test this change without enough memory. I upload the test 
results with the latest change.

--
Added file: http://bugs.python.org/file43775/zlib_test_result

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-16 Thread Martin Panter

Changes by Martin Panter :


--
Removed message: http://bugs.python.org/msg270561

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-16 Thread Martin Panter

Martin Panter added the comment:

I added one comment, but I think this might almost be ready

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-16 Thread Xiang Zhang

Xiang Zhang added the comment:

Upload the v9 version. It applies your last comment and catch up with the hg 
tip.

--
Added file: http://bugs.python.org/file43750/64bit_support_for_zlib_v9.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-14 Thread Martin Panter

Martin Panter added the comment:

This is on my list of things to look at, just that I have been away and am a 
bit backlogged atm.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-07-14 Thread Xiang Zhang

Xiang Zhang added the comment:

ping

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-27 Thread Xiang Zhang

Changes by Xiang Zhang :


Removed file: http://bugs.python.org/file43564/64bit_support_for_zlib_v8.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-27 Thread Xiang Zhang

Changes by Xiang Zhang :


Added file: http://bugs.python.org/file43565/64bit_support_for_zlib_v8.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-27 Thread Xiang Zhang

Xiang Zhang added the comment:

Make v8 patch consistent with the latest changeset.

--
Added file: http://bugs.python.org/file43564/64bit_support_for_zlib_v8.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-27 Thread Xiang Zhang

Changes by Xiang Zhang :


Removed file: http://bugs.python.org/file43563/64bit_support_for_zlib_v8.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-27 Thread Xiang Zhang

Changes by Xiang Zhang :


Removed file: http://bugs.python.org/file43561/64bit_support_for_zlib_v8.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-27 Thread Xiang Zhang

Changes by Xiang Zhang :


Added file: http://bugs.python.org/file43563/64bit_support_for_zlib_v8.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-27 Thread Xiang Zhang

Xiang Zhang added the comment:

This is the v8 patch. It does two things:

[1] Apply Martin's comment about decompressobj.decompress so when user passes 
in PY_SSIZE_T_MAX and there is enough memory, no error will be raised.

[2] Now decompressobj.flush can still work even if 
decompressobj.unconsumed_tail is larger than 4GiB. This needs two changes. 
First is we don't always use Z_FINISH. Second is we need to change 
save_unconsumed_input to support 64bit. Before we didn't realize this. 
Corresponding tests are added.

--
Added file: http://bugs.python.org/file43561/64bit_support_for_zlib_v8.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-25 Thread Xiang Zhang

Xiang Zhang added the comment:

Add the newest version applying Martin's comments.

--
Added file: http://bugs.python.org/file43538/64bit_support_for_zlib_v7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-19 Thread Xiang Zhang

Xiang Zhang added the comment:

Upload the near-final patch. This one adds large buffer tests on 64bit 
platforms. I have tested them on a server with enough memory. 

I don't add the @support.requires_resource('cpu') since large memory tests are 
born to be slow. And actually when I experiment, they are not that slow, at 
least not obviously slower than other large memory tests in test_zlib. So I 
ignore it.

--
Added file: http://bugs.python.org/file43469/64bit_support_for_zlib_v6.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-18 Thread Xiang Zhang

Changes by Xiang Zhang :


Removed file: http://bugs.python.org/file43460/64bit_support_for_zlib_v5.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-18 Thread Xiang Zhang

Changes by Xiang Zhang :


Added file: http://bugs.python.org/file43461/64bit_support_for_zlib_v5.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-18 Thread Xiang Zhang

Xiang Zhang added the comment:

Attach patch to restore argument parsing support for __int__.

--
Added file: http://bugs.python.org/file43460/64bit_support_for_zlib_v5.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-18 Thread Xiang Zhang

Changes by Xiang Zhang :


Removed file: http://bugs.python.org/file43450/64bit_support_for_zlib_v4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-18 Thread Xiang Zhang

Changes by Xiang Zhang :


Added file: http://bugs.python.org/file43459/64bit_support_for_zlib_v4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-18 Thread Xiang Zhang

Xiang Zhang added the comment:

Upload the new patch to fix bugs and make improvements. I'll add tests later.

--
Added file: http://bugs.python.org/file43450/64bit_support_for_zlib_v4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-16 Thread Xiang Zhang

Xiang Zhang added the comment:

Sorry, I suddenly find that I've misunderstood one of your comments. I changed 
according then. Please use the new version.

--
Added file: http://bugs.python.org/file43408/64bit_support_for_zlib_v3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-16 Thread Xiang Zhang

Xiang Zhang added the comment:

I'm willing to and thanks for your work too :) I have replied to your comments 
and adjusted my patch accordingly. But there are still several I am confused or 
want to negotiate more. 

I now upload the adjusted patch.

--
Added file: http://bugs.python.org/file43407/64bit_support_for_zlib_v2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-15 Thread Martin Panter

Martin Panter added the comment:

Thanks for working on this. I did a pass over your patch and left a bunch of 
comments.

--
stage: needs patch -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-14 Thread Xiang Zhang

Xiang Zhang added the comment:

Hello Martin. I've finished a patch to add 64bit support to zlib. I think it 
solves the 9 problems you mentioned and let the input and output both be larger 
than UINT_MAX. Hope you are willing to review and we can move this forward. :)

--
keywords: +patch
Added file: http://bugs.python.org/file43389/64bit_support_for_zlib.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-05 Thread Jack McCracken

Jack McCracken added the comment:

Don't know how useful this will be, but here's a crash report from Mac OS X 
10.11 with Klamann's example (Python 3.5).

--
nosy: +Jack.McCracken
Added file: http://bugs.python.org/file43251/coredump_macosx10.11.5.crash

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-02 Thread Xiang Zhang

Xiang Zhang added the comment:

I'd like to help but it'll need some time. And I'd like to start after 
issue27164 is solved. zdict now also checks for 4GB limit.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-02 Thread Martin Panter

Martin Panter added the comment:

Klamann, thanks for crash report. I think your decompress crash is explained by 
the bug expanding past UINT_MAX I identified above. The key is that length = 0 
in zlib_Decompress_decompress_impl(), as if wrapped around, and the return 
value will have been resized to zero. My suggested fix step 7 would address 
this.

The workaround here would either be to pass compressed data in smaller chunks 
(4 MB or less), so that no chunk can expand to 4 GiB, or to make use of the 
max_length parameter. Either way, it will make any code more complicated though.

If anyone wants to write a patch (or do testing) to solve any or all of the 
problems, I am happy to help. But it is not a high priority for me to do all 
the work, because I am not set up to test it easily.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-02 Thread Klamann

Klamann added the comment:

> You should be able to use a compression (or decompression) object as a 
> workaround.

OK, let's see

>>> import zlib
>>> zc = zlib.compressobj()
>>> c1 = zc.compress(b'a' * 2**31)
>>> c2 = zc.compress(b'a' * 2**31)
>>> c3 = zc.flush()
>>> c = c1 + c2 + c3
>>> zd = zlib.decompressobj()
>>> d1 = zd.decompress(c)
Segmentation fault (core dumped)

Seriously? What is wrong with this library? I've tested this using Python 3.5.0 
on linux and Python 3.5.1 on Windows.
At least with Python 2.7.6 it seems to work as expected...

So, splitting the Input in chunks of less than 2^32 byte (less than 2^31 for 
Python 2.x) seems to work (except for this segfault in Python 3), but it's 
still annoying that you have to split and concatenate data each time and 
remember to call flush() or you lose data...

imho, it would be best to fix the underlying issue. There is no reason why we 
should keep the 32 bit limitation.

> Alternatively (or in the mean time), I guess we could document the limitation.

+1

--
Added file: http://bugs.python.org/file43099/_usr_bin_python3.5.1000.crash

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-01 Thread Xiang Zhang

Xiang Zhang added the comment:

Yes. It's compression object not compress.

I find more. The overflow checking is introduced to solve problem in issue8650. 
It seems the explicit overflow checking is introduced to keep compatibility 
with py2 (py2 raises the overflowerror in pyargparse).

I support loosing the limitation, but now it can only go into the next version 
of py3.

--
nosy: +nadeem.vawda

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-06-01 Thread Martin Panter

Martin Panter added the comment:

You should be able to use a compression (or decompression) object as a 
workaround. But calling zlib.compress() multiple times would generate multiple 
separate deflated streams, which is different.

I think it is reasonable for Python to handle larger data sizes for zlib. (In 
theory, the 4 GiB UINT_MAX limit is platform-dependent.) IMO it is a matter of 
writing the patch(es), and perhaps depending on the complexity, deciding 
whether to apply them to 2.7 etc or just the next version of Python 3 (risk vs 
reward).

Alternatively (or in the mean time), I guess we could document the limitation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-05-26 Thread Xiang Zhang

Xiang Zhang added the comment:

Quick and careless scanning at night lead me to a wrong result, Sorry.

> You would need to compress just under 4 GiB of data that requires 5 MB or 
> more when compressed (i.e. not all the same bytes, or maybe try level=0).

With enough memory, compressing with level 0 does raise a error while the 
default level didn't. 

Except for overflow fix, does zlib have to support large data in one operation? 
For example, it's OK that zlib.compress does not support data beyond 4GB since 
we can split data in application and then call zlib.compress on each part and 
finally concatenate the results.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-05-26 Thread Martin Panter

Martin Panter added the comment:

Sorry Issue 10276 regarding crc32() and adler32() was only fixed for Python 3. 
Issue 23306 is open about crc32() in Python 2.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-05-26 Thread Martin Panter

Martin Panter added the comment:

This is similar, but different to the other bug. The other bug was only about 
output limits for incrementally decompressed data. Klamann’s bug is about the 
actual size of input (and possibly also output) buffers.

The gzip.compress() implementation uses zlib.compressobj.compress(), which does 
not accept 2 or 4 GiB input either.

The underlying zlib library uses “unsigned int” for the size of input and 
output chunks. It has to be called multiple times to handle 4 GiB. In both 
Python 2 and 3, the one-shot compress() function only does a single call to 
zlib. This explains why Python 3 cannot take 4 GiB.

Python 2 uses an “int” for the input buffer size, hence the 2 GiB limit.

I tend to think of these cases as bugs, which could be fixed in 3.5 and 2.7. 
Sometimes others also treat adding 64-bit support as a bug fix, e.g. 
file.read() on Python 2 (Issue 21932). But other times it is handled as a new 
feature for the next Python version, e.g. os.read() was fixed in 3.5, but not 
2.7 (Issue 21932), random.getrandbits() proposed for 3.6 only (Issue 27072).

This kind of bug is apparently already fixed for crc32() and adler32() in 
Python 2 and 3; see Issue 10276.

This line from zlib.compress() also worries me:

zst.avail_out = length + length/1000 + 12 + 1; /* unsigned ints */

I suspect it may overflow, but I don’t have enough memory to verify. You would 
need to compress just under 4 GiB of data that requires 5 MB or more when 
compressed (i.e. not all the same bytes, or maybe try level=0).

Also, the logic for expanding the output buffer in each of zlib.decompress(), 
compressobj.compress(), decompressobj.decompress(), compressobj.flush(), and 
decompressobj.flush() looks faulty when it hits UINT_MAX. I suspect it may 
overwrite unallocated memory or do other funny stuff, but again I don’t have 
enough memory to verify. What happens when you decompress more than 4 GiB when 
the compressed input is less than 4 GiB?

Code fixes that I think could be made:

1. Avoid the output buffer size overflow in the zlib.compress() function

2. Rewrite zlib.compress() to call deflate() in a loop, one iteration for each 
4 GiB input or output chunk

3. Allow the zlib.decompress() function to expand the output buffer beyond 4 GiB

4. Rewrite zlib.decompress() to pass 4 GiB input chunks to inflate()

5. Allow the compressobj.compress() method to expand the output buffer beyond 4 
GiB

6. Rewrite compressobj.compress() to pass 4 GiB input chunks to deflate()

7. Allow the decompressobj.decompress() method to expand the output buffer 
beyond 4 GiB

8. Rewrite decompressobj.decompress() to pass 4 GiB input chunks to inflate(), 
and to save 4 GiB in decompressobj.unconsumed_tail and unused_data

9. Change the two flush() methods to abort if they allocate UINT_MAX bytes, 
rather than pointing into unallocated memory (I don’t think this could happen 
in real usage, but the code shares the same problem as above.)

--
components: +Extension Modules -Library (Lib)
stage:  -> needs patch
versions: +Python 3.6 -Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-05-26 Thread Klamann

Klamann added the comment:

> But you can only get that feature with Python3.5+.

Well, I have Python 3.5.1 installed and the problem still persists. I'm not 
sure that 25626 ist the same problem - in the comments they say this was not an 
issue in Python 3.4 or 2.x, but this is clearly the case here.

Another thing I've noticed: Contrary to my previous statement, 
zlib.decompress() doesn't work on archives that are larger than 4GB (I was 
mislead by the fact that my 1GB archive contained a 6GB file).

When I use gzip.compress() on more than 2^32 bytes, the same OverflowError 
occurs as with zlib.compress(). But when I use gzip.decompress(), I can extract 
archives that are larger than 4GB.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-05-26 Thread Xiang Zhang

Changes by Xiang Zhang :


--
nosy: +martin.panter

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-05-26 Thread Xiang Zhang

Xiang Zhang added the comment:

This behaviour seems to have been fixed in issue25626. But you can only get 
that feature with Python3.5+.

--
nosy: +xiang.zhang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27130] zlib: OverflowError while trying to compress 2^32 bytes or more

2016-05-26 Thread Klamann

New submission from Klamann:

zlib fails to compress files larger than 4gb due to some 32bit issues.

I've tested this in Python 3.4.3 and 3.5.1:

> python3 -c "import zlib; zlib.compress(b'a' * (2**32 - 1))"
> python3 -c "import zlib; zlib.compress(b'a' * (2**32))"
Traceback (most recent call last):
  File "", line 1, in 
OverflowError: Size does not fit in an unsigned int

For Python 2.7, the issue starts at 2^31 byte (due to signed 32bit integers):

> python2 -c "import zlib; zlib.compress(b'a' * (2**31 - 1))"
> python2 -c "import zlib; zlib.compress(b'a' * (2**31))"
Traceback (most recent call last):
  File "", line 1, in 
OverflowError: size does not fit in an int

Decompressing files larger than 4GB works just fine.

--
components: Library (Lib)
messages: 266436
nosy: Klamann
priority: normal
severity: normal
status: open
title: zlib: OverflowError while trying to compress 2^32 bytes or more
versions: Python 2.7, Python 3.4, Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com