ID: 33070 User updated by: lindsay at bitleap dot com Reported By: lindsay at bitleap dot com Status: Open Bug Type: Performance problem Operating System: Linux 2.6.10 kernel PHP Version: 5.0.3 New Comment:
I noticed the new code was crashing. You can reliably reproduce the results by creating a file which compresses well. I created my data sample with the following commands: dd if=/dev/zero bs=1024 count=65536 of=large bzip2 large When I run the script on this, I get the following crash: file length 256K ran in: 0 seconds file length 512K ran in: 1 seconds file length 1M ran in: 1 seconds file length 2M ran in: 2 seconds file length 3M ran in: 2 seconds file length 4M ran in: 3 seconds file length large ran in: *** glibc detected *** double free or corruption (!prev): 0x0853f418 *** Previous Comments: ------------------------------------------------------------------------ [2005-05-23 20:09:43] lindsay at bitleap dot com I applied the patch suggested by jeff at vertexdev dot com. I also added another test for 64M data chunks. Just to be sure, I verified php's bzdecompress w/ the patch produces data creating the same md5sum as bzunzip. file length 256K ran in: 0 seconds file length 512K ran in: 0 seconds file length 1M ran in: 2 seconds file length 2M ran in: 1 seconds file length 3M ran in: 3 seconds file length 4M ran in: 3 seconds file length 64M ran in: 38 seconds ------------------------------------------------------------------------ [2005-05-23 16:50:02] lindsay at bitleap dot com It looks like BZ2_bzRead() requires a file to read from. Would: BZ2_bzDecompressInit loop BZ2_bzDecompress //decompress in chunks BZ2_bzDecompressEnd work? The source code for BZ2_bzBuffToBuffDecompress seems like it could almost be used if the BZ2_bzDecompress was looped through 'source' in chunks. ------------------------------------------------------------------------ [2005-05-23 10:02:56] [EMAIL PROTECTED] jeff at vertexdev dot com: Yes. And now look at the code. ------------------------------------------------------------------------ [2005-05-23 00:40:16] jeff at vertexdev dot com The fix is really easy. In function bzdecompress, you need to do the following: 1. Before the 'do' loop, initialize size to some reasonable value: size = PHP_BZ_DECOMPRESS_SIZE; 2. Inside of the loop, double the size each time (replace the existing 'size = dest_len * iter;' statement with this): size *= 2; This will temporarily use up a little bit more memory than stricly necessary, but it will make the function usable. Drop me an email if you need more information. ------------------------------------------------------------------------ [2005-05-22 22:04:49] [EMAIL PROTECTED] Yes, this is possibel with the BZ2_bzRead() api call. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/33070 -- Edit this bug report at http://bugs.php.net/?id=33070&edit=1