Bug#294929: rzip does not work for large files
Marc A. Lehmann on 2005-02-13 08:12:48 +0100: The man page could be a little clearer. If you would like to submit a diff, that would be great; otherwise, I'll take care of it as time permits. I'll put it on my todo list, dont' wait for me, though, I am pretty busy. But if a diff arrives and you haven't patched it yet, feel free. Hi Marc, I took a look at the man page today seeing how I would alter it, and it seems I missed the place where it does talk about memory usage during our initial conversation: -0..9 Set the compression level from 0 to 9. The default is to use level 9, which is the slowest but gives the best compression rate. The compression level is also strongly related to how much memory rzip uses, so if you are running rzip on a machine with limited amounts of memory then you will probably want to choose a level less than 9. Do you feel like that should be expanded on? I'm leaning towards 'no' and beating myself over the head for not RTFM'ing close enough :) Alec pgpWN36zETGIB.pgp Description: PGP signature
Bug#294929: rzip does not work for large files
root on 2005-02-12 14:25:47 +0100: Package: rzip Version: 2.0-2 Severity: important Unlike the manpage claims, rzip does not work for large files, as it tries to mmap the whole file into memory: -rw--- 1 root root 842895360 Feb 12 12:45 backup.tar # strace rzip -9 backup.tar ... fstat64(3, {st_mode=S_IFREG|0600, st_size=842895360, ...}) = 0 mmap2(NULL, 842895360, PROT_READ, MAP_SHARED, 3, 0) = -1 ENOMEM (Cannot allocate memory) write(2, Failed to map buffer in rzip_fd\n, 32Failed to map buffer in rzip_fd That is, rzip fails for most files that it is designed for. I'm guessing that the problem is the amount of memory you have available. rzip will copy into memory up to 900MB of the file at a time (see the man page, section COMPRESSION ALGORITHM); how much RAM + swap do you have available? pgpC0YdJmBjjD.pgp Description: PGP signature
Bug#294929: rzip does not work for large files
On Sat, Feb 12, 2005 at 09:12:01AM -0500, Alec Berryman [EMAIL PROTECTED] wrote: That is, rzip fails for most files that it is designed for. I'm guessing that the problem is the amount of memory you have available. rzip will copy into memory up to 900MB of the file at a time (see the man page, section COMPRESSION ALGORITHM); how much RAM Hmm, nothing in the man page claims it's copying that much into memory, or that it needs that much memory. It does refer to 900MB history buffer, so maybe that means it needs that much memory (in fatc, strace does not show that it tries to allocate that much memory, all it dos is mmap the file, but the same could be done by read()ing it). If that is indeed the case, that explains the problem. It might help if that gets documented more clearly in the manpage. I somehow doubt that it needs 900MB of ram, though. + swap do you have available? well, way less :) -- The choice of a -==- _GNU_ ==-- _ generation Marc Lehmann ---==---(_)__ __ __ [EMAIL PROTECTED] --==---/ / _ \/ // /\ \/ / http://schmorp.de/ -=/_/_//_/\_,_/ /_/\_\ XX11-RIPE -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#294929: rzip does not work for large files
Marc A. Lehmann on 2005-02-12 19:23:55 +0100: Hmm, nothing in the man page claims it's copying that much into memory, or that it needs that much memory. It does refer to 900MB history buffer, For the history buffer to be effective, it must be in memory and can't be piecemeal loaded and discarded. For a similar reason, rzip can't be used in a pipe. The author is not under the impression it can be easily coded around. so maybe that means it needs that much memory (in fatc, strace does not show that it tries to allocate that much memory, all it dos is mmap the file, but the same could be done by read()ing it). As mentioned, the history buffer needs to be in memory; I don't see the advantage in read()ing it instead of mmap()ing it, since the end result is that up to 900MB of the file is in memory at the same time. If that is indeed the case, that explains the problem. It might help if that gets documented more clearly in the manpage. I somehow doubt that it needs 900MB of ram, though. It doesn't *really* need 900MB of RAM + swap, as I read the code; take a look at rzip.c:581 if you're so inclined. The minimum history buffer size is 100MB, and each additional level of compression adds another 100MB of memory. In your original example, you used `rzip -9`. I am under the impression that this is how bzip2 works. The man page could be a little clearer. If you would like to submit a diff, that would be great; otherwise, I'll take care of it as time permits. pgpKTLVRWclhe.pgp Description: PGP signature
Bug#294929: rzip does not work for large files
On Sat, Feb 12, 2005 at 07:10:55PM -0500, Alec Berryman [EMAIL PROTECTED] wrote: Marc A. Lehmann on 2005-02-12 19:23:55 +0100: Hmm, nothing in the man page claims it's copying that much into memory, or that it needs that much memory. It does refer to 900MB history buffer, For the history buffer to be effective, BTW, the bug report should either (preferably) be closed or tagged wishlist. not show that it tries to allocate that much memory, all it dos is mmap the file, but the same could be done by read()ing it). As mentioned, the history buffer needs to be in memory; I don't see the advantage in read()ing it instead of mmap()ing it, since the end result is that up to 900MB of the file is in memory at the same time. Well, the advantage would be that it would run :) Anyway, I looked at the source, and what it does is roughly this: - hash by linearly reading through the file, if a possible match is found, look back and compare, if not, don't look back. Anyways, it's clear that some work would be involved, so the only request I have would be to document that (more clearly). There is a difference between an algorithm that has an effective history buffer of 900MB and an algorithm that needs 900MB of RAM to me :) It doesn't *really* need 900MB of RAM + swap, as I read the code; take a look at rzip.c:581 if you're so inclined. The minimum history buffer size is 100MB, and each additional level of compression adds another 100MB of memory. In your original example, you used `rzip -9`. I am under the impression that this is how bzip2 works. Oh, I assumed that would be the bzip2 compression level. Again, this could be mentioned in the manpage. The man page could be a little clearer. If you would like to submit a diff, that would be great; otherwise, I'll take care of it as time permits. I'll put it on my todo list, dont' wait for me, though, I am pretty busy. But if a diff arrives and you haven't patched it yet, feel free. Thanks for the explanation and insights! -- The choice of a -==- _GNU_ ==-- _ generation Marc Lehmann ---==---(_)__ __ __ [EMAIL PROTECTED] --==---/ / _ \/ // /\ \/ / http://schmorp.de/ -=/_/_//_/\_,_/ /_/\_\ XX11-RIPE -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]