Thank you for your detailed message. Now I see the problem; a different (and
much more complicated) strategy is needed for files above a certain size and
larger than RAM. It implies mmapping the protected file by parts, and
creating the fec files concurrently and also by parts. Testing and repairing
a file by parts will be even more difficult.
Phako wrote:
* I noticed that when I run the -Fl on the archive and not the fec file
by mistake, lziprecover doesn't return an error, keep working, reading
the file. ZSH kills it after about 30 sec (probably because it takes a
lot of memory), but BASH keeps it running and I had to kill it manually.*
Thanks. This is because lziprecover can use a badly damaged fec file, but it
currently reads the whole file before checking it. I plan to implement a
more complex reader that will check the first header before reading the rest
of the file, and will require --force to read a damaged (or wrong) file.
Lziprecover uses mmap because it needs access to the whole file and
"it is possible to mmap files orders of magnitude larger than both the
physical memory and swap space."[1] Maybe the problem is that the VM
you are using also limits the address space available to mmap.
I don't know enough about how VM memory works to have something relevant
to add but after a quick google search it seems not impossible that mmap
can have a different behavior/restriction in a KVM VM.
Maybe a VM lacks memory protection hardware. See
https://pubs.opengroup.org/onlinepubs/9799919799/functions/mmap.html
RATIONALE, paragraph 8:
"The MAP_PRIVATE function can be implemented efficiently when memory
protection hardware is available. When such hardware is not available,
implementations can implement such "mappings" by simply making a real copy
of the relevant data into process private memory, though this tends to
behave similarly to read()."
Maybe this can be worked around by mmapping the protected file by parts
(which brings other complications, but is simpler and more efficient that
processing each member of a multimember file separately).
$ time lziprecover -v -Fc2% archive.tar.lz
archive.tar.lz.fec: 300_754_928 bytes, 300_482_560 fec bytes, 655 blocks
real 450m44,092s
user 171m33,329s
sys 27m56,155s
These times mean that lziprecover spent most of the time waiting for the
disk. Fec file creation is multithreaded. On a 4-processor machine with
enough RAM the command above should have completed in 45 minutes or so.
Best regards,
Antonio.