I've recently run into the issue of clamd not being able to scan files that are larger than a small number of GB, and I have seen the warnings in `man clamd.conf` that say specified limits above 4GB are ignored.
Could developers or other folks familiar with the clamd codebase comment on the feasibility of scanning large files in multiple pieces as a way of handling larger files? For example, given a file that is 6GB, does using multiple INSTREAM calls (that's how I'm interacting wth clamd currently) to check the full 6GB seem like it should work reliably? INSTREAM: bytes 0-1000MB INSTREAM: bytes 900MB-1.9GB INSTREAM: bytes 1.8GB-2.8GB INSTREAM: bytes 2.7GB-3.7GB INSTREAM: bytes 3.6GB-4.6GB INSTREAM: bytes 4.5GB-5.5GB INSTREAM: bytes 5.4GB-6.0GB There is overlap above, wherein the 100MB of data that starts at the 900MB position is scanned twice, once in the first call (as the last 100MB of that stream) and once in the second call (as the first 100MB of that stream), to reduce the possibility of a virus being split into two pieces and therefore not recognized. If Clamav needs the first bytes in order to know what kind of file it is scanning and trigger filetype-specific heuristics, then something like the above could be adapted so that the first N bytes of the first chunk are prepended to each subsequent chunk that is checked for that file. Thanks for any guidance or feedback you can provide. _______________________________________________ Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml