Hi Alex,

I don't like seeing a well researched question go un-answered, though I don't 
have a very good answer for you.  We don't have any documentation from any 
previous work to say if there is an optimum chunk size for TCP sockets or unix 
sockets.

Intuitively, if you're using a TCP socket, particularly if sending over the 
network (hopefully using an encrypted SSH tunnel) then chunking will probably 
be done for you, and if you do chunking then ensuring that your chunk size is 
lower than the MTU for the TCP/IP stack may prevent you from sending ittybitty 
chunks every other packet.

If you're using a unix local socket, I really don't know if chunking buys you 
anything.  If you do end up doing some testing, it would be interesting to find 
out what you learn.

Micah Snyder
ClamAV Development
Talos
Cisco Systems, Inc.


On Oct 29, 2018, at 3:32 PM, Wreschnig, Alexander Scott 
<as...@pitt.edu<mailto:as...@pitt.edu>> wrote:

I have what is hopefully a quick question regarding clamd. What’s a good method 
for determining ideal chunk sizes when streaming data to the daemon over a 
socket connection? Or should I ignore chunking altogether and just stream one 
big contiguous file?

The background: I’ve developed a very simple plugin for an unrelated 
application that sends user-uploaded files of varying formats to clamd over a 
socket for some basic virus scanning. At the moment, and based on some of the 
clamd documentation, it loops over each file grabbing small chunks at a time 
and streams each of those chunks to clamd. It’s working fine, so I can in 
theory leave it exactly as-is. But I used an arbitrary value for chunk size and 
as I’m looking more closely I’m having a hard time finding documentation on how 
this works or what my chunk size should be (beyond the maximum chunk size, 
which I can see is StreamMaxLength). For reference, from man clamd:

“The stream is sent to clamd in chunks, after INSTREAM, on the same socket on 
which the command was sent. This avoids the overhead of establishing new TCP 
connections and problems with NAT. The format of the chunk is: '<length><data>' 
where <length> is the size of the following data in bytes expressed as a 4 byte 
unsigned integer in network byte order and <data> is the actual chunk. 
Streaming is terminated by sending a zero-length chunk. Note: do not exceed 
StreamMaxLength as defined in clamd.conf […]”

StreamMaxLength, on the other hand, is documented as

“[…] This option allows you to specify the upper limit for data size that will 
be transfered to remote daemon when scanning a single file. It should match 
your MTA's limit for a maximum attachment size.”

Looking at this combination I’m wondering if, since I’m only worrying about 
attachments (which by definition shouldn’t be larger than maximum attachment 
size), there’s another good reason to chunk things up or if I should just 
stream everything in one go.

Sorry if there’s an obvious answer staring at me and I’m not seeing it—I swear 
I looked! And thanks for any advice.

—
Alex Wreschnig

_______________________________________________
clamav-users mailing list
clamav-users@lists.clamav.net<mailto:clamav-users@lists.clamav.net>
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

_______________________________________________
clamav-users mailing list
clamav-users@lists.clamav.net
http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Reply via email to