Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-06 Thread Nick Leverton
On Wednesday 05 April 2006 18:25, Daniel T. Staal wrote:
 On Wed, April 5, 2006 1:08 pm, Rob MacGregor said:
  WTF are you doing accepting email's at 200 MB?  There are far more
  appropriate methods of file transfer than SMTP!

 But they all require more complicated and lasting setups than SMTP, for
 a specific set of senders/receivers.

 If you want to send a large file between two people who are likely to
 never send each other a file again, SMTP is a quick and easy way to do
 it. It is not the most efficient, but efficiency is not a primary
 concern in this case.  Having ClamAV not fall over would be nice.  ;)

Thanks for your comments, I'm glad someone can see my point of view here :)

If I zip up five 35Mb files into a zip archive (resulting size 110Mb), Clam 
scans it in 30 seconds and needs less than 20 Mb of VmData including all 
the patterns etc.  Clamscan reports Data scanned: 108.61 MB which is the 
same size as the unzipped contents.

If I enclose those identical files into an email, which is just another 
form of encapsulation, then Clam will take over 3 minutes and requires 
enough VmData to keep that entire email in memory, plus apparently at 
least one copy of the attachment it is working on - total 300Mb required 
to scan the exact same files.  Clamscan also reports Data scanned: 662.11 
MB - six times as much as was actually in the email, over three times as 
much as the Base64 contents.

So Clam scanning an mbox format MIME archive is six times slower and over 
10 times greedier for RAM than scanning the same files in a zip archive.

I haven't finished tracking through the code yet to find why it's coded 
like this and hence where this inefficiency comes from.  I accept it was 
probably an efficient way to handle small mails.  But I can already see 
that mbox.c does a lot of loading things into RAM, when it could probably 
use the disk copy and just keep headers and pointers.

BTW - to the person who asked how other AV's handle this - I will get some 
figures and report back.

Nick
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-06 Thread Nick Leverton
On Thursday 06 April 2006 19:17, Karolis Dautartas wrote:
  Agreed, especially since ClamAV is a general virus-scanning tool and
  not specifically for email.

 while sending emails of that size and scanning them for viruses is
 definately not the best idea, being unable to scan large files on your
 own HDD is not good. It is common to have 256MB RAM on a workstation.
 It is also common to download big CD and DVD ISOs.

 I wonder how other virus scanners perform in such situations.

Now tested against the two others I have access to.
Sophie (daemonised version of Sophos) - requires no extra memory.
Fsavd (F-secure daemon) - requires no extra memory.
Only Clam soaks up RAM byte-for-byte when scanning emails, and as far as I 
can tell it doesn't give any performance benefit for doing so.

Nick
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Rob MacGregor
On 4/5/06, Nick Leverton [EMAIL PROTECTED] wrote:
 Hi,

 We are using Clamav 0.88.1 under amavis on Linux to scan incoming mail.
 However we often receive mails containing a number of attachments, and I
 have found out that clamav appears to hold the entire email in memory
 whilst decoding and scanning the individual attachments.  Thus, for
 instance, if a customer sends us a 200Mb file containing four 35Mb word
 documents, clam's memory requirements will suddenly swell from 14 Mb up to
 300 Mb whilst it is scanning.

WTF are you doing accepting email's at 200 MB?  There are far more
appropriate methods of file transfer than SMTP!

 Can you help us reduce the memory needed for large mails, because having to
 allow so much extra memory for these large mails is very wasteful - and if
 we don't then the machine may go down when a couple of Clam threads have
 used all the memory.

Yeah, use a more appropriate file transfer method :-)

Alternatively, configure amavis to not virus scan emails above a
certain size.  Obviously this has some risks.

--
 Please keep list traffic on the list.
Rob MacGregor
  Whoever fights monsters should see to it that in the process he
doesn't become a monster.  Friedrich Nietzsche
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Daniel T. Staal
On Wed, April 5, 2006 1:08 pm, Rob MacGregor said:

 WTF are you doing accepting email's at 200 MB?  There are far more
 appropriate methods of file transfer than SMTP!

But they all require more complicated and lasting setups than SMTP, for a
specific set of senders/receivers.

If you want to send a large file between two people who are likely to
never send each other a file again, SMTP is a quick and easy way to do it.
 It is not the most efficient, but efficiency is not a primary concern in
this case.  Having ClamAV not fall over would be nice.  ;)

Daniel T. Staal

---
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---

___
http://lurker.clamav.net/list/clamav-users.html


RE: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Matthew.van.Eerde
Daniel T. Stall wrote:
 On Wed, April 5, 2006 1:08 pm, Rob MacGregor said:
 
 WTF are you doing accepting email's at 200 MB?  There are far more
 appropriate methods of file transfer than SMTP!
 
 If you want to send a large file between two people who are likely to
 never send each other a file again, SMTP is a quick and easy way to
 do it.

You still have to draw the line somewhere.  If you get ClamAV working, and keep 
accepting larger and larger files, something will break.  Eventually the 
limiting factor will be disk size... if you're lucky, at the email client.  If 
you're unlucky, at an MTA.

That's why I reject at 50MB (sendmail.mc excerpt follows:)
dnl Reject messages bigger than 50 MB
dnl size is specified in bytes
dnl 50 MB * 1024 KB/MB * 1024 B/KB = 52428800 B
define(`confMAX_MESSAGE_SIZE', `52428800')dnl

Where exactly the line is drawn is of little importance, but it's better to 
have a known limit with known consequences (REJECT) than an unknown limit with 
unknown consequences (server crash)

-- 
Matthew.van.Eerde (at) hbinc.com   805.964.4554 x902
Hispanic Business Inc./HireDiversity.com   Software Engineer
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Noel Jones

At 12:25 PM 4/5/2006, Daniel T. Staal wrote:
If you want to send a large file between two people who 
are likely to
never send each other a file again, SMTP is a quick and 
easy way to do it.


Just because something is quick and easy doesn't make it a 
good idea.
You might refer your users with large files to 
http://www.yousendit.com or a similar file-transfer service.


--
Noel Jones 


___
http://lurker.clamav.net/list/clamav-users.html


RE: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Daniel T. Staal
On Wed, April 5, 2006 1:34 pm, [EMAIL PROTECTED] said:

 Where exactly the line is drawn is of little importance, but it's better
 to have a known limit with known consequences (REJECT) than an unknown
 limit with unknown consequences (server crash).

Of course.  All I wanted to say was don't dismiss this as irrelevant:
Having ClamAV's resource use being the limiting factor in these situations
doesn't help ClamAV.  The known limit should be able to be set with little
regard for one small component's processing problems.  Being able to
handle large files nicely would be an advantage ClamAV could advertise,
while not being able to is something that needs to get mentioned in
integration/install notes.

From the original email, it appears ClamAV requires more available, real,
RAM than the largest file it will handle.  This would make me think when
installing: how *much* more RAM will it need?  What is the largest size
email I can handle on this machine based on that?  I might want to
reconfigure my email server.  Or I might want to turn off scanning over a
certain size.  Neither sounds like something I want to do while installing
a virus scanner.

Daniel T. Staal

---
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---

___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread clamav

At 10:50 AM 4/5/2006, Noel Jones wrote:


At 12:25 PM 4/5/2006, Daniel T. Staal wrote:

If you want to send a large file between two people who are likely to
never send each other a file again, SMTP is a quick and easy way to do it.


Just because something is quick and easy doesn't make it a good idea.
You might refer your users with large files to 
http://www.yousendit.com or a similar file-transfer service.


how about everyone get off the soapbox please. he didn't ask whether 
y'all thought it was right or wrong or good or bad to do what 
he's doing, and it really isn't anyone's business but his. He's 
serving a customer - i guess this is just classic, stereotypical 
'customers are idiots, and always wrong'  IT mentality coming through.


if he wants to offer to his customers the ability to send large 
messages via email, then that's between him and his customers, not 
the peanut gallery. if you have something helpful to add, like a 
possible solution to the issue, then by all means, do so.



Paul Theodoropoulos
http://www.anastrophe.com
http://www.smileglobal.com
http://www.forumgarden.com




___
http://lurker.clamav.net/list/clamav-users.html


RE: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Matthew.van.Eerde
Daniel T. Staal wrote:
 On Wed, April 5, 2006 1:34 pm, [EMAIL PROTECTED] said:
 
 Where exactly the line is drawn is of little importance, but it's
 better to have a known limit with known consequences (REJECT) than
 an unknown limit with unknown consequences (server crash).
 
...
 Having ClamAV's resource use being the limiting factor in these
 situations doesn't help ClamAV.

Agreed, especially since ClamAV is a general virus-scanning tool and not 
specifically for email.

I could see the line being drawn at 700MB for a shop that deals heavily in 
.iso's, which would entail lotsa space on the MTAs and high-powered MUAs.  But 
there should still be a line (probably measured in GB, for such a shop)

-- 
Matthew.van.Eerde (at) hbinc.com   805.964.4554 x902
Hispanic Business Inc./HireDiversity.com   Software Engineer
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Noel Jones

At 12:57 PM 4/5/2006, [EMAIL PROTECTED] wrote:

At 10:50 AM 4/5/2006, Noel Jones wrote:


At 12:25 PM 4/5/2006, Daniel T. Staal wrote:
If you want to send a large file between two people who 
are likely to
never send each other a file again, SMTP is a quick and 
easy way to do it.


Just because something is quick and easy doesn't make it 
a good idea.
You might refer your users with large files to 
http://www.yousendit.com or a similar file-transfer service.


how about everyone get off the soapbox please. he didn't 
ask whether y'all thought it was right or wrong or 
good or bad to do what he's doing, and it really isn't 
anyone's business but his. He's serving a customer - i 
guess this is just classic, stereotypical 'customers are 
idiots, and always wrong'  IT mentality coming through.


if he wants to offer to his customers the ability to send 
large messages via email, then that's between him and his 
customers, not the peanut gallery. if you have something 
helpful to add, like a possible solution to the issue, 
then by all means, do so.


Thank you for your wise and considered comments.
Apparently you missed that I offered an alternate quick and 
easy solution that doesn't create problems with the mail 
plant.  No soap box here, just pointing out that 
screwdrivers don't make good hammers.


--
Noel Jones 


___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Karolis Dautartas

From the original email, it appears ClamAV requires more available, real,

RAM than the largest file it will handle.  This would make me think when
installing: how *much* more RAM will it need?  What is the largest size
email I can handle on this machine based on that?  I might want to
reconfigure my email server.  Or I might want to turn off scanning over a
certain size.  Neither sounds like something I want to do while installing
a virus scanner.


I find it natural that an email message has to fit in server's memory 
while being scanned for viruses. Email was probably not meant to be used 
for huge file transfers anyway. Leave alone scanning it for viruses...


If you want to use email in such extraordinary way, you are going to 
have to pay for it... the price of RAM :)


Karolis
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Karolis Dautartas



Agreed, especially since ClamAV is a general virus-scanning tool and not 
specifically for email.

while sending emails of that size and scanning them for viruses is 
definately not the best idea, being unable to scan large files on your 
own HDD is not good. It is common to have 256MB RAM on a workstation.

It is also common to download big CD and DVD ISOs.

I wonder how other virus scanners perform in such situations.

Karolis
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread clamav

At 11:09 AM 4/5/2006, Noel Jones wrote:

Thank you for your wise and considered comments.
Apparently you missed that I offered an alternate quick and easy 
solution that doesn't create problems with the mail plant.  No soap 
box here, just pointing out that screwdrivers don't make good hammers.


Yes, your suggestion was helpful - while also passing judgement, 
needlessly. your message would have been equally helpful had you 
elided the first sentence.



Paul Theodoropoulos
http://www.anastrophe.com
http://www.smileglobal.com
http://www.forumgarden.com




___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Rob MacGregor
On 4/5/06, Daniel T. Staal [EMAIL PROTECTED] wrote:
 From the original email, it appears ClamAV requires more available, real,
 RAM than the largest file it will handle.

Not at all - the original documents will have been most likely base64
encoded (maybe uuencode, but I'd be surprised), which results in a
significantly larger file.  The ratio was in the region of what I'd
expect for a program scanning the file in memory.

And of course, ClamAV knows nothing about any difference between RAM
and swap - that's down to the OS.  You'll get better performance if
it's all RAM, but it'll still work if it's swap.

--
 Please keep list traffic on the list.
Rob MacGregor
  Whoever fights monsters should see to it that in the process he
doesn't become a monster.  Friedrich Nietzsche
___
http://lurker.clamav.net/list/clamav-users.html


Re: [Clamav-users] Scanning large mails occupies very large memory

2006-04-05 Thread Dennis Peterson
 
 
  Agreed, especially since ClamAV is a general virus-scanning tool and not 
  specifically for email.
  
 while sending emails of that size and scanning them for viruses is 
 definately not the best idea, being unable to scan large files on your 
 own HDD is not good. It is common to have 256MB RAM on a workstation.
 It is also common to download big CD and DVD ISOs.
 
 I wonder how other virus scanners perform in such situations.
 
 Karolis

My milter puts the attachment on disk and passes the path to clamd. The milter 
tidys up after the scan.

dp
___
http://lurker.clamav.net/list/clamav-users.html