Hence we come full circle back to the cause of the original post on this thread. With the default thread count of 1, one or more outbound emails to a non-well-behaved target mail server (e.g. outblaze...) can cause thousands of undelivered outbound emails to build up for days in the spool waiting for that one thread trying to send a couple of emails.
One improvement is to increase the number of threads so one blocked thread won't shut the system down. Another improvement is to reduce the timeout from 10 minutes to ~2 minutes. But frankly, I wish James would simply try one server and then put the email at the end of the line to wait for the next retry instead of trying every possible ip before moving on. Don't know if this is possible. But I think that would be the best way to minimize outbound mail from backing up. Or better yet, if James encounters a timeout situation and heads down the path of trying a bunch of ip addresses, have one delivery thread that is dedicated as the 'slow driver lane' thread. Once a timeout occurs on any of the normal delivery threads, queue that delivery to the slow 'retry' thread and move on to the next email. This way, only the emails that are attempting retries will get penalized during the retry process. Jerry -----Original Message----- From: Stefano Bagnara [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 21, 2006 9:22 AM To: James Users List Subject: Re: HELP!! Thousands of files stuck in spool at 'transport' state JWM wrote: > Noel, > >>> Are yoy seeing a 10 minute figure? That would imply that the target >>> server >>> accepted the connection, but is not doing I/O on it. > > This is a log snippit from an earlier post in this thread. Precisely 10 > minute lockups.... > > 770: 16/06/06 15:20:48 INFO James.Mailet: RemoteDelivery: Attempting > delivery of Mail1150406666744-21082-to-mail.com to host > mail-com.mr.outblaze.com. at 208.36.123.68 to > [EMAIL PROTECTED] Well, it is the 10 minutes timeout. If you telnet to port 25 of the 64.71.166.194 server you can see that the connection is established but the 220 welcome message is never sent by the remote server. mail.com has 2 MX servers. The main MX server has 6 IPs in a multihomed configuration. the second (backup) MX server has the same 6 IPs. This means that *currently* at each attempt James run 12 connections each one taking 10 minutes before timing out. So 2 hours to mark a *TEMP* error attempt for a single mail. This happens very often (I experienced this in past, too), so I think we should at least give options to avoid this. Decreasing timeouts is one thing, but you understand that it does not make sense to loose hours testing 2 times 6 multihomed servers and do this things 15 times by default. In our default that single mail would result in 6*2*15 total attempts (180 attempts) each one keeping a thread busy for 10 minutes (1800 minutes, more than a whole thread day). 1800 thread minutes for a single mail as worst case default is not acceptable to me. Stefano --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
