I think you are not alone :) In the last five days I have found several messages in the connection log with the same problem my server stops responding but after a while it comes back and if I run a netstat when I am receiving the error I really dont have more than 20 ESTABLISED connections.
This is interesting problem because in our test server running the same version of James and java a manage to deliver more than 20.000 messages with smtp-source and the connections works just fine if I go over it complains and releases the connections after use. I will put my connection log in debug mode to see if I get something []s On Tue, 2004-02-17 at 11:35, Michael Nestler wrote: > OK, James stopped again last night at about 1:50 am. I did not have the > time yet to run postal or smtp-source. But this time I have a log > showing smtp sockets with a 10 minute resolution. Since the last stop I > increased the maximum number of connections to 120 (thread number to 150). > > At 1:50 James reported in connections*.log: > > 17/02/04 01:50:00 WARN connections: Maximum number of open connections > exceeded - refusing connection. Current number of connections is 120 > > repeatedly until 8:29 am when I restarted James. My smtpconn.log (cron > job "netstat | grep smtp") shows for the same time: > > Tue Feb 17 01:50:00 EST 2004 > tcp 0 0 email1.whoglue.net:smtp DSL217-1:fyre-messanger > SYN_RECV > tcp 0 0 192.168.168.5:smtp vm211.vm:cruise-swroute > ESTABLISHED > tcp 1 0 192.168.168.5:smtp 220.127.103.78:idmaps > CLOSE_WAIT > tcp 0 0 192.168.168.5:smtp mta.email.contine:44071 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp custd530.innerhost:4074 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp pro109.profontis.:59902 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp smtp01.1800contact:4220 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp mta.email.contine:42345 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp mta.email.contine:25694 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp stdout-mv-01.inbo:44652 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp 128.121.26.2:31574 > ESTABLISHED > tcp 1 0 192.168.168.5:smtp ool-435798e8.dy:evb-elm > CLOSE_WAIT > tcp 1 0 192.168.168.5:smtp lsanca1-ar5:notify_srvr > CLOSE_WAIT > tcp 0 0 192.168.168.5:smtp mta113.cheetahmai:52853 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp 66.42.142.98:49221 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp server611.colonize:3926 > ESTABLISHED > tcp 1 0 192.168.168.5:smtp 211-56-243-3.inter:linx > CLOSE_WAIT > tcp 0 0 192.168.168.5:smtp wchicon1.qvc.com:4611 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp vm125.vmadmin.com:8381 > ESTABLISHED > tcp 7 0 192.168.168.5:smtp 66.28.60.195:58491 > CLOSE_WAIT > tcp 0 0 192.168.168.5:smtp imo-r10.mx.aol.co:61761 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp bonaire.server1.o:52001 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp bonaire.server1.o:52272 > ESTABLISHED > tcp 1 0 192.168.168.5:smtp 202.Red-213-97-41:pipes > CLOSE_WAIT > tcp 0 0 192.168.168.5:smtp n25.grp.scd.yahoo:47649 > ESTABLISHED > tcp 1 0 192.168.168.5:smtp 218.98.97.183:4612 > CLOSE_WAIT > tcp 1 0 192.168.168.5:smtp mail.contactel.cz:43724 > CLOSE_WAIT > tcp 0 0 192.168.168.5:smtp mail.contactel.cz:43794 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp mail.contactel.cz:43801 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp 212.143.236.10:hylafax > ESTABLISHED > tcp 0 0 192.168.168.5:smtp cust54-224.netcabo:4416 > ESTABLISHED > tcp 0 0 192.168.168.5:smtp ana216.anotherday:40710 > ESTABLISHED > tcp 1 0 192.168.168.5:smtp 218.232.228.130:vsixml > CLOSE_WAIT > > > And at the next time the netstat snapshot was made: > > Tue Feb 17 02:00:00 EST 2004 > tcp 0 0 192.168.168.5:smtp mta.email.contine:13622 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp imo-m23.mx.aol.co:63699 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp modemcable:conclave-cpp > TIME_WAIT > tcp 0 0 192.168.168.5:smtp 219.144.194.18:43349 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp 219.144.194.18:43281 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp c-24-7-95-19.clien:4669 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp web60804.mail.yah:30519 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp web60804.mail.yah:30518 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp relay.quickhire.c:34089 > TIME_WAIT > tcp 0 0 192.168.168.5:smtp ACC1B832.:ns-cfg-server > TIME_WAIT > > > From there on the smtp sockets log file basically shows only TIME_WAIT > sockets (there are a few SYN_RECV) until I restarted James at 8:29 am. > Never netstat shows 120 open smtp connections (what James seems to > believe). As a strange aside the default*.log contains a database > connection exception (it contained the same exception at the last time > James stopped 6 days ago). I don't know if this means anything - could > this exception have caused the problem? spoolmanager*.log and > mailet*.log stopped recording at 1:32 am. > > To summarize the events from my point of view: > o James was running for about 6 days without any problem, delivering > about 5,000-10,000 emails a day > o some event at a few minutes before 1:50 am stopped James from > returning used smtp connection socket objects to the pool starting at > that time > o as a consequence new connections were not accepted and the service > stopped effectively at 1:32 am > o it never recovered from that situation until I relieved it by > restarting phoenix at 8:29 am > > From my point of view it seems to be likely that the mentioned > exception in default*.log caused the problem, since it happened so close > to the time James stopped: > > 17/02/04 01:33:11 INFO database-connections.maildb: ***** connection 1 > is way too old: 63408 > 60000 > 17/02/04 01:33:11 INFO database-connections.maildb: java.lang.Throwable > at > org.apache.james.util.mordred.PoolConnEntry.lock(PoolConnEntry.java:135) > at > org.apache.james.util.mordred.JdbcDataSource.getConnection(JdbcDataSource.java:178) > > at > org.apache.james.mailrepository.JDBCMailRepository.store(JDBCMailRepository.java:484) > > > at > org.apache.james.mailrepository.JDBCSpoolRepository.store(JDBCSpoolRepository.java:243) > > > at > org.apache.james.transport.mailets.RemoteDelivery.service(RemoteDelivery.java:649) > > at > org.apache.james.transport.LinearProcessor.service(LinearProcessor.java:413) > > at > org.apache.james.transport.JamesSpoolManager.process(JamesSpoolManager.java:436) > > at > org.apache.james.transport.JamesSpoolManager.run(JamesSpoolManager.java:366) > > at > org.apache.avalon.excalibur.thread.impl.ExecutableRunnable.execute(ExecutableRunnable.java:47) > > > at > org.apache.avalon.excalibur.thread.impl.WorkerThread.run(WorkerThread.java:80) > > > On the other hand it could be just a side effect. I attached the > mentioned log files since midnight to this email in zipped format. > > I really hope somebody on this list can make any sense out of this and > knows what's going on here and how I can fix it. Let me know if I should > run postal or smtp-source first and I will do it. > > All help is greatly appreciated. > > Best regards, > Michael > > P.S. I was not able to attach a ZIP file to this email. mail.apache.org > bounced it saying "552 we don't accept email with executable content > (#5.3.4)" ? I try it again as TGZ. > > > > Noel J. Bergman wrote: > > >Run netstat and check to see if they are real. I have tested the connection > >handling with linux kernel 2.2.22, 2.4.20-18 and 2.4.20-24 without problem, > >amongst other versions. > > > >What version of James? > > > > --- Noel > > > > > >--------------------------------------------------------------------- > >To unsubscribe, e-mail: [EMAIL PROTECTED] > >For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > ______________________________________________________________________ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED]
