Thanks Bernd. Enjoy your holidays and please help whenever you get time... :)
Did you get a chance to take a look at my latest email ? Actually the server is not dead. Its refusing connections.. I have put the relevent details in the email.... Please take a look. Tanks Mahesh On Wed, Apr 1, 2015 at 12:48 AM, Bernd Waibel <bwai...@intarsys.de> wrote: > Hi Mahesh > > I am currently on holidays. So I could not check on a linux. > > The "chkconfig add" will add scripts for startup AND shutdown, with a > defined order and in the defined runlevel. > Not having this means: you have the service to be started and stopped by > hand. > > And the process may just be killed when rebooting. This MAY result in > nothing to be logged on shutdown. > If you reboot the Server the log may just end and the process will die. It > will not been started again. > > Just sounds like your description. Does it? > > Greetings > Bernd > > > -------- Ursprüngliche Nachricht -------- > Von: Mahesh Sivarama Pillai <srm...@gmail.com> > Datum: 31.03.2015 07:48 (GMT+01:00) > An: James Users List <server-user@james.apache.org> > Betreff: Re: URGENT HELP: James 2.3.2 not responding after few days of run > > Hi Bernd, > > Our Sys Admin has NOT performed the following things while configuring > james as a service. > > 1. Adding the below lines in phoenix.sh > > #chkconfig: 2345 80 05#description: James Mail Server > > 2. Chkconfig command > > chkconfig --add james > > > They created only the link in /etc/init.d pointing to phoenix.sh. We can > start and stop the service using the service command. Do you think not > doing the above two steps will impact a running James in any manner ? I am > trying to understand he run levels as well. > > Thanks > Mahesh > > > > On Mon, Mar 30, 2015 at 5:28 PM, Mahesh Sivarama Pillai <srm...@gmail.com> > wrote: > > > If there is a clean shutdown through RemoteManager, it should be shown in > > the log rite ? The thing is, I don't see any entry in the console log > which > > says STOPPED..I am investigating and will keep you posted. Thanks for the > > help so far. > > > > Thanks > > Mahesh > > > > On Mon, Mar 30, 2015 at 2:48 AM, Bernd Waibel <bwai...@intarsys.de> > wrote: > > > >> Hi Mahesh > >> > >> finding a hserr would be a clear sign that something happened outside > the > >> VM. > >> E.g. if you load a dll or lib inside your Java code and the dll produces > >> a memory fault than the vm may crash. > >> If a hserr is produced the vm have crashed, without writing a log or > >> something else. The log just ends. > >> Not finding a hserr means you need to look for something else. > >> So I think it is not a crash. > >> > >> Another Idea: > >> In the config.xml you could configure a RemoteManager Port and user. > >> I am currently on holidays so I could not look up the syntax. > >> You could telnet to that port and send a shutdown command. > >> Could something simple like that happen? > >> > >> And about chkconfig: > >> We had a system with james configured to run only in runlevel "with gui" > >> (i think it was 5 or 6). > >> And than a sysadmin switched the system to run "without gui". > >> So the switch to another runlevel just stopped james, with a clean > >> shutdown. > >> After that we just carefully looked for the runlevels. > >> James needs to start after network, and after database if used. > >> And also it should stop this way. > >> > >> Greetings Bernd > >> > >> > >> -------- Ursprüngliche Nachricht -------- > >> Von: Mahesh Sivarama Pillai <srm...@gmail.com> > >> Datum: 29.03.2015 07:58 (GMT+01:00) > >> An: James Users List <server-user@james.apache.org> > >> Betreff: Re: URGENT HELP: James 2.3.2 not responding after few days of > run > >> > >> Thanks again Bernd... I couldn't find the hserr files under the temp or > >> james directories. Considering we faced Too Many open files issue, will > it > >> prevent the JVM from not creating this file ? I am clueless on this > issue. > >> No process Killed James, Noone stopped James.. No OOM in logs.. No core > >> dump :) :( > >> > >> Regarding the file system I will verify. As far as I know we have a > NAS... > >> > >> On Sat, Mar 28, 2015 at 3:50 AM, Bernd Waibel <bwai...@intarsys.de> > >> wrote: > >> > >> > Hi Mahesh, > >> > > >> > Don't missunderstand: Out-of-file-handle COULD lead to a memory leak, > >> > consuming memory time by time. But not NEED to. > >> > > >> > OOMs will normally been shown in the log, as I know, but we got this > >> only > >> > for the heap memory. > >> > OOMs normally happen if the heap memory will reach the limit, and yes, > >> we > >> > got this in the logs, sometimes. > >> > Every time I got an OOM in log, I restarted the server. Just to be > sure > >> it > >> > keeps running. > >> > So I do not have long running servers with a lot of OOM errors. So: no > >> > experience with that. > >> > > >> > But you could also get short on memory for the java classes (Native > >> area, > >> > Method area), and I am not sure if this will show up in the log. Never > >> had > >> > this with james. I got his when running JIRA long ago, but could not > >> > remember the log. > >> > > >> > The PID (process ID) is something handled by the linux system, it is > >> > outside James, and I think you won't find it in log. > >> > But the PID is created on startup (phonix.sh), and may be logged in > the > >> > shell script to somewhere, together with a time stamp. > >> > But not in the james logs. > >> > > >> > If your sysadmins do use a monitoring tool (like nagios or icinga) the > >> may > >> > monitor the memory. > >> > You could also monitor the memory inside the VM using JMX, but this > is a > >> > little bit hard to set up. > >> > > >> > But anyway: the memory may NOT be the problem. So do not spend to much > >> > time on that. > >> > > >> > If you could find a hserr*.pid file, the file will tell the reason for > >> > "crashing". > >> > > >> > > >> > There is something else I could remember. But with another software. > >> > If the log file is stored on a file server (not a local directory), > and > >> > the file server will reboot, you will loose the log. > >> > We got a java process which "died", cause the file server has been > >> > rebooted at midnight, and the java process lost all mounted > directories. > >> > After that we made sure that the log directory is always local. And > the > >> > programm directory too. > >> > You may check if your server uses mounted file systems. > >> > > >> > > >> > Greetings > >> > Bernd > >> > > >> > -----Ursprüngliche Nachricht----- > >> > Von: Mahesh Sivarama Pillai [mailto:srm...@gmail.com] > >> > Gesendet: Freitag, 27. März 2015 15:17 > >> > An: James Users List > >> > Betreff: Re: URGENT HELP: James 2.3.2 not responding after few days of > >> run > >> > > >> > Hi Bernd, > >> > > >> > Thanks for the pointers. Let me ask the Sys admin on these details. > >> Btw, > >> > will this memory leak be shown in the logs? I couldn't find any OOM > >> errors > >> > in any of the logs. When the issue, happened, our team restarted the > >> > server. It will create a new PID rite ? Is there a way we can see the > >> old > >> > pids from the james logs ? > >> > > >> > Thanks > >> > Mahesh > >> > > >> > On Fri, Mar 27, 2015 at 7:33 PM, Bernd Waibel <bwai...@intarsys.de> > >> wrote: > >> > > >> > > Hi Mahesh > >> > > > >> > > to man open files may result in a memory leak. > >> > > Could the sysadmin monitor the memory? > >> > > > >> > > It is a java prozess. Is there a file called hserr*.pid? That is > >> > > produced if the vm crashes. > >> > > > >> > > Ciao > >> > > Bernd > >> > > > >> > > > >> > > -------- Ursprüngliche Nachricht -------- > >> > > Von: Mahesh Sivarama Pillai <srm...@gmail.com> > >> > > Datum: 27.03.2015 14:18 (GMT+01:00) > >> > > An: James Users List <server-user@james.apache.org> > >> > > Betreff: URGENT HELP: James 2.3.2 not responding after few days of > run > >> > > > >> > > Hi, > >> > > > >> > > I need an urgent help. We have rolled out James 2.3.2 to production > >> > > for our email processing application. I see that James getting > >> > > shutdown (no trace in the phoenix.console) after few days of run. It > >> > > processes around 100K email a day and sends a good amount of > >> > > Notification through RemoveDelivery. > >> > > > >> > > I have verified the logs but I couldn't find any reason for this > >> > > abnormal shutdown. I have seen couple of "Too Many Open Files" > errors > >> > > in smtpserver log and spoolmanager log. But I think those will not > >> bring > >> > down the server. > >> > > Will they ? I am not sure if James is killed by some other Linux > >> process. > >> > > James is running under a user (eg: james) account with sudo access > to > >> > > run on port 25. Since I don't have root access, what all areas that > I > >> > > look to figure out what the problem is ? If I want to talk to Sys > >> > > Admin, what all information that I should ask him/her to gather ? > >> > > > >> > > James is running on a 4 CPU machine with 8GB RAM. Heapsize of James > is > >> > > set to 4GB. > >> > > > >> > > I have configured to run James as service in Linux. I am not sure if > >> > > our Sys Admin run the chkconfig command. Is there any impact of not > >> > > running this command ? Please provide your inputs as early as > >> possible.. > >> > > > >> > > > >> > > Thanks > >> > > Mahesh > >> > > > >> > > >> > --------------------------------------------------------------------- > >> > To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org > >> > For additional commands, e-mail: server-user-h...@james.apache.org > >> > > >> > > > > >