I will start another thread with a relevant subject.. Thanks Mahesh
On Wed, Apr 1, 2015 at 12:29 PM, Mahesh Sivarama Pillai <srm...@gmail.com> wrote: > Thanks Bernd. Enjoy your holidays and please help whenever you get time... > :) > > Did you get a chance to take a look at my latest email ? Actually the > server is not dead. Its refusing connections.. I have put the relevent > details in the email.... Please take a look. > > Tanks > Mahesh > > On Wed, Apr 1, 2015 at 12:48 AM, Bernd Waibel <bwai...@intarsys.de> wrote: > >> Hi Mahesh >> >> I am currently on holidays. So I could not check on a linux. >> >> The "chkconfig add" will add scripts for startup AND shutdown, with a >> defined order and in the defined runlevel. >> Not having this means: you have the service to be started and stopped by >> hand. >> >> And the process may just be killed when rebooting. This MAY result in >> nothing to be logged on shutdown. >> If you reboot the Server the log may just end and the process will die. >> It will not been started again. >> >> Just sounds like your description. Does it? >> >> Greetings >> Bernd >> >> >> -------- Ursprüngliche Nachricht -------- >> Von: Mahesh Sivarama Pillai <srm...@gmail.com> >> Datum: 31.03.2015 07:48 (GMT+01:00) >> An: James Users List <server-user@james.apache.org> >> Betreff: Re: URGENT HELP: James 2.3.2 not responding after few days of run >> >> Hi Bernd, >> >> Our Sys Admin has NOT performed the following things while configuring >> james as a service. >> >> 1. Adding the below lines in phoenix.sh >> >> #chkconfig: 2345 80 05#description: James Mail Server >> >> 2. Chkconfig command >> >> chkconfig --add james >> >> >> They created only the link in /etc/init.d pointing to phoenix.sh. We can >> start and stop the service using the service command. Do you think not >> doing the above two steps will impact a running James in any manner ? I am >> trying to understand he run levels as well. >> >> Thanks >> Mahesh >> >> >> >> On Mon, Mar 30, 2015 at 5:28 PM, Mahesh Sivarama Pillai <srm...@gmail.com >> > >> wrote: >> >> > If there is a clean shutdown through RemoteManager, it should be shown >> in >> > the log rite ? The thing is, I don't see any entry in the console log >> which >> > says STOPPED..I am investigating and will keep you posted. Thanks for >> the >> > help so far. >> > >> > Thanks >> > Mahesh >> > >> > On Mon, Mar 30, 2015 at 2:48 AM, Bernd Waibel <bwai...@intarsys.de> >> wrote: >> > >> >> Hi Mahesh >> >> >> >> finding a hserr would be a clear sign that something happened outside >> the >> >> VM. >> >> E.g. if you load a dll or lib inside your Java code and the dll >> produces >> >> a memory fault than the vm may crash. >> >> If a hserr is produced the vm have crashed, without writing a log or >> >> something else. The log just ends. >> >> Not finding a hserr means you need to look for something else. >> >> So I think it is not a crash. >> >> >> >> Another Idea: >> >> In the config.xml you could configure a RemoteManager Port and user. >> >> I am currently on holidays so I could not look up the syntax. >> >> You could telnet to that port and send a shutdown command. >> >> Could something simple like that happen? >> >> >> >> And about chkconfig: >> >> We had a system with james configured to run only in runlevel "with >> gui" >> >> (i think it was 5 or 6). >> >> And than a sysadmin switched the system to run "without gui". >> >> So the switch to another runlevel just stopped james, with a clean >> >> shutdown. >> >> After that we just carefully looked for the runlevels. >> >> James needs to start after network, and after database if used. >> >> And also it should stop this way. >> >> >> >> Greetings Bernd >> >> >> >> >> >> -------- Ursprüngliche Nachricht -------- >> >> Von: Mahesh Sivarama Pillai <srm...@gmail.com> >> >> Datum: 29.03.2015 07:58 (GMT+01:00) >> >> An: James Users List <server-user@james.apache.org> >> >> Betreff: Re: URGENT HELP: James 2.3.2 not responding after few days of >> run >> >> >> >> Thanks again Bernd... I couldn't find the hserr files under the temp or >> >> james directories. Considering we faced Too Many open files issue, >> will it >> >> prevent the JVM from not creating this file ? I am clueless on this >> issue. >> >> No process Killed James, Noone stopped James.. No OOM in logs.. No core >> >> dump :) :( >> >> >> >> Regarding the file system I will verify. As far as I know we have a >> NAS... >> >> >> >> On Sat, Mar 28, 2015 at 3:50 AM, Bernd Waibel <bwai...@intarsys.de> >> >> wrote: >> >> >> >> > Hi Mahesh, >> >> > >> >> > Don't missunderstand: Out-of-file-handle COULD lead to a memory leak, >> >> > consuming memory time by time. But not NEED to. >> >> > >> >> > OOMs will normally been shown in the log, as I know, but we got this >> >> only >> >> > for the heap memory. >> >> > OOMs normally happen if the heap memory will reach the limit, and >> yes, >> >> we >> >> > got this in the logs, sometimes. >> >> > Every time I got an OOM in log, I restarted the server. Just to be >> sure >> >> it >> >> > keeps running. >> >> > So I do not have long running servers with a lot of OOM errors. So: >> no >> >> > experience with that. >> >> > >> >> > But you could also get short on memory for the java classes (Native >> >> area, >> >> > Method area), and I am not sure if this will show up in the log. >> Never >> >> had >> >> > this with james. I got his when running JIRA long ago, but could not >> >> > remember the log. >> >> > >> >> > The PID (process ID) is something handled by the linux system, it is >> >> > outside James, and I think you won't find it in log. >> >> > But the PID is created on startup (phonix.sh), and may be logged in >> the >> >> > shell script to somewhere, together with a time stamp. >> >> > But not in the james logs. >> >> > >> >> > If your sysadmins do use a monitoring tool (like nagios or icinga) >> the >> >> may >> >> > monitor the memory. >> >> > You could also monitor the memory inside the VM using JMX, but this >> is a >> >> > little bit hard to set up. >> >> > >> >> > But anyway: the memory may NOT be the problem. So do not spend to >> much >> >> > time on that. >> >> > >> >> > If you could find a hserr*.pid file, the file will tell the reason >> for >> >> > "crashing". >> >> > >> >> > >> >> > There is something else I could remember. But with another software. >> >> > If the log file is stored on a file server (not a local directory), >> and >> >> > the file server will reboot, you will loose the log. >> >> > We got a java process which "died", cause the file server has been >> >> > rebooted at midnight, and the java process lost all mounted >> directories. >> >> > After that we made sure that the log directory is always local. And >> the >> >> > programm directory too. >> >> > You may check if your server uses mounted file systems. >> >> > >> >> > >> >> > Greetings >> >> > Bernd >> >> > >> >> > -----Ursprüngliche Nachricht----- >> >> > Von: Mahesh Sivarama Pillai [mailto:srm...@gmail.com] >> >> > Gesendet: Freitag, 27. März 2015 15:17 >> >> > An: James Users List >> >> > Betreff: Re: URGENT HELP: James 2.3.2 not responding after few days >> of >> >> run >> >> > >> >> > Hi Bernd, >> >> > >> >> > Thanks for the pointers. Let me ask the Sys admin on these details. >> >> Btw, >> >> > will this memory leak be shown in the logs? I couldn't find any OOM >> >> errors >> >> > in any of the logs. When the issue, happened, our team restarted the >> >> > server. It will create a new PID rite ? Is there a way we can see the >> >> old >> >> > pids from the james logs ? >> >> > >> >> > Thanks >> >> > Mahesh >> >> > >> >> > On Fri, Mar 27, 2015 at 7:33 PM, Bernd Waibel <bwai...@intarsys.de> >> >> wrote: >> >> > >> >> > > Hi Mahesh >> >> > > >> >> > > to man open files may result in a memory leak. >> >> > > Could the sysadmin monitor the memory? >> >> > > >> >> > > It is a java prozess. Is there a file called hserr*.pid? That is >> >> > > produced if the vm crashes. >> >> > > >> >> > > Ciao >> >> > > Bernd >> >> > > >> >> > > >> >> > > -------- Ursprüngliche Nachricht -------- >> >> > > Von: Mahesh Sivarama Pillai <srm...@gmail.com> >> >> > > Datum: 27.03.2015 14:18 (GMT+01:00) >> >> > > An: James Users List <server-user@james.apache.org> >> >> > > Betreff: URGENT HELP: James 2.3.2 not responding after few days of >> run >> >> > > >> >> > > Hi, >> >> > > >> >> > > I need an urgent help. We have rolled out James 2.3.2 to >> production >> >> > > for our email processing application. I see that James getting >> >> > > shutdown (no trace in the phoenix.console) after few days of run. >> It >> >> > > processes around 100K email a day and sends a good amount of >> >> > > Notification through RemoveDelivery. >> >> > > >> >> > > I have verified the logs but I couldn't find any reason for this >> >> > > abnormal shutdown. I have seen couple of "Too Many Open Files" >> errors >> >> > > in smtpserver log and spoolmanager log. But I think those will not >> >> bring >> >> > down the server. >> >> > > Will they ? I am not sure if James is killed by some other Linux >> >> process. >> >> > > James is running under a user (eg: james) account with sudo access >> to >> >> > > run on port 25. Since I don't have root access, what all areas >> that I >> >> > > look to figure out what the problem is ? If I want to talk to Sys >> >> > > Admin, what all information that I should ask him/her to gather ? >> >> > > >> >> > > James is running on a 4 CPU machine with 8GB RAM. Heapsize of >> James is >> >> > > set to 4GB. >> >> > > >> >> > > I have configured to run James as service in Linux. I am not sure >> if >> >> > > our Sys Admin run the chkconfig command. Is there any impact of not >> >> > > running this command ? Please provide your inputs as early as >> >> possible.. >> >> > > >> >> > > >> >> > > Thanks >> >> > > Mahesh >> >> > > >> >> > >> >> > --------------------------------------------------------------------- >> >> > To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org >> >> > For additional commands, e-mail: server-user-h...@james.apache.org >> >> > >> >> >> > >> > >> > >