Hi Bernd, Our Sys Admin has NOT performed the following things while configuring james as a service.
1. Adding the below lines in phoenix.sh #chkconfig: 2345 80 05#description: James Mail Server 2. Chkconfig command chkconfig --add james They created only the link in /etc/init.d pointing to phoenix.sh. We can start and stop the service using the service command. Do you think not doing the above two steps will impact a running James in any manner ? I am trying to understand he run levels as well. Thanks Mahesh On Mon, Mar 30, 2015 at 5:28 PM, Mahesh Sivarama Pillai <srm...@gmail.com> wrote: > If there is a clean shutdown through RemoteManager, it should be shown in > the log rite ? The thing is, I don't see any entry in the console log which > says STOPPED..I am investigating and will keep you posted. Thanks for the > help so far. > > Thanks > Mahesh > > On Mon, Mar 30, 2015 at 2:48 AM, Bernd Waibel <bwai...@intarsys.de> wrote: > >> Hi Mahesh >> >> finding a hserr would be a clear sign that something happened outside the >> VM. >> E.g. if you load a dll or lib inside your Java code and the dll produces >> a memory fault than the vm may crash. >> If a hserr is produced the vm have crashed, without writing a log or >> something else. The log just ends. >> Not finding a hserr means you need to look for something else. >> So I think it is not a crash. >> >> Another Idea: >> In the config.xml you could configure a RemoteManager Port and user. >> I am currently on holidays so I could not look up the syntax. >> You could telnet to that port and send a shutdown command. >> Could something simple like that happen? >> >> And about chkconfig: >> We had a system with james configured to run only in runlevel "with gui" >> (i think it was 5 or 6). >> And than a sysadmin switched the system to run "without gui". >> So the switch to another runlevel just stopped james, with a clean >> shutdown. >> After that we just carefully looked for the runlevels. >> James needs to start after network, and after database if used. >> And also it should stop this way. >> >> Greetings Bernd >> >> >> -------- Ursprüngliche Nachricht -------- >> Von: Mahesh Sivarama Pillai <srm...@gmail.com> >> Datum: 29.03.2015 07:58 (GMT+01:00) >> An: James Users List <server-user@james.apache.org> >> Betreff: Re: URGENT HELP: James 2.3.2 not responding after few days of run >> >> Thanks again Bernd... I couldn't find the hserr files under the temp or >> james directories. Considering we faced Too Many open files issue, will it >> prevent the JVM from not creating this file ? I am clueless on this issue. >> No process Killed James, Noone stopped James.. No OOM in logs.. No core >> dump :) :( >> >> Regarding the file system I will verify. As far as I know we have a NAS... >> >> On Sat, Mar 28, 2015 at 3:50 AM, Bernd Waibel <bwai...@intarsys.de> >> wrote: >> >> > Hi Mahesh, >> > >> > Don't missunderstand: Out-of-file-handle COULD lead to a memory leak, >> > consuming memory time by time. But not NEED to. >> > >> > OOMs will normally been shown in the log, as I know, but we got this >> only >> > for the heap memory. >> > OOMs normally happen if the heap memory will reach the limit, and yes, >> we >> > got this in the logs, sometimes. >> > Every time I got an OOM in log, I restarted the server. Just to be sure >> it >> > keeps running. >> > So I do not have long running servers with a lot of OOM errors. So: no >> > experience with that. >> > >> > But you could also get short on memory for the java classes (Native >> area, >> > Method area), and I am not sure if this will show up in the log. Never >> had >> > this with james. I got his when running JIRA long ago, but could not >> > remember the log. >> > >> > The PID (process ID) is something handled by the linux system, it is >> > outside James, and I think you won't find it in log. >> > But the PID is created on startup (phonix.sh), and may be logged in the >> > shell script to somewhere, together with a time stamp. >> > But not in the james logs. >> > >> > If your sysadmins do use a monitoring tool (like nagios or icinga) the >> may >> > monitor the memory. >> > You could also monitor the memory inside the VM using JMX, but this is a >> > little bit hard to set up. >> > >> > But anyway: the memory may NOT be the problem. So do not spend to much >> > time on that. >> > >> > If you could find a hserr*.pid file, the file will tell the reason for >> > "crashing". >> > >> > >> > There is something else I could remember. But with another software. >> > If the log file is stored on a file server (not a local directory), and >> > the file server will reboot, you will loose the log. >> > We got a java process which "died", cause the file server has been >> > rebooted at midnight, and the java process lost all mounted directories. >> > After that we made sure that the log directory is always local. And the >> > programm directory too. >> > You may check if your server uses mounted file systems. >> > >> > >> > Greetings >> > Bernd >> > >> > -----Ursprüngliche Nachricht----- >> > Von: Mahesh Sivarama Pillai [mailto:srm...@gmail.com] >> > Gesendet: Freitag, 27. März 2015 15:17 >> > An: James Users List >> > Betreff: Re: URGENT HELP: James 2.3.2 not responding after few days of >> run >> > >> > Hi Bernd, >> > >> > Thanks for the pointers. Let me ask the Sys admin on these details. >> Btw, >> > will this memory leak be shown in the logs? I couldn't find any OOM >> errors >> > in any of the logs. When the issue, happened, our team restarted the >> > server. It will create a new PID rite ? Is there a way we can see the >> old >> > pids from the james logs ? >> > >> > Thanks >> > Mahesh >> > >> > On Fri, Mar 27, 2015 at 7:33 PM, Bernd Waibel <bwai...@intarsys.de> >> wrote: >> > >> > > Hi Mahesh >> > > >> > > to man open files may result in a memory leak. >> > > Could the sysadmin monitor the memory? >> > > >> > > It is a java prozess. Is there a file called hserr*.pid? That is >> > > produced if the vm crashes. >> > > >> > > Ciao >> > > Bernd >> > > >> > > >> > > -------- Ursprüngliche Nachricht -------- >> > > Von: Mahesh Sivarama Pillai <srm...@gmail.com> >> > > Datum: 27.03.2015 14:18 (GMT+01:00) >> > > An: James Users List <server-user@james.apache.org> >> > > Betreff: URGENT HELP: James 2.3.2 not responding after few days of run >> > > >> > > Hi, >> > > >> > > I need an urgent help. We have rolled out James 2.3.2 to production >> > > for our email processing application. I see that James getting >> > > shutdown (no trace in the phoenix.console) after few days of run. It >> > > processes around 100K email a day and sends a good amount of >> > > Notification through RemoveDelivery. >> > > >> > > I have verified the logs but I couldn't find any reason for this >> > > abnormal shutdown. I have seen couple of "Too Many Open Files" errors >> > > in smtpserver log and spoolmanager log. But I think those will not >> bring >> > down the server. >> > > Will they ? I am not sure if James is killed by some other Linux >> process. >> > > James is running under a user (eg: james) account with sudo access to >> > > run on port 25. Since I don't have root access, what all areas that I >> > > look to figure out what the problem is ? If I want to talk to Sys >> > > Admin, what all information that I should ask him/her to gather ? >> > > >> > > James is running on a 4 CPU machine with 8GB RAM. Heapsize of James is >> > > set to 4GB. >> > > >> > > I have configured to run James as service in Linux. I am not sure if >> > > our Sys Admin run the chkconfig command. Is there any impact of not >> > > running this command ? Please provide your inputs as early as >> possible.. >> > > >> > > >> > > Thanks >> > > Mahesh >> > > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: server-user-unsubscr...@james.apache.org >> > For additional commands, e-mail: server-user-h...@james.apache.org >> > >> > >