Hi Lilin, The permissions on the dumps directory are wrong, and this prevents Ellis, Homer and Homestead-prov from being able to write core files. We've tracked this at https://github.com/Metaswitch/clearwater-infrastructure/issues/40, and a fix for this is in the next release.
In the meantime, can you change the permissions of these two folders (dumps and tmp) to 777? Thanks, Ellie From: Lilin [mailto:[email protected]] Sent: 25 February 2015 21:37 To: Eleanor Merry; clearwater Subject: Re: [Clearwater] Monit unrecognizable in Ellis, Sprout, Homer, Ralf Hi Ellie, Thanks for raising the problem on github. I now manually renamed the hostname to 'em' (for ellis VM) and re-installed ellis (by re-running a bash script I wrote to automate the installing steps, including the bootstrapping, writing the clearwater config file, fetching the debian package, generating 1000 number, etc.) And this time, I can see output from 'sudo monit status'. But another problem rises: Process 'nginx' Running Process 'mysql' Running Program 'poll_ellis' Status failed Process 'ellis' Does not exist Process 'clearwater_diags_monitor' Running System 'em' Running I then checked the ellis-0.log. The following Error repeatedly appears: 2015-02-25 21:26:22,779 UTC ERROR utils:434 Can't dump core - is clearwater-diags-monitor installed? Traceback (most recent call last): File "/usr/share/clearwater/ellis/env/local/lib/python2.7/site-packages/metaswitchcommon-0.1-py2.7.egg/metaswitch/common/utils.py", line 429, in write_core_file with open(filename, "a") as stack_file: IOError: [Errno 13] Permission denied: '/var/clearwater-diags-monitor/tmp/core.ellis.1424899582' I then checked the folder to which the Error message pointed at: ubuntu@em:/var/clearwater-diags-monitor$ ls -l total 8 drwxr-xr-x 2 root root 4096 Feb 25 21:16 dumps drwxr-xr-x 2 root root 4096 Feb 25 21:16 tmp How to resolve this I/O permission denial issue and launch ellis process successfully? Thanks, Lilin On 24/02/2015 2:22 PM, Eleanor Merry wrote: Hi Lilin, Is the hostname of your Ellis node 'Ellis'? The monit configuration that monitors the whole system (called 'ellis') is conflicting with the monit configuration that monitors the Ellis process (also called 'ellis'). We're looking into fixing this up (you can track this at https://github.com/Metaswitch/clearwater-infrastructure/issues/139). In the meantime, can you change your system hostname to something like ellis-1 (you'll need to edit /etc/hostname and the public_hostname setting in /etc/clearwater/config). Can you then reboot (to ensure the hostname change is picked up), and then try the install again? Thanks, Ellie -----Original Message----- From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Lilin Sent: 20 February 2015 23:00 To: clearwater Subject: [Clearwater] Monit unrecognizable in Ellis, Sprout, Homer, Ralf Hello all, *I used stable repo*, "deb http://repo.cw-ngv.com/stable binary/", *to manually install* Ellis, Sprout, Homer, and Ralf nodes; but encounter the error when doing "sudo monit status": "*'check system' not defined in control file, failed to add automatic configuration (service name ellis is used already) -- please add 'check system <name>' manually*" At Ellis node: After checking the log of monit confirming that monit isn't running (logs are pasted at the end), I then checked the status of the following related services: *nginx is RUNNING** **clearwater-monit starts/running** **clearwater-diags.monitor is running* *ellis is NOT running** **mysql->ellis->numbers table has 1000 numbers created browser display one line of message "Welcome to nginx!" when directed to ellis_IP * *browser display one line of message "This web page is not available" when directed to ellis_IP:2812 *Please advise where went down here. Thanks! Lilin The following are the logs from monit (at ellis node): [UTC Feb 20 22:05:22] info : Generated unique Monit id 4c20cc7350274d0adaef4640e7a1d68a and stored to '/var/lib/monit/id' [UTC Feb 20 22:05:22] info : Starting monit daemon [UTC Feb 20 22:05:22] info : 'ellis' Monit started [UTC Feb 20 22:05:22] error : 'clearwater_diags_monitor' process is not running [UTC Feb 20 22:05:22] info : 'clearwater_diags_monitor' trying to restart [UTC Feb 20 22:05:22] info : 'clearwater_diags_monitor' start: /etc/init.d/clearwater-diags-monitor [UTC Feb 20 22:05:23] info : Awakened by the SIGHUP signal [UTC Feb 20 22:05:23] info : Reinitializing monit - Control file '/etc/monit/monitrc' [UTC Feb 20 22:05:23] info : Starting monit HTTP server at [localhost:2812] [UTC Feb 20 22:05:23] info : monit HTTP server started [UTC Feb 20 22:05:23] info : 'ellis' Monit reloaded [UTC Feb 20 22:05:26] info : Awakened by the SIGHUP signal [UTC Feb 20 22:05:26] info : Reinitializing monit - Control file '/etc/monit/monitrc' [UTC Feb 20 22:05:26] info : Shutting down monit HTTP server [UTC Feb 20 22:05:27] info : monit HTTP server stopped [UTC Feb 20 22:05:27] info : Starting monit HTTP server at [localhost:2812] [UTC Feb 20 22:05:27] info : monit HTTP server started [UTC Feb 20 22:05:27] info : 'ellis' Monit reloaded [UTC Feb 20 22:05:27] error : 'nginx' uptime test failed for /var/run/nginx.pid -- current uptime is 1 seconds [UTC Feb 20 22:05:27] info : 'nginx' exec: /bin/true [UTC Feb 20 22:05:31] info : Awakened by the SIGHUP signal [UTC Feb 20 22:05:31] info : Reinitializing monit - Control file '/etc/monit/monitrc' [UTC Feb 20 22:05:31] info : Shutting down monit HTTP server [UTC Feb 20 22:05:31] info : monit HTTP server stopped [UTC Feb 20 22:05:31] error : 'check system' not defined in control file, failed to add automatic configuration (service name ellis is used already) -- please add 'check system <name>' manually [UTC Feb 20 22:05:31] error : monit daemon died _______________________________________________ Clearwater mailing list [email protected]<mailto:[email protected]> http://lists.projectclearwater.org/listinfo/clearwater _______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/listinfo/clearwater
