It's odd that it's not doing anything - could you run "sudo bash -x /usr/share/clearwater/infrastructure/scripts/create-ellis-nginx-config", which should show every command that script runs, and send me the output?
Thanks, Rob ________________________________ From: Rashid Mijumbi <[email protected]> Sent: 19 April 2016 00:17 To: Robert Day (projectclearwater.org) Subject: Re: [Project Clearwater] Manual Installation No longer Working Hi Rob, Many thanks for looking at the logs. I have run sudo /usr/share/clearwater/infrastructure/scripts/create-ellis-nginx-config and it returns successful (without any output) but still the file /etc/nginx/sites-available/ellis does not exist. The folder /var/log/ellis is also still empty even after service ellis stop. I have followed the installation instructions at http://clearwater.readthedocs.org/en/stable/Manual_Install.html religiously, so I would not not say that I am doing anything wrong! Regards, Rashid On 18 April 2016 at 20:33, Robert Day (projectclearwater.org<http://projectclearwater.org>) <[email protected]<mailto:[email protected]>> wrote: Hi Rashid, Chris sent me the Ellis diagnostics you provided, and asked me to take a look. There's no /etc/nginx/sites-available/ellis file, which is odd - can you try running "sudo /usr/share/clearwater/infrastructure/scripts/create-ellis-nginx-config"? I'm wondering whether the issue here is that: * nginx isn't configured properly * so it's not exposing Ellis on port 80 * so our regular health-checks of Ellis are failing * so we keep automatically restarting the Ellis process to try and fix it * and also, because no requests are getting through to Ellis, it isn't doing anything, so isn't producing any logs That would explain the following diagnostics from monit.log: [UTC Apr 10 06:51:30] info : 'ellis_process' restart: /etc/monit/run_logged [UTC Apr 10 06:51:41] info : 'ellis_process' process is running with pid 6122 [UTC Apr 10 06:51:41] error : 'poll_ellis' HTTP failed to http://192.168.40.30/ping [UTC Apr 10 06:51:51] error : 'poll_ellis' HTTP failed to http://192.168.40.30/ping [UTC Apr 10 06:51:51] info : 'poll_ellis' exec: /etc/init.d/ellis [UTC Apr 10 06:52:01] error : 'ellis_process' process is not running [UTC Apr 10 06:52:01] info : 'ellis_process' trying to restart If that's the issue, "sudo /usr/share/clearwater/infrastructure/scripts/create-ellis-nginx-config" might fix things up (or fail with an informative error message). Best, Rob From: Clearwater [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Chris Elford (projectclearwater.org<http://projectclearwater.org>) Sent: 11 April 2016 16:10 To: Rashid Mijumbi <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]> Subject: Re: [Project Clearwater] Manual Installation No longer Working Hi Rashid, You file permissions look right to me. It looks like something is going wrong with Ellis before it starts producing any output, so it may be hard to find out what is going wrong. Can you send me a diagnostics package from your Ellis node? There are instructions at https://github.com/Metaswitch/clearwater-infrastructure/blob/master/clearwater-diags-monitor.md. The mailing list will filter out large attachments, so I suggest sending it to me directly, but we should keep the rest of our conversation on the mailing list. The issue with homer/homestead may be related, but I am not sure yet. I think it is best to try to solve one problem at a time. Yours, Chris From: Rashid Mijumbi [mailto:[email protected]] Sent: 08 April 2016 18:17 To: Chris Elford (projectclearwater.org<http://projectclearwater.org>) <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]> Subject: Re: [Project Clearwater] Manual Installation No longer Working Hi Chris, Many thanks for your support. I run both sudo monit stop -g ellis and sudo service ellis run and the commands return immediately without any output. They also do not change anything on the functionality of ellis. However, as you state, the problem might be file permissions. I have seen that /usr/share/clearwater/ellis/env/bin/python exists with permissions rwxr-xr-x and owner ellis. Moreover, /etc/default/ellis does NOT exist, and of course /etc/default exists and is owned by root with permissions rwxr-xr-x Do you think this might also be the cause of homer/homestead installation problems ? Regards, Rashid On 7 April 2016 at 16:06, Chris Elford (projectclearwater.org<http://projectclearwater.org>) <[email protected]<mailto:[email protected]>> wrote: Hi Rashid, It looks to me as though Ellis is finishing its installation, but failing to start properly, before it gets a chance to write any logs. We may be able to get some more information by looking at the output when you run monit manually. * First stop monit from trying to start Ellis automatically: sudo monit stop -g ellis * Then try to run Ellis: sudo service ellis run If Ellis doesn't quit after 5 minutes, then you can exit using Ctrl-C. When you are done, remember to run sudo monit start -g ellis to put everything back the way it was. There are a couple of other things to look at. In the past, we have had some trouble with file permission. Does /usr/share/clearwater/ellis/env/bin/python exist, and what are its file permissions and owner? Does /etc/default/ellis exist, and what are its file permissions and owner? Yours, Chris From: Rashid Mijumbi [mailto:[email protected]<mailto:[email protected]>] Sent: 06 April 2016 17:09 To: Chris Elford (projectclearwater.org<http://projectclearwater.org>) <[email protected]<mailto:[email protected]>> Subject: Re: [Project Clearwater] Manual Installation No longer Working Hi Chris, Please see install stderr + stdout files for the VMs as well as output for running monit summary and monit status on ellis. These logs are when I do a clean/new installation. If I remove a single package from a node and re-install it, it seems to complete the installation - I attach this case for HS when I remove and re-install clearwater-prov-tools. In any case, as you can see while ellis appears to be running, there are no logs at all in /var/log/ellis, and the ellis URL is not available. FWIW, I install the nodes in the order: Ellis -> Bono -> Sprout -> Homer -> HS -> Ralf Best Regards, Rashid On 6 April 2016 at 13:54, Chris Elford (projectclearwater.org<http://projectclearwater.org>) <[email protected]<mailto:[email protected]>> wrote: Hi Rashid, Thank you for trying that. You ran `sudo monit summary` and everything listed was running. What is the output from `sudo monit status`? There may be some missing processes that should be running. It looks like something went wrong installing the package clearwater-prov-tools on Homestead. Can you please try installing that again, and capture all of the output (stderr and stdout)? You may have to uninstall it first. That will give us more information about where the installation may be failing. If Ellis is not running, it may be a good idea to do the same thing for Ellis. Yours, Chris From: Rashid Mijumbi [mailto:[email protected]<mailto:[email protected]>] Sent: 05 April 2016 14:46 To: Chris Elford (projectclearwater.org<http://projectclearwater.org>) <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]> Subject: Re: [Project Clearwater] Manual Installation No longer Working Hi Chris, Many thanks for your quick response. I have to say that my previous shared_config files did not include scscf settings since it was mentioned in the installation guide that these were not mandatory. However, I have now included the (new) settings. I have also included the following lines to my DNS zone files (I am not sure if I need all of them, or if they are correctly created, but I already have similar lines for sprout): scscf.sprout.$domain. IN A $sprout_ip scscf-1 IN A $sprout_ip scscf IN A $sprout_ip scscf IN NAPTR 1 1 "S" "SIP+D2T" "" _sip._tcp.scscf.sprout _sip._tcp.scscf.sprout IN SRV 0 0 5054 scscf-1 With these changes, I still run into the same problems as previously: ellis not starting, errors especially for homer, homestead. Regards, Rashid On 5 April 2016 at 11:59, Chris Elford (projectclearwater.org<http://projectclearwater.org>) <[email protected]<mailto:[email protected]>> wrote: Hi Rashid, It looks like there are two separate problems here: * Homer is failing to install. * Ellis is not producing any logs (so probably isn't running). You say that you are using the same shared_config as previously. In release Articuno, we changed the way Sprout is configured, and this may be causing some of your problems. From the release note: Once you've upgraded, you may also need to change your sproutlet configuration, and add a DNS record for your S-CSCF cluster. Previously most Sproutlets/Application servers (AS) either used the value of the 'scscf_uri' parameter for their configuration or had other hard-coded configuration. Now you have the finer control over the configuration - each Sproutlet/AS has three configuration options. The options have the same format for each Sproutlet/AS, as listed here, with <sproutlet> replaced by the appropriate Sproutlet or AS name: * <sproutlet>: The port that the Sproutlet/AS listens on. The default value is 5054 for some Sproutlets/ASs (those enabled by default) and 0 for others (those disabled by default) * <sproutlet>_prefix: The identifier prefix for this Sproutlet/AS, used to build the uri, as described below. The default value is simply the Sproutlet/AS name: <sproutlet> * <sproutlet>_uri: The full identifier for this Sproutlet/AS, used for routing and receiving requests between nodes. The default value is created using the prefix and the hostname of the parent Sprout node, i.e. "sip:<sproutlet_prefix>.<sprout_hostname>;transport=tcp". We recommend that you don't set this yourself anymore, and use the defaults provided. As a concrete example, below are the S-CSCF options and the default values. * scscf=5054 * scscf_prefix=scscf * scscf_uri=sip:scscf.<sprout_hostname>;transport=tcp As we've split out the S-CSCF configuration, you'll also now need to set up DNS records for the S-CSCF cluster specifically (rather than just using the sprout cluster). A good first step would be to update your shared configuration files and see whether that fixes any of your issues. Once you have done that, we may be able to dig deeper into the other issues. Yours, Chris From: Clearwater [mailto:[email protected]<mailto:[email protected]>] On Behalf Of Rashid Mijumbi Sent: 05 April 2016 11:01 To: [email protected]<mailto:[email protected]> Subject: [Project Clearwater] Manual Installation No longer Working Dear all, I have not been able to successfully do a manual installation the last 3 or so days. I did so (multiple times) successfully in the past. Here are my details: VMs are running in OpenStack with Ubuntu-14.04-trusty-server-x86_64, 1vCPU, 2GB RAM, 8GB Disk It appears that I am getting multiple errors during installation. One of the errors is during installation of homer which repeatedly gives the error: TCP poll failed to 127.0.0.1 9160 nc: connect to 127.0.0.1 port 9160 (tcp) failed: Connection refused Failed to connect to '127.0.0.1:7199<http://127.0.0.1:7199>': Connection refused for about 5 minutes. Installation of homestead gives the same errors for an even longer period. Ultimately, the installation completes, but I cannot access the ellis URL in a browser. I would like to believe that my local_config and shared_config files are okay as I have used the same setup in the past and it worked quite well. The command "sudo service ellis stop" seems to work, while "sudo service clearwater-infrastructure restart" gives the following output * Restarting clearwater-infrastructure clearwater-infrastructure nginx: [warn] conflicting server name "87.44.18.128" on [::]:80, ignored nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful Configuring monit for only localhost access [ OK ] "sudo monit status" says that everything is either running or status is ok, while "sudo clearwater-etcdctl cluster-health" shows that all cluster members are healthy. Is anyone facing a similar problem or is it just something horribly wrong with my machines ? I am attaching a number of installation error log files as well as files from /var/log. The folder /var/log/ellis is empty! Regards, Rashid
_______________________________________________ Clearwater mailing list [email protected] http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org
