Hi ilya,

Funny you brought up debugging the router VM. After I responding yesterday, I 
did just that and I did find some odd things. 
Just to be clear (I think we're on the same page), since I'm not the OP of this 
thread, the virtual router always gets deployed and it starts up just fine; 
however, CloudStack reports that it's always stuck in starting. VMs that get 
deployed ultimately fail. CloudStack reports the router version as UNKNOWN.
Before I provide what I found debugging the router VM, I'll address some of 
your points.

### FOLLOW-UP QUESTIONS ###

" Another reason would be an issue of hypervisor accessing the NFS mount used 
for secondary storage."
I don't believe this is an issue. The hypervisor (VMware) does mount the 
secondary storage via NFS just fine. If this were an issue, I would think the 
Secondary Storage and Console VMs would not deploy.

" Use console of vCenter to see what is happening on router vm. You can login 
locally with root/password and see the content of /var/log/cloud.out file, 
paste it on pastebin - if it makes no sense to you..."
It looks like to me that /var/log/cloud.out is only logged to when $CLOUD_DEBUG 
is set to a non-zero length in the /etc/init.d/cloud script. As such, there 
isn't even a file for /var/log/cloud.out. Even when I set that variable, I 
never get anything logged to /var/log/cloud.out. However, there is a 
/var/log/cloud.log. Here is the contents of that: http://pastebin.com/aaTsRKZE

" you can also run /etc/init.d/cloud stop and start.. that will give you a 
fresh start on logs.."
The service is in a failed state. It's worth noting that this service is in a 
started state on the Console and Secondary Storage VMs.

" also, confirm that management server can talk to VR on POD IP
(management) on port 3922.."
It appears this is not an issue; see below:

root@r-4-VM:~# telnet 10.70.110.101 8250
Trying 10.70.110.101...
Connected to 10.70.110.101.
Escape character is '^]'.

### ROUTE VM DEBUG ###

Here is what I found with router VM gets deployed (please tell me if anything 
seems off):
2 NICs; only one NIC gets an IP  address. CloudStack NIC1 shows an IP address 
coming from the defaultGuestNetwork. NIC2 is traffic type Control but has an IP 
address of 0.0.0.0
>From the CloudStack management server, I cannot SSH into the router VM on 
>NIC1. I've found this is because of iptables rules on the router VM. If I 
>issue a /etc/init.d/iptables-persistent flush on the router VM, I can SSH into 
>the router VM using the SSH key at port 3922.
The service "cloud" is in a failed state. Looking at the cloud init script, I 
see the following:

CMDLINE=$(cat /var/cache/cloud/cmdline)

TYPE="router"
for i in $CMDLINE
  do
    # search for foo=bar pattern and cut out foo
    FIRSTPATTERN=$(echo $i | cut -d= -f1)
    case $FIRSTPATTERN in 
      type)
          TYPE=$(echo $i | cut -d= -f2)
      ;;
    esac
done

The file cat /var/cache/cloud/cmdline exist; here are the contents:

template=domP name=r-4-VM eth0ip=10.70.116.75 eth0mask=255.255.255.0 
gateway=10.70.116.1 domain=vit.vertitechit.com cidrsize=24 
dhcprange=10.70.116.1 eth1ip=0.0.0.0 eth1mask=0.0.0.0 mgmtcidr=10.70.110.0/24 
localgw=10.70.116.1 sshonguest=true type=dhcpsrvr disable_rp_filter=true 
extra_pubnics=2 dns1=10.70.10.21 
baremetalnotificationsecuritykey=nu1HfF_DpC-gK-G_3y1u54Snb9ruROq-qldOvhnHj4EMypguvtfQu0o18eY3gs81iPZMD2Du1QOUAG5KOfMYXQ
 
baremetalnotificationapikey=CKZoOXffpY5ihjvzly3yD_2t2qaDnFglYFDoeep37aH1qy5u67aX51ZsuZpZcphfOxJY52rkTlNOl0nkNSyXjQ
 host=10.70.110.101 port=8080 nic_macs=06:b1:2e:00:00:10|02:00:14:42:00:03

The previous code suggests that the value of TYPE starts as router but will get 
set to dhcpsrvr, as indicated by the contents of /var/cache/cloud/cmdline. Is 
this normal?
Further down the script, I see:

CLOUDSTACK_HOME="/usr/local/cloud" 
<----------------------------------------Exists
if [ -f  $CLOUDSTACK_HOME/systemvm/utils.sh ]; 
<----------------------------------------Does not exist. Seems odd!
then
  . $CLOUDSTACK_HOME/systemvm/utils.sh
else
  _failure
fi

# mkdir -p /var/log/vmops

start() {
   local pid=$(get_pids)
   if [ "$pid" != "" ]; then
       echo "CloudStack cloud sevice is already running, PID = $pid"
       return 0
   fi

   echo -n "Starting CloudStack cloud service (type=$TYPE) "
   if [ -f $CLOUDSTACK_HOME/systemvm/run.sh ]; 
<------------------------------------------------------Does not exist. Seems 
odd!
   then
     if [ "$pid" == "" ]
     then
       (cd $CLOUDSTACK_HOME/systemvm; nohup ./run.sh > $LOG_FILE 2>&1 & )
       pid=$(get_pids)
       echo $pid > /var/run/cloud.pid 
     fi
     _success
   else
     _failure
   fi
   echo
   echo 'start' > $CLOUDSTACK_HOME/systemvm/user_request
}

I see that it sets CLOUDSTACK_HOME to /usr/local/cloud. This folder exists; 
however, the script then looks for the file /usr/local/cloud/systemvm/utils.sh. 
This file doesn't exist. It also looks is supposed to start the script run.sh 
but that also doesn't exist. This seems like a problem to me.
Here you can see step through when I try to start the cloud service:

sh -x /etc/init.d/cloud start
+ ENABLED=0
+ [ -e /etc/default/cloud ]
+ . /etc/default/cloud
+ ENABLED=0
+ cat /var/cache/cloud/cmdline
+ CMDLINE= template=domP name=r-4-VM eth0ip=10.70.116.75 eth0mask=255.255.255.0 
gateway=10.70.116.1 domain=vit.vertitechit.com cidrsize=24 
dhcprange=10.70.116.1 eth1ip=0.0.0.0 eth1mask=0.0.0.0 mgmtcidr=10.70.110.0/24 
localgw=10.70.116.1 sshonguest=true type=dhcpsrvr disable_rp_filter=true 
extra_pubnics=2 dns1=10.70.10.21 
baremetalnotificationsecuritykey=nu1HfF_DpC-gK-G_3y1u54Snb9ruROq-qldOvhnHj4EMypguvtfQu0o18eY3gs81iPZMD2Du1QOUAG5KOfMYXQ
 
baremetalnotificationapikey=CKZoOXffpY5ihjvzly3yD_2t2qaDnFglYFDoeep37aH1qy5u67aX51ZsuZpZcphfOxJY52rkTlNOl0nkNSyXjQ
 host=10.70.110.101 port=8080 nic_macs=06:b1:2e:00:00:10|02:00:14:42:00:03
+ [ ! -z ]
+ LOG_FILE=/dev/null
+ TYPE=router
+ cut -d= -f1
+ echo template=domP
+ FIRSTPATTERN=template
+ cut -d= -f1
+ echo name=r-4-VM
+ FIRSTPATTERN=name
+ cut -d= -f1
+ echo eth0ip=10.70.116.75
+ FIRSTPATTERN=eth0ip
+ cut -d= -f1
+ echo eth0mask=255.255.255.0
+ FIRSTPATTERN=eth0mask
+ cut -d= -f1
+ echo gateway=10.70.116.1
+ FIRSTPATTERN=gateway
+ cut -d= -f1
+ echo domain=vit.vertitechit.com
+ FIRSTPATTERN=domain
+ cut -d= -f1
+ echo cidrsize=24
+ FIRSTPATTERN=cidrsize
+ cut -d= -f1
+ echo dhcprange=10.70.116.1
+ FIRSTPATTERN=dhcprange
+ cut -d= -f1
+ echo eth1ip=0.0.0.0
+ FIRSTPATTERN=eth1ip
+ cut -d= -f1
+ echo eth1mask=0.0.0.0
+ FIRSTPATTERN=eth1mask
+ cut -d= -f1
+ echo mgmtcidr=10.70.110.0/24
+ FIRSTPATTERN=mgmtcidr
+ cut -d= -f1
+ echo localgw=10.70.116.1
+ FIRSTPATTERN=localgw
+ cut -d= -f1
+ echo sshonguest=true
+ FIRSTPATTERN=sshonguest
+ cut -d= -f1
+ echo type=dhcpsrvr
+ FIRSTPATTERN=type
+ cut -d= -f2
+ echo type=dhcpsrvr
+ TYPE=dhcpsrvr
+ cut -d= -f1
+ echo disable_rp_filter=true
+ FIRSTPATTERN=disable_rp_filter
+ cut -d= -f1
+ echo extra_pubnics=2
+ FIRSTPATTERN=extra_pubnics
+ cut -d= -f1
+ echo dns1=10.70.10.21
+ FIRSTPATTERN=dns1
+ cut -d= -f1
+ echo 
baremetalnotificationsecuritykey=nu1HfF_DpC-gK-G_3y1u54Snb9ruROq-qldOvhnHj4EMypguvtfQu0o18eY3gs81iPZMD2Du1QOUAG5KOfMYXQ
+ FIRSTPATTERN=baremetalnotificationsecuritykey
+ cut -d= -f1
+ echo 
baremetalnotificationapikey=CKZoOXffpY5ihjvzly3yD_2t2qaDnFglYFDoeep37aH1qy5u67aX51ZsuZpZcphfOxJY52rkTlNOl0nkNSyXjQ
+ FIRSTPATTERN=baremetalnotificationapikey
+ cut -d= -f1
+ echo host=10.70.110.101
+ FIRSTPATTERN=host
+ cut -d= -f1
+ echo port=8080
+ FIRSTPATTERN=port
+ cut -d= -f1
+ echo nic_macs=06:b1:2e:00:00:10|02:00:14:42:00:03
+ FIRSTPATTERN=nic_macs
+ [ -f /etc/init.d/functions ]
+ [ -f ./lib/lsb/init-functions ]
+ RETVAL=0
+ CLOUDSTACK_HOME=/usr/local/cloud
+ [ -f /usr/local/cloud/systemvm/utils.sh ]
+ _failure
+ [ -f /etc/init.d/functions ]
+ echo Failed
Failed
+ [ 0 != 0 ]
+ exit 0

Thoughts?

Jacob Seeley
Sr. Infrastructure Engineer
VertitechIT
413-268-1631

www.vertitechit.com

-----Original Message-----
From: ilya [mailto:ilya.mailing.li...@gmail.com] 
Sent: Wednesday, July 27, 2016 8:43 PM
To: users@cloudstack.apache.org
Subject: Re: CS 4.8 VMware - Virtual Router stuck at starting

Hi Jacob

I gave this a second read - if your issue is Router VM in starting mode
- but not started - it means cloudstack agent on routerVM cannot talk to 
management server on 8250 over POD network.

Another reason would be an issue of hypervisor accessing the NFS mount used for 
secondary storage.

Use console of vCenter to see what is happening on router vm. You can login 
locally with root/password and see the content of /var/log/cloud.out file, 
paste it on pastebin - if it makes no sense to you...

you can also run /etc/init.d/cloud stop and start.. that will give you a fresh 
start on logs..

also, confirm that management server can talk to VR on POD IP
(management) on port 3922..

Regards
ilya

On 7/27/16 9:34 AM, Jacob Seeley wrote:
> ilya,
> 
> Here are the contents of the secondary storage:
> 
> .
> ./template
> ./template/tmpl
> ./template/tmpl/1
> ./template/tmpl/1/8
> ./template/tmpl/1/8/49a4c4ee-ef06-4474-92c3-1d8efb082266.ova
> ./template/tmpl/1/8/template.properties
> ./template/tmpl/1/8/systemvm64template-4.6.0-RC20151104T1522-4.6.0-vmw
> are.ovf 
> ./template/tmpl/1/8/systemvm64template-4.6.0-RC20151104T1522-4.6.0-vmw
> are-disk3.vmdk
> ./template/tmpl/1/7
> ./template/tmpl/1/7/template.properties
> ./template/tmpl/1/7/0098d168-4985-3b33-9840-eb5848d2f385.ova
> ./template/tmpl/1/7/CentOS5.3-x86_64.ovf
> ./template/tmpl/1/7/CentOS5.3-x86_64-disk1.vmdk
> ./template/tmpl/1/7/CentOS5.3-x86_64.mf
> ./systemvm
> ./systemvm/systemvm-4.8.0.1.iso
> ./systemvm/.lck-bf162a0100000000
> ./snapshots
> ./volumes
> 
> I've noticed that both the Secondary Storage VM and Console Proxy VM mount 
> this ISO and as stated before, they come up just fine.
> 
> Regards,
> 
> Jacob Seeley
> Sr. Infrastructure Engineer
> VertitechIT
> 413-268-1631
> 
> www.vertitechit.com
> 
> -----Original Message-----
> From: ilya [mailto:ilya.mailing.li...@gmail.com]
> Sent: Wednesday, July 27, 2016 3:22 AM
> To: users@cloudstack.apache.org
> Subject: Re: CS 4.8 VMware - Virtual Router stuck at starting
> 
> Jacob
> 
> The upgrade usually occurs though systemvm.iso - that is generated by 
> cloudstack on the first start.
> 
> Please show the content of your secondary store specifically
> 
> /mnt/[secondary-storage]/systemvm
> 
> Regards
> ilya
> 
> On 7/25/16 11:19 AM, Jacob Seeley wrote:
>> Here is a pastebin snippet the management-server.log - 
>> http://pastebin.com/GCLm53Gz
>>
>> Hopefully the relevant data is in there.
>>
>> I made sure to start from scratch for this example. Everything from the 
>> vSphere ESXi to the vCenter to the CentOS 7 with CloudStack install is 
>> fresh. I deployed a new instance in CloudStack, a VM internally named 
>> i-2-3-VM with an IP address of 192.168.0.78. This prompted CloudStack to 
>> deploy a VR. The VR is called r-4-VM with an IP address of 192.168.0.79.
>>
>> Thank you,
>>
>> Jacob Seeley
>> Sr. Infrastructure Engineer
>> VertitechIT
>> 413-268-1631
>>
>> www.vertitechit.com
>>
>> -----Original Message-----
>> From: Suresh Sadhu [mailto:suresh.sa...@accelerite.com]
>> Sent: Monday, July 25, 2016 1:37 AM
>> To: users@cloudstack.apache.org
>> Subject: Re: CS 4.8 VMware - Virtual Router stuck at starting
>>
>> please upload the logs in the issue.
>>> On Jul 5, 2016, at 8:46 AM, Darren Tang <darrentang...@gmail.com> wrote:
>>>
>>> https://issues.apache.org/jira/browse/CLOUDSTACK-9144
>>>
>>> 2016-07-04 19:41 GMT+08:00 Glenn Wagner <glenn.wag...@shapeblue.com>:
>>>
>>>> Hi,
>>>>
>>>> What template are you using to start your first VM? - the default 
>>>> vmware template?
>>>> If you look in vcenter , what does the console show you ?
>>>>
>>>>
>>>> Glenn
>>>>
>>>>
>>>>
>>>> glenn.wag...@shapeblue.com
>>>> www.shapeblue.com
>>>> 2nd Floor, Oudehuis Centre, 122 Main Rd, Somerset West, Cape Town 
>>>> 7130South Africa @shapeblue
>>>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Pascal R. [mailto:repa...@gmail.com]
>>>> Sent: Monday, 04 July 2016 1:26 PM
>>>> To: users@cloudstack.apache.org
>>>> Subject: CS 4.8 VMware - Virtual Router stuck at starting
>>>>
>>>> hi,
>>>>
>>>> we have a CS4.8 deployment with VMWare 5.5.
>>>>
>>>> When trying to launch the first VM, the VS is created. VS starts 
>>>> up, but in CS, it stuck with "starting" state.
>>>>
>>>> i can't find any usefull information in the logs.
>>>>
>>>> any hint?
>>>>
>>
>>
>>
>>
>> DISCLAIMER
>> ==========
>> This e-mail may contain privileged and confidential information which is the 
>> property of Accelerite, a Persistent Systems business. It is intended only 
>> for the use of the individual or entity to which it is addressed. If you are 
>> not the intended recipient, you are not authorized to read, retain, copy, 
>> print, distribute or use this message. If you have received this 
>> communication in error, please notify the sender and delete all copies of 
>> this message. Accelerite, a Persistent Systems business does not accept any 
>> liability for virus infected mails.
>>

Reply via email to