Hi Rohit,
I completed the test. The results are as follows:

The ssh.publickey and ssh.privatekey deletions in DB triggered the keys to be 
recreated, but not in the /var/lib/cloudstack/management/.ssh folder. They were 
recreated in /var/cloudstack/management/.ssh folder.

Here’s the log entry:

2017-05-28 20:53:28,508 DEBUG [c.c.u.s.Script] (localhost-startStop-1:null) 
(logid:) Executing: /bin/bash -c if [ -f /var/cloudstack/management/.ssh/id_rsa 
]; then rm -f /var/cloudstack/management/.ssh/id_rsa; fi; ssh-keygen -t rsa -N 
'' -f /var/cloudstack/management/.ssh/id_rsa –q

I then destroyed the ssvm and after recreation could not ssh to it from 
management server due to key publickey error. I have always previously been 
able to ssh to the ssvm using the var/cloudstack/management/.ssh keys.

I checked the management server logs and found entry about id_rsa files differ 
– injecting into systemvm.iso

Hopefully this is helpful.

Regards,
Jason


On 28/5/17, 5:29 pm, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:

    Hi Jason,
    
    
    In your test environment that uses the same db, can you try to do a 
workaround-experiment from [1]:
    
    
    0) chmod +r and chown cloud:cloud relevant file and locations.
    
    
    1) Stop Management Server
    
    2) Delete SSH Keys in mysql Database: delete from configuration where name 
= "ssh.publickey" ; delete from configuration where name = "ssh.privatekey" ;
    
    3) Delete the SSH Keys rm /var/lib/cloudstack/management/.ssh/id_rsa.pub rm 
/var/lib/cloudstack/management/.ssh/id_rsa
    
    4) Start the Management Server - SSH Keys are generated and mysql entries 
inserted
    
    
    
    [1] http://markmail.org/message/zfjyd7s22itg7t7q
    
    
    Regards.
    
    ________________________________
    From: Jason Kinsella <ja...@cloudpeople.com.au>
    Sent: 27 May 2017 05:33:43
    To: users@cloudstack.apache.org
    Subject: Re: SSVM NIO SSL Handshake error
    
    Files are linked here.
    
    
https://dl.dropboxusercontent.com/u/10588206/acs492/managmenet-server-logs.tar.gz
    https://dl.dropboxusercontent.com/u/10588206/acs492/systemvm.tar.gz
    
    Today we did a couple of additional tests that proved interesting. We’ve 
got a prod and a dev server. Both were upgraded last month. The prod has the 
error, but the dev is working. Everything was the same including CentOS 6.5.
    
    We restored the dev DB into the fresh CentOS7 box and it displayed the same 
problem. This would suggest an OS issue. Therefore, the converse should work. 
We restored the prod DB into the dev server and it continues to exhibit the 
problem.
    
    This suggests that we may have missed something in the migration between 
servers. Here’s steps:
    
        Stop cloudstack-man service on broken box
        Dump DB
        Copy to new and restore
        Copy db.properties & key files and update IP entry in db.properties
        Update DB entry host to new IP
        Delete DB ssl.keystore and keystore file
        Destroy systemVMs in Vmware
        Start cloudstack-man on new box
    
        The /var/cloudstack/management/.ssh/ files are referenced when we ssh 
to ssvm from MS so they are correct. What about ssh.public and ssh.private in 
db.cloud.configuration table?
    
        Regards,
        Jason
    
        On 25/5/17, 7:51 pm, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:
    
            Hi Jason,
    
    
            Thanks for sharing the details. Yes, with the new setup please 
share with us the mgmt server logs and ssvm logs with TRACE enabled in the 
log4j configuration.
    
    
            Regards.
    
            ________________________________
            From: Jason Kinsella <ja...@cloudpeople.com.au>
            Sent: 25 May 2017 12:49:50
            To: users@cloudstack.apache.org
            Subject: Re: SSVM NIO SSL Handshake error
    
            Hi Rohit,
            API login – fixed.
    
            Latest systemvmtemplate (shapeblue new) in place – no improvement
    
            No loadbalancer or known service on MS port 8250
    
            I am doing my testing now on a fresh install of CentOS7 using 
shapeblue noredist with DB restored.
    
            Hypervisor = vmware vsphere 6.5 with ESX 6.5
    
            Systemvm.iso is dated today
    
            All systemvms are exhibiting same behaviour.
    
            Would any other logs help?
    
            Regards,
            Jason
    
            On 25/5/17, 4:55 pm, "Rohit Yadav" <rohit.ya...@shapeblue.com> 
wrote:
    
                Hi Jason,
    
    
                The API login issue can be fixed by following this, which I 
believe you have already fixed: 
http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/4.9/accounts.html#using-dynamic-roles
    
    
                If not already in-use, can you try using the latest 
systemvmtemplate (for 4.6-4.9) from 
http://packages.shapeblue.com/systemvmtemplate/4.6/new.
    
    
                Do you have a load-balancer on port 8250 on the management 
server(s), or any script/service that may be trying to perform a tcp-connect on 
mgmt server's port 8250?
    
    
                When you upgrade can you make sure that both cloudstack-common 
and cloudstack-management packages are upgraded to 4.9.2.0? Also, what 
hypervisor(s) are you using?
    
    
                The following error may hint that the jars on systemvms may not 
be updated, as one of the exception classes are missing:
    
    
                    2017-05-23 11:58:22,468 INFO  
[utils.exception.CSExceptionErrorCode] (main:null) Could not find exception: 
com.cloud.utils.exception.NioConnectionException in error code list for 
exceptions
    
    
                Can you check that systemvm.iso are synced across hosts: (1) 
make sure cloudstack-common package is upgraded/updated to the same version as 
cloudstack-management (4.9.2.0), (2) if you're using vmware, delete this from 
the secondary storage, (3) for xenserver force reconnect on the host (from 
ui/api) or manually copy the scripts to xenserver host(s), (4) for kvm upgrade 
the cloudstack-common package.
    
    
                Destroy all other systemvms and see if you can reproduce the 
issue?
    
    
                Regards.
    
                ________________________________
                From: Jason Kinsella <ja...@cloudpeople.com.au>
                Sent: 25 May 2017 09:32:25
                To: users@cloudstack.apache.org
                Subject: Re: SSVM NIO SSL Handshake error
    
                Also, just wanted to mention that the symptoms we have with 
systemvms not connecting is described in the mail-list
    
                CS 4.9 NIO Selector wait time PR-1601 - 
https://www.mail-archive.com/dev@cloudstack.apache.org/msg69154.html
    
                The only difference is that this thread refers to KVM hosts not 
connecting.
    
                I’ve tried most suggestions in this thread.
    
                On 25/5/17, 1:51 pm, "Jason Kinsella" 
<ja...@cloudpeople.com.au> wrote:
    
                    Java versions are as follows:
    
                    MS: 1.7.0_141
                    SSVM: 1.7.0_85
    
                    Deleted keystore files (again) and restarted MS, then 
recreated the SSVM.
    
                    Errors from SSVM:/var/log/cloud.log
    
                    2017-05-25 03:01:28,757 INFO  [utils.nio.NioClient] 
(main:null) Connecting to 192.168.12.5:8250
                    2017-05-25 03:01:29,293 WARN  [utils.nio.Link] (main:null) 
This SSL engine was forced to close inbound due to end of stream.
                    2017-05-25 03:01:29,293 ERROR [utils.nio.Link] (main:null) 
Failed to send server's CLOSE message due to socket channel's failure.
                    2017-05-25 03:01:29,294 ERROR [utils.nio.NioClient] 
(main:null) SSL Handshake failed while connecting to host: 192.168.12.5 port: 
8250
                    2017-05-25 03:01:29,294 ERROR [utils.nio.NioConnection] 
(main:null) Unable to initialize the threads.
                    java.io.IOException: SSL Handshake failed while connecting 
to host: 192.168.12.5 port: 8250
                         at 
com.cloud.utils.nio.NioClient.init(NioClient.java:67)
                         at 
com.cloud.utils.nio.NioConnection.start(NioConnection.java:88)
                         at com.cloud.agent.Agent.start(Agent.java:228)
                         at 
com.cloud.agent.AgentShell.launchAgent(AgentShell.java:399)
                         at 
com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:367)
                         at 
com.cloud.agent.AgentShell.launchAgent(AgentShell.java:351)
                         at 
com.cloud.agent.AgentShell.start(AgentShell.java:456)
                         at com.cloud.agent.AgentShell.main(AgentShell.java:491)
    
                    Same SSL engine forced to close inbound due to end of stream
    
    
    
    
    
    
    
                    On 25/5/17, 1:51 am, "Rajani Karuturi" <raj...@apache.org> 
wrote:
    
                        Can you check java version? Set the default java to 1.7 
and delete keystore
                        files and restart MS
    
                        ~Rajani
    
                        Sent from phone.
    
                        On 24 May 2017 9:15 p.m., "Jason Kinsella" 
<ja...@cloudpeople.com.au> wrote:
    
                        > I have now moved management server to a fresh CentOS7 
server. But
                        > unfortunately I’m getting the exact same SSL 
handshake error. Back to
                        > square one.
                        >
                        > On 24/5/17, 11:40 pm, "Jason Kinsella" 
<ja...@cloudpeople.com.au> wrote:
                        >
                        >     Hi All,
                        >     Based on the feedback it seems like the issue is 
related to CentOS
                        > version, so I’ve built a new CentOS7 Management 
server using Blueshape
                        > noredist. I’ve restored the 4.9.2.0 DB into this 
server and
                        > management-server.logs look clean on boot. The only 
problem is that I can’t
                        > log into the webUI.
                        >
                        >     The logs show a successful login (user = kinsja), 
but the the API
                        > command either is not allowed or doesn’t exist for 
the user. This means the
                        > UI doesn’t load.
                        >
                        >     Anyone seen this with a restored DB?
                        >
                        >     2017-05-24 09:26:08,239 DEBUG 
[c.c.u.AccountManagerImpl]
                        > (catalina-exec-17:ctx-ee2c5e26) (logid:a8ca5ee5) 
User: kinsja in domain 1
                        > has successfully logged in
                        >     2017-05-24 09:26:08,246 INFO  [c.c.a.ApiServer] 
(catalina-exec-17:ctx-ee2c5e26)
                        > (logid:a8ca5ee5) Current user logged in under  
timezone
                        >     2017-05-24 09:26:08,246 INFO  [c.c.a.ApiServer] 
(catalina-exec-17:ctx-ee2c5e26)
                        > (logid:a8ca5ee5) Timezone offset from UTC is: 0.0
                        >     2017-05-24 09:26:08,251 DEBUG [c.c.a.ApiServlet] 
(catalina-exec-17:ctx-ee2c5e26)
                        > (logid:a8ca5ee5) ===END===  192.168.10.38 -- POST
                        >     2017-05-24 09:26:08,320 DEBUG [c.c.a.ApiServlet] 
(catalina-exec-13:ctx-a1d38347)
                        > (logid:3404c663) ===START===  192.168.10.38 -- GET
                        > command=listCapabilities&response=json&_=1495632368256
                        >     2017-05-24 09:26:08,325 DEBUG [c.c.a.ApiServer]
                        > (catalina-exec-13:ctx-a1d38347 ctx-960796a5) 
(logid:3404c663) The user with
                        > id:31 is not allowed to request the API command or 
the API command does not
                        > exist: listCapabilities
                        >
                        >     Thanks
                        >     Jason
                        >
                        >     From: Jason Kinsella <ja...@cloudpeople.com.au>
                        >     Date: Tuesday, 23 May 2017 at 10:11 pm
                        >     To: "users@cloudstack.apache.org" 
<users@cloudstack.apache.org>
                        >     Subject: SSVM NIO SSL Handshake error
                        >
                        >     Hi,
                        >     We recently upgraded from 4.5.0 to 4.9.2.0 and 
encountered a problem
                        > with the SSVM and Console Proxy. They cannot connect 
to the management
                        > server. The SSVM cloud.log repeats this error every 
couple of seconds.
                        >
                        >     2017-05-23 11:58:22,461 INFO  
[utils.nio.NioClient] (main:null)
                        > Connecting to 192.168.12.1:8250
                        >     2017-05-23 11:58:22,465 WARN  [utils.nio.Link] 
(main:null) This SSL
                        > engine was forced to close inbound due to end of 
stream.
                        >     2017-05-23 11:58:22,465 ERROR [utils.nio.Link] 
(main:null) Failed to
                        > send server's CLOSE message due to socket channel's 
failure.
                        >     2017-05-23 11:58:22,466 ERROR 
[utils.nio.NioClient] (main:null) SSL
                        > Handshake failed while connecting to host: 
192.168.12.1 port: 8250
                        >     2017-05-23 11:58:22,466 ERROR 
[utils.nio.NioConnection] (main:null)
                        > Unable to initialize the threads.
                        >     java.io.IOException: SSL Handshake failed while 
connecting to host:
                        > 192.168.12.1 port: 8250
                        >                     at com.cloud.utils.nio.NioClient.
                        > init(NioClient.java:67)
                        >                     at 
com.cloud.utils.nio.NioConnection.start(
                        > NioConnection.java:88)
                        >                     at 
com.cloud.agent.Agent.start(Agent.java:237)
                        >                     at com.cloud.agent.AgentShell.
                        > launchAgent(AgentShell.java:399)
                        >                     at com.cloud.agent.AgentShell.
                        > launchAgentFromClassInfo(AgentShell.java:367)
                        >                     at com.cloud.agent.AgentShell.
                        > launchAgent(AgentShell.java:351)
                        >                     at com.cloud.agent.AgentShell.
                        > start(AgentShell.java:456)
                        >                     at com.cloud.agent.AgentShell.
                        > main(AgentShell.java:491)
                        >     2017-05-23 11:58:22,468 INFO  
[utils.exception.CSExceptionErrorCode]
                        > (main:null) Could not find exception: 
com.cloud.utils.exception.NioConnectionException
                        > in error code list for exceptions
                        >     2017-05-23 11:58:22,468 WARN  [cloud.agent.Agent] 
(main:null) NIO
                        > Connection Exception  
com.cloud.utils.exception.NioConnectionException:
                        > SSL Handshake failed while connecting to host: 
192.168.12.1 port: 8250
                        >
                        >     The setup is very simple. Single management 
server and ports are open.
                        >
                        >     Things checked / tried:
                        >
                        >     ·         Destroyed SSVM multiple times – still 
same problem.
                        >
                        >     ·         SSH to SSVM from MS using ssh -i 
/var/cloudstack/management/.ssh/id_rsa
                        > -p 3922 root@IPADDRESS – PASS
                        >
                        >     ·         SSVM telnet on 8250 to MS – PASS
                        >
                        >     I’ve also tested a restore of the DB into our 
working development
                        > 4.9.2.0 server. It also exhibits the handshake 
errors, so most likely DB
                        > related.
                        >
                        >     I’ve used up all my skills. Please help
                        >
                        >     Regards,
                        >     Jason
                        >
                        >
                        >
                        >
    
    
    
    
    
                rohit.ya...@shapeblue.com
                www.shapeblue.com<http://www.shapeblue.com>
                53 Chandos Place, Covent Garden, London  WC2N 4HSUK
                @shapeblue
    
    
    
    
    
    
            rohit.ya...@shapeblue.com
            www.shapeblue.com<http://www.shapeblue.com>
            53 Chandos Place, Covent Garden, London  WC2N 4HSUK
            @shapeblue
    
    
    
    
    
    
    
    
    rohit.ya...@shapeblue.com 
    www.shapeblue.com
    53 Chandos Place, Covent Garden, London  WC2N 4HSUK
    @shapeblue
      
     
    
    

Reply via email to