exion created CLOUDSTACK-10355:
----------------------------------

             Summary: After upgrade to 4.11, Ceph RBD primary storage fails 
connection and renders node unusable
                 Key: CLOUDSTACK-10355
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-10355
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: cloudstack-agent
    Affects Versions: 4.11.0.0
            Reporter: exion


On a perfectly working 4.10 node with KVM hypervisor and Ceph RBD primary 
storage, after upgrading to 4.11, cloudstack agent is unable to connect the BRD 
pool in libvirt, giving just a generic "operation not supported" error in its 
logs:

 

2018-04-06 16:27:37,650 INFO  [kvm.storage.LibvirtStorageAdaptor] 
(agentRequest-Handler-2:null) (logid:91b4e1df) Attempting to create storage 
pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt

2018-04-06 16:27:37,652 WARN  [kvm.storage.LibvirtStorageAdaptor] 
(agentRequest-Handler-2:null) (logid:91b4e1df) Storage pool 
be80af6a-7201-3410-8da4-9b3b58c4954f was not found running in libvirt. Need to 
create it.

2018-04-06 16:27:37,653 INFO  [kvm.storage.LibvirtStorageAdaptor] 
(agentRequest-Handler-2:null) (logid:91b4e1df) Didn't find an existing storage 
pool be80af6a-7201-3410-8da4-9b3b58c4954f by UUID, checking for pools with 
duplicate paths

2018-04-06 16:27:37,664 ERROR [kvm.storage.LibvirtStorageAdaptor] 
(agentRequest-Handler-2:null) (logid:91b4e1df) Failed to create RBD storage 
pool: org.libvirt.LibvirtException: failed to connect to the RADOS monitor on: 
storagepool1:6789,: Operation not supported

2018-04-06 16:27:42,762 INFO  [cloud.agent.Agent] (Agent-Handler-4:null) 
(logid:) Lost connection to the server. Dealing with the remaining commands...

 

Exactly the same pool was previously working before upgrade:

 

2018-04-06 12:53:52,847 INFO  [kvm.storage.LibvirtStorageAdaptor] 
(agentRequest-Handler-3:null) (logid:14dace5e) Attempting to create storage 
pool be80af6a-7201-3410-8da4-9b3b58c4954f (RBD) in libvirt

2018-04-06 12:53:52,850 INFO  [kvm.storage.LibvirtStorageAdaptor] 
(agentRequest-Handler-3:null) (logid:14dace5e) Found existing defined storage 
pool be80af6a-7201-3410-8da4-9b3b58c4954f, using it.

2018-04-06 12:53:52,850 INFO  [kvm.storage.LibvirtStorageAdaptor] 
(agentRequest-Handler-3:null) (logid:14dace5e) Trying to fetch storage pool 
be80af6a-7201-3410-8da4-9b3b58c4954f from libvirt

2018-04-06 12:53:53,171 INFO  [cloud.agent.Agent] (agentRequest-Handler-2:null) 
(logid:14dace5e) Proccess agent ready command, agent id = 46

 

To nail out the issue I have tried to use the following XML config and attach 
the pool directly to libvirt in order to nail out system related issues, and it 
worked as expected:

 

<pool type="rbd">

  <name>be80af6a-7201-3410-8da4-9b3b58c4954f</name>

  <source>

    <name>cephstor1</name>

    <host name='storagepool1' port='6789'/>

    <auth username='admin' type='ceph'>

      <secret uuid='XXXXX'/>

    </auth>

  </source>

</pool>

 

virsh pool-create test.xml 

Pool be80af6a-7201-3410-8da4-9b3b58c4954f created from test.xml

 

root@compute6:~# virsh pool-info be80af6a-7201-3410-8da4-9b3b58c4954f

Name:           be80af6a-7201-3410-8da4-9b3b58c4954f

UUID:           47afe7d4-61cb-46c5-a642-93712c758b5c

State:          running

Persistent:     no

Autostart:      no

Capacity:       10.05 TiB

Allocation:     2.22 TiB

Available:      2.71 TiB

 

That being said the issue looks related to the way cloudstack scripts interface 
with libvirt's daemon.

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to