Re: rbd volume locked by other nodes

Diego Spinola Castro Wed, 20 Jan 2016 03:14:37 -0800

Hello Huamin, i'm aware of the mounting constraints of a block storage.
What happens is when i remove the pod and wait for a new one sometimes it
doesn't start for the lock issue.


Here's my pod:

Name: hawkular-cassandra-1-rw5ii
Namespace: openshift-infra
Image(s): openshift/origin-metrics-cassandra:latest
Node: nodebr0.xnc3qg4rvmuenbiin5bq5kisfe.nx.internal.cloudapp.net/10.0.2.5
Start Time: Tue, 19 Jan 2016 21:27:23 +0000
Labels:
metrics-infra=hawkular-cassandra,name=hawkular-cassandra-1,type=hawkular-cassandra
Status: Running
Reason:
Message:
IP: 10.1.2.104
Replication Controllers: hawkular-cassandra-1 (1/1 replicas created)
Containers:
  hawkular-cassandra-1:
    Container ID:
docker://4210b08b808a5c2c684ddfc1b734c0a76cd61b23989c83dbff7d7c175e45505f
    Image: openshift/origin-metrics-cassandra:latest
    Image ID:
docker://9f440f6ca921872a9b06d34da808a3a82cb071c16d8089676bc823e309b17724
    QoS Tier:
      cpu: BestEffort
      memory: BestEffort
    State: Running
      Started: Tue, 19 Jan 2016 21:28:23 +0000
    Ready: True
    Restart Count: 0
    Environment Variables:
      CASSANDRA_MASTER: true
      POD_NAMESPACE: openshift-infra (v1:metadata.namespace)
Conditions:
  Type Status
  Ready True
Volumes:
  cassandra-data:
    Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in
the same namespace)
    ClaimName: metrics-cassandra-1
    ReadOnly: false
  hawkular-cassandra-secrets:
    Type: Secret (a secret that should populate this volume)
    SecretName: hawkular-cassandra-secrets
  cassandra-token-5l0qq:
    Type: Secret (a secret that should populate this volume)
    SecretName: cassandra-token-5l0qq
No events.



2016-01-19 22:44 GMT-03:00 Huamin Chen <[email protected]>:

> Diego,
>
> rbd volume is expected to be used by just one container to write. Since
> once used, rbd volume is exclusively owned by that writer. What is your
> canssandra pod like?
>
> Huamin
>
> On Tue, Jan 19, 2016 at 7:20 PM, Clayton Coleman <[email protected]>
> wrote:
>
>> Hrm - so it sounds like the pod didn't get the volume torn down
>> correctly to release the volume lock.  Copying some folks who might
>> know what part of the logs to look for.
>>
>> On Tue, Jan 19, 2016 at 7:15 PM, Diego Spinola Castro
>> <[email protected]> wrote:
>> > Hi, this is origin 1.1.0.1-0.git.7334.2c6ff4b and a ceph cluster for
>> block
>> > storage.
>> >
>> > It happens a lot with my cassandra pod but i'd like to check if is a
>> issue
>> > or something that i'm doing wrong.
>> >
>> > When i delete the pod with a ceph pv it's happens to not start again,
>> > looking at the pod events i found;
>> >
>> > FailedSync Error syncing pod, skipping: rbd: image cassandra is locked
>> by
>> > other nodes
>> >
>> >
>> > I looked up and found the lock at the rbd system, indeed it was owner
>> by a
>> > different node, so as soon as i deleted the pod was able to start.
>> >
>> > $ rbd lock list cassandra
>> >
>> > There is 1 exclusive lock on this image.
>> > Locker       ID                         Address
>> > client.10197 kubelet_lock_magic_nodebr0 10.0.2.5:0/1005447
>> >
>> > $ rbd lock remove cassandra kubelet_lock_magic_nodebr0 client.10197
>> >
>> >
>> > Does anybody else has this issue?
>> >
>> > _______________________________________________
>> > dev mailing list
>> > [email protected]
>> > http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>> >
>>
>
>

_______________________________________________
dev mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

Re: rbd volume locked by other nodes

Reply via email to