Re: [ceph-users] RBD mirror error requesting lock: (30) Read-only file system

2017-03-17 Thread daniel parkes
Hi Jason,

You hit it on the head, that was the problem, on other installations I was
using the client.admin and the consequent daemon, and in this case I
created a dedicated user/daemon, but I didn't disable the admin daemon.

Thanks for the help!.

On Fri, Mar 17, 2017 at 2:17 AM, Jason Dillaman  wrote:

> Any chance you have two or more instance of rbd-mirror daemon running
> against the same cluster (zone2 in this instance)? The error message
> is stating that there is another process that owns the exclusive lock
> to the image and it is refusing to release it. The fact that the
> status ping-pongs back-and-forth between OK and ERROR/WARNING also
> hints that you have two or more rbd-mirror daemons fighting each
> other. In the Jewel and Kraken releases, we unfortunately only support
> a single rbd-mirror daemon process per cluster. In the forthcoming
> Luminous release, we are hoping to add active/active support (it
> already safely supports self-promoting active/passive if more than one
> rbd-mirror daemon process is started).
>
> On Thu, Mar 16, 2017 at 5:48 PM, daniel parkes 
> wrote:
> > Hi!,
> >
> > I'm having a problem with a new ceph deployment using rbd mirroring and
> it's
> > just in case someone can help me out or point me in the right direction.
> >
> > I have a ceph jewel install, with 2 clusters(zone1,zone2), rbd is working
> > fine, but the rbd mirroring between sites is not working correctly.
> >
> > I have configured  pool replication in the default rbd pool, I have setup
> > the peers and created 2 test images:
> >
> > [root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool info
> > Mode: pool
> > Peers:
> >   UUID NAME  CLIENT
> >   397b37ef-8300-4dd3-a637-2a03c3b9289c zone2 client.zone2
> > [root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool info
> > Mode: pool
> > Peers:
> >   UUID NAME  CLIENT
> >   2c11f1dc-67a4-43f1-be33-b785f1f6b366 zone1 client.zone1
> >
> > Primary is ok:
> >
> > [root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool status
> > --verbose
> > health: OK
> > images: 2 total
> > 2 stopped
> >
> > test-2:
> >   global_id:   511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4
> >   state:   up+stopped
> >   description: remote image is non-primary or local image is primary
> >   last_update: 2017-03-16 17:38:08
> >
> > And secondary is always in this state:
> >
> > [root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool status
> > --verbose
> > health: WARN
> > images: 2 total
> > 1 syncing
> >
> > test-2:
> >   global_id:   511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4
> >   state:   up+syncing
> >   description: bootstrapping, OPEN_LOCAL_IMAGE
> >   last_update: 2017-03-16 17:41:02
> >
> > Sometimes for a couple of seconds it goes into replay state and health
> ok,
> > but then back to bootstrapping, OPEN_LOCAL_IMAGE. what does this state
> > mean?.
> >
> > In the log files I have this error:
> >
> > 2017-03-16 17:43:02.404372 7ff6262e7700 -1 librbd::ImageWatcher:
> > 0x7ff654003190 error requesting lock: (30) Read-only file system
> > 2017-03-16 17:43:03.411327 7ff6262e7700 -1 librbd::ImageWatcher:
> > 0x7ff654003190 error requesting lock: (30) Read-only file system
> > 2017-03-16 17:43:04.420074 7ff6262e7700 -1 librbd::ImageWatcher:
> > 0x7ff654003190 error requesting lock: (30) Read-only file system
> > 2017-03-16 17:43:05.422253 7ff6262e7700 -1 librbd::ImageWatcher:
> > 0x7ff654003190 error requesting lock: (30) Read-only file system
> > 2017-03-16 17:43:06.428447 7ff6262e7700 -1 librbd::ImageWatcher:
> > 0x7ff654003190 error requesting lock: (30) Read-only file system
> >
> > Not sure to what file it refers that is RO, I have tried to strace it,
> but
> > couldn't find it.
> >
> > I have disable selinux just in case but the result is the same the OS is
> > rhel 7.2 by the way.
> >
> > If a do a demote/promote of the image, I get the same state and errors on
> > the other cluster.
> >
> > If someone could help it would be great.
> >
> > Thnx in advance.
> >
> > Regards
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> Jason
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD mirror error requesting lock: (30) Read-only file system

2017-03-16 Thread Jason Dillaman
Any chance you have two or more instance of rbd-mirror daemon running
against the same cluster (zone2 in this instance)? The error message
is stating that there is another process that owns the exclusive lock
to the image and it is refusing to release it. The fact that the
status ping-pongs back-and-forth between OK and ERROR/WARNING also
hints that you have two or more rbd-mirror daemons fighting each
other. In the Jewel and Kraken releases, we unfortunately only support
a single rbd-mirror daemon process per cluster. In the forthcoming
Luminous release, we are hoping to add active/active support (it
already safely supports self-promoting active/passive if more than one
rbd-mirror daemon process is started).

On Thu, Mar 16, 2017 at 5:48 PM, daniel parkes  wrote:
> Hi!,
>
> I'm having a problem with a new ceph deployment using rbd mirroring and it's
> just in case someone can help me out or point me in the right direction.
>
> I have a ceph jewel install, with 2 clusters(zone1,zone2), rbd is working
> fine, but the rbd mirroring between sites is not working correctly.
>
> I have configured  pool replication in the default rbd pool, I have setup
> the peers and created 2 test images:
>
> [root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool info
> Mode: pool
> Peers:
>   UUID NAME  CLIENT
>   397b37ef-8300-4dd3-a637-2a03c3b9289c zone2 client.zone2
> [root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool info
> Mode: pool
> Peers:
>   UUID NAME  CLIENT
>   2c11f1dc-67a4-43f1-be33-b785f1f6b366 zone1 client.zone1
>
> Primary is ok:
>
> [root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool status
> --verbose
> health: OK
> images: 2 total
> 2 stopped
>
> test-2:
>   global_id:   511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4
>   state:   up+stopped
>   description: remote image is non-primary or local image is primary
>   last_update: 2017-03-16 17:38:08
>
> And secondary is always in this state:
>
> [root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool status
> --verbose
> health: WARN
> images: 2 total
> 1 syncing
>
> test-2:
>   global_id:   511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4
>   state:   up+syncing
>   description: bootstrapping, OPEN_LOCAL_IMAGE
>   last_update: 2017-03-16 17:41:02
>
> Sometimes for a couple of seconds it goes into replay state and health ok,
> but then back to bootstrapping, OPEN_LOCAL_IMAGE. what does this state
> mean?.
>
> In the log files I have this error:
>
> 2017-03-16 17:43:02.404372 7ff6262e7700 -1 librbd::ImageWatcher:
> 0x7ff654003190 error requesting lock: (30) Read-only file system
> 2017-03-16 17:43:03.411327 7ff6262e7700 -1 librbd::ImageWatcher:
> 0x7ff654003190 error requesting lock: (30) Read-only file system
> 2017-03-16 17:43:04.420074 7ff6262e7700 -1 librbd::ImageWatcher:
> 0x7ff654003190 error requesting lock: (30) Read-only file system
> 2017-03-16 17:43:05.422253 7ff6262e7700 -1 librbd::ImageWatcher:
> 0x7ff654003190 error requesting lock: (30) Read-only file system
> 2017-03-16 17:43:06.428447 7ff6262e7700 -1 librbd::ImageWatcher:
> 0x7ff654003190 error requesting lock: (30) Read-only file system
>
> Not sure to what file it refers that is RO, I have tried to strace it, but
> couldn't find it.
>
> I have disable selinux just in case but the result is the same the OS is
> rhel 7.2 by the way.
>
> If a do a demote/promote of the image, I get the same state and errors on
> the other cluster.
>
> If someone could help it would be great.
>
> Thnx in advance.
>
> Regards
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RBD mirror error requesting lock: (30) Read-only file system

2017-03-16 Thread daniel parkes
Hi!,

I'm having a problem with a new ceph deployment using rbd mirroring and
it's just in case someone can help me out or point me in the right
direction.

I have a ceph jewel install, with 2 clusters(zone1,zone2), rbd is working
fine, but the rbd mirroring between sites is not working correctly.

I have configured  pool replication in the default rbd pool, I have setup
the peers and created 2 test images:

[root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool info
Mode: pool
Peers:
  UUID NAME  CLIENT
  397b37ef-8300-4dd3-a637-2a03c3b9289c zone2 client.zone2
[root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool info
Mode: pool
Peers:
  UUID NAME  CLIENT
  2c11f1dc-67a4-43f1-be33-b785f1f6b366 zone1 client.zone1

Primary is ok:

[root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool status
--verbose
health: OK
images: 2 total
2 stopped

test-2:
  global_id:   511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4
  state:   up+stopped
  description: remote image is non-primary or local image is primary
  last_update: 2017-03-16 17:38:08

And secondary is always in this state:

[root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool status
--verbose
health: WARN
images: 2 total
1 syncing

test-2:
  global_id:   511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4
  state:   up+syncing
  description: bootstrapping, OPEN_LOCAL_IMAGE
  last_update: 2017-03-16 17:41:02

Sometimes for a couple of seconds it goes into replay state and health ok,
but then back to bootstrapping, OPEN_LOCAL_IMAGE. what does this state
mean?.

In the log files I have this error:

2017-03-16 17:43:02.404372 7ff6262e7700 -1 librbd::ImageWatcher:
0x7ff654003190 error requesting lock: (30) Read-only file system
2017-03-16 17:43:03.411327 7ff6262e7700 -1 librbd::ImageWatcher:
0x7ff654003190 error requesting lock: (30) Read-only file system
2017-03-16 17:43:04.420074 7ff6262e7700 -1 librbd::ImageWatcher:
0x7ff654003190 error requesting lock: (30) Read-only file system
2017-03-16 17:43:05.422253 7ff6262e7700 -1 librbd::ImageWatcher:
0x7ff654003190 error requesting lock: (30) Read-only file system
2017-03-16 17:43:06.428447 7ff6262e7700 -1 librbd::ImageWatcher:
0x7ff654003190 error requesting lock: (30) Read-only file system

Not sure to what file it refers that is RO, I have tried to strace it, but
couldn't find it.

I have disable selinux just in case but the result is the same the OS is
rhel 7.2 by the way.

If a do a demote/promote of the image, I get the same state and errors on
the other cluster.

If someone could help it would be great.

Thnx in advance.

Regards
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com