Re: [ceph-users] rados rm: device or resource busy

2017-06-11 Thread Richard Arends

On 06/09/2017 10:47 AM, Jan Kasprzak wrote:

All,

This is where i wrote https://github.com/Mosibi/ceph_stripe_fixer for.

With regards,
Richard.


Hello,

Brad Hubbard wrote:
: I can reproduce this.
[...]
: That's here where you will notice it is returning EBUSY which is error
: code 16, "Device or resource busy".
:
: 
https://github.com/badone/ceph/blob/wip-ceph_test_admin_socket_output/src/cls/lock/cls_lock.cc#L189
:
: In order to remove the existing parts of the file you should be able
: to just run "rados --pool testpool ls" and remove the listed objects
: belonging to "testfile".
:
: Example:
: rados --pool testpool ls
: testfile.0004
: testfile.0001
: testfile.
: testfile.0003
: testfile.0005
: testfile.0002
:
: rados --pool testpool rm testfile.
: rados --pool testpool rm testfile.0001
: ...

This works for me, thanks!

: Please open a tracker for this so it can be investigated further.

Done: http://tracker.ceph.com/issues/20233

-Yenya



--
With regards,

Richard Arends.
Snow BV / http://snow.nl

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rados rm: device or resource busy

2017-06-09 Thread Jan Kasprzak
Hello,

Brad Hubbard wrote:
: I can reproduce this.
[...] 
: That's here where you will notice it is returning EBUSY which is error
: code 16, "Device or resource busy".
: 
: 
https://github.com/badone/ceph/blob/wip-ceph_test_admin_socket_output/src/cls/lock/cls_lock.cc#L189
: 
: In order to remove the existing parts of the file you should be able
: to just run "rados --pool testpool ls" and remove the listed objects
: belonging to "testfile".
: 
: Example:
: rados --pool testpool ls
: testfile.0004
: testfile.0001
: testfile.
: testfile.0003
: testfile.0005
: testfile.0002
: 
: rados --pool testpool rm testfile.
: rados --pool testpool rm testfile.0001
: ...

This works for me, thanks!

: Please open a tracker for this so it can be investigated further.

Done: http://tracker.ceph.com/issues/20233

-Yenya

-- 
| Jan "Yenya" Kasprzak  |
| http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 |
> That's why this kind of vulnerability is a concern: deploying stuff is  <
> often about collecting an obscene number of .jar files and pushing them <
> up to the application server.  --pboddie at LWN <
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rados rm: device or resource busy

2017-06-08 Thread Brad Hubbard
I can reproduce this.

The key is to look at debug logging on the primary.

2017-06-09 09:30:14.776355 7f9cf26a4700 20 
/home/brad/working/src/ceph3/src/cls/lock/cls_lock.cc:247: lock_op
2017-06-09 09:30:14.776359 7f9cf26a4700 20 
/home/brad/working/src/ceph3/src/cls/lock/cls_lock.cc:162: requested
lock_type=exclusive fail_if_exists=1
2017-06-09 09:30:14.776363 7f9cf26a4700 10 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean] do_osd_op 0:6d521d9c:::testfile.:head
[getxattr lock.striper.lock]
2017-06-09 09:30:14.776372 7f9cf26a4700 10 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean] do_osd_op  getxattr lock.striper.lock
2017-06-09 09:30:14.776383 7f9cf26a4700 15
filestore(/home/brad/working/src/ceph3/build/dev/osd0) getattr
0.6_head/#0:6d521d9c:::testfile.:head#
'_lock.striper.lock'
2017-06-09 09:30:14.776408 7f9cf26a4700 10
filestore(/home/brad/working/src/ceph3/build/dev/osd0) getattr
0.6_head/#0:6d521d9c:::testfile.:head#
'_lock.striper.lock' = 126
2017-06-09 09:30:14.776419 7f9cf26a4700 20 
/home/brad/working/src/ceph3/src/cls/lock/cls_lock.cc:189: cannot take
lock on object, conflicting tag
2017-06-09 09:30:14.776422 7f9cf26a4700 10 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean] method called response length=0
2017-06-09 09:30:14.776432 7f9cf26a4700 10 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean]  dropping ondisk_read_lock
2017-06-09 09:30:14.776445 7f9cf26a4700 20 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean]  op order client.4122 tid 1 (first)
2017-06-09 09:30:14.776453 7f9cf26a4700 20 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean] execute_ctx update_log_only -- result=-16
2017-06-09 09:30:14.776468 7f9cf26a4700 20 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean] record_write_error r=-16
2017-06-09 09:30:14.776478 7f9cf26a4700 10 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean] submit_log_entries 10'32 (0'0) error
0:6d521d9c:::testfile.:head by client.4122.0:1
0.00 -16
2017-06-09 09:30:14.776490 7f9cf26a4700 10 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean] new_repop: repgather(0x565246704a80 10'32 rep_tid=33
committed?=0 applied?=0 r=-16)
2017-06-09 09:30:14.776502 7f9cf26a4700 10 osd.0 pg_epoch: 10 pg[0.6(
v 10'31 (0'0,10'31] local-lis/les=8/9 n=2 ec=1/1 lis/c 8/8 les/c/f
9/9/0 8/8/4) [0,1,2] r=0 lpr=8 crt=10'28 lcod 10'30 mlcod 10'27
active+clean] merge_new_log_entries 10'32 (0'0) error
0:6d521d9c:::testfile.:head by client.4122.0:1
0.00 -16
2017-06-09 09:30:14.776514 7f9cf26a4700 20 update missing, append
10'32 (0'0) error0:6d521d9c:::testfile.:head by
client.4122.0:1 0.00 -16

Specifically this.

/home/brad/working/src/ceph3/src/cls/lock/cls_lock.cc:189: cannot take
lock on object, conflicting tag

That's here where you will notice it is returning EBUSY which is error
code 16, "Device or resource busy".

https://github.com/badone/ceph/blob/wip-ceph_test_admin_socket_output/src/cls/lock/cls_lock.cc#L189

In order to remove the existing parts of the file you should be able
to just run "rados --pool testpool ls" and remove the listed objects
belonging to "testfile".

Example:
rados --pool testpool ls
testfile.0004
testfile.0001
testfile.
testfile.0003
testfile.0005
testfile.0002

rados --pool testpool rm testfile.
rados --pool testpool rm testfile.0001
...

Please open a tracker for this so it can be investigated further.

On Fri, Jun 9, 2017 at 1:43 AM, Jan Kasprzak  wrote:
> Hello,
>
> David Turner wrote:
> : How long have you waited?
>
> About a day.
>
> : I don't do much with rados objects directly.  I usually use RBDs and
> : cephfs.  If you just need to clean things up, you can delete the pool and
> : recreate it since it looks like it's testing.  

Re: [ceph-users] rados rm: device or resource busy

2017-06-08 Thread Jan Kasprzak
Hello,

David Turner wrote:
: How long have you waited?

About a day.

: I don't do much with rados objects directly.  I usually use RBDs and
: cephfs.  If you just need to clean things up, you can delete the pool and
: recreate it since it looks like it's testing.  However this is probably a
: prime time to figure out how to get past this in case it happens in the
: future in production.

Yes. This is why I am asking now.

-Yenya

: On Thu, Jun 8, 2017 at 11:04 AM Jan Kasprzak  wrote:
: > I have created a RADOS striped object using
: >
: > $ dd someargs | rados --pool testpool --striper put testfile -
: >
: > and interrupted it in the middle of writing. Now I cannot remove this
: > object:
: >
: > $ rados --pool testpool --striper rm testfile
: > error removing testpool>testfile: (16) Device or resource busy
: >
: > How can I tell CEPH that the writer is no longer around and does not come
: > back,
: > so that I can remove the object "testfile"?

-- 
| Jan "Yenya" Kasprzak  |
| http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 |
> That's why this kind of vulnerability is a concern: deploying stuff is  <
> often about collecting an obscene number of .jar files and pushing them <
> up to the application server.  --pboddie at LWN <
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rados rm: device or resource busy

2017-06-08 Thread David Turner
How long have you waited? Watchers of objects in ceph time out after a
while and you should be able to delete it.  I'm talking around the range of
30 minutes, so it's likely this isn't the problem if you've been wrestling
with it long enough to write in about.

I don't do much with rados objects directly.  I usually use RBDs and
cephfs.  If you just need to clean things up, you can delete the pool and
recreate it since it looks like it's testing.  However this is probably a
prime time to figure out how to get past this in case it happens in the
future in production.

Hopefully someone that has more experience with manually creating and
removing rados objects chimes in.
On Thu, Jun 8, 2017 at 11:04 AM Jan Kasprzak  wrote:

> Hello,
>
> I have created a RADOS striped object using
>
> $ dd someargs | rados --pool testpool --striper put testfile -
>
> and interrupted it in the middle of writing. Now I cannot remove this
> object:
>
> $ rados --pool testpool --striper rm testfile
> error removing testpool>testfile: (16) Device or resource busy
>
> How can I tell CEPH that the writer is no longer around and does not come
> back,
> so that I can remove the object "testfile"?
>
> Thanks,
>
> -Yenya
>
> --
> | Jan "Yenya" Kasprzak 
> |
> | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5
> |
> > That's why this kind of vulnerability is a concern: deploying stuff is  <
> > often about collecting an obscene number of .jar files and pushing them <
> > up to the application server.  --pboddie at LWN <
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rados rm: device or resource busy

2017-06-08 Thread Jan Kasprzak
Hello,

I have created a RADOS striped object using

$ dd someargs | rados --pool testpool --striper put testfile -

and interrupted it in the middle of writing. Now I cannot remove this object:

$ rados --pool testpool --striper rm testfile
error removing testpool>testfile: (16) Device or resource busy

How can I tell CEPH that the writer is no longer around and does not come back,
so that I can remove the object "testfile"?

Thanks,

-Yenya

-- 
| Jan "Yenya" Kasprzak  |
| http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 |
> That's why this kind of vulnerability is a concern: deploying stuff is  <
> often about collecting an obscene number of .jar files and pushing them <
> up to the application server.  --pboddie at LWN <
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com