Re: [ceph-users] Missing clones

2018-02-21 Thread Karsten Becker
So - here is the feedback. After a long night...

The plain copying did not help... it then complains about the Snaps of
another VM (also with old Snapshots).

I remembered about a thread I read that the problem could solved by
converting back to filestore, because you then have access of the data
in filesystem. So I did that for the 3 OSDs affected. After that, of
course (rgh), the PG got located on other OSDs - but at least one
was still on a filestore converted OSD.

So I first set the primary affinity in a way that the PG was primary on
the filestore OSD. Then I quickly turned off all three OSDs. The PG got
stale then (all replicas were down). Flushed the journals to be on the
safe side.

Then I took a detailed look in the filesystem (with find) and found the
rbd_data.2313975238e1f29.000XXX, which was size 0. So no data in it.

I then used
> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-X 
> rbd_data.2313975238e1f29.000XXX remove

on all three OSDs and fired them up again.

Then - after waiting for the cluster to get balanced again (PG still
reported as inconsistent) - I fired up a repair on the PG (primary still
on the filestore OSD).

-> Fixed.   :-)  HEALTHY

This night I will set the OSD up as BlueStore again. Hopefully it will
not happen again.

I found in a bug report the tip to set "bluefs_allocator = stupid" in
ceph.conf. I also did that and restarted all OSDs afterwards. So maybe
this prevents the problem to happen again.

Best
Karsten


On 20.02.2018 16:03, Eugen Block wrote:
> Alright, good luck!
> The results would be interesting. :-)


Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin (Charlottenburg), 
HRB 57947
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Missing clones

2018-02-20 Thread Eugen Block

Alright, good luck!
The results would be interesting. :-)


Zitat von Karsten Becker :


Hi Eugen,

yes, I also see the rbd_data.-Number changing. This can be caused by me
by deleting snapshots and trying to move over VMs to another pool which
is not affected.

Currently I'm trying to move the Finance VM, which is a very old VM
which got created as one of the first VMs and is still alive (as the
only one of this age). Maybe it's really a problem of "old" VM formats,
like mentioned in the links somebody sent where snapshots had wrong/old
bits that a new Ceph could not understrand anymore.

We'll see... the VM is large and currently copying... if the error gets
also copied, the VM format/age is the cause. If not, ... hm...   :-D

Nevertheless thank you for your help!
Karsten




On 20.02.2018 15:47, Eugen Block wrote:

I'm not quite sure how to interpret this, but there are different
objects referenced. From the first log output you pasted:


2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head expected
clone 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1
missing


From the failed PG import the logs mention two different objects:


Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#

snapset 0=[]:{}
Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#

And your last log output has another two different objects:


Write #10:9df3943b:::rbd_data.e57feb238e1f29.0003c2e1:head#

snapset 0=[]:{}
Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:19#


So in total we're seeing five different rbd_data objects here:

 - rbd_data.2313975238e1f29
 - rbd_data.f5b8603d1b58ba
 - rbd_data.966489238e1f29
 - rbd_data.e57feb238e1f29
 - rbd_data.4401c7238e1f29

This doesn't make too much sense to me, yet. Which ones are belongig to
your corrupted VM? Do you have a backup of the VM in case the repair fails?


Zitat von Karsten Becker :


Nope:


Write #10:9df3943b:::rbd_data.e57feb238e1f29.0003c2e1:head#
snapset 0=[]:{}
Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:19#
Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:23#
Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:head#
snapset 612=[23,22,15]:{19=[15],23=[23,22]}
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function
'void SnapMapper::add_oid(const hobject_t&, const
std::set&,
MapCacher::Transaction*)' thread 7fd45147a400 time 2018-02-20
13:56:20.672430
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED
assert(r == -2)
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x102) [0x7fd4478c68f2]
 2: (SnapMapper::add_oid(hobject_t const&, std::set const&,
MapCacher::Transaction,
ceph::buffer::list>*)+0x8e9) [0x556930765fe9]
 3: (get_attrs(ObjectStore*, coll_t, ghobject_t,
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
SnapMapper&)+0xafb) [0x5569304ca01b]
 4: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
[0x5569304caae8]
 5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
std::__cxx11::basic_string, ObjectStore::Sequencer&)+0x1135)
[0x5569304d12f5]
 6: (main()+0x3909) [0x556930432349]
 7: (__libc_start_main()+0xf1) [0x7fd444d252b1]
 8: (_start()+0x2a) [0x5569304ba01a]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.
*** Caught signal (Aborted) **
 in thread 7fd45147a400 thread_name:ceph-objectstor
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
luminous (stable)
 1: (()+0x913f14) [0x556930ae1f14]
 2: (()+0x110c0) [0x7fd44619e0c0]
 3: (gsignal()+0xcf) [0x7fd444d37fcf]
 4: (abort()+0x16a) [0x7fd444d393fa]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x28e) [0x7fd4478c6a7e]
 6: (SnapMapper::add_oid(hobject_t const&, std::set const&,
MapCacher::Transaction,
ceph::buffer::list>*)+0x8e9) [0x556930765fe9]
 7: (get_attrs(ObjectStore*, coll_t, ghobject_t,
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
SnapMapper&)+0xafb) [0x5569304ca01b]
 8: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
[0x5569304caae8]
 9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
std::__cxx11::basic_string, ObjectStore::Sequencer&)+0x1135)
[0x5569304d12f5]
 10: (main()+0x3909) [0x556930432349]
 11: (__libc_start_main()+0xf1) [0x7fd444d252b1]

Re: [ceph-users] Missing clones

2018-02-20 Thread Karsten Becker
Hi Eugen,

yes, I also see the rbd_data.-Number changing. This can be caused by me
by deleting snapshots and trying to move over VMs to another pool which
is not affected.

Currently I'm trying to move the Finance VM, which is a very old VM
which got created as one of the first VMs and is still alive (as the
only one of this age). Maybe it's really a problem of "old" VM formats,
like mentioned in the links somebody sent where snapshots had wrong/old
bits that a new Ceph could not understrand anymore.

We'll see... the VM is large and currently copying... if the error gets
also copied, the VM format/age is the cause. If not, ... hm...   :-D

Nevertheless thank you for your help!
Karsten




On 20.02.2018 15:47, Eugen Block wrote:
> I'm not quite sure how to interpret this, but there are different
> objects referenced. From the first log output you pasted:
> 
>> 2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9
>> 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head expected
>> clone 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1
>> missing
> 
> From the failed PG import the logs mention two different objects:
> 
>> Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#
> snapset 0=[]:{}
> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#
> 
> And your last log output has another two different objects:
> 
>> Write #10:9df3943b:::rbd_data.e57feb238e1f29.0003c2e1:head#
> snapset 0=[]:{}
> Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:19#
> 
> 
> So in total we're seeing five different rbd_data objects here:
> 
>  - rbd_data.2313975238e1f29
>  - rbd_data.f5b8603d1b58ba
>  - rbd_data.966489238e1f29
>  - rbd_data.e57feb238e1f29
>  - rbd_data.4401c7238e1f29
> 
> This doesn't make too much sense to me, yet. Which ones are belongig to
> your corrupted VM? Do you have a backup of the VM in case the repair fails?
> 
> 
> Zitat von Karsten Becker :
> 
>> Nope:
>>
>>> Write #10:9df3943b:::rbd_data.e57feb238e1f29.0003c2e1:head#
>>> snapset 0=[]:{}
>>> Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:19#
>>> Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:23#
>>> Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:head#
>>> snapset 612=[23,22,15]:{19=[15],23=[23,22]}
>>> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function
>>> 'void SnapMapper::add_oid(const hobject_t&, const
>>> std::set&,
>>> MapCacher::Transaction>> ceph::buffer::list>*)' thread 7fd45147a400 time 2018-02-20
>>> 13:56:20.672430
>>> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED
>>> assert(r == -2)
>>>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
>>> luminous (stable)
>>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> const*)+0x102) [0x7fd4478c68f2]
>>>  2: (SnapMapper::add_oid(hobject_t const&, std::set>> std::less, std::allocator > const&,
>>> MapCacher::Transaction>> std::char_traits, std::allocator >,
>>> ceph::buffer::list>*)+0x8e9) [0x556930765fe9]
>>>  3: (get_attrs(ObjectStore*, coll_t, ghobject_t,
>>> ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
>>> SnapMapper&)+0xafb) [0x5569304ca01b]
>>>  4: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
>>> ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
>>> [0x5569304caae8]
>>>  5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
>>> std::__cxx11::basic_string>> std::allocator >, ObjectStore::Sequencer&)+0x1135)
>>> [0x5569304d12f5]
>>>  6: (main()+0x3909) [0x556930432349]
>>>  7: (__libc_start_main()+0xf1) [0x7fd444d252b1]
>>>  8: (_start()+0x2a) [0x5569304ba01a]
>>>  NOTE: a copy of the executable, or `objdump -rdS ` is
>>> needed to interpret this.
>>> *** Caught signal (Aborted) **
>>>  in thread 7fd45147a400 thread_name:ceph-objectstor
>>>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
>>> luminous (stable)
>>>  1: (()+0x913f14) [0x556930ae1f14]
>>>  2: (()+0x110c0) [0x7fd44619e0c0]
>>>  3: (gsignal()+0xcf) [0x7fd444d37fcf]
>>>  4: (abort()+0x16a) [0x7fd444d393fa]
>>>  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> const*)+0x28e) [0x7fd4478c6a7e]
>>>  6: (SnapMapper::add_oid(hobject_t const&, std::set>> std::less, std::allocator > const&,
>>> MapCacher::Transaction>> std::char_traits, std::allocator >,
>>> ceph::buffer::list>*)+0x8e9) [0x556930765fe9]
>>>  7: (get_attrs(ObjectStore*, coll_t, ghobject_t,
>>> ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
>>> SnapMapper&)+0xafb) [0x5569304ca01b]
>>>  8: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
>>> ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
>>> [0x5569304caae8]
>>>  9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
>>> 

Re: [ceph-users] Missing clones

2018-02-20 Thread Eugen Block
I'm not quite sure how to interpret this, but there are different  
objects referenced. From the first log output you pasted:


2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9  
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head  
expected clone  
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1 missing


From the failed PG import the logs mention two different objects:


Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#

snapset 0=[]:{}
Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#

And your last log output has another two different objects:


Write #10:9df3943b:::rbd_data.e57feb238e1f29.0003c2e1:head#

snapset 0=[]:{}
Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:19#


So in total we're seeing five different rbd_data objects here:

 - rbd_data.2313975238e1f29
 - rbd_data.f5b8603d1b58ba
 - rbd_data.966489238e1f29
 - rbd_data.e57feb238e1f29
 - rbd_data.4401c7238e1f29

This doesn't make too much sense to me, yet. Which ones are belongig  
to your corrupted VM? Do you have a backup of the VM in case the  
repair fails?



Zitat von Karsten Becker :


Nope:


Write #10:9df3943b:::rbd_data.e57feb238e1f29.0003c2e1:head#
snapset 0=[]:{}
Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:19#
Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:23#
Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:head#
snapset 612=[23,22,15]:{19=[15],23=[23,22]}
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function  
'void SnapMapper::add_oid(const hobject_t&, const  
std::set&,  
MapCacher::Transaction*)' thread 7fd45147a400 time 2018-02-20  
13:56:20.672430
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED  
assert(r == -2)
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)  
luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char  
const*)+0x102) [0x7fd4478c68f2]
 2: (SnapMapper::add_oid(hobject_t const&, std::set const&,  
MapCacher::Transaction,  
ceph::buffer::list>*)+0x8e9) [0x556930765fe9]
 3: (get_attrs(ObjectStore*, coll_t, ghobject_t,  
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,  
SnapMapper&)+0xafb) [0x5569304ca01b]
 4: (ObjectStoreTool::get_object(ObjectStore*, coll_t,  
ceph::buffer::list&, OSDMap&, bool*,  
ObjectStore::Sequencer&)+0x738) [0x5569304caae8]
 5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,  
std::__cxx11::basic_string, ObjectStore::Sequencer&)+0x1135)  
[0x5569304d12f5]

 6: (main()+0x3909) [0x556930432349]
 7: (__libc_start_main()+0xf1) [0x7fd444d252b1]
 8: (_start()+0x2a) [0x5569304ba01a]
 NOTE: a copy of the executable, or `objdump -rdS ` is  
needed to interpret this.

*** Caught signal (Aborted) **
 in thread 7fd45147a400 thread_name:ceph-objectstor
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)  
luminous (stable)

 1: (()+0x913f14) [0x556930ae1f14]
 2: (()+0x110c0) [0x7fd44619e0c0]
 3: (gsignal()+0xcf) [0x7fd444d37fcf]
 4: (abort()+0x16a) [0x7fd444d393fa]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char  
const*)+0x28e) [0x7fd4478c6a7e]
 6: (SnapMapper::add_oid(hobject_t const&, std::set const&,  
MapCacher::Transaction,  
ceph::buffer::list>*)+0x8e9) [0x556930765fe9]
 7: (get_attrs(ObjectStore*, coll_t, ghobject_t,  
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,  
SnapMapper&)+0xafb) [0x5569304ca01b]
 8: (ObjectStoreTool::get_object(ObjectStore*, coll_t,  
ceph::buffer::list&, OSDMap&, bool*,  
ObjectStore::Sequencer&)+0x738) [0x5569304caae8]
 9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,  
std::__cxx11::basic_string, ObjectStore::Sequencer&)+0x1135)  
[0x5569304d12f5]

 10: (main()+0x3909) [0x556930432349]
 11: (__libc_start_main()+0xf1) [0x7fd444d252b1]
 12: (_start()+0x2a) [0x5569304ba01a]
Aborted




What I also do not understand: If I take your approach of finding out
what is stored in the PG, I get no match with my PG ID anymore.

If I take the approach of "rbd info" which was posted by Mykola Golub, I
get a match - unfortunately the most important VM on our system which
holds the software for our Finance.

Best
Karsten









On 20.02.2018 09:16, Eugen Block wrote:

And does the re-import of the PG work? From the logs I assumed that the
snapshot(s) prevented a successful import, but now that they are deleted
it could work.


Zitat von Karsten Becker :


Hi Eugen,

hmmm, that should be :


rbd -p cpVirtualMachines list | while read LINE; do osdmaptool
--test-map-object $LINE 

Re: [ceph-users] Missing clones

2018-02-20 Thread Karsten Becker
Nope:

> Write #10:9df3943b:::rbd_data.e57feb238e1f29.0003c2e1:head#
> snapset 0=[]:{}
> Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:19#
> Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:23#
> Write #10:9df399dd:::rbd_data.4401c7238e1f29.050d:head#
> snapset 612=[23,22,15]:{19=[15],23=[23,22]}
> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function 'void 
> SnapMapper::add_oid(const hobject_t&, const std::set&, 
> MapCacher::Transaction ceph::buffer::list>*)' thread 7fd45147a400 time 2018-02-20 13:56:20.672430
> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED assert(r 
> == -2)
>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> const*)+0x102) [0x7fd4478c68f2]
>  2: (SnapMapper::add_oid(hobject_t const&, std::set std::less, std::allocator > const&, 
> MapCacher::Transaction std::char_traits, std::allocator >, ceph::buffer::list>*)+0x8e9) 
> [0x556930765fe9]
>  3: (get_attrs(ObjectStore*, coll_t, ghobject_t, ObjectStore::Transaction*, 
> ceph::buffer::list&, OSDriver&, SnapMapper&)+0xafb) [0x5569304ca01b]
>  4: (ObjectStoreTool::get_object(ObjectStore*, coll_t, ceph::buffer::list&, 
> OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) [0x5569304caae8]
>  5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
> std::__cxx11::basic_string >, ObjectStore::Sequencer&)+0x1135) [0x5569304d12f5]
>  6: (main()+0x3909) [0x556930432349]
>  7: (__libc_start_main()+0xf1) [0x7fd444d252b1]
>  8: (_start()+0x2a) [0x5569304ba01a]
>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
> interpret this.
> *** Caught signal (Aborted) **
>  in thread 7fd45147a400 thread_name:ceph-objectstor
>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
> (stable)
>  1: (()+0x913f14) [0x556930ae1f14]
>  2: (()+0x110c0) [0x7fd44619e0c0]
>  3: (gsignal()+0xcf) [0x7fd444d37fcf]
>  4: (abort()+0x16a) [0x7fd444d393fa]
>  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> const*)+0x28e) [0x7fd4478c6a7e]
>  6: (SnapMapper::add_oid(hobject_t const&, std::set std::less, std::allocator > const&, 
> MapCacher::Transaction std::char_traits, std::allocator >, ceph::buffer::list>*)+0x8e9) 
> [0x556930765fe9]
>  7: (get_attrs(ObjectStore*, coll_t, ghobject_t, ObjectStore::Transaction*, 
> ceph::buffer::list&, OSDriver&, SnapMapper&)+0xafb) [0x5569304ca01b]
>  8: (ObjectStoreTool::get_object(ObjectStore*, coll_t, ceph::buffer::list&, 
> OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) [0x5569304caae8]
>  9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
> std::__cxx11::basic_string >, ObjectStore::Sequencer&)+0x1135) [0x5569304d12f5]
>  10: (main()+0x3909) [0x556930432349]
>  11: (__libc_start_main()+0xf1) [0x7fd444d252b1]
>  12: (_start()+0x2a) [0x5569304ba01a]
> Aborted



What I also do not understand: If I take your approach of finding out
what is stored in the PG, I get no match with my PG ID anymore.

If I take the approach of "rbd info" which was posted by Mykola Golub, I
get a match - unfortunately the most important VM on our system which
holds the software for our Finance.

Best
Karsten









On 20.02.2018 09:16, Eugen Block wrote:
> And does the re-import of the PG work? From the logs I assumed that the
> snapshot(s) prevented a successful import, but now that they are deleted
> it could work.
> 
> 
> Zitat von Karsten Becker :
> 
>> Hi Eugen,
>>
>> hmmm, that should be :
>>
>>> rbd -p cpVirtualMachines list | while read LINE; do osdmaptool
>>> --test-map-object $LINE --pool 10 osdmap 2>&1; rbd snap ls
>>> cpVirtualMachines/$LINE | grep -v SNAPID | awk '{ print $2 }' | while
>>> read LINE2; do echo "$LINE"; osdmaptool --test-map-object $LINE2
>>> --pool 10 osdmap 2>&1; done; done | less
>>
>> It's a Proxmox system. There were only two snapshots on the PG, which I
>> deleted now. Now nothing gets displayed on the PG... is that possible? A
>> repair still fails unfortunately...
>>
>> Best & thank you for the hint!
>> Karsten
>>
>>
>>
>> On 19.02.2018 22:42, Eugen Block wrote:
 BTW - how can I find out, which RBDs are affected by this problem.
 Maybe
 a copy/remove of the affected RBDs could help? But how to find out to
 which RBDs this PG belongs to?
>>>
>>> Depending on how many PGs your cluster/pool has, you could dump your
>>> osdmap and then run the osdmaptool [1] for every rbd object in your pool
>>> and grep for the affected PG. That would be quick for a few objects, I
>>> guess:
>>>
>>> ceph1:~ # ceph osd getmap -o /tmp/osdmap
>>>
>>> ceph1:~ # osdmaptool --test-map-object image1 --pool 5 /tmp/osdmap
>>> 

Re: [ceph-users] Missing clones

2018-02-20 Thread Eugen Block
And does the re-import of the PG work? From the logs I assumed that  
the snapshot(s) prevented a successful import, but now that they are  
deleted it could work.



Zitat von Karsten Becker :


Hi Eugen,

hmmm, that should be :

rbd -p cpVirtualMachines list | while read LINE; do osdmaptool  
--test-map-object $LINE --pool 10 osdmap 2>&1; rbd snap ls  
cpVirtualMachines/$LINE | grep -v SNAPID | awk '{ print $2 }' |  
while read LINE2; do echo "$LINE"; osdmaptool --test-map-object  
$LINE2 --pool 10 osdmap 2>&1; done; done | less


It's a Proxmox system. There were only two snapshots on the PG, which I
deleted now. Now nothing gets displayed on the PG... is that possible? A
repair still fails unfortunately...

Best & thank you for the hint!
Karsten



On 19.02.2018 22:42, Eugen Block wrote:

BTW - how can I find out, which RBDs are affected by this problem. Maybe
a copy/remove of the affected RBDs could help? But how to find out to
which RBDs this PG belongs to?


Depending on how many PGs your cluster/pool has, you could dump your
osdmap and then run the osdmaptool [1] for every rbd object in your pool
and grep for the affected PG. That would be quick for a few objects, I
guess:

ceph1:~ # ceph osd getmap -o /tmp/osdmap

ceph1:~ # osdmaptool --test-map-object image1 --pool 5 /tmp/osdmap
osdmaptool: osdmap file '/tmp/osdmap'
 object 'image1' -> 5.2 -> [0]

ceph1:~ # osdmaptool --test-map-object image2 --pool 5 /tmp/osdmap
osdmaptool: osdmap file '/tmp/osdmap'
 object 'image2' -> 5.f -> [0]


[1]
https://www.hastexo.com/resources/hints-and-kinks/which-osd-stores-specific-rados-object/


Zitat von Karsten Becker :


BTW - how can I find out, which RBDs are affected by this problem. Maybe
a copy/remove of the affected RBDs could help? But how to find out to
which RBDs this PG belongs to?

Best
Karsten

On 19.02.2018 19:26, Karsten Becker wrote:

Hi.

Thank you for the tip. I just tried... but unfortunately the import
aborts:


Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#
snapset 0=[]:{}
Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#
Write #10:9de973fe:::rbd_data.966489238e1f29.250b:24#
Write #10:9de973fe:::rbd_data.966489238e1f29.250b:head#
snapset 628=[24,21,17]:{18=[17],24=[24,21]}
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function
'void SnapMapper::add_oid(const hobject_t&, const
std::set&,
MapCacher::Transaction*)' thread 7facba7de400 time 2018-02-19
19:24:18.917515
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED
assert(r == -2)
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x102) [0x7facb0c2a8f2]
 2: (SnapMapper::add_oid(hobject_t const&, std::set const&,
MapCacher::Transaction,
ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
 3: (get_attrs(ObjectStore*, coll_t, ghobject_t,
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
SnapMapper&)+0xafb) [0x55eef35f901b]
 4: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
[0x55eef35f9ae8]
 5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
std::__cxx11::basic_string, ObjectStore::Sequencer&)+0x1135)
[0x55eef36002f5]
 6: (main()+0x3909) [0x55eef3561349]
 7: (__libc_start_main()+0xf1) [0x7facae0892b1]
 8: (_start()+0x2a) [0x55eef35e901a]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.
*** Caught signal (Aborted) **
 in thread 7facba7de400 thread_name:ceph-objectstor
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
luminous (stable)
 1: (()+0x913f14) [0x55eef3c10f14]
 2: (()+0x110c0) [0x7facaf5020c0]
 3: (gsignal()+0xcf) [0x7facae09bfcf]
 4: (abort()+0x16a) [0x7facae09d3fa]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x28e) [0x7facb0c2aa7e]
 6: (SnapMapper::add_oid(hobject_t const&, std::set const&,
MapCacher::Transaction,
ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
 7: (get_attrs(ObjectStore*, coll_t, ghobject_t,
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
SnapMapper&)+0xafb) [0x55eef35f901b]
 8: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
[0x55eef35f9ae8]
 9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
std::__cxx11::basic_string, ObjectStore::Sequencer&)+0x1135)
[0x55eef36002f5]
 10: (main()+0x3909) [0x55eef3561349]
 11: (__libc_start_main()+0xf1) [0x7facae0892b1]
 12: (_start()+0x2a) 

Re: [ceph-users] Missing clones

2018-02-19 Thread Karsten Becker
Hi Eugen,

hmmm, that should be :

> rbd -p cpVirtualMachines list | while read LINE; do osdmaptool 
> --test-map-object $LINE --pool 10 osdmap 2>&1; rbd snap ls 
> cpVirtualMachines/$LINE | grep -v SNAPID | awk '{ print $2 }' | while read 
> LINE2; do echo "$LINE"; osdmaptool --test-map-object $LINE2 --pool 10 osdmap 
> 2>&1; done; done | less

It's a Proxmox system. There were only two snapshots on the PG, which I
deleted now. Now nothing gets displayed on the PG... is that possible? A
repair still fails unfortunately...

Best & thank you for the hint!
Karsten



On 19.02.2018 22:42, Eugen Block wrote:
>> BTW - how can I find out, which RBDs are affected by this problem. Maybe
>> a copy/remove of the affected RBDs could help? But how to find out to
>> which RBDs this PG belongs to?
> 
> Depending on how many PGs your cluster/pool has, you could dump your
> osdmap and then run the osdmaptool [1] for every rbd object in your pool
> and grep for the affected PG. That would be quick for a few objects, I
> guess:
> 
> ceph1:~ # ceph osd getmap -o /tmp/osdmap
> 
> ceph1:~ # osdmaptool --test-map-object image1 --pool 5 /tmp/osdmap
> osdmaptool: osdmap file '/tmp/osdmap'
>  object 'image1' -> 5.2 -> [0]
> 
> ceph1:~ # osdmaptool --test-map-object image2 --pool 5 /tmp/osdmap
> osdmaptool: osdmap file '/tmp/osdmap'
>  object 'image2' -> 5.f -> [0]
> 
> 
> [1]
> https://www.hastexo.com/resources/hints-and-kinks/which-osd-stores-specific-rados-object/
> 
> 
> Zitat von Karsten Becker :
> 
>> BTW - how can I find out, which RBDs are affected by this problem. Maybe
>> a copy/remove of the affected RBDs could help? But how to find out to
>> which RBDs this PG belongs to?
>>
>> Best
>> Karsten
>>
>> On 19.02.2018 19:26, Karsten Becker wrote:
>>> Hi.
>>>
>>> Thank you for the tip. I just tried... but unfortunately the import
>>> aborts:
>>>
 Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#
 snapset 0=[]:{}
 Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#
 Write #10:9de973fe:::rbd_data.966489238e1f29.250b:24#
 Write #10:9de973fe:::rbd_data.966489238e1f29.250b:head#
 snapset 628=[24,21,17]:{18=[17],24=[24,21]}
 /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function
 'void SnapMapper::add_oid(const hobject_t&, const
 std::set&,
 MapCacher::Transaction*)' thread 7facba7de400 time 2018-02-19
 19:24:18.917515
 /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED
 assert(r == -2)
  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
 luminous (stable)
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
 const*)+0x102) [0x7facb0c2a8f2]
  2: (SnapMapper::add_oid(hobject_t const&, std::set const&,
 MapCacher::Transaction,
 ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
  3: (get_attrs(ObjectStore*, coll_t, ghobject_t,
 ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
 SnapMapper&)+0xafb) [0x55eef35f901b]
  4: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
 ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
 [0x55eef35f9ae8]
  5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
 std::__cxx11::basic_string, ObjectStore::Sequencer&)+0x1135)
 [0x55eef36002f5]
  6: (main()+0x3909) [0x55eef3561349]
  7: (__libc_start_main()+0xf1) [0x7facae0892b1]
  8: (_start()+0x2a) [0x55eef35e901a]
  NOTE: a copy of the executable, or `objdump -rdS ` is
 needed to interpret this.
 *** Caught signal (Aborted) **
  in thread 7facba7de400 thread_name:ceph-objectstor
  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)
 luminous (stable)
  1: (()+0x913f14) [0x55eef3c10f14]
  2: (()+0x110c0) [0x7facaf5020c0]
  3: (gsignal()+0xcf) [0x7facae09bfcf]
  4: (abort()+0x16a) [0x7facae09d3fa]
  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
 const*)+0x28e) [0x7facb0c2aa7e]
  6: (SnapMapper::add_oid(hobject_t const&, std::set const&,
 MapCacher::Transaction,
 ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
  7: (get_attrs(ObjectStore*, coll_t, ghobject_t,
 ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,
 SnapMapper&)+0xafb) [0x55eef35f901b]
  8: (ObjectStoreTool::get_object(ObjectStore*, coll_t,
 ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738)
 [0x55eef35f9ae8]
  9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool,
 std::__cxx11::basic_string

Re: [ceph-users] Missing clones

2018-02-19 Thread Mykola Golub
On Mon, Feb 19, 2018 at 10:17:55PM +0100, Karsten Becker wrote:
> BTW - how can I find out, which RBDs are affected by this problem. Maybe
> a copy/remove of the affected RBDs could help? But how to find out to
> which RBDs this PG belongs to?

In this case rbd_data.966489238e1f29.250b looks like the
problem object. To find out which RBD image it belongs to you can run
`rbd info /` command for every image in the pool, looking at
block_name_prefix field, until you find 'rbd_data.966489238e1f29'.

> 
> Best
> Karsten
> 
> On 19.02.2018 19:26, Karsten Becker wrote:
> > Hi.
> > 
> > Thank you for the tip. I just tried... but unfortunately the import aborts:
> > 
> >> Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#
> >> snapset 0=[]:{}
> >> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#
> >> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:24#
> >> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:head#
> >> snapset 628=[24,21,17]:{18=[17],24=[24,21]}
> >> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function 'void 
> >> SnapMapper::add_oid(const hobject_t&, const std::set&, 
> >> MapCacher::Transaction >> ceph::buffer::list>*)' thread 7facba7de400 time 2018-02-19 19:24:18.917515
> >> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED 
> >> assert(r == -2)
> >>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
> >> (stable)
> >>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> >> const*)+0x102) [0x7facb0c2a8f2]
> >>  2: (SnapMapper::add_oid(hobject_t const&, std::set >> std::less, std::allocator > const&, 
> >> MapCacher::Transaction >> std::char_traits, std::allocator >, 
> >> ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
> >>  3: (get_attrs(ObjectStore*, coll_t, ghobject_t, 
> >> ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&, 
> >> SnapMapper&)+0xafb) [0x55eef35f901b]
> >>  4: (ObjectStoreTool::get_object(ObjectStore*, coll_t, 
> >> ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) 
> >> [0x55eef35f9ae8]
> >>  5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
> >> std::__cxx11::basic_string >> std::allocator >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5]
> >>  6: (main()+0x3909) [0x55eef3561349]
> >>  7: (__libc_start_main()+0xf1) [0x7facae0892b1]
> >>  8: (_start()+0x2a) [0x55eef35e901a]
> >>  NOTE: a copy of the executable, or `objdump -rdS ` is needed 
> >> to interpret this.
> >> *** Caught signal (Aborted) **
> >>  in thread 7facba7de400 thread_name:ceph-objectstor
> >>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
> >> (stable)
> >>  1: (()+0x913f14) [0x55eef3c10f14]
> >>  2: (()+0x110c0) [0x7facaf5020c0]
> >>  3: (gsignal()+0xcf) [0x7facae09bfcf]
> >>  4: (abort()+0x16a) [0x7facae09d3fa]
> >>  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> >> const*)+0x28e) [0x7facb0c2aa7e]
> >>  6: (SnapMapper::add_oid(hobject_t const&, std::set >> std::less, std::allocator > const&, 
> >> MapCacher::Transaction >> std::char_traits, std::allocator >, 
> >> ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
> >>  7: (get_attrs(ObjectStore*, coll_t, ghobject_t, 
> >> ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&, 
> >> SnapMapper&)+0xafb) [0x55eef35f901b]
> >>  8: (ObjectStoreTool::get_object(ObjectStore*, coll_t, 
> >> ceph::buffer::list&, OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) 
> >> [0x55eef35f9ae8]
> >>  9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
> >> std::__cxx11::basic_string >> std::allocator >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5]
> >>  10: (main()+0x3909) [0x55eef3561349]
> >>  11: (__libc_start_main()+0xf1) [0x7facae0892b1]
> >>  12: (_start()+0x2a) [0x55eef35e901a]
> >> Aborted

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Missing clones

2018-02-19 Thread Eugen Block

BTW - how can I find out, which RBDs are affected by this problem. Maybe
a copy/remove of the affected RBDs could help? But how to find out to
which RBDs this PG belongs to?


Depending on how many PGs your cluster/pool has, you could dump your  
osdmap and then run the osdmaptool [1] for every rbd object in your  
pool and grep for the affected PG. That would be quick for a few  
objects, I guess:


ceph1:~ # ceph osd getmap -o /tmp/osdmap

ceph1:~ # osdmaptool --test-map-object image1 --pool 5 /tmp/osdmap
osdmaptool: osdmap file '/tmp/osdmap'
 object 'image1' -> 5.2 -> [0]

ceph1:~ # osdmaptool --test-map-object image2 --pool 5 /tmp/osdmap
osdmaptool: osdmap file '/tmp/osdmap'
 object 'image2' -> 5.f -> [0]


[1]  
https://www.hastexo.com/resources/hints-and-kinks/which-osd-stores-specific-rados-object/


Zitat von Karsten Becker :


BTW - how can I find out, which RBDs are affected by this problem. Maybe
a copy/remove of the affected RBDs could help? But how to find out to
which RBDs this PG belongs to?

Best
Karsten

On 19.02.2018 19:26, Karsten Becker wrote:

Hi.

Thank you for the tip. I just tried... but unfortunately the import aborts:


Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#
snapset 0=[]:{}
Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#
Write #10:9de973fe:::rbd_data.966489238e1f29.250b:24#
Write #10:9de973fe:::rbd_data.966489238e1f29.250b:head#
snapset 628=[24,21,17]:{18=[17],24=[24,21]}
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In  
function 'void SnapMapper::add_oid(const hobject_t&, const  
std::set&,  
MapCacher::Transaction*)' thread 7facba7de400 time 2018-02-19  
19:24:18.917515
/home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246:  
FAILED assert(r == -2)
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)  
luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char  
const*)+0x102) [0x7facb0c2a8f2]
 2: (SnapMapper::add_oid(hobject_t const&, std::set const&,  
MapCacher::Transaction,  
ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
 3: (get_attrs(ObjectStore*, coll_t, ghobject_t,  
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,  
SnapMapper&)+0xafb) [0x55eef35f901b]
 4: (ObjectStoreTool::get_object(ObjectStore*, coll_t,  
ceph::buffer::list&, OSDMap&, bool*,  
ObjectStore::Sequencer&)+0x738) [0x55eef35f9ae8]
 5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&,  
bool, std::__cxx11::basic_string, ObjectStore::Sequencer&)+0x1135)  
[0x55eef36002f5]

 6: (main()+0x3909) [0x55eef3561349]
 7: (__libc_start_main()+0xf1) [0x7facae0892b1]
 8: (_start()+0x2a) [0x55eef35e901a]
 NOTE: a copy of the executable, or `objdump -rdS ` is  
needed to interpret this.

*** Caught signal (Aborted) **
 in thread 7facba7de400 thread_name:ceph-objectstor
 ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf)  
luminous (stable)

 1: (()+0x913f14) [0x55eef3c10f14]
 2: (()+0x110c0) [0x7facaf5020c0]
 3: (gsignal()+0xcf) [0x7facae09bfcf]
 4: (abort()+0x16a) [0x7facae09d3fa]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char  
const*)+0x28e) [0x7facb0c2aa7e]
 6: (SnapMapper::add_oid(hobject_t const&, std::set const&,  
MapCacher::Transaction,  
ceph::buffer::list>*)+0x8e9) [0x55eef3894fe9]
 7: (get_attrs(ObjectStore*, coll_t, ghobject_t,  
ObjectStore::Transaction*, ceph::buffer::list&, OSDriver&,  
SnapMapper&)+0xafb) [0x55eef35f901b]
 8: (ObjectStoreTool::get_object(ObjectStore*, coll_t,  
ceph::buffer::list&, OSDMap&, bool*,  
ObjectStore::Sequencer&)+0x738) [0x55eef35f9ae8]
 9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&,  
bool, std::__cxx11::basic_string, ObjectStore::Sequencer&)+0x1135)  
[0x55eef36002f5]

 10: (main()+0x3909) [0x55eef3561349]
 11: (__libc_start_main()+0xf1) [0x7facae0892b1]
 12: (_start()+0x2a) [0x55eef35e901a]
Aborted


Best
Karsten

On 19.02.2018 17:09, Eugen Block wrote:

Could [1] be of interest?
Exporting the intact PG and importing it back to the rescpective OSD
sounds promising.

[1]
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-July/019673.html


Zitat von Karsten Becker :


Hi.

We have size=3 min_size=2.

But this "upgrade" has been done during the weekend. We had size=2
min_size=1 before.

Best
Karsten



On 19.02.2018 13:02, Eugen Block wrote:

Hi,

just to rule out the obvious, which size does the pool have? You aren't
running it with size = 2, do you?


Zitat von Karsten Becker :


Hi,

I have one damaged PG in my cluster. All OSDs are BlueStore. How do 

Re: [ceph-users] Missing clones

2018-02-19 Thread Karsten Becker
BTW - how can I find out, which RBDs are affected by this problem. Maybe
a copy/remove of the affected RBDs could help? But how to find out to
which RBDs this PG belongs to?

Best
Karsten

On 19.02.2018 19:26, Karsten Becker wrote:
> Hi.
> 
> Thank you for the tip. I just tried... but unfortunately the import aborts:
> 
>> Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#
>> snapset 0=[]:{}
>> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#
>> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:24#
>> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:head#
>> snapset 628=[24,21,17]:{18=[17],24=[24,21]}
>> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function 'void 
>> SnapMapper::add_oid(const hobject_t&, const std::set&, 
>> MapCacher::Transaction> ceph::buffer::list>*)' thread 7facba7de400 time 2018-02-19 19:24:18.917515
>> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED assert(r 
>> == -2)
>>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
>> (stable)
>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>> const*)+0x102) [0x7facb0c2a8f2]
>>  2: (SnapMapper::add_oid(hobject_t const&, std::set> std::less, std::allocator > const&, 
>> MapCacher::Transaction> std::char_traits, std::allocator >, ceph::buffer::list>*)+0x8e9) 
>> [0x55eef3894fe9]
>>  3: (get_attrs(ObjectStore*, coll_t, ghobject_t, ObjectStore::Transaction*, 
>> ceph::buffer::list&, OSDriver&, SnapMapper&)+0xafb) [0x55eef35f901b]
>>  4: (ObjectStoreTool::get_object(ObjectStore*, coll_t, ceph::buffer::list&, 
>> OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) [0x55eef35f9ae8]
>>  5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
>> std::__cxx11::basic_string> std::allocator >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5]
>>  6: (main()+0x3909) [0x55eef3561349]
>>  7: (__libc_start_main()+0xf1) [0x7facae0892b1]
>>  8: (_start()+0x2a) [0x55eef35e901a]
>>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
>> interpret this.
>> *** Caught signal (Aborted) **
>>  in thread 7facba7de400 thread_name:ceph-objectstor
>>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
>> (stable)
>>  1: (()+0x913f14) [0x55eef3c10f14]
>>  2: (()+0x110c0) [0x7facaf5020c0]
>>  3: (gsignal()+0xcf) [0x7facae09bfcf]
>>  4: (abort()+0x16a) [0x7facae09d3fa]
>>  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>> const*)+0x28e) [0x7facb0c2aa7e]
>>  6: (SnapMapper::add_oid(hobject_t const&, std::set> std::less, std::allocator > const&, 
>> MapCacher::Transaction> std::char_traits, std::allocator >, ceph::buffer::list>*)+0x8e9) 
>> [0x55eef3894fe9]
>>  7: (get_attrs(ObjectStore*, coll_t, ghobject_t, ObjectStore::Transaction*, 
>> ceph::buffer::list&, OSDriver&, SnapMapper&)+0xafb) [0x55eef35f901b]
>>  8: (ObjectStoreTool::get_object(ObjectStore*, coll_t, ceph::buffer::list&, 
>> OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) [0x55eef35f9ae8]
>>  9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
>> std::__cxx11::basic_string> std::allocator >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5]
>>  10: (main()+0x3909) [0x55eef3561349]
>>  11: (__libc_start_main()+0xf1) [0x7facae0892b1]
>>  12: (_start()+0x2a) [0x55eef35e901a]
>> Aborted
> 
> Best
> Karsten
> 
> On 19.02.2018 17:09, Eugen Block wrote:
>> Could [1] be of interest?
>> Exporting the intact PG and importing it back to the rescpective OSD
>> sounds promising.
>>
>> [1]
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-July/019673.html
>>
>>
>> Zitat von Karsten Becker :
>>
>>> Hi.
>>>
>>> We have size=3 min_size=2.
>>>
>>> But this "upgrade" has been done during the weekend. We had size=2
>>> min_size=1 before.
>>>
>>> Best
>>> Karsten
>>>
>>>
>>>
>>> On 19.02.2018 13:02, Eugen Block wrote:
 Hi,

 just to rule out the obvious, which size does the pool have? You aren't
 running it with size = 2, do you?


 Zitat von Karsten Becker :

> Hi,
>
> I have one damaged PG in my cluster. All OSDs are BlueStore. How do I
> fix this?
>
>> 2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9
>> 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head expected
>> clone 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1
>> missing
>> 2018-02-19 11:00:23.183707 osd.29 [INF] repair 10.7b9
>> 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head 1
>> missing clone(s)
>> 2018-02-19 11:01:18.074666 mon.0 [ERR] Health check update: Possible
>> data damage: 1 pg inconsistent (PG_DAMAGED)
>> 2018-02-19 11:01:11.856529 osd.29 [ERR] 10.7b9 repair 1 errors, 0

Re: [ceph-users] Missing clones

2018-02-19 Thread Karsten Becker
Hi.

Thank you for the tip. I just tried... but unfortunately the import aborts:

> Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head#
> snapset 0=[]:{}
> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18#
> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:24#
> Write #10:9de973fe:::rbd_data.966489238e1f29.250b:head#
> snapset 628=[24,21,17]:{18=[17],24=[24,21]}
> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: In function 'void 
> SnapMapper::add_oid(const hobject_t&, const std::set&, 
> MapCacher::Transaction ceph::buffer::list>*)' thread 7facba7de400 time 2018-02-19 19:24:18.917515
> /home/builder/source/ceph-12.2.2/src/osd/SnapMapper.cc: 246: FAILED assert(r 
> == -2)
>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> const*)+0x102) [0x7facb0c2a8f2]
>  2: (SnapMapper::add_oid(hobject_t const&, std::set std::less, std::allocator > const&, 
> MapCacher::Transaction std::char_traits, std::allocator >, ceph::buffer::list>*)+0x8e9) 
> [0x55eef3894fe9]
>  3: (get_attrs(ObjectStore*, coll_t, ghobject_t, ObjectStore::Transaction*, 
> ceph::buffer::list&, OSDriver&, SnapMapper&)+0xafb) [0x55eef35f901b]
>  4: (ObjectStoreTool::get_object(ObjectStore*, coll_t, ceph::buffer::list&, 
> OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) [0x55eef35f9ae8]
>  5: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
> std::__cxx11::basic_string >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5]
>  6: (main()+0x3909) [0x55eef3561349]
>  7: (__libc_start_main()+0xf1) [0x7facae0892b1]
>  8: (_start()+0x2a) [0x55eef35e901a]
>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
> interpret this.
> *** Caught signal (Aborted) **
>  in thread 7facba7de400 thread_name:ceph-objectstor
>  ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous 
> (stable)
>  1: (()+0x913f14) [0x55eef3c10f14]
>  2: (()+0x110c0) [0x7facaf5020c0]
>  3: (gsignal()+0xcf) [0x7facae09bfcf]
>  4: (abort()+0x16a) [0x7facae09d3fa]
>  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> const*)+0x28e) [0x7facb0c2aa7e]
>  6: (SnapMapper::add_oid(hobject_t const&, std::set std::less, std::allocator > const&, 
> MapCacher::Transaction std::char_traits, std::allocator >, ceph::buffer::list>*)+0x8e9) 
> [0x55eef3894fe9]
>  7: (get_attrs(ObjectStore*, coll_t, ghobject_t, ObjectStore::Transaction*, 
> ceph::buffer::list&, OSDriver&, SnapMapper&)+0xafb) [0x55eef35f901b]
>  8: (ObjectStoreTool::get_object(ObjectStore*, coll_t, ceph::buffer::list&, 
> OSDMap&, bool*, ObjectStore::Sequencer&)+0x738) [0x55eef35f9ae8]
>  9: (ObjectStoreTool::do_import(ObjectStore*, OSDSuperblock&, bool, 
> std::__cxx11::basic_string >, ObjectStore::Sequencer&)+0x1135) [0x55eef36002f5]
>  10: (main()+0x3909) [0x55eef3561349]
>  11: (__libc_start_main()+0xf1) [0x7facae0892b1]
>  12: (_start()+0x2a) [0x55eef35e901a]
> Aborted

Best
Karsten

On 19.02.2018 17:09, Eugen Block wrote:
> Could [1] be of interest?
> Exporting the intact PG and importing it back to the rescpective OSD
> sounds promising.
> 
> [1]
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-July/019673.html
> 
> 
> Zitat von Karsten Becker :
> 
>> Hi.
>>
>> We have size=3 min_size=2.
>>
>> But this "upgrade" has been done during the weekend. We had size=2
>> min_size=1 before.
>>
>> Best
>> Karsten
>>
>>
>>
>> On 19.02.2018 13:02, Eugen Block wrote:
>>> Hi,
>>>
>>> just to rule out the obvious, which size does the pool have? You aren't
>>> running it with size = 2, do you?
>>>
>>>
>>> Zitat von Karsten Becker :
>>>
 Hi,

 I have one damaged PG in my cluster. All OSDs are BlueStore. How do I
 fix this?

> 2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9
> 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head expected
> clone 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1
> missing
> 2018-02-19 11:00:23.183707 osd.29 [INF] repair 10.7b9
> 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head 1
> missing clone(s)
> 2018-02-19 11:01:18.074666 mon.0 [ERR] Health check update: Possible
> data damage: 1 pg inconsistent (PG_DAMAGED)
> 2018-02-19 11:01:11.856529 osd.29 [ERR] 10.7b9 repair 1 errors, 0
> fixed
> 2018-02-19 11:01:24.333533 mon.0 [ERR] overall HEALTH_ERR 1 scrub
> errors; Possible data damage: 1 pg inconsistent

 "ceph pg repair 10.7b9" fails and is not able to fix ist. A manually
 started scrub "ceph pg scrub 10.7b9" also.

 Best from Berlin/Germany
 Karsten


 Ecologic Institut 

Re: [ceph-users] Missing clones

2018-02-19 Thread Eugen Block

Could [1] be of interest?
Exporting the intact PG and importing it back to the rescpective OSD  
sounds promising.


[1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-July/019673.html


Zitat von Karsten Becker :


Hi.

We have size=3 min_size=2.

But this "upgrade" has been done during the weekend. We had size=2
min_size=1 before.

Best
Karsten



On 19.02.2018 13:02, Eugen Block wrote:

Hi,

just to rule out the obvious, which size does the pool have? You aren't
running it with size = 2, do you?


Zitat von Karsten Becker :


Hi,

I have one damaged PG in my cluster. All OSDs are BlueStore. How do I
fix this?


2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head expected
clone 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1
missing
2018-02-19 11:00:23.183707 osd.29 [INF] repair 10.7b9
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head 1
missing clone(s)
2018-02-19 11:01:18.074666 mon.0 [ERR] Health check update: Possible
data damage: 1 pg inconsistent (PG_DAMAGED)
2018-02-19 11:01:11.856529 osd.29 [ERR] 10.7b9 repair 1 errors, 0 fixed
2018-02-19 11:01:24.333533 mon.0 [ERR] overall HEALTH_ERR 1 scrub
errors; Possible data damage: 1 pg inconsistent


"ceph pg repair 10.7b9" fails and is not able to fix ist. A manually
started scrub "ceph pg scrub 10.7b9" also.

Best from Berlin/Germany
Karsten


Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin
(Charlottenburg), HRB 57947
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com







Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin  
(Charlottenburg), HRB 57947

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Eugen Block voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail  : ebl...@nde.ag

Vorsitzende des Aufsichtsrates: Angelika Mozdzen
  Sitz und Registergericht: Hamburg, HRB 90934
  Vorstand: Jens-U. Mozdzen
   USt-IdNr. DE 814 013 983

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Missing clones

2018-02-19 Thread Eugen Block
When we ran our test cluster with size 2 I experienced a similar  
issue, but that was in Hammer. There I could find the corresponding PG  
data in the filesystem and copy it to the damaged PG. But now we also  
run Bluestore on Luminous, I don't know yet how to fix this kind of  
issue, maybe someone else can share some thoughts on this.



Zitat von Karsten Becker :


Hi.

We have size=3 min_size=2.

But this "upgrade" has been done during the weekend. We had size=2
min_size=1 before.

Best
Karsten



On 19.02.2018 13:02, Eugen Block wrote:

Hi,

just to rule out the obvious, which size does the pool have? You aren't
running it with size = 2, do you?


Zitat von Karsten Becker :


Hi,

I have one damaged PG in my cluster. All OSDs are BlueStore. How do I
fix this?


2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head expected
clone 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1
missing
2018-02-19 11:00:23.183707 osd.29 [INF] repair 10.7b9
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head 1
missing clone(s)
2018-02-19 11:01:18.074666 mon.0 [ERR] Health check update: Possible
data damage: 1 pg inconsistent (PG_DAMAGED)
2018-02-19 11:01:11.856529 osd.29 [ERR] 10.7b9 repair 1 errors, 0 fixed
2018-02-19 11:01:24.333533 mon.0 [ERR] overall HEALTH_ERR 1 scrub
errors; Possible data damage: 1 pg inconsistent


"ceph pg repair 10.7b9" fails and is not able to fix ist. A manually
started scrub "ceph pg scrub 10.7b9" also.

Best from Berlin/Germany
Karsten


Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin
(Charlottenburg), HRB 57947
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com







Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin  
(Charlottenburg), HRB 57947

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Eugen Block voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail  : ebl...@nde.ag

Vorsitzende des Aufsichtsrates: Angelika Mozdzen
  Sitz und Registergericht: Hamburg, HRB 90934
  Vorstand: Jens-U. Mozdzen
   USt-IdNr. DE 814 013 983

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Missing clones

2018-02-19 Thread Karsten Becker
Hi.

We have size=3 min_size=2.

But this "upgrade" has been done during the weekend. We had size=2
min_size=1 before.

Best
Karsten



On 19.02.2018 13:02, Eugen Block wrote:
> Hi,
> 
> just to rule out the obvious, which size does the pool have? You aren't
> running it with size = 2, do you?
> 
> 
> Zitat von Karsten Becker :
> 
>> Hi,
>>
>> I have one damaged PG in my cluster. All OSDs are BlueStore. How do I
>> fix this?
>>
>>> 2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9
>>> 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head expected
>>> clone 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1
>>> missing
>>> 2018-02-19 11:00:23.183707 osd.29 [INF] repair 10.7b9
>>> 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head 1
>>> missing clone(s)
>>> 2018-02-19 11:01:18.074666 mon.0 [ERR] Health check update: Possible
>>> data damage: 1 pg inconsistent (PG_DAMAGED)
>>> 2018-02-19 11:01:11.856529 osd.29 [ERR] 10.7b9 repair 1 errors, 0 fixed
>>> 2018-02-19 11:01:24.333533 mon.0 [ERR] overall HEALTH_ERR 1 scrub
>>> errors; Possible data damage: 1 pg inconsistent
>>
>> "ceph pg repair 10.7b9" fails and is not able to fix ist. A manually
>> started scrub "ceph pg scrub 10.7b9" also.
>>
>> Best from Berlin/Germany
>> Karsten
>>
>>
>> Ecologic Institut gemeinnuetzige GmbH
>> Pfalzburger Str. 43/44, D-10717 Berlin
>> Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
>> Sitz der Gesellschaft / Registered Office: Berlin (Germany)
>> Registergericht / Court of Registration: Amtsgericht Berlin
>> (Charlottenburg), HRB 57947
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 


Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin (Charlottenburg), 
HRB 57947
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Missing clones

2018-02-19 Thread Eugen Block

Hi,

just to rule out the obvious, which size does the pool have? You  
aren't running it with size = 2, do you?



Zitat von Karsten Becker :


Hi,

I have one damaged PG in my cluster. All OSDs are BlueStore. How do I
fix this?

2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9  
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head  
expected clone  
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1 missing
2018-02-19 11:00:23.183707 osd.29 [INF] repair 10.7b9  
10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head 1  
missing clone(s)
2018-02-19 11:01:18.074666 mon.0 [ERR] Health check update:  
Possible data damage: 1 pg inconsistent (PG_DAMAGED)

2018-02-19 11:01:11.856529 osd.29 [ERR] 10.7b9 repair 1 errors, 0 fixed
2018-02-19 11:01:24.333533 mon.0 [ERR] overall HEALTH_ERR 1 scrub  
errors; Possible data damage: 1 pg inconsistent


"ceph pg repair 10.7b9" fails and is not able to fix ist. A manually
started scrub "ceph pg scrub 10.7b9" also.

Best from Berlin/Germany
Karsten


Ecologic Institut gemeinnuetzige GmbH
Pfalzburger Str. 43/44, D-10717 Berlin
Geschaeftsfuehrerin / Director: Dr. Camilla Bausch
Sitz der Gesellschaft / Registered Office: Berlin (Germany)
Registergericht / Court of Registration: Amtsgericht Berlin  
(Charlottenburg), HRB 57947

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Eugen Block voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail  : ebl...@nde.ag

Vorsitzende des Aufsichtsrates: Angelika Mozdzen
  Sitz und Registergericht: Hamburg, HRB 90934
  Vorstand: Jens-U. Mozdzen
   USt-IdNr. DE 814 013 983

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com