On Wed, 9 Aug 2017, Brad Hubbard wrote:
> Wee
> 
> On Wed, Aug 9, 2017 at 12:41 AM, Marc Roos <m.r...@f1-outsourcing.eu> wrote:
> >
> >
> >
> > The --debug indeed comes up with something
> > bluestore(/var/lib/ceph/osd/ceph-12) _verify_csum bad crc32c/0x1000
> > checksum at blob offset 0x0, got 0x100ac314, expected 0x90407f75, device
> > location [0x15a0170000~1000], logical extent 0x0~1000,
> >  bluestore(/var/lib/ceph/osd/ceph-9) _verify_csum bad crc32c/0x1000
> > checksum at blob offset 0x0, got 0xb40b26a7, expected 0x90407f75, device
> > location [0x2daea0000~1000], logical extent 0x0~1000,

What about the 3rd OSD?

It would be interesting to capture the fsck output for one of these.  
Stop the OSD, and then run

 ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-12 --log-file out \
        --debug-bluestore 30 --no-log-to-stderr

That'll generate a pretty huge log, but should include dumps of onode 
metadata and will hopefully include something else with the checksum of 
0x100ac314 so we can get some clue as to where the bad data came from.

Thanks!
sage


> >
> > I dont know how to interpret this, but am I correct to understand that
> > data has been written across the cluster to these 3 osd's and all 3 have
> > somehow received something different?
> 
> Did you run this command on OSD 0? What was the output in that case?
> 
> Possibly, all we currently know for sure is that the crc32c checksum for the
> object on OSDs 12 and 9 do not match the expected checksum according to the 
> code
> when we attempt to read the object
> #17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4#. There seems to be
> some history behind this based on your previous emails regarding these OSDs
> (12,9,0, and possibly 13) could you give us as much detail as possible about 
> how
> this issue came about and what you have done in the interim to try to resolve
> it?
> 
> When was the first indication there was a problem with pg 17.36? Did this
> correspond with any significant event?
> 
> Are these OSDs all on separate hosts?
> 
> It's possible ceph-bluestore-tool may help here but I would hold off on that
> option until we understand the issue better.
> 
> 
> >
> >
> > size=4194304 object_info:
> > 17:6ca10b29:::rbd_data.1fff61238e1f29.0000000000009923:head(5387'35157
> > client.2096993.0:78941 dirty|data_digest|omap_digest s 4194304 uv 35356 dd
> > f53dff2e od ffffffff alloc_hint [4194304 4194304 0]) data section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 attrs 
> > size
> > 2 omap map size 0 Read
> > #17:6ca11ab9:::rbd_data.1fa8ef2ae8944a.00000000000011b4:head# size=4194304
> > object_info:
> > 17:6ca11ab9:::rbd_data.1fa8ef2ae8944a.00000000000011b4:head(5163'7136
> > client.2074638.1:483264 dirty|data_digest|omap_digest s 4194304 uv 7418 dd
> > 43d61c5d od ffffffff alloc_hint [4194304 4194304 0]) data section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 attrs 
> > size
> > 2 omap map size 0 Read
> > #17:6ca13bed:::rbd_data.1f114174b0dc51.00000000000002c6:head# size=4194304
> > object_info:
> > 17:6ca13bed:::rbd_data.1f114174b0dc51.00000000000002c6:head(5236'7640
> > client.2074638.1:704364 dirty|data_digest|omap_digest s 4194304 uv 7922 dd
> > 3bcff64d od ffffffff alloc_hint [4194304 4194304 0]) data section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 attrs 
> > size
> > 2 omap map size 0 Read
> > #17:6ca1a791:::rbd_data.1fff61238e1f29.000000000000f101:head# size=4194304
> > object_info:
> > 17:6ca1a791:::rbd_data.1fff61238e1f29.000000000000f101:head(5387'35553
> > client.2096993.0:123721 dirty|data_digest|omap_digest s 4194304 uv 35752 dd
> > f9bc0fbd od ffffffff alloc_hint [4194304 4194304 0]) data section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 attrs 
> > size
> > 2 omap map size 0 Read
> > #17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4# size=4194304
> > object_info:
> > 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4(5390'56613
> > client.2096907.1:3222443 dirty|omap_digest s 4194304 uv 55477 od ffffffff
> > alloc_hint [0 0 0]) 2017-08-08 15:57:45.078348 7fad08fa4100 -1
> > bluestore(/var/lib/ceph/osd/ceph-12) _verify_csum bad crc32c/0x1000 checksum
> > at blob offset 0x0, got 0x100ac314, expected 0x90407f75, device location
> > [0x15a0170000~1000], logical extent 0x0~1000, object
> > #17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4# export_files 
> > error
> > -5 2017-08-08 15:57:45.081279 7fad08fa4100  1
> > bluestore(/var/lib/ceph/osd/ceph-12) umount 2017-08-08 15:57:45.150210
> > 7fad08fa4100  1 freelist shutdown 2017-08-08 15:57:45.150307 7fad08fa4100  4
> > rocksdb:
> > [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> > CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> > 12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/rocksdb/db/db_impl.cc:217] Shutdown:
> > canceling all background work 2017-08-08 15:57:45.152099 7fad08fa4100  4
> > rocksdb:
> > [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> > CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> > 12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/rocksdb/db/db_impl.cc:343] Shutdown
> > complete 2017-08-08 15:57:45.184742 7fad08fa4100  1 bluefs umount 2017-08-08
> > 15:57:45.203674 7fad08fa4100  1 bdev(0x7fad0b260e00
> > /var/lib/ceph/osd/ceph-12/block) close 2017-08-08 15:57:45.442499 
> > 7fad08fa4100
> > 1 bdev(0x7fad0b0a5a00 /var/lib/ceph/osd/ceph-12/block) close
> >
> > grep -i export_files strace.out -C 10
> >
> > 814  16:08:19.261144 futex(0x7fffea9378c0, FUTEX_WAKE_PRIVATE, 1) = 0
> > <0.000010> 6814  16:08:19.261242 futex(0x7f4832bb60bc, 
> > FUTEX_WAKE_OP_PRIVATE,
> > 1, 1, 0x7f4832bb60b8, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 <0.000012>
> > 6814  16:08:19.261281 madvise(0x7f4843bf0000, 524288, MADV_DONTNEED
> > <unfinished ...> 6815  16:08:19.261382 <... futex resumed> ) = 0 <14.990766>
> > 6814  16:08:19.261412 <... madvise resumed> ) = 0 <0.000123> 6814
> > 16:08:19.261446 madvise(0x7f4843b70000, 1048576, MADV_DONTNEED <unfinished
> > ...> 6815  16:08:19.261474 futex(0x7f4832bb6038, FUTEX_WAKE_PRIVATE, 1
> > <unfinished ...> 6814  16:08:19.261535 <... madvise resumed> ) = 0 
> > <0.000067>
> > 6815  16:08:19.261557 <... futex resumed> ) = 0 <0.000069> 6815
> > 16:08:19.261647 futex(0x7f4832bb60bc, FUTEX_WAIT_PRIVATE, 45, NULL 
> > <unfinished
> > ...> 6814  16:08:19.261700 write(2</dev/pts/0>, "export_files error ", 19) =
> > 19 <0.000024> 6814  16:08:19.261774 write(2</dev/pts/0>, "-5", 2) = 2
> > <0.000018> 6814  16:08:19.261841 write(2</dev/pts/0>, "\n", 1) = 1 
> > <0.000016>
> > 6814  16:08:19.262191 madvise(0x7f4839106000, 16384, MADV_DONTNEED) = 0
> > <0.000015> 6814  16:08:19.262229 madvise(0x7f483914e000, 16384, 
> > MADV_DONTNEED)
> > = 0 <0.000012> 6814  16:08:19.262295 madvise(0x7f48389e6000, 49152,
> > MADV_DONTNEED) = 0 <0.000013> 6814  16:08:19.262498 madvise(0x7f48390ea000,
> > 16384, MADV_DONTNEED) = 0 <0.000013> 6814  16:08:19.262538
> > madvise(0x7f48390ce000, 16384, MADV_DONTNEED) = 0 <0.000012> 6814
> > 16:08:19.262580 madvise(0x7f483c228000, 24576, MADV_DONTNEED) = 0 <0.000012>
> > 6814  16:08:19.263047 madvise(0x7f48393d8000, 16384, MADV_DONTNEED) = 0
> > <0.000013> 6814  16:08:19.263081 madvise(0x7f48393d8000, 32768, 
> > MADV_DONTNEED)
> > = 0 <0.000016>
> >
> >
> > I was curious how this would compare to the osd.9
> >
> > object_info:
> > 17:6ca13bed:::rbd_data.1f114174b0dc51.00000000000002c6:head(5236'7640
> > client.2074638.1:704364 dirty|data_digest|omap_digest s 4194304 uv 7922 dd
> > 3bcff64d od ffffffff alloc_hint [4194304 4194304 0]) data section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 attrs 
> > size
> > 2 omap map size 0 Read
> > #17:6ca1a791:::rbd_data.1fff61238e1f29.000000000000f101:head# size=4194304
> > object_info:
> > 17:6ca1a791:::rbd_data.1fff61238e1f29.000000000000f101:head(5387'35553
> > client.2096993.0:123721 dirty|data_digest|omap_digest s 4194304 uv 35752 dd
> > f9bc0fbd od ffffffff alloc_hint [4194304 4194304 0]) data section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 attrs 
> > size
> > 2 omap map size 0 Read
> > #17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4# size=4194304
> > object_info:
> > 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4(5390'56613
> > client.2096907.1:3222443 dirty|omap_digest s 4194304 uv 55477 od ffffffff
> > alloc_hint [0 0 0]) 2017-08-08 16:22:00.893216 7f94e10f5100 -1
> > bluestore(/var/lib/ceph/osd/ceph-9) _verify_csum bad crc32c/0x1000 checksum 
> > at
> > blob offset 0x0, got 0xb40b26a7, expected 0x90407f75, device location
> > [0x2daea0000~1000], logical extent 0x0~1000, object
> > #17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4# export_files 
> > error
> > -5 2017-08-08 16:22:00.895439 7f94e10f5100  1
> > bluestore(/var/lib/ceph/osd/ceph-9) umount 2017-08-08 16:22:00.963774
> > 7f94e10f5100  1 freelist shutdown 2017-08-08 16:22:00.963861 7f94e10f5100  4
> > rocksdb:
> > [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> > CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> > 12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/rocksdb/db/db_impl.cc:217] Shutdown:
> > canceling all background work 2017-08-08 16:22:00.968438 7f94e10f5100  4
> > rocksdb:
> > [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> > CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> > 12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/rocksdb/db/db_impl.cc:343] Shutdown
> > complete 2017-08-08 16:22:00.984583 7f94e10f5100  1 bluefs umount 2017-08-08
> > 16:22:01.026784 7f94e10f5100  1 bdev(0x7f94e3670e00
> > /var/lib/ceph/osd/ceph-9/block) close 2017-08-08 16:22:01.243361 
> > 7f94e10f5100
> > 1 bdev(0x7f94e34b5a00 /var/lib/ceph/osd/ceph-9/block) close
> >
> >
> > 23555 16:26:31.336061 io_getevents(139955679129600, 1, 16,  <unfinished ...>
> > 23552 16:26:31.336081 futex(0x7ffe7e4c9210, FUTEX_WAKE_PRIVATE, 1) = 0
> > <0.000155> 23552 16:26:31.336452 futex(0x7f49fb4d20bc, 
> > FUTEX_WAKE_OP_PRIVATE,
> > 1, 1, 0x7f49fb4d20b8, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 <0.000129>
> > 23553 16:26:31.336637 <... futex resumed> ) = 0 <16.434259> 23553
> > 16:26:31.336758 futex(0x7f49fb4d2038, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
> > 23552 16:26:31.336801 madvise(0x7f4a0cafa000, 2555904, MADV_DONTNEED
> > <unfinished ...> 23553 16:26:31.336915 <... futex resumed> ) = 0 <0.000113>
> > 23552 16:26:31.336959 <... madvise resumed> ) = 0 <0.000148> 23553
> > 16:26:31.337040 futex(0x7f49fb4d20bc, FUTEX_WAIT_PRIVATE, 55, NULL 
> > <unfinished
> > ...> 23552 16:26:31.337070 madvise(0x7f4a0ca7a000, 3080192, MADV_DONTNEED) 
> > = 0
> > <0.000180> 23552 16:26:31.337424 write(2</dev/pts/1>, "export_files error ",
> > 19) = 19 <0.000104> 23552 16:26:31.337615 write(2</dev/pts/1>, "-5", 2) = 2
> > <0.000017> 23552 16:26:31.337674 write(2</dev/pts/1>, "\n", 1) = 1 
> > <0.000037>
> > 23552 16:26:31.338270 madvise(0x7f4a01ae4000, 16384, MADV_DONTNEED) = 0
> > <0.000020> 23552 16:26:31.338320 madvise(0x7f4a018cc000, 49152, 
> > MADV_DONTNEED)
> > = 0 <0.000014> 23552 16:26:31.338561 madvise(0x7f4a0770a000, 24576,
> > MADV_DONTNEED) = 0 <0.000015> 23552 16:26:31.339161 madvise(0x7f4a02102000,
> > 16384, MADV_DONTNEED) = 0 <0.000015> 23552 16:26:31.339201
> > madvise(0x7f4a02132000, 16384, MADV_DONTNEED) = 0 <0.000013> 23552
> > 16:26:31.339235 madvise(0x7f4a02102000, 32768, MADV_DONTNEED) = 0 <0.000014>
> > 23552 16:26:31.339331 madvise(0x7f4a01df8000, 16384, MADV_DONTNEED) = 0
> > <0.000019> 23552 16:26:31.339372 madvise(0x7f4a01df8000, 32768, 
> > MADV_DONTNEED)
> > = 0 <0.000013>
> >
> >
> > -----Original Message----- From: Brad Hubbard [mailto:bhubb...@redhat.com]
> > Sent: 07 August 2017 02:34 To: Marc Roos Cc: ceph-users Subject: Re:
> > [ceph-users] Pg inconsistent / export_files error -5
> >
> >
> >
> > On Sat, Aug 5, 2017 at 1:21 AM, Marc Roos <m.r...@f1-outsourcing.eu> wrote:
> >>
> >> I have got a placement group inconsistency, and saw some manual where you 
> >> can
> >> export and import this on another osd. But I am getting an export error on
> >> every osd.
> >>
> >> What does this export_files error -5 actually mean? I thought 3 copies
> >
> > #define EIO              5      /* I/O error */
> >
> >> should be enough to secure your data.
> >>
> >>
> >>> PG_DAMAGED Possible data damage: 1 pg inconsistent pg 17.36 is
> >>> active+clean+inconsistent, acting [9,0,12]
> >>
> >>
> >>> 2017-08-04 05:39:51.534489 7f2f623d6700 -1 log_channel(cluster) log
> >> [ERR] : 17.36 soid 
> >> 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4:
> >> failed to pick suitable object info
> >>> 2017-08-04 05:41:12.715393 7f2f623d6700 -1 log_channel(cluster) log
> >> [ERR] : 17.36 deep-scrub 3 errors
> >>> 2017-08-04 15:21:12.445799 7f2f623d6700 -1 log_channel(cluster) log
> >> [ERR] : 17.36 soid 
> >> 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4:
> >> failed to pick suitable object info
> >>> 2017-08-04 15:22:35.646635 7f2f623d6700 -1 log_channel(cluster) log
> >> [ERR] : 17.36 repair 3 errors, 0 fixed
> >>
> >> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-12 --pgid 17.36 
> >> --op
> >> export --file /tmp/recover.17.36
> >
> > Can you run this command under strace like so?
> >
> > # strace -fvttyyTo /tmp/strace.out -s 1024 ceph-objectstore-tool --data-path
> > /var/lib/ceph/osd/ceph-12 --pgid 17.36 --op export --file /tmp/recover.17.36
> >
> > Then see if you can find which syscall is returning EIO.
> >
> > # grep "= \-5" /tmp/strace.out
> >
> >>
> >> ... Read #17:6c9f811c:::rbd_data.1b42f52ae8944a.0000000000001a32:head# Read
> >> #17:6ca035fc:::rbd_data.1fff61238e1f29.000000000000b31a:head# Read
> >> #17:6ca0b4f8:::rbd_data.1fff61238e1f29.0000000000006fcc:head# Read
> >> #17:6ca0ffbc:::rbd_data.1fff61238e1f29.000000000000a214:head# Read
> >> #17:6ca10b29:::rbd_data.1fff61238e1f29.0000000000009923:head# Read
> >> #17:6ca11ab9:::rbd_data.1fa8ef2ae8944a.00000000000011b4:head# Read
> >> #17:6ca13bed:::rbd_data.1f114174b0dc51.00000000000002c6:head# Read
> >> #17:6ca1a791:::rbd_data.1fff61238e1f29.000000000000f101:head# Read
> >> #17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4# export_files 
> >> error
> >> -5
> >
> > Running the command with "--debug" appended will give more output which may
> > shed more light as well.
> >
> >> _______________________________________________ ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > -- Cheers, Brad
> >
> >
> 
> 
> 
> -- 
> Cheers,
> Brad
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to