Just realized there is a file called superblock in the ceph directory. ceph-1
and ceph-2's superblock file is identical, ceph-6 and ceph-7 are identical, but
not between the two groups. When I originally created the OSDs, I created
ceph-0 through 5. Can superblock file be copied over from
Tried connecting recovered osd. Looks like some of the files in the lost+found
are super blocks. Below is the log. What can I do about this?
2017-09-01 22:27:27.634228 7f68837e5800 0 set uid:gid to 1001:1001
(ceph:ceph)2017-09-01 22:27:27.634245 7f68837e5800 0 ceph version 10.2.9
Found the partition, wasn't able to mount the partition right away... Did a
xfs_repair on that drive.
Got bunch of messages like this.. =(entry
"10a89fd.__head_AE319A25__0" in shortform directory 845908970
references non-existent inode 605294241 junking entry
Hello,
We have checked all the drives, and there is no problem with them. If there
would be a failing drive, then I think that the slow requests should appear
also in the normal traffic as the ceph cluster is using all the OSDs as
primaries for some PGs. But these slow requests are appearing
On Fri, 1 Sep 2017, Felix, Evan J wrote:
> Is there documentation about how to deal with a pool application
> association that is not one of cephfs, rbd, or rgw? We have multiple
> pools that have nothing to do with those applications, we just use the
> objects in them directly using the
Don't discount failing drives. You can have drives in a "ready-to-fail"
state that doesn't show up in SMART or anywhere easy to track. When
backfilling, the drive is using sectors it may not normally use. I managed
a 1400 osd cluster that would lose 1-3 drives in random nodes when I added
new
Hi David,
Well, most probably the larger part of our PGs will have to be reorganized, as we are
moving from 9 hosts to 3 chassis. But I was hoping to be able to throttle the backfilling
to an extent where it has minimal impact on our user traffic. Unfortunately I wasn't able
to do it. I saw
Hi:
I want to ask a question about CEPH_IOC_SYNCIO flag.
I know that when using O_SYNC flag or O_DIRECT flag, write call
executes in other two code paths different than using CEPH_IOC_SYNCIO flag.
And I find the comments about CEPH_IOC_SYNCIO here:
/*
*
Looks like it has been rescued... Only 1 error as we saw before in the smart
log!# ddrescue -f /dev/sda /dev/sdc ./rescue.logGNU ddrescue 1.21Press Ctrl-C
to interrupt ipos: 1508 GB, non-trimmed: 0 B, current rate:
0 B/s opos: 1508 GB, non-scraped: 0 B,
That is normal to have backfilling because the crush map did change. The
host and the chassis have crush numbers and their own weight which is the
sum of the osds under them. By moving the host into the chassis you
changed the weight of the chassis and that affects the PG placement even
though
Hi All,
Is there a known procedure to debug the PG state in case of problems like
this?
Best regards,
Yuri.
2017-08-28 14:05 GMT+03:00 Yuri Gorshkov :
> Hi.
>
> When trying to take down a host for maintenance purposes I encountered an
> I/O stall along with some PGs
Hi,
I have RAID0 for each disks, unfortunately my raid doesn't support JBOD.
Apart from this I also run separate cluster with Jewel 10.2.9 on RAID0
and there is no such problem(I just tested it). Moreover, a cluster that
has this issue used to run Firefly with RAID0 and everything was fine.
Hello,
thank you very much for the hint, you are right!
Kind regards, Thomas
Marc Roos schrieb am 30.08.2017 um 14:26:
>
> I had this also once. If you update all nodes and then systemctl restart
> 'ceph-osd@*' on all nodes, you should be fine. But first the monitors of
> course
13 matches
Mail list logo