subject:"\[ceph\-users\] HEALTH_ERR with a kitchen sink of problems\: MDS damaged, readonly, and so forth"

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

2019-07-25 Thread Sangwhan Moon

Original Message:
> 
> 
> On 7/25/19 7:49 AM, Sangwhan Moon wrote:
> > Hello,
> > 
> > Original Message:
> >>
> >>
> >> On 7/25/19 6:49 AM, Sangwhan Moon wrote:
> >>> Hello,
> >>>
> >>> I've inherited a Ceph cluster from someone who has left zero 
> >>> documentation or any handover. A couple days ago it decided to show the 
> >>> entire company what it is capable of..
> >>>
> >>> The health report looks like this:
> >>>
> >>> [root@host mnt]# ceph -s
> >>>   cluster:
> >>> id: 809718aa-3eac-4664-b8fa-38c46cdbfdab
> >>> health: HEALTH_ERR
> >>> 1 MDSs report damaged metadata
> >>> 1 MDSs are read only
> >>> 2 MDSs report slow requests
> >>> 6 MDSs behind on trimming
> >>> Reduced data availability: 2 pgs stale
> >>> Degraded data redundancy: 2593/186803520 objects degraded 
> >>> (0.001%), 2 pgs degraded, 2 pgs undersized
> >>> 1 slow requests are blocked > 32 sec. Implicated osds
> >>> 716 stuck requests are blocked > 4096 sec. Implicated osds 
> >>> 25,31,38\
> >>
> >> I would start here:
> >>
> >>>
> >>>   services:
> >>> mon: 3 daemons, quorum f,rook-ceph-mon2,rook-ceph-mon0
> >>> mgr: a(active)
> >>> mds: ceph-fs-2/2/2 up odd-fs-2/2/2 up  
> >>> {[ceph-fs:0]=ceph-fs-5b997cbf7b-5tjwh=up:active,[ceph-fs:1]=ceph-fs-5b997cbf
> >>> 7b-nstqz=up:active,[user-fs:0]=odd-fs-5668c75f9f-hflps=up:active,[user-fs:1]=odd-fs-5668c75f9f-jf59x=up:active},
> >>>  4 up:sta
> >>> ndby-replay
> >>> osd: 39 osds: 39 up, 38 in
> >>>
> >>>   data:
> >>> pools:   5 pools, 706 pgs
> >>> objects: 91212k objects, 4415 GB
> >>> usage:   10415 GB used, 13024 GB / 23439 GB avail
> >>> pgs: 2593/186803520 objects degraded (0.001%)
> >>>  703 active+clean
> >>>  2   stale+active+undersized+degraded
> >>
> >> This is a problem! Can you check:
> >>
> >> $ ceph pg dump_stuck
> >>
> >> The PGs will start with a number like 8.1a where '8' it the pool ID.
> >>
> >> Then check:
> >>
> >> $ ceph df
> >>
> >> To which pools to those PGs belong?
> >>
> >> Then check:
> >>
> >> $ ceph pg  query
> >>
> >> And the bottom somewhere should show why these PGs are not active. You
> >> might even want to try a restart of these OSDs involved with those two PGs.
> > 
> > Thanks a lot for the suggestions - I just checked and it says that the 
> > problematic PGs are 4.4f and 4.59 - but querying those seem result in the 
> > following error:
> > 
> > Error ENOENT: i don't have pgid 4.4f
> > 
> > (same applies for 4.59 - they do seem to show up in "ceph pg ls" though.)
> > 
> > In ceph pg ls, it shows that for these PGs UP, UP_PRIMARY ACTING, 
> > ACTING_PRIMARY all only have one OSD associated with it. (24, 13 - although 
> > both the PG ID mentioned above and these numbers probably don't help much 
> > with the diagnosis) Should restarting be a safe thing to try first?
> > 
> > ceph health detail says the following:
> > 
> > MDS_DAMAGE 1 MDSs report damaged metadata
> > mdsceph-fs-5b997cbf7b-5tjwh(mds.0): Metadata damage detected
> > MDS_READ_ONLY 1 MDSs are read only
> > mdsceph-fs-5b997cbf7b-5tjwh(mds.0): MDS in read-only mode
> > MDS_SLOW_REQUEST 2 MDSs report slow requests
> > mdsuser-fs-5668c75f9f-hflps(mds.0): 3 slow requests are blocked > 30 sec
> > mdsuser-fs-5668c75f9f-jf59x(mds.1): 980 slow requests are blocked > 30 
> > sec
> > MDS_TRIM 6 MDSs behind on trimming
> > mdsuser-fs-5668c75f9f-hflps(mds.0): Behind on trimming (342/128) 
> > max_segments: 128, num_segments: 342
> > mdsuser-fs-5668c75f9f-jf59x(mds.1): Behind on trimming (461/128) 
> > max_segments: 128, num_segments: 461
> > mdsuser-fs-5668c75f9f-h8p2t(mds.0): Behind on trimming (342/128) 
> > max_segments: 128, num_segments: 342
> > mdsuser-fs-5668c75f9f-7gs67(mds.1): Behind on trimming (461/128) 
> > max_segments: 128, num_segments: 461
> > mdsceph-fs-5b997cbf7b-5tjwh(mds.0): Behind on trimming (386/128) 
> > max_segments: 128, num_segments: 386
> > mdsceph-fs-5b997cbf7b-hmrxr(mds.0): Behind on trimming (386/128) 
> > max_segments: 128, num_segments: 386
> > PG_AVAILABILITY Reduced data availability: 2 pgs stale
> > pg 4.4f is stuck stale for 171783.855465, current state 
> > stale+active+undersized+degraded, last acting [24]
> > pg 4.59 is stuck stale for 171751.961506, current state 
> > stale+active+undersized+degraded, last acting [13]
> > PG_DEGRADED Degraded data redundancy: 2593/186805106 objects degraded 
> > (0.001%), 2 pgs degraded, 2 pgs undersized
> > pg 4.4f is stuck undersized for 171797.245359, current state 
> > stale+active+undersized+degraded, last acting [24]> pg 4.59 is stuck 
> > undersized for 171797.257707, current state
> stale+active+undersized+degraded, last acting [13]
> 
> So where are osd.24 and osd.13?
> 
> To which pool do these PGs belong?
> 
> But these PGs are probably the root-cause of all the issues you are seeing.
> 

Both

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

2019-07-25 Thread Wido den Hollander




On 7/25/19 7:49 AM, Sangwhan Moon wrote:
> Hello,
> 
> Original Message:
>>
>>
>> On 7/25/19 6:49 AM, Sangwhan Moon wrote:
>>> Hello,
>>>
>>> I've inherited a Ceph cluster from someone who has left zero documentation 
>>> or any handover. A couple days ago it decided to show the entire company 
>>> what it is capable of..
>>>
>>> The health report looks like this:
>>>
>>> [root@host mnt]# ceph -s
>>>   cluster:
>>> id: 809718aa-3eac-4664-b8fa-38c46cdbfdab
>>> health: HEALTH_ERR
>>> 1 MDSs report damaged metadata
>>> 1 MDSs are read only
>>> 2 MDSs report slow requests
>>> 6 MDSs behind on trimming
>>> Reduced data availability: 2 pgs stale
>>> Degraded data redundancy: 2593/186803520 objects degraded 
>>> (0.001%), 2 pgs degraded, 2 pgs undersized
>>> 1 slow requests are blocked > 32 sec. Implicated osds
>>> 716 stuck requests are blocked > 4096 sec. Implicated osds 
>>> 25,31,38\
>>
>> I would start here:
>>
>>>
>>>   services:
>>> mon: 3 daemons, quorum f,rook-ceph-mon2,rook-ceph-mon0
>>> mgr: a(active)
>>> mds: ceph-fs-2/2/2 up odd-fs-2/2/2 up  
>>> {[ceph-fs:0]=ceph-fs-5b997cbf7b-5tjwh=up:active,[ceph-fs:1]=ceph-fs-5b997cbf
>>> 7b-nstqz=up:active,[user-fs:0]=odd-fs-5668c75f9f-hflps=up:active,[user-fs:1]=odd-fs-5668c75f9f-jf59x=up:active},
>>>  4 up:sta
>>> ndby-replay
>>> osd: 39 osds: 39 up, 38 in
>>>
>>>   data:
>>> pools:   5 pools, 706 pgs
>>> objects: 91212k objects, 4415 GB
>>> usage:   10415 GB used, 13024 GB / 23439 GB avail
>>> pgs: 2593/186803520 objects degraded (0.001%)
>>>  703 active+clean
>>>  2   stale+active+undersized+degraded
>>
>> This is a problem! Can you check:
>>
>> $ ceph pg dump_stuck
>>
>> The PGs will start with a number like 8.1a where '8' it the pool ID.
>>
>> Then check:
>>
>> $ ceph df
>>
>> To which pools to those PGs belong?
>>
>> Then check:
>>
>> $ ceph pg  query
>>
>> And the bottom somewhere should show why these PGs are not active. You
>> might even want to try a restart of these OSDs involved with those two PGs.
> 
> Thanks a lot for the suggestions - I just checked and it says that the 
> problematic PGs are 4.4f and 4.59 - but querying those seem result in the 
> following error:
> 
> Error ENOENT: i don't have pgid 4.4f
> 
> (same applies for 4.59 - they do seem to show up in "ceph pg ls" though.)
> 
> In ceph pg ls, it shows that for these PGs UP, UP_PRIMARY ACTING, 
> ACTING_PRIMARY all only have one OSD associated with it. (24, 13 - although 
> both the PG ID mentioned above and these numbers probably don't help much 
> with the diagnosis) Should restarting be a safe thing to try first?
> 
> ceph health detail says the following:
> 
> MDS_DAMAGE 1 MDSs report damaged metadata
> mdsceph-fs-5b997cbf7b-5tjwh(mds.0): Metadata damage detected
> MDS_READ_ONLY 1 MDSs are read only
> mdsceph-fs-5b997cbf7b-5tjwh(mds.0): MDS in read-only mode
> MDS_SLOW_REQUEST 2 MDSs report slow requests
> mdsuser-fs-5668c75f9f-hflps(mds.0): 3 slow requests are blocked > 30 sec
> mdsuser-fs-5668c75f9f-jf59x(mds.1): 980 slow requests are blocked > 30 sec
> MDS_TRIM 6 MDSs behind on trimming
> mdsuser-fs-5668c75f9f-hflps(mds.0): Behind on trimming (342/128) 
> max_segments: 128, num_segments: 342
> mdsuser-fs-5668c75f9f-jf59x(mds.1): Behind on trimming (461/128) 
> max_segments: 128, num_segments: 461
> mdsuser-fs-5668c75f9f-h8p2t(mds.0): Behind on trimming (342/128) 
> max_segments: 128, num_segments: 342
> mdsuser-fs-5668c75f9f-7gs67(mds.1): Behind on trimming (461/128) 
> max_segments: 128, num_segments: 461
> mdsceph-fs-5b997cbf7b-5tjwh(mds.0): Behind on trimming (386/128) 
> max_segments: 128, num_segments: 386
> mdsceph-fs-5b997cbf7b-hmrxr(mds.0): Behind on trimming (386/128) 
> max_segments: 128, num_segments: 386
> PG_AVAILABILITY Reduced data availability: 2 pgs stale
> pg 4.4f is stuck stale for 171783.855465, current state 
> stale+active+undersized+degraded, last acting [24]
> pg 4.59 is stuck stale for 171751.961506, current state 
> stale+active+undersized+degraded, last acting [13]
> PG_DEGRADED Degraded data redundancy: 2593/186805106 objects degraded 
> (0.001%), 2 pgs degraded, 2 pgs undersized
> pg 4.4f is stuck undersized for 171797.245359, current state 
> stale+active+undersized+degraded, last acting [24]> pg 4.59 is stuck 
> undersized for 171797.257707, current state
stale+active+undersized+degraded, last acting [13]

So where are osd.24 and osd.13?

To which pool do these PGs belong?

But these PGs are probably the root-cause of all the issues you are seeing.

Wido

> REQUEST_SLOW 3 slow requests are blocked > 32 sec. Implicated osds
> 3 ops are blocked > 2097.15 sec
> REQUEST_STUCK 717 stuck requests are blocked > 4096 sec. Implicated osds 
> 25,31,38
> 286 ops are blocked > 268435 sec
> 211 ops are blocked > 134218 sec
>

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

2019-07-25 Thread Sangwhan Moon

Original Message:
> On Thu, 25 Jul 2019 13:49:22 +0900 Sangwhan Moon wrote:
> 
> > osd: 39 osds: 39 up, 38 in
> 
> You might want to find that out OSD.

Thanks, I've identified the OSD and put it back in - doesn't seem to change 
anything though. :(

Sangwhan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

2019-07-24 Thread Christian Balzer

On Thu, 25 Jul 2019 13:49:22 +0900 Sangwhan Moon wrote:

> osd: 39 osds: 39 up, 38 in

You might want to find that out OSD.

-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Rakuten Mobile Inc.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

2019-07-24 Thread Sangwhan Moon

Hello,

Original Message:
> 
> 
> On 7/25/19 6:49 AM, Sangwhan Moon wrote:
> > Hello,
> > 
> > I've inherited a Ceph cluster from someone who has left zero documentation 
> > or any handover. A couple days ago it decided to show the entire company 
> > what it is capable of..
> > 
> > The health report looks like this:
> > 
> > [root@host mnt]# ceph -s
> >   cluster:
> > id: 809718aa-3eac-4664-b8fa-38c46cdbfdab
> > health: HEALTH_ERR
> > 1 MDSs report damaged metadata
> > 1 MDSs are read only
> > 2 MDSs report slow requests
> > 6 MDSs behind on trimming
> > Reduced data availability: 2 pgs stale
> > Degraded data redundancy: 2593/186803520 objects degraded 
> > (0.001%), 2 pgs degraded, 2 pgs undersized
> > 1 slow requests are blocked > 32 sec. Implicated osds
> > 716 stuck requests are blocked > 4096 sec. Implicated osds 
> > 25,31,38\
> 
> I would start here:
> 
> > 
> >   services:
> > mon: 3 daemons, quorum f,rook-ceph-mon2,rook-ceph-mon0
> > mgr: a(active)
> > mds: ceph-fs-2/2/2 up odd-fs-2/2/2 up  
> > {[ceph-fs:0]=ceph-fs-5b997cbf7b-5tjwh=up:active,[ceph-fs:1]=ceph-fs-5b997cbf
> > 7b-nstqz=up:active,[user-fs:0]=odd-fs-5668c75f9f-hflps=up:active,[user-fs:1]=odd-fs-5668c75f9f-jf59x=up:active},
> >  4 up:sta
> > ndby-replay
> > osd: 39 osds: 39 up, 38 in
> > 
> >   data:
> > pools:   5 pools, 706 pgs
> > objects: 91212k objects, 4415 GB
> > usage:   10415 GB used, 13024 GB / 23439 GB avail
> > pgs: 2593/186803520 objects degraded (0.001%)
> >  703 active+clean
> >  2   stale+active+undersized+degraded
> 
> This is a problem! Can you check:
> 
> $ ceph pg dump_stuck
> 
> The PGs will start with a number like 8.1a where '8' it the pool ID.
> 
> Then check:
> 
> $ ceph df
> 
> To which pools to those PGs belong?
> 
> Then check:
> 
> $ ceph pg  query
> 
> And the bottom somewhere should show why these PGs are not active. You
> might even want to try a restart of these OSDs involved with those two PGs.

Thanks a lot for the suggestions - I just checked and it says that the 
problematic PGs are 4.4f and 4.59 - but querying those seem result in the 
following error:

Error ENOENT: i don't have pgid 4.4f

(same applies for 4.59 - they do seem to show up in "ceph pg ls" though.)

In ceph pg ls, it shows that for these PGs UP, UP_PRIMARY ACTING, 
ACTING_PRIMARY all only have one OSD associated with it. (24, 13 - although 
both the PG ID mentioned above and these numbers probably don't help much with 
the diagnosis) Should restarting be a safe thing to try first?

ceph health detail says the following:

MDS_DAMAGE 1 MDSs report damaged metadata
mdsceph-fs-5b997cbf7b-5tjwh(mds.0): Metadata damage detected
MDS_READ_ONLY 1 MDSs are read only
mdsceph-fs-5b997cbf7b-5tjwh(mds.0): MDS in read-only mode
MDS_SLOW_REQUEST 2 MDSs report slow requests
mdsuser-fs-5668c75f9f-hflps(mds.0): 3 slow requests are blocked > 30 sec
mdsuser-fs-5668c75f9f-jf59x(mds.1): 980 slow requests are blocked > 30 sec
MDS_TRIM 6 MDSs behind on trimming
mdsuser-fs-5668c75f9f-hflps(mds.0): Behind on trimming (342/128) 
max_segments: 128, num_segments: 342
mdsuser-fs-5668c75f9f-jf59x(mds.1): Behind on trimming (461/128) 
max_segments: 128, num_segments: 461
mdsuser-fs-5668c75f9f-h8p2t(mds.0): Behind on trimming (342/128) 
max_segments: 128, num_segments: 342
mdsuser-fs-5668c75f9f-7gs67(mds.1): Behind on trimming (461/128) 
max_segments: 128, num_segments: 461
mdsceph-fs-5b997cbf7b-5tjwh(mds.0): Behind on trimming (386/128) 
max_segments: 128, num_segments: 386
mdsceph-fs-5b997cbf7b-hmrxr(mds.0): Behind on trimming (386/128) 
max_segments: 128, num_segments: 386
PG_AVAILABILITY Reduced data availability: 2 pgs stale
pg 4.4f is stuck stale for 171783.855465, current state 
stale+active+undersized+degraded, last acting [24]
pg 4.59 is stuck stale for 171751.961506, current state 
stale+active+undersized+degraded, last acting [13]
PG_DEGRADED Degraded data redundancy: 2593/186805106 objects degraded (0.001%), 
2 pgs degraded, 2 pgs undersized
pg 4.4f is stuck undersized for 171797.245359, current state 
stale+active+undersized+degraded, last acting [24]
pg 4.59 is stuck undersized for 171797.257707, current state 
stale+active+undersized+degraded, last acting [13]
REQUEST_SLOW 3 slow requests are blocked > 32 sec. Implicated osds
3 ops are blocked > 2097.15 sec
REQUEST_STUCK 717 stuck requests are blocked > 4096 sec. Implicated osds 
25,31,38
286 ops are blocked > 268435 sec
211 ops are blocked > 134218 sec
5 ops are blocked > 67108.9 sec
2 ops are blocked > 33554.4 sec
134 ops are blocked > 16777.2 sec
79 ops are blocked > 8388.61 sec
osds 25,31,38 have stuck requests > 268435 sec

Cheers,
Sangwhan

> 
> Wido
> 
> >  1   active+clean+scrubbing+deep
> > 
> >   io:
> > client:   168

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

2019-07-24 Thread Wido den Hollander




On 7/25/19 6:49 AM, Sangwhan Moon wrote:
> Hello,
> 
> I've inherited a Ceph cluster from someone who has left zero documentation or 
> any handover. A couple days ago it decided to show the entire company what it 
> is capable of..
> 
> The health report looks like this:
> 
> [root@host mnt]# ceph -s
>   cluster:
> id: 809718aa-3eac-4664-b8fa-38c46cdbfdab
> health: HEALTH_ERR
> 1 MDSs report damaged metadata
> 1 MDSs are read only
> 2 MDSs report slow requests
> 6 MDSs behind on trimming
> Reduced data availability: 2 pgs stale
> Degraded data redundancy: 2593/186803520 objects degraded 
> (0.001%), 2 pgs degraded, 2 pgs undersized
> 1 slow requests are blocked > 32 sec. Implicated osds
> 716 stuck requests are blocked > 4096 sec. Implicated osds 
> 25,31,38\

I would start here:

> 
>   services:
> mon: 3 daemons, quorum f,rook-ceph-mon2,rook-ceph-mon0
> mgr: a(active)
> mds: ceph-fs-2/2/2 up odd-fs-2/2/2 up  
> {[ceph-fs:0]=ceph-fs-5b997cbf7b-5tjwh=up:active,[ceph-fs:1]=ceph-fs-5b997cbf
> 7b-nstqz=up:active,[user-fs:0]=odd-fs-5668c75f9f-hflps=up:active,[user-fs:1]=odd-fs-5668c75f9f-jf59x=up:active},
>  4 up:sta
> ndby-replay
> osd: 39 osds: 39 up, 38 in
> 
>   data:
> pools:   5 pools, 706 pgs
> objects: 91212k objects, 4415 GB
> usage:   10415 GB used, 13024 GB / 23439 GB avail
> pgs: 2593/186803520 objects degraded (0.001%)
>  703 active+clean
>  2   stale+active+undersized+degraded

This is a problem! Can you check:

$ ceph pg dump_stuck

The PGs will start with a number like 8.1a where '8' it the pool ID.

Then check:

$ ceph df

To which pools to those PGs belong?

Then check:

$ ceph pg  query

And the bottom somewhere should show why these PGs are not active. You
might even want to try a restart of these OSDs involved with those two PGs.

Wido

>  1   active+clean+scrubbing+deep
> 
>   io:
> client:   168 kB/s rd, 6336 B/s wr, 10 op/s rd, 1 op/s wr
> 
> The offending broken MDS entry (damaged metadata) seems to be this:
> 
> mds.ceph-fs-5b997cbf7b-5tjwh: [
> {
> "damage_type": "dir_frag",
> "id": 1190692215,
> "ino": 2199023258131,
> "frag": "*",
> "path": "/f/01/59"
> }
> ]
> 
> Is there any idea how I can diagnose and find out what is wrong? For the 
> other issues I'm not even sure what/where I need to look into.
> 
> Cheers,
> Sangwhan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

2019-07-24 Thread Sangwhan Moon

Hello,

I've inherited a Ceph cluster from someone who has left zero documentation or 
any handover. A couple days ago it decided to show the entire company what it 
is capable of..

The health report looks like this:

[root@host mnt]# ceph -s
  cluster:
id: 809718aa-3eac-4664-b8fa-38c46cdbfdab
health: HEALTH_ERR
1 MDSs report damaged metadata
1 MDSs are read only
2 MDSs report slow requests
6 MDSs behind on trimming
Reduced data availability: 2 pgs stale
Degraded data redundancy: 2593/186803520 objects degraded (0.001%), 
2 pgs degraded, 2 pgs undersized
1 slow requests are blocked > 32 sec. Implicated osds
716 stuck requests are blocked > 4096 sec. Implicated osds 25,31,38

  services:
mon: 3 daemons, quorum f,rook-ceph-mon2,rook-ceph-mon0
mgr: a(active)
mds: ceph-fs-2/2/2 up odd-fs-2/2/2 up  
{[ceph-fs:0]=ceph-fs-5b997cbf7b-5tjwh=up:active,[ceph-fs:1]=ceph-fs-5b997cbf
7b-nstqz=up:active,[user-fs:0]=odd-fs-5668c75f9f-hflps=up:active,[user-fs:1]=odd-fs-5668c75f9f-jf59x=up:active},
 4 up:sta
ndby-replay
osd: 39 osds: 39 up, 38 in

  data:
pools:   5 pools, 706 pgs
objects: 91212k objects, 4415 GB
usage:   10415 GB used, 13024 GB / 23439 GB avail
pgs: 2593/186803520 objects degraded (0.001%)
 703 active+clean
 2   stale+active+undersized+degraded
 1   active+clean+scrubbing+deep

  io:
client:   168 kB/s rd, 6336 B/s wr, 10 op/s rd, 1 op/s wr

The offending broken MDS entry (damaged metadata) seems to be this:

mds.ceph-fs-5b997cbf7b-5tjwh: [
{
"damage_type": "dir_frag",
"id": 1190692215,
"ino": 2199023258131,
"frag": "*",
"path": "/f/01/59"
}
]

Is there any idea how I can diagnose and find out what is wrong? For the other 
issues I'm not even sure what/where I need to look into.

Cheers,
Sangwhan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

[ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

7 matches

Site Navigation

Mail list logo

Footer information