The hanging kernel tasks under -327 for XFS resulted in LOG verification
failures and completely locked the hosts.
BTRFS task timeouts we could get around by setting
kernel.hung_task_timeout_secs = 960

The host would eventually get responsive again however that doesn't really
matter, since the ceph ops are blocked for so long it all goes to hell
anyways.
I only found stability under high load with EXT4 or -229 with BTRFS|EXT4.

Bad story, sorry to have to tell it.

-Wade


On Tue, Dec 22, 2015 at 9:44 AM Dan Nica <[email protected]>
wrote:

> That is strange, maybe there is a sysctl option to tweak on OSDs ? this
> will be nasty if it goes into our production!
>
>
>
> --
>
> Dan
>
>
>
> *From:* Wade Holler [mailto:[email protected]]
> *Sent:* Tuesday, December 22, 2015 4:36 PM
> *To:* Dan Nica <[email protected]>; [email protected]
> *Subject:* Re: [ceph-users] requests are blocked
>
>
>
> I had major host stability problems under load with -327  . Repeatable
> test cases under high load with XFS or BTRFS would result in hung kernel
> tasks and of course the sympathetic behavior you mention.
>
> requests are blocked mean that the op tracker in ceph hasn't received a
> timely response from the osd usually.  I'm sure someone more seasoned can
> provide a better explanation.
>
> -Wade
>
>
>
> On Tue, Dec 22, 2015 at 9:24 AM Dan Nica <[email protected]>
> wrote:
>
> Hi
>
>
>
> I try to run a bench test on a RBD image and I get from time to time the
> following in ceph status
>
>
>
>     cluster 046b0180-dc3f-4846-924f-41d9729d48c8
>
>      health HEALTH_WARN
>
>             2 requests are blocked > 32 sec
>
>      monmap e1: 3 mons at {alder=
> 10.6.250.249:6789/0,ash=10.6.250.248:6789/0,aspen=10.6.250.247:6789/0}
>
>             election epoch 18, quorum 0,1,2 aspen,ash,alder
>
>      osdmap e114: 6 osds: 6 up, 6 in
>
>             flags sortbitwise
>
>       pgmap v3816: 192 pgs, 1 pools, 23062 MB data, 5814 objects
>
>             46406 MB used, 44624 GB / 44670 GB avail
>
>                  192 active+clean
>
>   client io 6083 B/s rd, 18884 kB/s wr, 75 op/s
>
>
>
>
>
> what does  “requests are blocked” mean ? and performance drops to almost
>  0 ?
>
> I am running infernalis version on Centos 7 kernel
> 3.10.0-327.3.1.el7.x86_64
>
>
>
> Thanks
>
> --
>
> Dan
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to