Hi Alexey,
> in general soft-lookup isn't error, that just notice about some operation is
> need too many time (more then 10s i think).
> attached soft-lookup say - OST is busy with creating objects after MDS<>OST
> reconnect,
Yes, i know that a soft-lockup doesn't mean that i hit a bug but having
ll_ost_creat_* wasting 100% CPU
doesn't seem to be normal.
> i think you have too busy disks or overloaded node.
Disk %busy is < 5% for all attached disks.
The OST is doing almost nothing (there are a few read()'s, that's all)
> if you have slow disks - client can be disconnected before they request is
> processing, and that request blocked to reconnect from that client.
The recovery of the clients seems to be ok: all clients can write/read data
from the OST but
there is something wrong between the MDS<->OST0005.
But this might just be a side-effect of the ll_ost_creat_* issue :-/
Regards,
Adrian
--
RFC 1925:
(11) Every old idea will be proposed again with a different name and
a different presentation, regardless of whether it works.
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss