If you can do an ssh session to the IPMI console and then do that inside of a screen, you can save the output of the screen to a file and look at what was happening on the console when the server locked up. That's how I track kernel panics.
On Fri, Oct 27, 2017 at 1:53 PM Bogdan SOLGA <[email protected]> wrote: > Thank you very much for the reply, Ilya! > > The server was completely frozen / hard lockup, we had to restart it via > IPMI. We grepped the logs trying to find the culprit, but to no avail. > Any hint on how to troubleshoot the (eventual) freezes is highly > appreciated. > > Understood on the kernel recommendation. We'll continue to use 4.10, then. > > Thanks, a lot! > > Kind regards, > Bogdan > > > > > On Fri, Oct 27, 2017 at 8:04 PM, Ilya Dryomov <[email protected]> wrote: > >> On Fri, Oct 27, 2017 at 6:33 PM, Bogdan SOLGA <[email protected]> >> wrote: >> > Hello, everyone! >> > >> > We have recently upgraded our Ceph pool to the latest Luminous release. >> On >> > one of the servers that we used as Ceph clients we had several freeze >> > issues, which we empirically linked to the concurrent usage of some I/O >> > operations - writing in an LXD container (backed by Ceph) while there >> was an >> > ongoing PG rebalancing. We searched for the issue's cause through the >> logs, >> > but we haven't found anything useful. >> >> What kind of freezes -- temporary slowdowns or hard lockups? Did they >> resolve on their own or did you have to intervene? >> >> > >> > At that time the server was running Ubuntu 16 with a 4.5 kernel. We >> thought >> > an upgrade to the latest HWE kernel (4.10) would help, but we had the >> same >> > freezing issues after the kernel upgrade. Of course, we're aware that we >> > have tried to fix / avoid the issue without understanding it's cause. >> > >> > After seeing the OS recommendations from the Ceph page, we reinstalled >> the >> > server (and got the 4.4 kernel), we ran into a feature set mismatch >> issue >> > when mounting a RBD image. We concluded that the feature set requires a >> > kernel > 4.5. >> > >> > Our question - how would you recommend us to proceed? Shall we >> re-upgrade to >> > the HWE kernel (4.10) or to another kernel version? Would you recommend >> an >> > alternative solution? >> >> The OS recommendations page lists upstream kernels, as a general >> guidance. As long as the kernel is fairly recent and maintained >> (either upstream or by the distributor), it should be fine. 4.10 is >> certainly better than 4.4-based kernels, at least for the kernel >> client. >> >> Thanks, >> >> Ilya >> > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
