Hi,
I can confirm that using grsec kernel with haproxy can sometimes be a
bit tricky.
For instance, 3.2.54 with grsec crashes with me after ~8 hours while
3.2.55 and 3.2.52 with grsec do not. Kernels with grsec just need more
testing because their stability can vary.
Greets,
Sander
On 27.02.2014 11:29, Cedric Maion wrote:
I agree that it does indeed look like a kernel issue (in the intel eth
driver?), however 1.5 is doing something new that triggers this.
Any idea of a significant 1.4 -> 1.5 change that can affect what is
happening in the kernel?
This kernel is indeed not the stock Ubuntu kernel, but the default one
provided by the hosting company (OVH in that case)... I would really
like not having to recompile the kernel and play too much with the
production environment (sadly this issue never popped in my dev & lab
environments).
So any haproxy related idea would be very welcome...!
On Thu, Feb 27, 2014 at 11:06:38AM +0100, Lukas Tribus wrote:
Hi,
> Just upgraded a production node from 1.4.18 to 1.5-dev22.
> Ran fine for a couple of minutes then crashed with the following kernel
> messages:
>
> WARNING: at mm/page_alloc.c:2107 __alloc_pages_nodemask+0x1fd/0x790()
> Hardware name: X9SRE/X9SRE-3F/X9SRi/X9SRi-3F
> Pid: 23190, comm: haproxy Not tainted 3.2.13-grsec-xxxx-grs-ipv6-64 #1
> Call Trace:
> [<ffffffff810f1ded>] ? __alloc_pages_nodemask+0x1fd/0x790
> [<ffffffff81089f3b>] warn_slowpath_common+0x7b/0xc0
> [<ffffffff81089f95>] warn_slowpath_null+0x15/0x20
> [<ffffffff810f1ded>] __alloc_pages_nodemask+0x1fd/0x790
Thats definitely a kernel issue.
Are you building your own kernel? That doesn't look like the default
Ubuntu kernel.
I would suggest to upgrade your kernel to 3.2.55 (of course use an
updated grsec patch as well). If that doesn't fix the issue, try
vanilla 3.2.55 (no grsec).
If the issue persists, report it upstream (either to lkml/netdev or
grsec, depending whether the vanilla 3.2.55 has the issue or not).
Regards,
Lukas