On Fri, Jun 03, 2011 at 01:40:44PM +0300, Fyodor Ustinov wrote:
> Hi!
>
> kernel 2.6.39
> ceph - 0.28.2
>
> In sysctl.conf set
> vm.min_free_kbytes=262144
>
> Jun 2 03:08:17 amanda kernel: [35398.757055] libceph: msg_new can't
> allocate 4096 bytes
... so first you run out of memory ...
> Jun 3 13:33:10 amanda kernel: [159291.960881] ------------[ cut
> here ]------------
> Jun 3 13:33:10 amanda kernel: [159291.960930] kernel BUG at
> mm/mempool.c:186!
...
> Jun 3 13:33:10 amanda kernel: [159291.970496] Call Trace:
> Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a59e2>]
> ceph_msgpool_destroy+0x12/0x20 [libceph]
> Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a7fc3>]
> ceph_osdc_stop+0x83/0xb0 [libceph]
> Jun 3 13:33:10 amanda kernel: [159291.970496] [<ffffffffa02a158d>]
> ceph_destroy_client+0x1d/0x60 [libceph]
And then, the mempool destroy goes wrong. And that's because...
/**
* mempool_destroy - deallocate a memory pool
* @pool: pointer to the memory pool which was allocated via
* mempool_create().
*
* this function only sleeps if the free_fn() function sleeps. The caller
* has to guarantee that all elements have been returned to the pool (ie:
* freed) prior to calling mempool_destroy().
*/
void mempool_destroy(mempool_t *pool)
{
/* Check for outstanding elements */
BUG_ON(pool->curr_nr != pool->min_nr);
free_pool(pool);
}
We didn't empty the pool before trying to release it. It's either one
of these
ceph_msgpool_destroy(&osdc->msgpool_op);
ceph_msgpool_destroy(&osdc->msgpool_op_reply);
but I can't easily tell which one.
Summary so far: we're leaking msgpool_op or msgpool_op_reply entries
when unmounting kclient while out of memory.
devs: If anyone else has a good idea where this is heading, please
take over.
--
:(){ :|:&};:
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html