On 19-Jul-18 10:01 AM, Burakov, Anatoly wrote:
On 18-Jul-18 9:58 PM, Stephen Hemminger wrote:
On Wed, 18 Jul 2018 22:52:12 +0300
Andrew Rybchenko <arybche...@solarflare.com> wrote:
On 18.07.2018 20:18, Burakov, Anatoly wrote:
On 18-Jul-18 4:20 PM, Andrew Rybchenko wrote:
Hi Anatoly,
I'm investigating issue which finally comes to the fact that memory
allocated using
rte_zmalloc() has non zeros.
If I add memset just after allocation, everything is perfect and
works fine.
I've found out that memset was removed from rte_zmalloc_socket() some
time ago:
>>>
commit b78c9175118f7d61022ddc5c62ce54a1bd73cea5
Author: Sergio Gonzalez Monroy <sergio.gonzalez.mon...@intel.com>
Date: Tue Jul 5 12:01:16 2016 +0100
mem: do not zero out memory on zmalloc
Zeroing out memory on rte_zmalloc_socket is not required anymore
since all
allocated memory is already zeroed.
Signed-off-by: Sergio Gonzalez Monroy
<sergio.gonzalez.mon...@intel.com>
<<<
but may be something has changed now that made above statement false.
I observe the problem when memory is reallocated. I.e. I configure 7
queues,
start, stop, reconfigure to 3 queues, start. Memory is allocated on
start and
freed on stop, since we have less queues on the second start it is
allocated
in a different way and reuses previously allocated/freed memory.
Do you have any ideas what could be wrong?
Andrew.
Hi Andrew,
I will look into it first thing tomorrow. In general, we memset(0) on
free, and kernel gives us zeroed out pages initially, so the most
likely point of failure is that i'm not overwring some malloc headers
correctly on free.
OK, at least now I know how it is supposed to work in theory.
The following region was allocated (the second number below is pointer
plus size)
ALLOC 0x7fffa3264080-0x7fffa32640b8
Not zerod address is 16 bytes before:
(gdb) p/x ((uint64_t *)0x7fffa3264070)[0]
$4 = 0x4000000002
(gdb) p/x ((uint64_t *)0x7fffa3264070)[1]
$5 = 0x80
then freed
FREE 0x7fffa3264080-0x7fffa32640b8
but above values (gdb) are still the same
then it is allocated as the part of bigger memory chunk
ALLOC 0x7fffa3245b80-0x7fffa3265fd8
which should contain zeros, but above values are still the same.
It is interesting that it looks like it was the first block freed on the
port stop. I'm not 100% sure since I've put printouts to my allocation
wrapper, not EAL.
Many thanks,
Andrew.
memset here is what is supposed to clear the data.
struct malloc_elem *
malloc_elem_free(struct malloc_elem *elem)
{
void *ptr;
size_t data_len;
ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN + elem->pad);
data_len = elem->size - elem->pad - MALLOC_ELEM_OVERHEAD;
elem = malloc_elem_join_adjacent_free(elem);
malloc_elem_free_list_insert(elem);
elem->pad = 0;
/* decrease heap's count of allocated elements */
elem->heap->alloc_count--;
memset(ptr, 0, data_len);
Maybe data_len is not correct either because of bug, or your
application clobbered
the malloc reserved regions in the element.
More likely, gcc is incorrectly optimizing this away.
https://wiki.sei.cmu.edu/confluence/display/c/MSC06-C.+Beware+of+compiler+optimizations
https://www.cryptologie.net/article/419/zeroing-memory-compiler-optimizations-and-memset_s/
I tend to be very wary of blaming the compiler without exhausting any
other possibilities :) It used to work before without issues, so
presumably whatever is happening, our memset works correctly.
Andrew, you write:
<snip>
ALLOC 0x7fffa3264080-0x7fffa32640b8
Not zerod address is 16 bytes before:
<snip>
Of course the memory *before* your pointer would not be zero - it is
preceded by a 64-byte malloc header, so what you're seeing is the malloc
header data (which doesn't go away if you free it - it will go away only
if it is merged with an adjacent free malloc element). So, i'm failing
to see which problem you're describing, given that all memory regions
that are supposedly not free lie outside of your malloc-allocated memory.
However, after careful analysis, i can see that there is one possibility
where memory is not zeroed on free - if the original malloc element was
padded, and there aren't any more adjacent free elements, then newly
allocated memory may contain old pad header. I'll submit a patch for you
to try shortly.
Patch:
http://patches.dpdk.org/patch/43196/
--
Thanks,
Anatoly