Re: [Discuss-gnuradio] vector sink "malloc(): memory corruption" / object was probably modified after being freed
I'll do my best, but it might take me quite some time to get gdb set up with the symbols you mentioned. C++/debugging doesn't come naturally for me. What it smells like to me is memory is being freed but then written to. Then when the system goes to allocate memory it says hey that memory isn't so free because it's still being written to (freed memory checksum mismatch).. And it isn't until the crime has been discovered at the time of m'allocation that an exception is thrown. Correct me if I'm wrong but the real problem (leak?) would be some to many instructions prior to the malloc exception triggered at the time of discovery? The lead suspect in the case seems to be the vector sink. I know using vector sink blocks is shunned and I feel very naughty for doing so but it is making it possible for me to do significant research without having to deep dive into C++. I'll try and think of some clever test cases to put the vector sink through its paces to trigger the error in a different context., On Mon, Apr 30, 2018 at 1:28 PM, Müller, Marcus (CEL)wrote: > Ah, by the way, the function names where that error occurs are mangled > C++ names; I'll try to show what is what (using `c++filt` to demangle > the names) > > > /lib64/libc.so.6(+0x7dd4d)[0x7fa14f448d4d] > > /lib64/libc.so.6(__libc_malloc+0x4c)[0x7fa14f44afbc] > OK, that's malloc; are we certain that this has anything to do with > `free`? > > /lib64/libstdc++.so.6(_Znwm+0x1d)[0x7fa145c0e0cd] > "operator new(unsigned long)" > so, we're reserving space. Which calls malloc. Makes sense > > /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/ > _blocks_swig1.so(_ZNSt6vectorIfSaIfEEaSERKS1_+0xd4)[0x7fa13df58e64] > "std::vector >::operator=(std::vector const&)" > > Assignment operator on a std::vector! > > > /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/ > _blocks_swig1.so(+0x12c8a0)[0x7fa13deea8a0] > > So, we are indeed in gr-blocks; but since this is the first frame > called from libpython that isn't libpython, it's likely just SWIG > wrapper code, not our own code (which doesn't imply our code isn't the > one that's buggy here). In other words: This might be the place where > SWIG or you *assign* something to something preallocated. Strange! > > The rest is just python internals: > > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6df0)[0x7fa150195af0] > > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > > /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7fa150197e3d] > > /lib64/libpython2.7.so.1.0(+0x7088d)[0x7fa15012188d] > > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] > > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x17fd)[0x7fa1501904fd] > > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > > /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7fa150197e3d] > > /lib64/libpython2.7.so.1.0(+0x70798)[0x7fa150121798] > > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] > > /lib64/libpython2.7.so.1.0(+0x5a8d5)[0x7fa15010b8d5] > > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] > > /lib64/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+ > 0x47)[0x7fa15018e6f7] > > /lib64/libpython2.7.so.1.0(+0x1155c2)[0x7fa1501c65c2] > > /lib64/libpthread.so.0(+0x7dc5)[0x7fa14fe9cdc5] > > /lib64/libc.so.6(clone+0x6d)[0x7fa14f4c276d] > > If this problem persists, I'd be very interested in you running the > flowgraph in GDB, e.g. > > gdb --args python2 /path/to/your/fg.py > ... > (gdb)run > ... > crash > ... > (gdb)backtrace > > The backtrace would be even more useful if your GDB knows the python > debug symbols, and maybe even has the scripting in place to interleave > Python state with the back trace (since we then would even see which > Python line was being executed, as well as the state of the python > interpreter). > > Best regards, > Marcus > > On Mon, 2018-04-30 at 17:16 +, Müller, Marcus (CEL) wrote: > > Hi Brad, > > > > Sorry that I missed your mail for so long! > > So, I'm pretty certain I've fixed a potential race condition when > > accessing the data vector in vector sink lately: > > > > https://github.com/gnuradio/gnuradio/pull/1445 > > > > But that should be included in the 3.7.11.1 release you're using... hm. > > > > We should probably be looking into what happens in reset(), right? > > > > Best regards, > > Marcus > > > > On Sun, 2018-04-15 at 15:10 +, Brad Hein wrote: > > > > > > The Vector Sink is coming in very handy for some experimentation I'm > doing. I'm analyzing the output of an FFT block which terminates into a > float vector sink. every few seconds from a thread which then calls reset() > to clear the contents in preparation for another read. This seems > problematic as the program crashes quite frequently
Re: [Discuss-gnuradio] vector sink "malloc(): memory corruption" / object was probably modified after being freed
Hi Brad, Sorry that I missed your mail for so long! So, I'm pretty certain I've fixed a potential race condition when accessing the data vector in vector sink lately: https://github.com/gnuradio/gnuradio/pull/1445 But that should be included in the 3.7.11.1 release you're using... hm. We should probably be looking into what happens in reset(), right? Best regards, Marcus On Sun, 2018-04-15 at 15:10 +, Brad Hein wrote: > > The Vector Sink is coming in very handy for some experimentation I'm doing. > I'm analyzing the output of an FFT block which terminates into a float vector > sink. every few seconds from a thread which then calls reset() to clear the > contents in preparation for another read. This seems problematic as the > program crashes quite frequently after seconds to minutes of operation. Based > on my layman's perspective it seems to be a memory leak in the vector block. > > I tried many things over the last couple of days. Nothing seems to mitigate > it. For example self.lock and self.unlock to lock the flowgraph before > reading from the vector (results in flowgraph never starting back up, seems > to be a whole different issue)... I tried using python copy.deepcopy to make > a copy of the vector contents before using it but that didn't help eiehter. > > When the exception occurs, it seems to happen right after resetting the > vector sink. > > My flowgraph doesn't run standalone and requires a number of other > applications to function but I'll work on getting it up to github for review > if that helps > > Gnuradio 3.7.11.1 on a CentOS VM (Linux 3.10.0-514.el7.x86_64 #1 SMP Tue Nov > 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux) > > BTW when the exception occurs on OSX the error is: > > Python(86100,0x7eab9000) malloc: *** error for object 0x7f94bf9042b8: > incorrect checksum for freed object - object was probably modified after > being freed. > > > Stack Trace samples from Linux (they are all quite similar to eachother with > only slight differences in the addresses): > > > *** Error in `python': malloc(): memory corruption: 0x7fa0cc019ba0 *** > === Backtrace: = > /lib64/libc.so.6(+0x7dd4d)[0x7fa14f448d4d] > /lib64/libc.so.6(__libc_malloc+0x4c)[0x7fa14f44afbc] > /lib64/libstdc++.so.6(_Znwm+0x1d)[0x7fa145c0e0cd] > /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(_ZNSt6vectorIfSaIfEEaSERKS1_+0xd4)[0x7fa13df58e64] > /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(+0x12c8a0)[0x7fa13deea8a0] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6df0)[0x7fa150195af0] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7fa150197e3d] > /lib64/libpython2.7.so.1.0(+0x7088d)[0x7fa15012188d] > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x17fd)[0x7fa1501904fd] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7fa150197e3d] > /lib64/libpython2.7.so.1.0(+0x70798)[0x7fa150121798] > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] > /lib64/libpython2.7.so.1.0(+0x5a8d5)[0x7fa15010b8d5] > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] > /lib64/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x47)[0x7fa15018e6f7] > /lib64/libpython2.7.so.1.0(+0x1155c2)[0x7fa1501c65c2] > /lib64/libpthread.so.0(+0x7dc5)[0x7fa14fe9cdc5] > /lib64/libc.so.6(clone+0x6d)[0x7fa14f4c276d] > === Memory map: > 0040-00401000 r-xp fd:00 242126 > /usr/bin/python2.7 > 0060-00601000 r--p fd:00 242126 > /usr/bin/python2.7 > -- > *** Error in `python': malloc(): memory corruption: 0x7f03ac00f640 *** > === Backtrace: = > /lib64/libc.so.6(+0x7dd4d)[0x7f04382b6d4d] > /lib64/libc.so.6(__libc_malloc+0x4c)[0x7f04382b8fbc] > /lib64/libstdc++.so.6(_Znwm+0x1d)[0x7f042ea7c0cd] > /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(_ZNSt6vectorIfSaIfEEaSERKS1_+0xd4)[0x7f0426dc6e64] > /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(+0x12c8a0)[0x7f0426d588a0] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6df0)[0x7f0439003af0] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f04390034bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f04390034bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f0439005e3d] > /lib64/libpython2.7.so.1.0(+0x7088d)[0x7f0438f8f88d] > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f0438f6a8e3] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x17fd)[0x7f0438ffe4fd] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f04390034bd] >
Re: [Discuss-gnuradio] vector sink "malloc(): memory corruption" / object was probably modified after being freed
Ah, by the way, the function names where that error occurs are mangled C++ names; I'll try to show what is what (using `c++filt` to demangle the names) > /lib64/libc.so.6(+0x7dd4d)[0x7fa14f448d4d] > /lib64/libc.so.6(__libc_malloc+0x4c)[0x7fa14f44afbc] OK, that's malloc; are we certain that this has anything to do with `free`? > /lib64/libstdc++.so.6(_Znwm+0x1d)[0x7fa145c0e0cd] "operator new(unsigned long)" so, we're reserving space. Which calls malloc. Makes sense > /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(_ZNSt6vectorIfSaIfEEaSERKS1_+0xd4)[0x7fa13df58e64] "std::vector::operator=(std::vector const&)" Assignment operator on a std::vector! > /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(+0x12c8a0)[0x7fa13deea8a0] So, we are indeed in gr-blocks; but since this is the first frame called from libpython that isn't libpython, it's likely just SWIG wrapper code, not our own code (which doesn't imply our code isn't the one that's buggy here). In other words: This might be the place where SWIG or you *assign* something to something preallocated. Strange! The rest is just python internals: > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6df0)[0x7fa150195af0] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7fa150197e3d] > /lib64/libpython2.7.so.1.0(+0x7088d)[0x7fa15012188d] > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x17fd)[0x7fa1501904fd] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] > /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7fa150197e3d] > /lib64/libpython2.7.so.1.0(+0x70798)[0x7fa150121798] > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] > /lib64/libpython2.7.so.1.0(+0x5a8d5)[0x7fa15010b8d5] > /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] > /lib64/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x47)[0x7fa15018e6f7] > /lib64/libpython2.7.so.1.0(+0x1155c2)[0x7fa1501c65c2] > /lib64/libpthread.so.0(+0x7dc5)[0x7fa14fe9cdc5] > /lib64/libc.so.6(clone+0x6d)[0x7fa14f4c276d] If this problem persists, I'd be very interested in you running the flowgraph in GDB, e.g. gdb --args python2 /path/to/your/fg.py ... (gdb)run ... crash ... (gdb)backtrace The backtrace would be even more useful if your GDB knows the python debug symbols, and maybe even has the scripting in place to interleave Python state with the back trace (since we then would even see which Python line was being executed, as well as the state of the python interpreter). Best regards, Marcus On Mon, 2018-04-30 at 17:16 +, Müller, Marcus (CEL) wrote: > Hi Brad, > > Sorry that I missed your mail for so long! > So, I'm pretty certain I've fixed a potential race condition when > accessing the data vector in vector sink lately: > > https://github.com/gnuradio/gnuradio/pull/1445 > > But that should be included in the 3.7.11.1 release you're using... hm. > > We should probably be looking into what happens in reset(), right? > > Best regards, > Marcus > > On Sun, 2018-04-15 at 15:10 +, Brad Hein wrote: > > > > The Vector Sink is coming in very handy for some experimentation I'm doing. > > I'm analyzing the output of an FFT block which terminates into a float > > vector sink. every few seconds from a thread which then calls reset() to > > clear the contents in preparation for another read. This seems problematic > > as the program crashes quite frequently after seconds to minutes of > > operation. Based on my layman's perspective it seems to be a memory leak in > > the vector block. > > > > I tried many things over the last couple of days. Nothing seems to mitigate > > it. For example self.lock and self.unlock to lock the flowgraph before > > reading from the vector (results in flowgraph never starting back up, seems > > to be a whole different issue)... I tried using python copy.deepcopy to > > make a copy of the vector contents before using it but that didn't help > > eiehter. > > > > When the exception occurs, it seems to happen right after resetting the > > vector sink. > > > > My flowgraph doesn't run standalone and requires a number of other > > applications to function but I'll work on getting it up to github for > > review if that helps > > > > Gnuradio 3.7.11.1 on a CentOS VM (Linux 3.10.0-514.el7.x86_64 #1 SMP Tue > > Nov 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux) > > > > BTW when the exception occurs on OSX the error is: > > > > Python(86100,0x7eab9000) malloc: *** error for object 0x7f94bf9042b8: > > incorrect checksum for freed object - object was probably modified after > > being freed. > > > > > > Stack Trace samples from
[Discuss-gnuradio] vector sink "malloc(): memory corruption" / object was probably modified after being freed
The Vector Sink is coming in very handy for some experimentation I'm doing. I'm analyzing the output of an FFT block which terminates into a float vector sink. every few seconds from a thread which then calls reset() to clear the contents in preparation for another read. This seems problematic as the program crashes quite frequently after seconds to minutes of operation. Based on my layman's perspective it seems to be a memory leak in the vector block. I tried many things over the last couple of days. Nothing seems to mitigate it. For example self.lock and self.unlock to lock the flowgraph before reading from the vector (results in flowgraph never starting back up, seems to be a whole different issue)... I tried using python copy.deepcopy to make a copy of the vector contents before using it but that didn't help eiehter. When the exception occurs, it seems to happen right after resetting the vector sink. My flowgraph doesn't run standalone and requires a number of other applications to function but I'll work on getting it up to github for review if that helps Gnuradio 3.7.11.1 on a CentOS VM (Linux 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux) BTW when the exception occurs on OSX the error is: Python(86100,0x7eab9000) malloc: *** error for object 0x7f94bf9042b8: incorrect checksum for freed object - object was probably modified after being freed. Stack Trace samples from Linux (they are all quite similar to eachother with only slight differences in the addresses): *** Error in `python': malloc(): memory corruption: 0x7fa0cc019ba0 *** === Backtrace: = /lib64/libc.so.6(+0x7dd4d)[0x7fa14f448d4d] /lib64/libc.so.6(__libc_malloc+0x4c)[0x7fa14f44afbc] /lib64/libstdc++.so.6(_Znwm+0x1d)[0x7fa145c0e0cd] /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(_ZNSt6vectorIfSaIfEEaSERKS1_+0xd4)[0x7fa13df58e64] /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(+0x12c8a0)[0x7fa13deea8a0] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6df0)[0x7fa150195af0] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7fa150197e3d] /lib64/libpython2.7.so.1.0(+0x7088d)[0x7fa15012188d] /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x17fd)[0x7fa1501904fd] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7fa1501954bd] /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7fa150197e3d] /lib64/libpython2.7.so.1.0(+0x70798)[0x7fa150121798] /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] /lib64/libpython2.7.so.1.0(+0x5a8d5)[0x7fa15010b8d5] /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7fa1500fc8e3] /lib64/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x47)[0x7fa15018e6f7] /lib64/libpython2.7.so.1.0(+0x1155c2)[0x7fa1501c65c2] /lib64/libpthread.so.0(+0x7dc5)[0x7fa14fe9cdc5] /lib64/libc.so.6(clone+0x6d)[0x7fa14f4c276d] === Memory map: 0040-00401000 r-xp fd:00 242126 /usr/bin/python2.7 0060-00601000 r--p fd:00 242126 /usr/bin/python2.7 -- *** Error in `python': malloc(): memory corruption: 0x7f03ac00f640 *** === Backtrace: = /lib64/libc.so.6(+0x7dd4d)[0x7f04382b6d4d] /lib64/libc.so.6(__libc_malloc+0x4c)[0x7f04382b8fbc] /lib64/libstdc++.so.6(_Znwm+0x1d)[0x7f042ea7c0cd] /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(_ZNSt6vectorIfSaIfEEaSERKS1_+0xd4)[0x7f0426dc6e64] /usr/local/lib64/python2.7/site-packages/gnuradio/blocks/_blocks_swig1.so(+0x12c8a0)[0x7f0426d588a0] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6df0)[0x7f0439003af0] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f04390034bd] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f04390034bd] /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f0439005e3d] /lib64/libpython2.7.so.1.0(+0x7088d)[0x7f0438f8f88d] /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f0438f6a8e3] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x17fd)[0x7f0438ffe4fd] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f04390034bd] /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f04390034bd] /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f0439005e3d] /lib64/libpython2.7.so.1.0(+0x70798)[0x7f0438f8f798] /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f0438f6a8e3] /lib64/libpython2.7.so.1.0(+0x5a8d5)[0x7f0438f798d5] /lib64/libpython2.7.so.1.0(PyObject_Call+0x43)[0x7f0438f6a8e3] /lib64/libpython2.7.so.1.0(PyEval_CallObjectWithKeywords+0x47)[0x7f0438ffc6f7] /lib64/libpython2.7.so.1.0(+0x1155c2)[0x7f04390345c2] /lib64/libpthread.so.0(+0x7dc5)[0x7f0438d0adc5] /lib64/libc.so.6(clone+0x6d)[0x7f043833076d] === Memory map: 0040-00401000 r-xp fd:00 242126