Re: [m5-users] out of memory problem

Steve Reinhardt Wed, 02 Jun 2010 11:58:49 -0700

Hi Ankit,

M5 has its own internal memory allocator for objects that get
frequently allocated & deallocated.  Your valgrind trace is just
telling you that once the internal memory allocator grabs a chunk of
memory then valgrind loses track of it.  To reliably find memory leaks
with valgrind you have to turn off the internal memory allocator; just
add NO_FAST_ALLOC=True to your scons command line.


Steve

On Wed, Jun 2, 2010 at 11:46 AM, Ankit Sethia <[email protected]> wrote:
> Hi Steve,
>
> I have a limited knowledge of M5, with that i made the following
> conclusions. Let me know whether i am wrong and what would be the correct
> way of fixing it.
>
> On Wed, Jun 2, 2010 at 2:02 PM, Steve Reinhardt <[email protected]> wrote:
>>
>> Thanks for tracking this down and supplying these patches.  I'm
>> confused by a couple of things though:
>>
>> - For base.cc, how was there a leak in the original code?  I agree
>> it's pretty contorted; the one place where keep_trying used to be
>> sometimes set to true got deleted, see
>> http://repo.m5sim.org/m5/rev/874f2ee2f115, and the code never got
>> restructured to reflect the fact that keep_trying is always false.
>> Given that keep_trying is always false in the original code though I
>> don't see a memory leak, since we always just return the packet we
>> pop.  Your patch actually changes the behavior of the prefetcher
>> pretty significantly (sort of emulating the old !cacheCheckPush
>> behavior).
>
> The valgrind log should the following to be a major place of loosing memory.
> This is from a run which was a fraction of the full run.
> ==18669== 12,816,136 (6,038,400 direct, 6,777,736 indirect) bytes in 3,145
> blocks are definitely lost in loss record 3,109 of 3,110
> ==18669==    at 0x4A062CA: operator new[](unsigned long)
> (vg_replace_malloc.c:264)
> ==18669==    by 0x9D56EC: FastAlloc::moreStructs(int) (fast_alloc.cc:62)
> ==18669==    by 0x86D403: FastAlloc::allocate(unsigned long)
> (fast_alloc.hh:170)
> ==18669==    by 0x86D42A: FastAlloc::operator new(unsigned long)
> (fast_alloc.hh:200)
> ==18669==    by 0x905AA9: BasePrefetcher::notify(Packet*&, long)
> (base.cc:235)
> ==18669==    by 0x8E31BA: Cache<LRU>::timingAccess(Packet*)
> (cache_impl.hh:449)
> ==18669==    by 0x8E3ACB: Cache<LRU>::CpuSidePort::recvTiming(Packet*)
> (cache_impl.hh:1409)
> ==18669==    by 0x86A6A7: Port::sendTiming(Packet*) (port.hh:186)
> ==18669==    by 0x9C005F: TimingSimpleCPU::handleReadPacket(Packet*)
> (timing.cc:259)
> ==18669==    by 0x9C1AE8: TimingSimpleCPU::sendData(Request*, unsigned
> char*, unsigned long*, bool) (timing.cc:282)
> ==18669==    by 0x9C2D29:
> TimingSimpleCPU::finishTranslation(WholeTranslationState*) (timing.cc:660)
> ==18669==    by 0x9CB35C:
> DataTranslation<TimingSimpleCPU>::finish(RefCountingPtr<FaultBase>,
> Request*, ThreadContext*, BaseTLB::Mode) (translation.hh:233)
>
> The packets that are created in BasePrefetcher::notify are put in a list and
> they are taken out in BasePrefetcher::getPacket(), and with keep_trying
> always set to false in getPacket(), I was  not sure that it was the expected
> behavior. A lot of packets present in the cache would be returned and are
> leaked subsequently. There might be better ways of fixing this though.
>
>>
>> - The stride.cc fix makes sense, but isn't it adequate to just add:
>>    delete *min_pos;
>> in front of tab.erase(min_pos) without adding the min_entry variable?
>
> Yes. That should do it.
>
> On Wed, Jun 2, 2010 at 2:02 PM, Steve Reinhardt <[email protected]> wrote:
>>
>> Thanks for tracking this down and supplying these patches.  I'm
>> confused by a couple of things though:
>>
>> - For base.cc, how was there a leak in the original code?  I agree
>> it's pretty contorted; the one place where keep_trying used to be
>> sometimes set to true got deleted, see
>> http://repo.m5sim.org/m5/rev/874f2ee2f115, and the code never got
>> restructured to reflect the fact that keep_trying is always false.
>> Given that keep_trying is always false in the original code though I
>> don't see a memory leak, since we always just return the packet we
>> pop.  Your patch actually changes the behavior of the prefetcher
>> pretty significantly (sort of emulating the old !cacheCheckPush
>> behavior).
>>
>> - The stride.cc fix makes sense, but isn't it adequate to just add:
>>    delete *min_pos;
>> in front of tab.erase(min_pos) without adding the min_entry variable?
>>
>> Steve
>>
>> On Wed, Jun 2, 2010 at 10:37 AM, Ankit Sethia <[email protected]>
>> wrote:
>> > Hi,
>> >
>> > After running valgrind i found the leaks. I have fixed them locally and
>> > my
>> > applications dont blow up in memory after the fix. Here is the diff,
>> > kindly
>> > let me know if it is correct:
>> >
>> > diff -r 84bd4089958b src/mem/cache/prefetch/base.cc
>> > --- a/src/mem/cache/prefetch/base.cc    Tue May 25 20:15:44 2010 -0700
>> > +++ b/src/mem/cache/prefetch/base.cc    Wed Jun 02 13:21:46 2010 -0400
>> > @@ -138,23 +138,25 @@
>> >      }
>> >
>> >      PacketPtr pkt;
>> > -    bool keep_trying = false;
>> > +    bool keep_trying = true;
>> >      do {
>> >          pkt = *pf.begin();
>> >          pf.pop_front();
>> >
>> > -        if (keep_trying) {
>> > +        if ((pkt != (*pf.end())) && inCache(pkt->getAddr())) {
>> > +            keep_trying = true;
>> >              DPRINTF(HWPrefetch, "addr 0x%x in cache, skipping\n",
>> >                      pkt->getAddr());
>> >              delete pkt->req;
>> >              delete pkt;
>> >          }
>> > +        else {
>> > +            keep_trying = false;
>> > +        }
>> >
>> > -        if (pf.empty()) {
>> > +        if (keep_trying && pf.empty()) {
>> >              cache->deassertMemSideBusRequest(BaseCache::Request_PF);
>> > -            if (keep_trying) {
>> > -                return NULL; // None left, all were in cache
>> > -            }
>> > +            return NULL; // None left, all were in cache
>> >          }
>> >      } while (keep_trying);
>> >
>> > diff -r 84bd4089958b src/mem/cache/prefetch/stride.cc
>> > --- a/src/mem/cache/prefetch/stride.cc    Tue May 25 20:15:44 2010 -0700
>> > +++ b/src/mem/cache/prefetch/stride.cc    Wed Jun 02 13:21:46 2010 -0400
>> > @@ -110,13 +110,16 @@
>> >          if (tab.size() >= 256) { //set default table size is 256
>> >              std::list<StrideEntry*>::iterator min_pos = tab.begin();
>> >              int min_conf = (*min_pos)->confidence;
>> > +           StrideEntry* min_entry = NULL;
>> >              for (iter = min_pos, ++iter; iter != tab.end(); ++iter) {
>> >                  if ((*iter)->confidence < min_conf){
>> >                      min_pos = iter;
>> >                      min_conf = (*iter)->confidence;
>> > +                    min_entry = *iter;
>> >                  }
>> >              }
>> >              DPRINTF(HWPrefetch, "  replacing PC %x\n",
>> > (*min_pos)->instAddr);
>> > +           delete min_entry;
>> >              tab.erase(min_pos);
>> >          }
>> >
>> > On Tue, Jun 1, 2010 at 7:46 PM, Korey Sewell <[email protected]> wrote:
>> >>
>> >>  Has anyone come across such an issue or can any one suggest me what is
>> >> the appropriate way of finding out where the problem lies?
>> >> Try using "valgrind" to check where the memory leak is coming from.
>> >> It's a
>> >> tremendously useful tool.
>> >>
>> >> you'll want to run on a debug binary and for a limited number of cycles
>> >> as
>> >> well..
>> >>
>> >> a command line such as "valgrind --tool=memcheck --leak-check=yes
>> >> <cmd_line>" usually works for me. (it's a lot of output, be warned).
>> >>
>> >>
>> >>>
>> >>> The GHB, prefetching mechanism works fine in all cases.
>> >>>
>> >>> Another problem that i am currently facing is as follow:
>> >>> There are only integers and floating point numbers in the application
>> >>> that i am trying to run.
>> >>
>> >> Check the mailing list archives but I'm not sure Floating Point support
>> >> is
>> >> all the way there for x86. I'll let someone else chime in.
>> >>
>> >>
>> >>>
>> >>> build/X86_SE/cpu/simple/timing.cc:438: Fault
>> >>> TimingSimpleCPU::read(Addr,
>> >>> T&, unsigned int) [with T = uint64_t]: Assertion `split_addr <= addr
>> >>> ||
>> >>> split_addr - addr < block_size' failed.
>> >>
>> >> That looks like a error in the code trying to determine something with
>> >> split accesses (accesses that span multiple cache blocks)...
>> >>
>> >> It could be something to do with your prefetcher...Or it could be
>> >> something with the memory leak corrupting values.
>> >>
>> >>
>> >>
>> >> --
>> >> - Korey
>> >>
>> >> _______________________________________________
>> >> m5-users mailing list
>> >> [email protected]
>> >> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>> >
>> >
>> >
>> > --
>> > Ankit
>> >
>> > _______________________________________________
>> > m5-users mailing list
>> > [email protected]
>> > http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>> >
>> _______________________________________________
>> m5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>
>
>
> --
> Ankit
>
> _______________________________________________
> m5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Re: [m5-users] out of memory problem

Reply via email to