Thanks for your help, yesterday we managed to find and fix the bug. I'll briefly describe what the problem was and how we found it for the record, should someone have a similar issue.
First we used the proxy.config.dump_mem_info_frequency configuration and it clearly showed that our TSIOBuffers were remaining in use after we called TSIOBufferDestroy on them: allocated | in-use | type size | free list name --------------------|--------------------|------------|----- ----------------------------- 0 | 0 | 2097152 | memory/ioBufAllocator[14] 0 | 0 | 1048576 | memory/ioBufAllocator[13] 0 | 0 | 524288 | memory/ioBufAllocator[12] 0 | 0 | 262144 | memory/ioBufAllocator[11] 0 | 0 | 131072 | memory/ioBufAllocator[10] 0 | 0 | 65536 | memory/ioBufAllocator[9] * 98566144 | 98304000 | 32768 | memory/ioBufAllocator[8]* 0 | 0 | 16384 | memory/ioBufAllocator[7] 262144 | 0 | 8192 | memory/ioBufAllocator[6] ... Then we found out that the reason was that we were calling TSIOBufferDestroy when receiving the TS_EVENT_HTTP_TXN_CLOSE event, and when this event is fired ATS still holds a handler to the buffer's blocks, so calling Destroy was decrementing the buffer block's reference counters, but they were not reaching 0, thus the buffer blocks were not being released. Also, when handling the event, we were calling TSVConnShutdown(vc, 1, 1), which prevented further events from being received. The fix was to remove the call to shutdown and wait to destroy the objects only when both connections were already closed. Best regards, Acácio. Acácio Centeno Software Engineering Azion Technologies Porto Alegre, Brasil +55 51 3012 3005 | +55 51 8118 9947 Miami, USA +1 305 704 8816 Quaisquer informações contidas neste e-mail e anexos podem ser confidenciais e privilegiadas, protegidas por sigilo legal. Qualquer forma de utilização deste documento depende de autorização do emissor, sujeito as penalidades cabíveis. Any information in this e-mail and attachments may be confidential and privileged, protected by legal confidentiality. The use of this document require authorization by the issuer, subject to penalties. On Wed, Jul 16, 2014 at 11:30 PM, Yunkai Zhang <yunkai...@gmail.com> wrote: > Why not use the following option to dump memory usage? > > # Great for tracking down memory leaks, but you need to use the > # ink allocators > CONFIG proxy.config.dump_mem_info_frequency INT 0 > > I always use it to detect memory leak when working with freelist. > > There are two parts memory in ATS: > 1) allocated by malloc()/new() directly. > 2) managed by freelist memory pool. > > Firstly, we should judge out where the memory leak comes from? The above > option can help you to judge. > > If memory leak comes from 2), you can find out which Class objects are leak > by analyzing the dumping output. > > If memory leak comes from 1), you can use SystemTap to see whether the > 'malloc()' and 'free()' are paired. > > > On Thu, Jul 17, 2014 at 8:40 AM, Leif Hedstrom <zw...@apache.org> wrote: > > > > > On Jul 16, 2014, at 12:07 PM, James Peach <jpe...@apache.org> wrote: > > > > > Sometimes I've gone to the extent of using placement new() in order to > > be consistent about using TSmalloc+TSfree, but there's no additional > memory > > leak tracking in those. The only practical difference I can think of is > > that if ATS is using tcmalloc, then this would ensure your plugin gets it > > too. > > > > yeah, tcmalloc is nice (and easy to use with ATS). I’m not sure why > > Valgrind isn’t working for you, best I can suggest is that you setup a > dev > > environment with CentOS7 or Fedora Core 20, and run it there ? I > generally > > find Valgrind easy to use, albeit, incredibly slow. > > > > Cheers, > > > > — Leif > > > > > > > -- > Yunkai Zhang > Work at Taobao >