Re: LE C growing heap issue

Bernd Oppolzer Thu, 04 Jan 2024 15:47:12 -0800

I had similar situations a lot of times at different customers' locations.

It always turned out that when the developers at the customer's sitetelled methat they returned all the storage requested to the heap, in fact theydidn't.There was always a small area left which was not returned or freed. Thisarea was then

responsible for the growing heap on every call.

What I did to diagnose the problem has already been done here, as far asI can see.The most interesting tool is the alternate heap manager (CEL4MCHK,IIRC), which allowsyou to track all memory allocations and frees, and it even tells you thestack trace at the place

when the allocations have been done.

I then wrote a procedure (REXX, IIRC), which processed the output ofCEL4MCHK, just to seeif there is a certain pattern in the areas which remain allocated. Forexample: I do exactly1000 requests and then I look for areas which remain allocated (of acertain size) andare present in the list 1000 times (or a multiple of 1000 times). Then Ido the same 2000 times,and look, if I have the same area 2000 times, and so on. (There is noneed, BTW, to runthe tests until all the memory is used up ... if you do this, yourtraces will grow much too large).

Areas which remain constant if I change the number of calls are of nointerest. But if I find areas,which change the number exactly with the number of calls, I know theplace where the allocationwithout free has been done. And then I call the developer, who isresponsible for the module

which does that allocation and ask him or her to fix it.

I've done this many times, and every time I found the module or functionwhich was responsiblefor the storage leak within some hours. The languages used (which leadto the storage leaks)were C, C++ (often), and even PL/1 - this doesn't really matter, becauseit's all LE.

AFAIK, for Windows and Linux and similar platforms, there is a toolcalled ValGrind,

which does a similar analysis.

The REXX procedure was used primarily to sort and group the requests inthe CEL4MCHKoutput by size and caller sequence etc ... so that the numbers (like1000 and 2000 above)

can easily be recognized.

This said: there is of course a small chance that the problem is notinside the user's (orcustomer's) code, but instead inside some of the vendor's functions (inthis case: IBM,for example the JSON processors mentioned). But IMO the probability islow ... althoughin my career there were some rare situations, where after 4 weeks ofexamination oferror situations, it REALLY turned out that the error was in the IBMpart ... and it tookme some time to convince IBM. If you want, I can tell you more aboutthis ... but offline.

But even if this was the case in your situation, the CEL4MCHK method IMOwould detect it.Honestly: I believe, you will find out that the error is in thecustomer's code.


HTH, kind regards

Bernd


Am 04.01.2024 um 22:45 schrieb Eric Erickson:

We are in a bit of a quandary here with some memory issues surrounding our 
application. This is a multitasking LE C application running in 31 bit mode 
that utilizes the IBM JSON and EZNOSQL Services. Some of the attributes are:

•       z/OS V2.5 Operation System
•       POSIX(OFF) - all tasks/subtasks
•       Single address space (31 Bit Mode)
•       ATTACHX Multi-tasking model (no pthreads)
•       Execute as started task – Problem State – Key 4
•       Drop in/out of supervisor state as needed
•       3 EZNOSQL Databases are opened at application start and remain open 
until termination
•       Open EZNOSQL connections tokens are passed to the worker task(s) along 
with the unit of work to be processed

Our issue is that the total available heap grows until we end up exhausting all 
available memory and inevitable application failure, but the key here is that 
while the total heap grows with every unit of work processed by tasks, the in use 
amount only shows no or only a small (<128 bytes) increment between units of 
work. For example, here is a heap report (using LE __heaprpt function) example. So 
we are fairly confident that our application code is not leaking memory.

HeapReport: ZdpQuery @Start  - Total/In Use/Available:   1048576/    888160/    
 160416.
HeapReport: ZdpQuery @Enter  - Total/In Use/Available:   1048576/    888160/    
 160416.
HeapReport: ZdpQuery @Exit   - Total/In Use/Available:   1560856/    888192/    
 672664.
HeapReport: ZdpQuery @Enter  - Total/In Use/Available:   1560856/    888192/    
 672664.
HeapReport: ZdpQuery @Exit   - Total/In Use/Available:   2073088/    888224/    
1184864.
HeapReport: ZdpQuery @Enter  - Total/In Use/Available:   2073088/    888224/    
1184864.
HeapReport: ZdpQuery @Exit   - Total/In Use/Available:   2073088/    888224/    
1184864.
HeapReport: ZdpQuery @Enter  - Total/In Use/Available:   2073088/    888224/    
1184864.
HeapReport: ZdpQuery @Exit   - Total/In Use/Available:   2585376/    888256/    
1697120.
HeapReport: ZdpQuery @Enter  - Total/In Use/Available:   2585376/    888256/    
1697120.
HeapReport: ZdpQuery @Exit   - Total/In Use/Available:   2585376/    888256/    
1697120.
HeapReport: ZdpQuery @Enter  - Total/In Use/Available:   2585376/    888256/    
1697120.
HeapReport: ZdpQuery @Exit   - Total/In Use/Available:   2585376/    888256/    
1697120.
HeapReport: ZdpQuery @Finish - Total/In Use/Available:   2585376/    888256/    
1697120.

The @Start and @Finish lines show the heap report results just after the task 
is attached and before it terminates. Each of the @Enter/@Exit lines show the 
heap at the unit of work start and end processing, respectively.

We are at a loss to explain why the heap keeps growing. We would expect that the 
heap would grow to some high water mark and become stabilized, but the total size 
just keeps growing until the application fails due to out of memory condition, even 
though there is a significant amount of heap storage available. Our tasks are 
returning all the storage they directly allocate back to the heap, as indicated by 
in use at start & end. While there is a small increment in the in use number, 
we think that may just be LE overhead in managing the heap, but in any case is 
generally less than 128 bytes per iteration, and only appears then the total heap 
size increases. What makes this example even more interesting, is that we are 
processing the exact same request for each iteration.

We’ve turned on all the various LE memory analysis options (HEAPCHK, RPTSTG) 
and utilized the LE alternate heap manager to detect overlays, corruption, 
etc.. This pointed us to a couple of minor leaks we plugged but has not led us 
to an answer as to the growing heap. We make heavy use of the IBM JSON and 
EZNOSQL services during processing.

We are in search of any insight, recommendations as to how to proceed in 
diagnosis this issue.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: LE C growing heap issue

Reply via email to