[ 
https://issues.apache.org/jira/browse/TRAFODION-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315550#comment-16315550
 ] 

Selvaganesan Govindarajan commented on TRAFODION-2888:
------------------------------------------------------

It is true that every effort should be  to ensure the process continue to run, 
but to retain the stability of the cluster, it should be ok to bring down a 
process or node.

I think out of memory management  condition (OOM) and memory allocation failure 
are entirely orthogonal. 

OOM memory condition can happen when there is a memory pressure or when there 
is RAM exhaustion. It could be due to
a)  There are too many processes in the system than the system can handle
a)      Some of the processes are building up the virtual memory due to memory 
leak.
b)      Ran out of swap space

Memory allocation failure rarely happen in 64 bit addressing scheme unless some 
process limit like PTE (page table entry) is reached either at the process or 
system level.  The process dump I have analyzed  had allocated huge amount of 
of memory out of which only 1.6 GB is from accounted SQL memory via Trafodion 
Heap management. 

OOM condition can lead to memory allocation failure, but it is too late because 
OOM killer would have kicked in and killed some process that would have made 
the node unusable anyway.

If longjmp/setjmp needs work with heap correctly, it needs to be associated 
with top level heap cli_globals::executorMemory (even for  the ESP process) 
because EsgynDB heap management is hierarchical.  The heap in the lower rank 
requests its parent heap to allocate a block if it can’t assign the memory from 
already allocated block. This continues till it reaches the top level Heap. In 
case of multi-threaded ESPs this heap is used from multiple threads by setting 
the heap to be thread safe. But setjmp/longjmp are thread-safe only when it is 
coded such that you don't setjmp from one thread and longjmp to its context 
from another thread. It is not possible to guarantee the thread-safeness for 
setjmp/longjmp  in a multi-threaded ESP.

NAMemory::setJmpBuf is supposed to assert when threadSafe is set to true. 

In legacy Trafodion code, all memory allocations in executor are from the Heap 
infrastructure. But it isn’t the case anymore. I have seen the Trafodion heap 
memory constitute less than 10-20% of the total virtual memory of the process. 
In some scenarios,  it could be much worse because of memory fragmentation as 
seen from the core dump. So, the memory allocation failure(if it happens) most 
likely to happen in other parts of the code.

So, it is imperative that the  memory growth/leak is managed in a pro-active 
manner in Trafodion processes. My suggestion would be to look for the memory 
pressure in the cluster or virtual memory growth in the process at some logical 
points. For eq, It is possible to prevent new queries in mxosrvr if the virtual 
memory of the mxosrvr process exceeds a certain value.  When the application 
needs to execute multiple statements simultaneously, this restriction would 
make sense. If there is only one user SQL statement active at any point of 
time, then the memory growth seen in the mxosrvr process most likely is due to 
memory leak.  Currently Trafodion code doesn't detect this memory leak and 
recover the mxosrvr from it before the next user SQL statement is submitted to 
it.  But it is possible to incorporate such self-healing concepts.

To ensure that the process continue to run, the setjmp/longjmp concepts are 
retained in the compiler for all cases other than the memory allocation 
failure(which shouldn’t happen at all).



> Streamline setjmp/longjmp concepts in Trafodion
> -----------------------------------------------
>
>                 Key: TRAFODION-2888
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2888
>             Project: Apache Trafodion
>          Issue Type: Improvement
>          Components: sql-general
>            Reporter: Selvaganesan Govindarajan
>            Assignee: Selvaganesan Govindarajan
>             Fix For: 2.3
>
>
> I happened to come across a core dump with longjmp in executor layer that 
> brought down the node. Unfortunately, the core dump wasn’t useful to figure 
> out what was the root cause for the longjmp.  Hence,
> a)    I wonder is there a way to figure out what caused longjmp from the core?
> b)    If no, why do longjmp? It might be better to let it dump naturally by 
> accessing the invalid address or null pointer right at the point of failure. 
> Was longjmp put in place in legacy Trafodion code base to avoid node being 
> brought down when the privilege code gets into segment violation?
> If a) is not possible, I would want to remove the remnants of setjmp and 
> longjmp from the code to enable us to debug the issue better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to