[ 
https://issues.apache.org/jira/browse/HDFS-12029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062105#comment-16062105
 ] 

Nandakumar commented on HDFS-12029:
-----------------------------------

We can also use {{-Xss1280k}} as workaround as suggested [here | 
https://issues.apache.org/jira/browse/DAEMON-363?focusedCommentId=16060779&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16060779],
 this value will be a safer option as it's closer to the default value 1024k.
This has to be set for jsvc process.
After setting this value, tested the stack size allocated for Threads in java 
process launched by jsvc

{quote}
7f962acb2000-7f962adf0000 rw-p 00000000 00:00 0   stack:19767    Size: 1272 kB
--
7f962adf3000-7f962af31000 rw-p 00000000 00:00 0    stack:19766    Size: 1272 kB
--
7f962af34000-7f962b072000 rw-p 00000000 00:00 0   stack:19765    Size: 1272 kB
--
7f962b075000-7f962b1b3000 rw-p 00000000 00:00 0   stack:19764   Size: 1272 kB
--
7f962b1b6000-7f962b2f4000 rw-p 00000000 00:00 0    stack:19763   Size: 1272 kB
{quote}

>  Data node process crashes after kernel upgrade
> -----------------------------------------------
>
>                 Key: HDFS-12029
>                 URL: https://issues.apache.org/jira/browse/HDFS-12029
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Anu Engineer
>            Assignee: Nandakumar
>            Priority: Blocker
>
>  We have seen that when Linux kernel is upgraded to address a specific CVE 
>  ( https://access.redhat.com/security/vulnerabilities/stackguard ) it might 
> cause a datanode crash.
> We have observed this issue while upgrading from 3.10.0-514.6.2 to 
> 3.10.0-514.21.2 versions of the kernel.
> Original kernel fix is here -- 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1be7107fbe18eed3e319a6c3e83c78254b693acb
> Datanode fails with the following stack trace, 
> {noformat}
> # 
> # A fatal error has been detected by the Java Runtime Environment: 
> # 
> # SIGBUS (0x7) at pc=0x00007f458d078b7c, pid=13214, tid=139936990349120 
> # 
> # JRE version: (8.0_40-b25) (build ) 
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.40-b25 mixed mode 
> linux-amd64 compressed oops) 
> # Problematic frame: 
> # j java.lang.Object.<clinit>()V+0 
> # 
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again 
> # 
> # An error report file with more information is saved as: 
> # /tmp/hs_err_pid13214.log 
> # 
> # If you would like to submit a bug report, please visit: 
> # http://bugreport.java.com/bugreport/crash.jsp 
> # 
> {noformat}
> The root cause is a failure in jsvc. If we pass a greater than 1MB value as 
> the stack size argument, this can be mitigated.  Something like:
> {code}
> exec "$JSVC" \
> -Xss2m
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter "$@"
> {code}
> This JIRA tracks potential fixes for this problem. We don't have data on how 
> this impacts other applications that run on datanode as this might impact 
> datanodes memory usage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to