[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345361#comment-16345361 ] Jim Brennan commented on YARN-7796: --- I've filed a new Jira for this binary compatibility issue: YARN-7857 > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343452#comment-16343452 ] Jim Brennan commented on YARN-7796: --- [~miklos.szeg...@cloudera.com] - I believe the max stack size reported by ulimit is in KBs, so 10 MB for RHEL 6 and 8 MB for RHEL 7. The container-executor 128 KB stack allocation should be well within those limits. It does appear that gcc is generating dynamic stack-checking code when compiled on RHEL 6, and that seems to work when run on RHEL 6. But the same binary does not seem to work on RHEL 7. This implies to me that the code generated for RHEL 6 is in some way incompatible with the RHEL 7 kernel. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343304#comment-16343304 ] Gergo Repas commented on YARN-7796: --- [~Jim_Brennan] I have the same gcc version (gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC)), and yes, when I removed the -fstack-check flag (and having this patch reverted) I haven't experienced the segfault anymore. [~miklos.szeg...@cloudera.com] Your results may suggest that the container-executor issue is not related to the max stack size. I wonder if the following happens: "If neither STACK_CHECK_BUILTIN nor STACK_CHECK_STATIC_BUILTIN is defined, GCC will change its allocation strategy for large objects if the option -fstack-check is specified: they will always be allocated dynamically if their size exceeds STACK_CHECK_MAX_VAR_SIZE bytes." (from https://gcc.gnu.org/onlinedocs/gccint/Stack-Checking.html). > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341948#comment-16341948 ] Miklos Szegedi commented on YARN-7796: -- Now the question is, how does a 128K allocation fill in a stack that is normally 8K? If it is the one that brought up the issue, there should be another big allocation. Do you have a ulimit -s value from a system that reproduces this? > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341932#comment-16341932 ] Miklos Szegedi commented on YARN-7796: -- [~Jim_Brennan], [~grepas], the stack depth is specified by {{ulimit -s}}. It is different on Redhat 6 and 7. I also checked below with -fstack-check and it has no impact on the limit. {code:java} *** REDHAT 6 *** gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18) [root@mybox-rh69 ~]# curl https://gist.githubusercontent.com/szegedim/c583ccead8316b1035bc9148bcf588b9/raw/c0455196b47c76194e37a100964f3b3bf51d4a53/checkstack.cpp >./checkstack.cpp && gcc ./checkstack.cpp -lstdc++ -fstack-check && ./a.out 12051K succeededSegmentation fault (core dumped) [root@mybox-rh69 ~]# curl https://gist.githubusercontent.com/szegedim/c583ccead8316b1035bc9148bcf588b9/raw/c0455196b47c76194e37a100964f3b3bf51d4a53/checkstack.cpp >./checkstack.cpp && gcc ./checkstack.cpp -lstdc++ && ./a.out 12051K succeededSegmentation fault (core dumped) [root@mybox-rh69 ~]# ulimit -s 10240 *** REDHAT 7 *** gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16) [root@mybox-rh74 ~]# curl https://gist.githubusercontent.com/szegedim/c583ccead8316b1035bc9148bcf588b9/raw/c0455196b47c76194e37a100964f3b3bf51d4a53/checkstack.cpp >./checkstack.cpp && gcc ./checkstack.cpp -lstdc++ -fstack-check && ./a.out 8016K Segmentation fault [root@mybox-rh74 ~]# curl https://gist.githubusercontent.com/szegedim/c583ccead8316b1035bc9148bcf588b9/raw/c0455196b47c76194e37a100964f3b3bf51d4a53/checkstack.cpp >./checkstack.cpp && gcc ./checkstack.cpp -lstdc++ && ./a.out 8016K Segmentation fault [root@mybox-rh74 ~]# ulimit -s 8192 *** REDHAT 6 BUILT CODE ON REDHAT 7 *** [root@mybox-rh74 ~]# scp root@mybox-rh69:/root/a.out ./b.out a.out 100% 6989 4.4MB/s 00:00 [root@mybox-rh74 ~]# ./b.out 8016K Segmentation fault {code} > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341628#comment-16341628 ] Jim Brennan commented on YARN-7796: --- [~grepas] that is interesting. I wonder if it is the version of gcc that is the issue? This is what I was using on RHEL 6, which causes the problem when running on RHEL 7: gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18) I'll be interested to hear if removing the -fstack-check flag works in your case. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340756#comment-16340756 ] Gergo Repas commented on YARN-7796: --- [~Jim_Brennan] Thanks for the info on -fstack-check, I'll look into it. In my case the compilation and execution both happened on RHEL 6.9. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340101#comment-16340101 ] Jim Brennan commented on YARN-7796: --- We ran into this same issue when running a container-executor that was compiled on RHEL 6 on a RHEL 7 system. While we have verified that the patch in this Jira does avoid the segmentation fault, we are concerned that the root cause of the problem remains, and may bite us later. The -fstack-check flag was added to the command line in YARN-6721 Based on my testing, a container-executor (without the patch from this Jira) compiled on RHEL 6 with the -fstack-check flag always hits this segmentation fault when run on RHEL 7. But if you compile without this flag, the container-executor runs on RHEL 7 with no problems. I also verified this with a simple program that just does the copy_file. [~grepas] - was this the case for you? Were you running a container-executor that was compiled on an earlier redhat release? > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337103#comment-16337103 ] Gergo Repas commented on YARN-7796: --- Thanks [~miklos.szeg...@cloudera.com] for the reviews, commit, and pushing the failing builds through! > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336958#comment-16336958 ] Hudson commented on YARN-7796: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13544 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13544/]) YARN-7796. Container-executor fails with segfault on certain OS (szegedim: rev e7642a3e6f540b4b56367babfbaf35ee6b3c7675) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Fix For: 3.1.0 > > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336942#comment-16336942 ] Miklos Szegedi commented on YARN-7796: -- +1 LGTM. The docker issue is unrelated. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336708#comment-16336708 ] genericqa commented on YARN-7796: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 24m 53s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 5s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 56m 49s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7796 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12907385/YARN-7796.002.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux f3eb4970474c 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 39b999a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/19414/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19414/testReport/ | | Max. process+thread count | 395 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19414/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions:
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336628#comment-16336628 ] Miklos Szegedi commented on YARN-7796: -- {{[ERROR] unable to create new native thread -> [Help 1]}} It looks like we have a busy node. I kicked off the test again. https://builds.apache.org/job/PreCommit-YARN-Build/19414/ > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336618#comment-16336618 ] genericqa commented on YARN-7796: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 20s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 18s{color} | {color:red} hadoop-yarn-server-nodemanager in trunk failed. {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 24m 21s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 20s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 11s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 16 new + 0 unchanged - 0 fixed = 16 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 12s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 3s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7796 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12907385/YARN-7796.002.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 225059e2d361 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 39b999a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | compile | https://builds.apache.org/job/PreCommit-YARN-Build/19413/artifact/out/branch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | mvninstall | https://builds.apache.org/job/PreCommit-YARN-Build/19413/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | javac | https://builds.apache.org/job/PreCommit-YARN-Build/19413/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/19413/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19413/testReport/ | | Max. process+thread count | 408 (vs. ulimit of 5000) | | modules | C:
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336605#comment-16336605 ] Miklos Szegedi commented on YARN-7796: -- +1 pending Jenkins [https://builds.apache.org/job/PreCommit-YARN-Build/19413/] > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336516#comment-16336516 ] Gergo Repas commented on YARN-7796: --- [~miklos.szeg...@cloudera.com] Thanks for the comments. I also had to reformat the sorroundings of the line with the leading tab, as the sorrounding lines were using tab too. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch, > YARN-7796.002.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336506#comment-16336506 ] Gergo Repas commented on YARN-7796: --- [~miklos.szeg...@cloudera.com] I'm addressing those. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336469#comment-16336469 ] Miklos Szegedi commented on YARN-7796: -- Thank you [~grepas] for the patch. Can you fix the outstanding whitespace issue? Also these declarations need to be in the beginning of the function per C coding style: {code:java} 1018const int buffer_size = 128*1024; 1019char* buffer = malloc(buffer_size);{code} > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336230#comment-16336230 ] genericqa commented on YARN-7796: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 25m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 31s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 21s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 56m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7796 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12907307/YARN-7796.001.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux fde509a413ac 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f63d13f | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/19405/artifact/out/whitespace-tabs.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19405/testReport/ | | Max. process+thread count | 409 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19405/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch > > > There is a relatively big (128K) buffer
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336151#comment-16336151 ] Miklos Szegedi commented on YARN-7796: -- I tried to start jenkins manually. https://builds.apache.org/job/PreCommit-YARN-Build/19405/ > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336143#comment-16336143 ] genericqa commented on YARN-7796: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 25m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 8s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 37s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 29s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7796 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12907307/YARN-7796.001.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 1497d87b9a0a 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6347b22 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/19402/artifact/out/whitespace-tabs.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/19402/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19402/testReport/ | | Max. process+thread count | 440 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19402/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN >
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336111#comment-16336111 ] genericqa commented on YARN-7796: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 24m 44s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 21s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 56m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7796 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12907304/YARN-7796.000.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 91d2e1d45175 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6347b22 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19401/testReport/ | | Max. process+thread count | 443 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19401/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336046#comment-16336046 ] Gergo Repas commented on YARN-7796: --- [~miklos.szeg...@cloudera.com] Thanks for the comments, I addressed those issues in v001. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch, YARN-7796.001.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336028#comment-16336028 ] Miklos Szegedi commented on YARN-7796: -- [~grepas], thank you for the patch. {{malloc}} may return NULL in case of out of memory. Could you update the patch to log an error and exit in this case? {code:java} if (write_result <= 0) { fprintf(LOGFILE, "Error writing to %s - %s\n", out_filename, strerror(errno)); close(out_fd); return -1; }{code} Also please make sure we do not leak the buffer in the error case above. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is *** > main : requested yarn user is *** > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7796) Container-executor fails with segfault on certain OS configurations
[ https://issues.apache.org/jira/browse/YARN-7796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336007#comment-16336007 ] Gergo Repas commented on YARN-7796: --- The issue is resolved if the above mentioned buffer is allocated on the heap instead of the stack, I'm attaching a patch for this. > Container-executor fails with segfault on certain OS configurations > --- > > Key: YARN-7796 > URL: https://issues.apache.org/jira/browse/YARN-7796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: Gergo Repas >Assignee: Gergo Repas >Priority: Major > Attachments: YARN-7796.000.patch > > > There is a relatively big (128K) buffer allocated on the stack in > container-executor.c for the purpose of copying files. As indicated by the > below gdb stack trace, this allocation can fail with SIGSEGV. This happens > only on certain OS configurations - I can reproduce this issue on RHEL 6.9: > {code:java} > [Thread debugging using libthread_db enabled] > main : command provided 0 > main : run as user is systest > main : requested yarn user is systest > Program received signal SIGSEGV, Segmentation fault. > 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > 966 char buffer[buffer_size]; > (gdb) bt > #0 0x004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > out_filename=0x932930 > "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_01.tokens", > perm=384) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 > #1 0x00409a81 in initialize_app (user=, > app_id=0x7ffd669fd2b7 "application_1516711246952_0001", > nmPrivate_credentials_file=0x7ffd669fd2d6 > "/yarn/nm/nmPrivate/container_1516711246952_0001_02_01.tokens", > local_dirs=0x9331c8, log_roots=, args=0x7ffd669fb168) > at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 > #2 0x00403f90 in main (argc=, argv= optimized out>) at > /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org