[jira] [Commented] (TS-934) Proxy Mutex null pointer crash
[ https://issues.apache.org/jira/browse/TS-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13274381#comment-13274381 ] John Plevyak commented on TS-934: - Is this still happening with the latest code? Proxy Mutex null pointer crash -- Key: TS-934 URL: https://issues.apache.org/jira/browse/TS-934 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.1.0 Environment: Debian 6.0.2 quadcore, forward transparent proxy. Reporter: Alan M. Carroll Assignee: Alan M. Carroll Fix For: 3.1.4, 3.1.1 Attachments: ts-934-patch.txt [Client report] We had the cache crash gracefully twice last night on a segfault. Both times the callstack produced by trafficserver's signal handler was: /usr/bin/traffic_server[0x529596] /lib/libpthread.so.0(+0xef60)[0x2ab09a897f60] [0x2ab09e7c0a10] usr/bin/traffic_server(HttpServerSession::do_io_close(int)+0xa8)[0x567a3c] /usr/bin/traffic_server(HttpVCTable::cleanup_entry(HttpVCTableEntry*)+0x4c)[0x56aff6] /usr/bin/traffic_server(HttpVCTable::cleanup_all()+0x64)[0x56b07a] /usr/bin/traffic_server(HttpSM::kill_this()+0x120)[0x57c226] /usr/bin/traffic_server(HttpSM::main_handler(int, void*)+0x208)[0x571b28] /usr/bin/traffic_server(Continuation::handleEvent(int, void*)+0x69)[0x4e4623] I went through the disassembly and the instruction that it is on in ::do_io_close is loading the value of diags (not dereferencing it) so it is unlikely that that through a segfault (unless this is some how in thread local storage and that is corrupt). The kernel message claimed that the instruction pointer was 0x4e438e which in this build is in ProxyMutexPtr::operator -() on the instruction that dereferences the object pointer to get the stored mutex pointer (bingo!), so it would seem that at some point we are dereferencing a null safe pointer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-934) Proxy Mutex null pointer crash
[ https://issues.apache.org/jira/browse/TS-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13274383#comment-13274383 ] John Plevyak commented on TS-934: - I think we should undo this as other changes fixed the bug. Proxy Mutex null pointer crash -- Key: TS-934 URL: https://issues.apache.org/jira/browse/TS-934 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.1.0 Environment: Debian 6.0.2 quadcore, forward transparent proxy. Reporter: Alan M. Carroll Assignee: Alan M. Carroll Fix For: 3.1.4, 3.1.1 Attachments: ts-934-patch.txt [Client report] We had the cache crash gracefully twice last night on a segfault. Both times the callstack produced by trafficserver's signal handler was: /usr/bin/traffic_server[0x529596] /lib/libpthread.so.0(+0xef60)[0x2ab09a897f60] [0x2ab09e7c0a10] usr/bin/traffic_server(HttpServerSession::do_io_close(int)+0xa8)[0x567a3c] /usr/bin/traffic_server(HttpVCTable::cleanup_entry(HttpVCTableEntry*)+0x4c)[0x56aff6] /usr/bin/traffic_server(HttpVCTable::cleanup_all()+0x64)[0x56b07a] /usr/bin/traffic_server(HttpSM::kill_this()+0x120)[0x57c226] /usr/bin/traffic_server(HttpSM::main_handler(int, void*)+0x208)[0x571b28] /usr/bin/traffic_server(Continuation::handleEvent(int, void*)+0x69)[0x4e4623] I went through the disassembly and the instruction that it is on in ::do_io_close is loading the value of diags (not dereferencing it) so it is unlikely that that through a segfault (unless this is some how in thread local storage and that is corrupt). The kernel message claimed that the instruction pointer was 0x4e438e which in this build is in ProxyMutexPtr::operator -() on the instruction that dereferences the object pointer to get the stored mutex pointer (bingo!), so it would seem that at some point we are dereferencing a null safe pointer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-934) Proxy Mutex null pointer crash
[ https://issues.apache.org/jira/browse/TS-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124201#comment-13124201 ] B Wyatt commented on TS-934: From the cores/callstacks I've seen this is the same issue as TS-857. Proxy Mutex null pointer crash -- Key: TS-934 URL: https://issues.apache.org/jira/browse/TS-934 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.1.0 Environment: Debian 6.0.2 quadcore, forward transparent proxy. Reporter: Alan M. Carroll Assignee: Alan M. Carroll Fix For: 3.1.1 Attachments: ts-934-patch.txt [Client report] We had the cache crash gracefully twice last night on a segfault. Both times the callstack produced by trafficserver's signal handler was: /usr/bin/traffic_server[0x529596] /lib/libpthread.so.0(+0xef60)[0x2ab09a897f60] [0x2ab09e7c0a10] usr/bin/traffic_server(HttpServerSession::do_io_close(int)+0xa8)[0x567a3c] /usr/bin/traffic_server(HttpVCTable::cleanup_entry(HttpVCTableEntry*)+0x4c)[0x56aff6] /usr/bin/traffic_server(HttpVCTable::cleanup_all()+0x64)[0x56b07a] /usr/bin/traffic_server(HttpSM::kill_this()+0x120)[0x57c226] /usr/bin/traffic_server(HttpSM::main_handler(int, void*)+0x208)[0x571b28] /usr/bin/traffic_server(Continuation::handleEvent(int, void*)+0x69)[0x4e4623] I went through the disassembly and the instruction that it is on in ::do_io_close is loading the value of diags (not dereferencing it) so it is unlikely that that through a segfault (unless this is some how in thread local storage and that is corrupt). The kernel message claimed that the instruction pointer was 0x4e438e which in this build is in ProxyMutexPtr::operator -() on the instruction that dereferences the object pointer to get the stored mutex pointer (bingo!), so it would seem that at some point we are dereferencing a null safe pointer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-934) Proxy Mutex null pointer crash
[ https://issues.apache.org/jira/browse/TS-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096479#comment-13096479 ] Alan M. Carroll commented on TS-934: bcall reports seeing something that looks very much like this problem (crash in do_io_close at the ProxyMutexPtr dereference with a value of 0 for this). He reports that he doesn't see it at 65K TPS but does at 140K TPS on the 3.0.1 codebase. This codebase does not include the TS-911 fix. Proxy Mutex null pointer crash -- Key: TS-934 URL: https://issues.apache.org/jira/browse/TS-934 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.1.0 Environment: Debian 6.0.2 quadcore, forward transparent proxy. Reporter: Alan M. Carroll Assignee: Alan M. Carroll Fix For: 3.1.1 [Client report] We had the cache crash gracefully twice last night on a segfault. Both times the callstack produced by trafficserver's signal handler was: /usr/bin/traffic_server[0x529596] /lib/libpthread.so.0(+0xef60)[0x2ab09a897f60] [0x2ab09e7c0a10] usr/bin/traffic_server(HttpServerSession::do_io_close(int)+0xa8)[0x567a3c] /usr/bin/traffic_server(HttpVCTable::cleanup_entry(HttpVCTableEntry*)+0x4c)[0x56aff6] /usr/bin/traffic_server(HttpVCTable::cleanup_all()+0x64)[0x56b07a] /usr/bin/traffic_server(HttpSM::kill_this()+0x120)[0x57c226] /usr/bin/traffic_server(HttpSM::main_handler(int, void*)+0x208)[0x571b28] /usr/bin/traffic_server(Continuation::handleEvent(int, void*)+0x69)[0x4e4623] I went through the disassembly and the instruction that it is on in ::do_io_close is loading the value of diags (not dereferencing it) so it is unlikely that that through a segfault (unless this is some how in thread local storage and that is corrupt). The kernel message claimed that the instruction pointer was 0x4e438e which in this build is in ProxyMutexPtr::operator -() on the instruction that dereferences the object pointer to get the stored mutex pointer (bingo!), so it would seem that at some point we are dereferencing a null safe pointer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira