David Calavera created TS-4444:
----------------------------------
Summary: Segfault accessing NULL connection buffer reader
Key: TS-4444
URL: https://issues.apache.org/jira/browse/TS-4444
Project: Traffic Server
Issue Type: Bug
Reporter: David Calavera
A few days ago, we found a segfault trying to use the master branch in staging.
Using ATS as a proxy to another http server, we managed to segfault ATS with
simple client requests.
This problem is hard to reproduce, we've seen it connecting clients from Mac OS
X and Linux, but it doesn't always happen.
This is the crash:
{code}
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff2f83700 (LWP 20224)]
HttpTunnel::consumer_handler (this=this@entry=0x7fffe8fa4d28, event=103,
c=0x7fffe8fa4dd8) at HttpTunnel.cc:1340
1340 c->buffer_reader->mbuf->dealloc_reader(c->buffer_reader);
(gdb) bt
#0 HttpTunnel::consumer_handler (this=this@entry=0x7fffe8fa4d28, event=103,
c=0x7fffe8fa4dd8) at HttpTunnel.cc:1340
#1 0x00000000005ea395 in HttpTunnel::main_handler (this=0x7fffe8fa4d28,
event=<optimized out>, data=<optimized out>) at HttpTunnel.cc:1574
#2 0x000000000068df8f in handleEvent (data=0x7fffe4218e00, event=<optimized
out>, this=<optimized out>) at ../../iocore/eventsystem/I_Continuation.h:153
#3 CacheVC::calluser (this=0x7fffe4218bb0, event=<optimized out>) at
../../iocore/cache/P_CacheInternal.h:628
#4 0x0000000000704c95 in CacheVC::openWriteMain (this=0x7fffe4218bb0) at
CacheWrite.cc:1350
#5 0x00000000007627d0 in handleEvent (data=0x7fffe41d4f80, event=1,
this=<optimized out>) at I_Continuation.h:153
#6 EThread::process_event (this=0x7ffff3085010, e=0x7fffe41d4f80,
calling_code=1) at UnixEThread.cc:130
#7 0x000000000076348b in EThread::execute (this=0x7ffff3085010) at
UnixEThread.cc:184
#8 0x000000000076224a in spawn_thread_internal (a=0x10affa0) at Thread.cc:86
#9 0x00007ffff6890182 in start_thread (arg=0x7ffff2f83700) at
pthread_create.c:312
#10 0x00007ffff5b9947d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
{code}
We were able to reproduce the issue with this simple Go program:
{code:java}
package main
import (
"fmt"
"html"
"log"
"net/http"
)
func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/html; charset=UTF-8")
w.Header().Set("Cache-Control", "public, max-age=0,
must-revalidate")
fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
})
log.Fatal(http.ListenAndServe(":9393", nil))
}
{code}
We redirect ATS to this program with a very simple remap rule:
{code}
regex_map http://.* http://10.1.10.17:9393
{code}
Using `git bisect` we found that the issue was introduced in this commit,
although we don't really understand why:
https://github.com/apache/trafficserver/commit/af76977adb9f3c0296a232688bbcb5a1421a6768
I have a patch ready that seems to fix the problem and that I'll push as a pull
request to GitHub asap.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)