David Calavera created TS-4444:
----------------------------------

             Summary: Segfault accessing NULL connection buffer reader
                 Key: TS-4444
                 URL: https://issues.apache.org/jira/browse/TS-4444
             Project: Traffic Server
          Issue Type: Bug
            Reporter: David Calavera


A few days ago, we found a segfault trying to use the master branch in staging. 
Using ATS as a proxy to another http server, we managed to segfault ATS with 
simple client requests. 

This problem is hard to reproduce, we've seen it connecting clients from Mac OS 
X and Linux, but it doesn't always happen.

This is the crash:

{code}
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff2f83700 (LWP 20224)]
HttpTunnel::consumer_handler (this=this@entry=0x7fffe8fa4d28, event=103, 
c=0x7fffe8fa4dd8) at HttpTunnel.cc:1340
1340        c->buffer_reader->mbuf->dealloc_reader(c->buffer_reader);
(gdb) bt
#0  HttpTunnel::consumer_handler (this=this@entry=0x7fffe8fa4d28, event=103, 
c=0x7fffe8fa4dd8) at HttpTunnel.cc:1340
#1  0x00000000005ea395 in HttpTunnel::main_handler (this=0x7fffe8fa4d28, 
event=<optimized out>, data=<optimized out>) at HttpTunnel.cc:1574
#2  0x000000000068df8f in handleEvent (data=0x7fffe4218e00, event=<optimized 
out>, this=<optimized out>) at ../../iocore/eventsystem/I_Continuation.h:153
#3  CacheVC::calluser (this=0x7fffe4218bb0, event=<optimized out>) at 
../../iocore/cache/P_CacheInternal.h:628
#4  0x0000000000704c95 in CacheVC::openWriteMain (this=0x7fffe4218bb0) at 
CacheWrite.cc:1350
#5  0x00000000007627d0 in handleEvent (data=0x7fffe41d4f80, event=1, 
this=<optimized out>) at I_Continuation.h:153
#6  EThread::process_event (this=0x7ffff3085010, e=0x7fffe41d4f80, 
calling_code=1) at UnixEThread.cc:130
#7  0x000000000076348b in EThread::execute (this=0x7ffff3085010) at 
UnixEThread.cc:184
#8  0x000000000076224a in spawn_thread_internal (a=0x10affa0) at Thread.cc:86
#9  0x00007ffff6890182 in start_thread (arg=0x7ffff2f83700) at 
pthread_create.c:312
#10 0x00007ffff5b9947d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
{code}

We were able to reproduce the issue with this simple Go program:

{code:java}
package main

import (
        "fmt"
        "html"
        "log"
        "net/http"
)

func main() {
        http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
                w.Header().Set("Content-Type", "text/html; charset=UTF-8")
                w.Header().Set("Cache-Control", "public, max-age=0, 
must-revalidate")
                fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
        })
        log.Fatal(http.ListenAndServe(":9393", nil))
}
{code}

We redirect ATS to this program with a very simple remap rule:

{code}
regex_map http://.*  http://10.1.10.17:9393
{code}

Using `git bisect` we found that the issue was introduced in this commit, 
although we don't really understand why:

https://github.com/apache/trafficserver/commit/af76977adb9f3c0296a232688bbcb5a1421a6768

I have a patch ready that seems to fix the problem and that I'll push as a pull 
request to GitHub asap.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to