The following pull request was submitted through Github. It can be accessed and reviewed at: https://github.com/lxc/lxd/pull/2076
This e-mail was sent by the LXC bot, direct replies will not reach the author unless they happen to be subscribed to this list. === Description (from pull-request) === Consider, t=2016-05-31T20:49:07+0000 lvl=dbug msg=MigrationSink driver=storage/dir live=false container=nonlive objects=[nonlive/snap0] t=2016-05-31T20:49:07+0000 lvl=dbug msg="sending write barrier" t=2016-05-31T20:49:07+0000 lvl=dbug msg="Got message barrier, resetting stream" t=2016-05-31T20:49:07+0000 lvl=dbug msg="sending write barrier" t=2016-05-31T20:49:07+0000 lvl=dbug msg="Got message barrier, resetting stream" panic: concurrent write to websocket connection goroutine 560 [running]: panic(0x9b9da0, 0xc42012a000) /lxd/build/tmp.xTgWRy05x1/go/golang/src/runtime/panic.go:500 +0x1a1 github.com/gorilla/websocket.(*Conn).flushFrame(0xc42000bb00, 0xc42034e601, 0x0, 0x0, 0x0, 0xc42034e778, 0xc42046c001) /lxd/build/tmp.xTgWRy05x1/go/src/github.com/gorilla/websocket/conn.go:481 +0x414 github.com/gorilla/websocket.(*Conn).NextWriter(0xc42000bb00, 0x2, 0xc42034e778, 0x1044e01, 0xc42028aa20, 0x0) /lxd/build/tmp.xTgWRy05x1/go/src/github.com/gorilla/websocket/conn.go:407 +0x1a1 github.com/lxc/lxd/shared.WebsocketMirror.func2(0xc420147180, 0xc42000bb00, 0x10482e0, 0xc42028aa20) /lxd/build/tmp.xTgWRy05x1/go/src/github.com/lxc/lxd/shared/network.go:257 +0x125 created by github.com/lxc/lxd/shared.WebsocketMirror /lxd/build/tmp.xTgWRy05x1/go/src/github.com/lxc/lxd/shared/network.go:274 +0xe2 error: Get https://127.0.0.1:22441/1.0/operations/eb592bbb-aa8c-4a0f-a57f-cac9285fedba/wait: Unable to connect to: 127.0.0.1:22441 One way this could happen is that the goroute that is sending the snapshot signals readDone, and then the storage driver starts sending the next bits over the websocket before the message barrier is actually sent. This seems really unlikley to happen since the goroutine sending the message needs to be starved for a long time, but given that we've only seen this stack trace once in the last few months it seems quite rare. There may be other ways for this to occur, but this is the only one I can see right now. Signed-off-by: Tycho Andersen <tycho.ander...@canonical.com>
From 38c7a38f12e0f2928d7bcb74a20410d09d23cd67 Mon Sep 17 00:00:00 2001 From: Tycho Andersen <tycho.ander...@canonical.com> Date: Wed, 1 Jun 2016 21:26:02 +0000 Subject: [PATCH] probably fix a concurrent websocket write Consider, t=2016-05-31T20:49:07+0000 lvl=dbug msg=MigrationSink driver=storage/dir live=false container=nonlive objects=[nonlive/snap0] t=2016-05-31T20:49:07+0000 lvl=dbug msg="sending write barrier" t=2016-05-31T20:49:07+0000 lvl=dbug msg="Got message barrier, resetting stream" t=2016-05-31T20:49:07+0000 lvl=dbug msg="sending write barrier" t=2016-05-31T20:49:07+0000 lvl=dbug msg="Got message barrier, resetting stream" panic: concurrent write to websocket connection goroutine 560 [running]: panic(0x9b9da0, 0xc42012a000) /lxd/build/tmp.xTgWRy05x1/go/golang/src/runtime/panic.go:500 +0x1a1 github.com/gorilla/websocket.(*Conn).flushFrame(0xc42000bb00, 0xc42034e601, 0x0, 0x0, 0x0, 0xc42034e778, 0xc42046c001) /lxd/build/tmp.xTgWRy05x1/go/src/github.com/gorilla/websocket/conn.go:481 +0x414 github.com/gorilla/websocket.(*Conn).NextWriter(0xc42000bb00, 0x2, 0xc42034e778, 0x1044e01, 0xc42028aa20, 0x0) /lxd/build/tmp.xTgWRy05x1/go/src/github.com/gorilla/websocket/conn.go:407 +0x1a1 github.com/lxc/lxd/shared.WebsocketMirror.func2(0xc420147180, 0xc42000bb00, 0x10482e0, 0xc42028aa20) /lxd/build/tmp.xTgWRy05x1/go/src/github.com/lxc/lxd/shared/network.go:257 +0x125 created by github.com/lxc/lxd/shared.WebsocketMirror /lxd/build/tmp.xTgWRy05x1/go/src/github.com/lxc/lxd/shared/network.go:274 +0xe2 error: Get https://127.0.0.1:22441/1.0/operations/eb592bbb-aa8c-4a0f-a57f-cac9285fedba/wait: Unable to connect to: 127.0.0.1:22441 One way this could happen is that the goroute that is sending the snapshot signals readDone, and then the storage driver starts sending the next bits over the websocket before the message barrier is actually sent. This seems really unlikley to happen since the goroutine sending the message needs to be starved for a long time, but given that we've only seen this stack trace once in the last few months it seems quite rare. There may be other ways for this to occur, but this is the only one I can see right now. Signed-off-by: Tycho Andersen <tycho.ander...@canonical.com> --- shared/network.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/shared/network.go b/shared/network.go index cfd2017..9ad35e4 100644 --- a/shared/network.go +++ b/shared/network.go @@ -275,10 +275,10 @@ func WebsocketMirror(conn *websocket.Conn, w io.WriteCloser, r io.ReadCloser) (c for { buf, ok := <-in if !ok { - readDone <- true r.Close() Debugf("sending write barrier") conn.WriteMessage(websocket.TextMessage, []byte{}) + readDone <- true return } w, err := conn.NextWriter(websocket.BinaryMessage)
_______________________________________________ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel