Re: How to verify HAProxy build on Solaris/SPARC ?

2018-08-29 Thread Willy Tarreau
Hi Tom,

On Wed, Aug 29, 2018 at 05:05:20PM -0700, Tom Hood wrote:
> Hi,
> 
> I've built haproxy 1.8.9 with gcc 4.2.0 on Solaris 10, but I'm not sure how
> to verify my build.

Well, first and foremost, you must absolutely use an up-to-date version
in the branch you choose. Latest version in the 1.8 branch currently is
1.8.13 so there is no reason to purposely install a version containing a
number of well known bugs that were already fixed in more recent versions.

> I had to define THREAD_LOCAL to be empty in
> include/common/config.h, but otherwise it seemed to build cleanly.

I find this strange because it is already defined as empty there if
USE_THREAD is not set. And if USE_THREAD is set, you absolutely need
to have this defined to __thread as it's a compiler indication that
the variable is thread-local (per thread) instead of shared.

> Build command:  gmake TARGET=solaris CPU=ultrasparc USE_OPENSSL=1
> SSL_INC= SSL_LIB=
> 
> It "seems" to work for what I'm using it for on Solaris 11.3 (TCP reverse
> proxy with SSL termination).

If you're using threads it will definitely fail without __thread.

> Based on forum thread regression testing for haproxy
>  it looks
> like maybe there isn't any automated testing of my haproxy build I can do
> yet?
> 
> I see the "tests" directory, but I'm not sure what I'm supposed to do with
> that.  I don't see any expected result files for each test case?

The problem with threads is that it solely relies on luck, so by definition
it's hard to build a test to verify that what you've built uses correct
locking. The simple fact that it did not build and you had to modify it
is the problematic part. If you don't need threads, please try to build
adding "USE_THREAD=" (no value) to your make command to disable use of
threads. At least you will not risk to face any thread-related issue.
However I'm definitely interested in figuring why __thread does not
work there! At least from what I'm reading below, it is expected to
work just like on any other system :

   https://docs.oracle.com/cd/E26502_01/html/E26507/gentextid-23018.html

Regards,
Willy



How to verify HAProxy build on Solaris/SPARC ?

2018-08-29 Thread Tom Hood
Hi,

I've built haproxy 1.8.9 with gcc 4.2.0 on Solaris 10, but I'm not sure how
to verify my build.  I had to define THREAD_LOCAL to be empty in
include/common/config.h, but otherwise it seemed to build cleanly.

Build command:  gmake TARGET=solaris CPU=ultrasparc USE_OPENSSL=1
SSL_INC= SSL_LIB=

It "seems" to work for what I'm using it for on Solaris 11.3 (TCP reverse
proxy with SSL termination).

Based on forum thread regression testing for haproxy
 it looks
like maybe there isn't any automated testing of my haproxy build I can do
yet?

I see the "tests" directory, but I'm not sure what I'm supposed to do with
that.  I don't see any expected result files for each test case?

Please let me know.

Thanks,
-- Tom


Re: [PATCH] BUG/MAJOR: thread: lua: Wrong SSL context initialization.

2018-08-29 Thread PiBa-NL

Op 29-8-2018 om 14:29 schreef Olivier Houchard:

On Wed, Aug 29, 2018 at 02:11:45PM +0200, Frederic Lecaille wrote:

This patch is in relation with one bug reproduced by the reg testing file
sent by Pieter in this thread:
https://www.mail-archive.com/haproxy@formilux.org/msg31079.html

Must be checked by Thierry.
Must be backported to 1.8.

Note that Pieter reg testing files reg-tests/lua/b2.* come with this
patch.


Fred.
 From d6d38a354a89b55f91bb9962c5832a089d960b60 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20L=C3=A9caille?= 
Date: Wed, 29 Aug 2018 13:46:24 +0200
Subject: [PATCH] BUG/MAJOR: thread: lua: Wrong SSL context initialization.

When calling ->prepare_srv() callback for SSL server which
depends on global "nbthread" value, this latter was not already parsed,
so equal to 1 default value. This lead to bad memory accesses.

Thank you to Pieter (PiBa-NL) for having reported this issue and
for having provided a very helpful reg testing file to reproduce
this issue (reg-test/lua/b2.*).


That sounds good, nice catch !

And yes thanks Pieter, as usual :)

Olivier


As you've probably already verified, the issue is indeed fixed with this 
patch applied on top of master.


Thanks Frederic & Olivier.

@Thierry, can you give the 'all okay' ? (or not okay, if it needs a 
different fix..)


Regards,
PiBa-NL (Pieter)



Re: [PATCH] BUG/MAJOR: thread: lua: Wrong SSL context initialization.

2018-08-29 Thread Olivier Houchard
On Wed, Aug 29, 2018 at 02:11:45PM +0200, Frederic Lecaille wrote:
> This patch is in relation with one bug reproduced by the reg testing file
> sent by Pieter in this thread:
> https://www.mail-archive.com/haproxy@formilux.org/msg31079.html
> 
> Must be checked by Thierry.
> Must be backported to 1.8.
> 
> Note that Pieter reg testing files reg-tests/lua/b2.* come with this
> patch.
> 
> 
> Fred.

> From d6d38a354a89b55f91bb9962c5832a089d960b60 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20L=C3=A9caille?= 
> Date: Wed, 29 Aug 2018 13:46:24 +0200
> Subject: [PATCH] BUG/MAJOR: thread: lua: Wrong SSL context initialization.
> 
> When calling ->prepare_srv() callback for SSL server which
> depends on global "nbthread" value, this latter was not already parsed,
> so equal to 1 default value. This lead to bad memory accesses.
> 
> Thank you to Pieter (PiBa-NL) for having reported this issue and
> for having provided a very helpful reg testing file to reproduce
> this issue (reg-test/lua/b2.*).
> 

That sounds good, nice catch !

And yes thanks Pieter, as usual :)

Olivier



[PATCH] BUG/MAJOR: thread: lua: Wrong SSL context initialization.

2018-08-29 Thread Frederic Lecaille
This patch is in relation with one bug reproduced by the reg testing 
file sent by Pieter in this thread: 
https://www.mail-archive.com/haproxy@formilux.org/msg31079.html


Must be checked by Thierry.
Must be backported to 1.8.

Note that Pieter reg testing files reg-tests/lua/b2.* come with this 
patch.



Fred.
>From d6d38a354a89b55f91bb9962c5832a089d960b60 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20L=C3=A9caille?= 
Date: Wed, 29 Aug 2018 13:46:24 +0200
Subject: [PATCH] BUG/MAJOR: thread: lua: Wrong SSL context initialization.

When calling ->prepare_srv() callback for SSL server which
depends on global "nbthread" value, this latter was not already parsed,
so equal to 1 default value. This lead to bad memory accesses.

Thank you to Pieter (PiBa-NL) for having reported this issue and
for having provided a very helpful reg testing file to reproduce
this issue (reg-test/lua/b2.*).

Must be backported to 1.8.
---
 reg-tests/lua/b2.lua | 180 +++
 reg-tests/lua/b2.vtc |  33 +++
 reg-tests/lua/b2_print_r.lua |  96 +
 reg-tests/lua/common.pem |   1 +
 src/hlua.c   |  19 +++--
 5 files changed, 321 insertions(+), 8 deletions(-)
 create mode 100644 reg-tests/lua/b2.lua
 create mode 100644 reg-tests/lua/b2.vtc
 create mode 100644 reg-tests/lua/b2_print_r.lua
 create mode 12 reg-tests/lua/common.pem

diff --git a/reg-tests/lua/b2.lua b/reg-tests/lua/b2.lua
new file mode 100644
index ..41e5eeeb
--- /dev/null
+++ b/reg-tests/lua/b2.lua
@@ -0,0 +1,180 @@
+Luacurl = {}
+Luacurl.__index = Luacurl
+setmetatable(Luacurl, {
+	__call = function (cls, ...)
+		return cls.new(...)
+	end,
+})
+function Luacurl.new(server, port, ssl)
+	local self = setmetatable({}, Luacurl)
+	self.sockconnected = false
+	self.server = server
+	self.port = port
+	self.ssl = ssl
+	self.cookies = {}
+	return self
+end
+
+function Luacurl:get(method,url,headers,data)
+	core.Info("MAKING SOCKET")
+	if self.sockconnected == false then
+	  self.sock = core.tcp()
+	  if self.ssl then
+		local r = self.sock:connect_ssl(self.server,self.port)
+	  else
+		local r = self.sock:connect(self.server,self.port)
+	  end
+	  self.sockconnected = true
+	end
+	core.Info("SOCKET MADE")
+	local request = method.." "..url.." HTTP/1.1"
+	if data ~= nil then
+		request = request .. "\r\nContent-Length: "..string.len(data)
+	end
+	if headers ~= null then
+		for h,v in pairs(headers) do
+			request = request .. "\r\n"..h..": "..v
+		end
+	end
+	cookstring = ""
+	for cook,cookval in pairs(self.cookies) do
+		cookstring = cookstring .. cook.."="..cookval.."; "
+	end
+	if string.len(cookstring) > 0 then
+		request = request .. "\r\nCookie: "..cookstring
+	end
+
+	request = request .. "\r\n\r\n"
+	if data and string.len(data) > 0 then
+		request = request .. data
+	end
+--print(request)
+	core.Info("SENDING REQUEST")
+	self.sock:send(request)
+
+--	core.Info("PROCESSING RESPONSE")
+	return processhttpresponse(self.sock)
+end
+
+function processhttpresponse(socket)
+	local res = {}
+core.Info("1")
+	res.status = socket:receive("*l")
+core.Info("2")
+
+	if res.status == nil then
+		core.Info(" processhttpresponse RECEIVING status: NIL")
+		return res
+	end
+	core.Info(" processhttpresponse RECEIVING status:"..res.status)
+	res.headers = {}
+	res.headerslist = {}
+	repeat
+core.Info("3")
+		local header = socket:receive("*l")
+		if header == nil then
+			return "error"
+		end
+		local valuestart = header:find(":")
+		if valuestart ~= nil then
+			local head = header:sub(1,valuestart-1)
+			local value = header:sub(valuestart+2)
+			table.insert(res.headerslist, {head,value})
+			res.headers[head] = value
+		end
+	until header == ""
+	local bodydone = false
+	if res.headers["Connection"] ~= nil and res.headers["Connection"] == "close" then
+--		core.Info("luacurl processresponse with connection:close")
+		res.body = ""
+		repeat
+core.Info("4")
+			local d = socket:receive("*a")
+			if d ~= nil then
+res.body = res.body .. d
+			end
+		until d == nil or d == 0
+		bodydone = true
+	end
+	if bodydone == false and res.headers["Content-Length"] ~= nil then
+		res.contentlength = tonumber(res.headers["Content-Length"])
+		if res.contentlength == nil then
+		  core.Warning("res.contentlength ~NIL = "..res.headers["Content-Length"])
+		end
+--		core.Info("luacur, contentlength="..res.contentlength)
+		res.body = ""
+		repeat
+			local d = socket:receive(res.contentlength)
+			if d == nil then
+--core.Info("luacurl, ERROR?: recieved NIL, expecting "..res.contentlength.." bytes only got "..string.len(res.body).." sofar")
+return
+			else
+res.body = res.body..d
+--core.Info("luacurl, COMPLETE?: expecting "..res.contentlength.." bytes, got "..string.len(res.body))
+if string.len(res.body) >= res.contentlength then
+--	core.Info("luacurl, COMPLETE?: expecting "..res.contentlength.." 

Re: lua script, 200% cpu usage with nbthread 3 - haproxy hangs - __spin_lock - HA-Proxy version 1.9-dev1-e3faf02 2018/08/25

2018-08-29 Thread Frederic Lecaille

On 08/28/2018 11:19 AM, Frederic Lecaille wrote:

On 08/27/2018 10:46 PM, PiBa-NL wrote:

Hi Frederic, Oliver,


Hi Pieter,


Thanks for your investigations :).
I've made a little reg-test (files attached). Its probably not 
'correct' to commit as-is, but should be enough to get a 
reproduction.. I hope..


changing it to nbthread 1 makes it work every time..(that i tried)


Your script is correct. Thank you a lot for this Pieter.


The test actually seems to show a variety of issues.
## Every once in a while it takes like 7 seconds to run a test.. 
During which cpu usage is high..


Sounds like the first issue you reported. You can use -t varnistest 
option to set a large timeout so that you might have enough time to kill 
varnistest (Ctrl+C) to prevent it to kill haproxy. Then you can attach 
gdb to the haproxy process.



  c0    7.6 HTTP rx timeout (fd:5 7500 ms)

## But most of the time, it just doesn't finish with a correct result 
(ive seen haproxy do core dumps also while testing..). There is of 
course the option that i did something wrong in the lua as well...


Does the test itself work for you guys? (with nbthread 1)


I have not managed to make this script fails with "nbthread 1".

I have also seen coredumps with "nbthread 3" even with only one HTTP 
request from c0 client:



     client c0 -connect ${h1_fe1_sock} {
     txreq -url "/"
     rxresp
     expect resp.status == 200
     }

If you run varnishtest with -l option, it leaves the temporary vtc.* 
directory if the test failed.


If you set your environment to produce coredumps (ulimit -c unlimited) 
you should find coredump files in /tmp/vtc.*// 
directory (/tmp/vtc.*/h1/ in our case).


According to gdb we have an issue in src/ssl_sock.c. So I have CC this 
mail to Emeric:


Reading symbols from haproxy...done.
[New LWP 32432]
[New LWP 32431]
[New LWP 32428]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/home/flecaille/src/haproxy/haproxy -d -f 
/tmp/vtc.32410.6f80f987/h1/cfg'.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f78f98bba56 in ASN1_get_object ()
    from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
[Current thread is 1 (Thread 0x7f78f8522700 (LWP 32432))]
(gdb) bt full
#0  0x7f78f98bba56 in ASN1_get_object ()
    from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
No symbol table info available.
#1  0x7f78f98c2ff8 in ?? () from 
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1

No symbol table info available.
#2  0x7f78f98c41b5 in ?? () from 
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1

No symbol table info available.
#3  0x7f78f98c4ead in ASN1_item_ex_d2i ()
    from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
No symbol table info available.
#4  0x7f78f98c4f2b in ASN1_item_d2i ()
    from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
No symbol table info available.
#5  0x7f78f9cdac98 in d2i_SSL_SESSION ()
    from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
No symbol table info available.
#6  0x55e2078be006 in ssl_sock_init (conn=0x7f78e8012220) at 
src/ssl_sock.c:4985
     ptr = 0xf800 address 0xf800>

     sess = 
     may_retry = 
     conn = 0x7f78e8012220
#7  0x55e20797dfc1 in conn_xprt_init (conn=0x7f78e8012220)
     at include/proto/connection.h:84
     ret = 0
#8  tcp_connect_server (conn=0x7f78e8012220, data=0, delack=out>)

     at src/proto_tcp.c:545
     fd = 18
     srv = 
     be = 0x55e207c567e0 
     src = 
#9  0x55e207981aba in si_connect (si=0x7f78e8017680)
     at include/proto/stream_interface.h:366
     ret = 0
#10 connect_server (s=s@entry=0x7f78e80173b0) at src/backend.c:1223
     cli_conn = 0x0
     srv_conn = 0x7f78e8012220
     srv_cs = 
     old_cs = 
     reuse = 
     err = 
#11 0x55e207924295 in sess_update_stream_int (s=0x7f78e80173b0) at 
src/stream.c:885

     conn_err = 
     si = 0x7f78e8017680
     req = 0x7f78e80173c0
#12 process_stream (t=, context=0x7f78e80173b0, 
state=)

     at src/stream.c:2240
     s = 0x7f78e80173b0
     sess = 
     rqf_last = 
     rpf_last = 2147483648
     rq_prod_last = 
     rq_cons_last = 
     rp_cons_last = 
     rp_prod_last = 
     req_ana_back = 
     req = 0x7f78e80173c0
     res = 0x7f78e8017420
     si_f = 0x7f78e8017638
     si_b = 0x7f78e8017680
#13 0x55e2079ab1f8 in process_runnable_tasks () at src/task.c:381
     t = 
     state = 
     ctx = 
---Type  to continue, or q  to quit---
     process = 
     t = 
     max_processed = 
#14 0x55e207959c51 in run_poll_loop () at src/haproxy.c:2386
     next = 
     exp = 
#15 run_thread_poll_loop (data=) at src/haproxy.c:2451
     ptif = 
     ptdf = 
     start_lock = 0
#16 0x7f78f9f27494 in start_thread (arg=0x7f78f8522700) at