Re: regtest: abns should work now :-)
пт, 3 апр. 2020 г. в 16:56, Илья Шипицин : > > > пт, 3 апр. 2020 г. в 16:33, Martin Grigorov : > >> Hi everyone, >> >> On Mon, Mar 23, 2020 at 11:11 AM Martin Grigorov >> wrote: >> >>> Hi Илья, >>> >>> On Mon, Mar 23, 2020 at 10:52 AM Илья Шипицин >>> wrote: >>> well, I tried to repro abns failures on x86_64 I chose MS Azure VM of completely different size, both number of CPU and RAM. it was never reproduced, say on 1000 execution in loop. so, I decided "it looks like something with memory aligning". also, I tried to run arm64 emulation on virtualbox. no luck yet. >>> >>> >> >> >> >>> Have you tried with multiarch Docker ? >>> >>> 1) execute >>> docker run --rm --privileged multiarch/qemu-user-static:register --reset >>> to register QEMU >>> >>> 2) create Dockerfile >>> for Centos use: FROM multiarch/centos:7-aarch64-clean >>> for Ubuntu use: FROM multiarch/ubuntu-core:arm64-bionic >>> >>> 3) enjoy :-) >>> >> >> Here is a PR for Varnish Cache project where I use Docker + QEMU to build >> and package for several Linux distros and two architectures: >> https://github.com/varnishcache/varnish-cache/pull/3263 >> They use CircleCI but I guess the same approach can be applied on GitHub >> Actions. >> If you are interested in this approach I could give it a try. >> > > I tried custom docker images in Github Actions. > > some parts of github runner are executed inside container, for example it > breaks centos 6 > https://github.com/actions/runner/issues/337 > here's corresponding workflow https://github.com/chipitsine/haproxy/commit/20fabcd005dc9e3bac54a84bf44631f177fa79c2 > > however, I was able to run Fedora Rawhide. > > if that will work, why not ? > if you will get it working on CircleCI, I do not mind. CircleCI is nice. > > >> >> >> Regards, >> Martin >> >> >>> >>> пн, 23 мар. 2020 г. в 13:43, Willy Tarreau : > Hi Ilya, > > I think this time I managed to fix the ABNS test. To make a long story > short, it was by design extremely sensitive to the new process's > startup > time, which is increased with larger FD counts and/or less powerful VMs > and/or noisy neighbors. This explains why it started to misbehave with > the commit which relaxed the maxconn limitations. A starting process > stealing a few ms of CPU from the old one could make its keep-alive > timeout expire before it got a new request on a reused connection, > resulting in an empty response as reported by the client. > > I'm going to issue dev5 now. s390x is currently down but all x86 ones > build and run fine for now. > > Cheers, > Willy >
Re: regtest: abns should work now :-)
пт, 3 апр. 2020 г. в 16:33, Martin Grigorov : > Hi everyone, > > On Mon, Mar 23, 2020 at 11:11 AM Martin Grigorov > wrote: > >> Hi Илья, >> >> On Mon, Mar 23, 2020 at 10:52 AM Илья Шипицин >> wrote: >> >>> well, I tried to repro abns failures on x86_64 >>> I chose MS Azure VM of completely different size, both number of CPU and >>> RAM. >>> it was never reproduced, say on 1000 execution in loop. >>> >>> so, I decided "it looks like something with memory aligning". >>> also, I tried to run arm64 emulation on virtualbox. no luck yet. >>> >> >> > > > >> Have you tried with multiarch Docker ? >> >> 1) execute >> docker run --rm --privileged multiarch/qemu-user-static:register --reset >> to register QEMU >> >> 2) create Dockerfile >> for Centos use: FROM multiarch/centos:7-aarch64-clean >> for Ubuntu use: FROM multiarch/ubuntu-core:arm64-bionic >> >> 3) enjoy :-) >> > > Here is a PR for Varnish Cache project where I use Docker + QEMU to build > and package for several Linux distros and two architectures: > https://github.com/varnishcache/varnish-cache/pull/3263 > They use CircleCI but I guess the same approach can be applied on GitHub > Actions. > If you are interested in this approach I could give it a try. > I tried custom docker images in Github Actions. some parts of github runner are executed inside container, for example it breaks centos 6 https://github.com/actions/runner/issues/337 however, I was able to run Fedora Rawhide. if that will work, why not ? if you will get it working on CircleCI, I do not mind. CircleCI is nice. > > > Regards, > Martin > > >> >> >>> >>> пн, 23 мар. 2020 г. в 13:43, Willy Tarreau : >>> Hi Ilya, I think this time I managed to fix the ABNS test. To make a long story short, it was by design extremely sensitive to the new process's startup time, which is increased with larger FD counts and/or less powerful VMs and/or noisy neighbors. This explains why it started to misbehave with the commit which relaxed the maxconn limitations. A starting process stealing a few ms of CPU from the old one could make its keep-alive timeout expire before it got a new request on a reused connection, resulting in an empty response as reported by the client. I'm going to issue dev5 now. s390x is currently down but all x86 ones build and run fine for now. Cheers, Willy >>>
Re: regtest: abns should work now :-)
Hi everyone, On Mon, Mar 23, 2020 at 11:11 AM Martin Grigorov wrote: > Hi Илья, > > On Mon, Mar 23, 2020 at 10:52 AM Илья Шипицин > wrote: > >> well, I tried to repro abns failures on x86_64 >> I chose MS Azure VM of completely different size, both number of CPU and >> RAM. >> it was never reproduced, say on 1000 execution in loop. >> >> so, I decided "it looks like something with memory aligning". >> also, I tried to run arm64 emulation on virtualbox. no luck yet. >> > > > Have you tried with multiarch Docker ? > > 1) execute > docker run --rm --privileged multiarch/qemu-user-static:register --reset > to register QEMU > > 2) create Dockerfile > for Centos use: FROM multiarch/centos:7-aarch64-clean > for Ubuntu use: FROM multiarch/ubuntu-core:arm64-bionic > > 3) enjoy :-) > Here is a PR for Varnish Cache project where I use Docker + QEMU to build and package for several Linux distros and two architectures: https://github.com/varnishcache/varnish-cache/pull/3263 They use CircleCI but I guess the same approach can be applied on GitHub Actions. If you are interested in this approach I could give it a try. Regards, Martin > > >> >> пн, 23 мар. 2020 г. в 13:43, Willy Tarreau : >> >>> Hi Ilya, >>> >>> I think this time I managed to fix the ABNS test. To make a long story >>> short, it was by design extremely sensitive to the new process's startup >>> time, which is increased with larger FD counts and/or less powerful VMs >>> and/or noisy neighbors. This explains why it started to misbehave with >>> the commit which relaxed the maxconn limitations. A starting process >>> stealing a few ms of CPU from the old one could make its keep-alive >>> timeout expire before it got a new request on a reused connection, >>> resulting in an empty response as reported by the client. >>> >>> I'm going to issue dev5 now. s390x is currently down but all x86 ones >>> build and run fine for now. >>> >>> Cheers, >>> Willy >>> >>
Re: regtest: abns should work now :-)
On Mon, Mar 23, 2020 at 01:51:23PM +0500, ??? wrote: > osx were not stable few days on travis. > s390x also. > > I think to wait for few days and if it will not be repaired, we will mark > all those as "allowed failures" for good. Sounds good to me. Thanks! Willy
Re: regtest: abns should work now :-)
Hi Илья, On Mon, Mar 23, 2020 at 10:52 AM Илья Шипицин wrote: > well, I tried to repro abns failures on x86_64 > I chose MS Azure VM of completely different size, both number of CPU and > RAM. > it was never reproduced, say on 1000 execution in loop. > > so, I decided "it looks like something with memory aligning". > also, I tried to run arm64 emulation on virtualbox. no luck yet. > Have you tried with multiarch Docker ? 1) execute docker run --rm --privileged multiarch/qemu-user-static:register --reset to register QEMU 2) create Dockerfile for Centos use: FROM multiarch/centos:7-aarch64-clean for Ubuntu use: FROM multiarch/ubuntu-core:arm64-bionic 3) enjoy :-) > > пн, 23 мар. 2020 г. в 13:43, Willy Tarreau : > >> Hi Ilya, >> >> I think this time I managed to fix the ABNS test. To make a long story >> short, it was by design extremely sensitive to the new process's startup >> time, which is increased with larger FD counts and/or less powerful VMs >> and/or noisy neighbors. This explains why it started to misbehave with >> the commit which relaxed the maxconn limitations. A starting process >> stealing a few ms of CPU from the old one could make its keep-alive >> timeout expire before it got a new request on a reused connection, >> resulting in an empty response as reported by the client. >> >> I'm going to issue dev5 now. s390x is currently down but all x86 ones >> build and run fine for now. >> >> Cheers, >> Willy >> >
Re: regtest: abns should work now :-)
osx were not stable few days on travis. s390x also. I think to wait for few days and if it will not be repaired, we will mark all those as "allowed failures" for good. пн, 23 мар. 2020 г. в 13:43, Willy Tarreau : > Hi Ilya, > > I think this time I managed to fix the ABNS test. To make a long story > short, it was by design extremely sensitive to the new process's startup > time, which is increased with larger FD counts and/or less powerful VMs > and/or noisy neighbors. This explains why it started to misbehave with > the commit which relaxed the maxconn limitations. A starting process > stealing a few ms of CPU from the old one could make its keep-alive > timeout expire before it got a new request on a reused connection, > resulting in an empty response as reported by the client. > > I'm going to issue dev5 now. s390x is currently down but all x86 ones > build and run fine for now. > > Cheers, > Willy >
Re: regtest: abns should work now :-)
well, I tried to repro abns failures on x86_64 I chose MS Azure VM of completely different size, both number of CPU and RAM. it was never reproduced, say on 1000 execution in loop. so, I decided "it looks like something with memory aligning". also, I tried to run arm64 emulation on virtualbox. no luck yet. пн, 23 мар. 2020 г. в 13:43, Willy Tarreau : > Hi Ilya, > > I think this time I managed to fix the ABNS test. To make a long story > short, it was by design extremely sensitive to the new process's startup > time, which is increased with larger FD counts and/or less powerful VMs > and/or noisy neighbors. This explains why it started to misbehave with > the commit which relaxed the maxconn limitations. A starting process > stealing a few ms of CPU from the old one could make its keep-alive > timeout expire before it got a new request on a reused connection, > resulting in an empty response as reported by the client. > > I'm going to issue dev5 now. s390x is currently down but all x86 ones > build and run fine for now. > > Cheers, > Willy >
regtest: abns should work now :-)
Hi Ilya, I think this time I managed to fix the ABNS test. To make a long story short, it was by design extremely sensitive to the new process's startup time, which is increased with larger FD counts and/or less powerful VMs and/or noisy neighbors. This explains why it started to misbehave with the commit which relaxed the maxconn limitations. A starting process stealing a few ms of CPU from the old one could make its keep-alive timeout expire before it got a new request on a reused connection, resulting in an empty response as reported by the client. I'm going to issue dev5 now. s390x is currently down but all x86 ones build and run fine for now. Cheers, Willy