[Gluster-infra] [Bug 1621987] New: Glusto-tests build is failing because of some dependency issue

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1621987 Bug ID: 1621987 Summary: Glusto-tests build is failing because of some dependency issue Product: GlusterFS Version: 3.12 Component: project-infrastructure Assigne

[Gluster-infra] [Bug 1621978] New: Create a new list for ci results

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1621978 Bug ID: 1621978 Summary: Create a new list for ci results Product: GlusterFS Version: mainline Component: project-infrastructure Assignee: b...@gluster.org Reporter: nig...

[Gluster-infra] Post mortem of 2018-08-23 (2 for the price of one)

2018-08-23 Thread Michael Scherer
Hi, so we had 3 incidents in the last 24h, and while all of them are different, they are also linked. So we did face several issues, starting by gerrit showing error 500 last night, around 23h Paris. That was https://bugzilla.redhat.com/show_bug.cgi?id=1620243 , and did result in a memory upgra

Re: [Gluster-infra] More proxy cleanup coming

2018-08-23 Thread Michael Scherer
Le jeudi 15 mars 2018 à 15:35 +0100, Michael Scherer a écrit : > Hi, > > So now we have a new proxy (yes, I am almost as proud of it as the > firewall), I need to move the old service on the old proxy to the new > one. It will imply some time of unavailability, because DNS has > latency > to propa

Re: [Gluster-infra] 2 outages: http.int.rht.gluster.org disk full and DNS issue

2018-08-23 Thread Michael Scherer
Le jeudi 23 août 2018 à 17:12 +0200, Michael Scherer a écrit : > Hi, > > quick note, we have 2 outages at the moment: > > - I changed build.gluster.org DNS? but somehow, it do redirect to > supercolony.gluster and jenkins. Why, I a not sure, but I reverted my > DNS change and will search more, ca

[Gluster-infra] 2 outages: http.int.rht.gluster.org disk full and DNS issue

2018-08-23 Thread Michael Scherer
Hi, quick note, we have 2 outages at the moment: - I changed build.gluster.org DNS? but somehow, it do redirect to supercolony.gluster and jenkins. Why, I a not sure, but I reverted my DNS change and will search more, cause the syntax did look ok to me. So we have to wait until DNS is propagated

[Gluster-infra] Reboot of gerrit notes

2018-08-23 Thread Michael Scherer
So, a few notes, before I forgot, since I am going to lunch: - seems gerrit didn't start automatically at boot. I tought this was supposed to be fixed and done by systemd, but nope. Need to investigate more. - gerrit as usual say it is ready, but it is not, since logging to github only work after

Re: [Gluster-infra] Emergency reboot on Gerrit today at 09h45 UTC

2018-08-23 Thread Michael Scherer
Le jeudi 23 août 2018 à 11:45 +0200, Michael Scherer a écrit : > Le jeudi 23 août 2018 à 10:46 +0200, Michael Scherer a écrit : > > Hi, > > > > as said on https://bugzilla.redhat.com/show_bug.cgi?id=1620243 , we > > have found that gerrit can't sustain the load when receiving too > > much > > patc

Re: [Gluster-infra] Emergency reboot on Gerrit today at 09h45 UTC

2018-08-23 Thread Michael Scherer
Le jeudi 23 août 2018 à 10:46 +0200, Michael Scherer a écrit : > Hi, > > as said on https://bugzilla.redhat.com/show_bug.cgi?id=1620243 , we > have found that gerrit can't sustain the load when receiving too much > patch at once. We are going to reboot the VM to bump the memory and > CPUs. We alre

Re: [Gluster-infra] Urgent Gerrit reboot today

2018-08-23 Thread Michael Scherer
Le jeudi 23 août 2018 à 14:01 +0530, Nigel Babu a écrit : > Hello folks, > > We're going to do an urgent reboot of the Gerrit server in the next > 1h or > so. For some reason, hot-adding RAM on this machine isn't working, so > we're > going to do a reboot to get this working. This is needed to pre

Re: [Gluster-infra] [Gluster-devel] Reboot policy for the infra

2018-08-23 Thread Michael Scherer
Le jeudi 23 août 2018 à 11:37 +0300, Yaniv Kaul a écrit : > On Thu, Aug 23, 2018 at 10:49 AM, Michael Scherer m> > wrote: > > > Le jeudi 23 août 2018 à 11:21 +0530, Nigel Babu a écrit : > > > One more piece that's missing is when we'll restart the physical > > > servers. > > > That seems to be en

[Gluster-infra] Emergency reboot on Gerrit today at 09h45 UTC

2018-08-23 Thread Michael Scherer
Hi, as said on https://bugzilla.redhat.com/show_bug.cgi?id=1620243 , we have found that gerrit can't sustain the load when receiving too much patch at once. We are going to reboot the VM to bump the memory and CPUs. We already used the hotplug feature of libvirt, so a reboot is required. We are g

Re: [Gluster-infra] [Gluster-devel] Reboot policy for the infra

2018-08-23 Thread Yaniv Kaul
On Thu, Aug 23, 2018 at 10:49 AM, Michael Scherer wrote: > Le jeudi 23 août 2018 à 11:21 +0530, Nigel Babu a écrit : > > One more piece that's missing is when we'll restart the physical > > servers. > > That seems to be entirely missing. The rest looks good to me and I'm > > happy > > to add an i

[Gluster-infra] Urgent Gerrit reboot today

2018-08-23 Thread Nigel Babu
Hello folks, We're going to do an urgent reboot of the Gerrit server in the next 1h or so. For some reason, hot-adding RAM on this machine isn't working, so we're going to do a reboot to get this working. This is needed to prevent the OOM Kill problems we've been running into since last night. --

[Gluster-infra] [Bug 1620243] Gerrit is non-responsive (503)

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1620243 --- Comment #14 from M. Scherer --- Also, I can't increase the root partition for some reason, and I need to figure why, cause lvm do say there is enough space. I did changed the configuration to have max 8G of ram, so a reboot (not just reb

[Gluster-infra] [Bug 1620243] Gerrit is non-responsive (503)

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1620243 --- Comment #13 from M. Scherer --- It would need a reboot for that. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=8sQgRd5Qin&a=cc_unsubscribe _

[Gluster-infra] [Bug 1620243] Gerrit is non-responsive (503)

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1620243 --- Comment #12 from Nigel Babu --- Alright, so 1 GB of swap isn't enough. Michael, can you give the VM 4 more GB of RAM? Please add another 10 Gig of disk space as well so we can have a larger swap partition. For each patch, there's a git pr

[Gluster-infra] [Bug 1620243] Gerrit is non-responsive (503)

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1620243 --- Comment #11 from Yaniv Kaul --- Do you really want to submit the above commits? Type 'yes' to confirm, other to cancel: yes remote: remote: Processing changes: (\) remote: Processing changes: updated: 15 (|) remote: Processing changes: up

[Gluster-infra] [Bug 1620243] Gerrit is non-responsive (503)

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1620243 Yaniv Kaul changed: What|Removed |Added Status|CLOSED |NEW Resolution|CURRENTRELEASE

Re: [Gluster-infra] Reboot policy for the infra

2018-08-23 Thread Michael Scherer
Le jeudi 23 août 2018 à 11:21 +0530, Nigel Babu a écrit : > One more piece that's missing is when we'll restart the physical > servers. > That seems to be entirely missing. The rest looks good to me and I'm > happy > to add an item to next sprint to automate the node rebooting. That's covered as "

[Gluster-infra] [Bug 1620243] Gerrit is non-responsive (503)

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1620243 --- Comment #9 from M. Scherer --- If swap is added, can it be also added in ansible ? -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=pOSHlBGlJG&

[Gluster-infra] [Bug 1620243] Gerrit is non-responsive (503)

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1620243 --- Comment #8 from Yaniv Kaul --- Note, Gerrit is very slow. Commands take a lot of time. The UI also seems sluggish. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.re

[Gluster-infra] [Bug 1620243] Gerrit is non-responsive (503)

2018-08-23 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1620243 Nigel Babu changed: What|Removed |Added Status|VERIFIED|CLOSED Resolution|---