[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

Chris McMahon  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #35 from Chris McMahon  ---
Between changing the timeout value for Mobile varnish, putting the first page
load in a try/catch clause in the automated browser test, and hopefully
reducing parsing time everywhere, I'm going to go ahead and close this.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-31 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #34 from Antoine "hashar" Musso  ---
Is that still an issue? We can enable varnish log on the mobile cache as
described in Comment #17.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

Faidon Liambotis  changed:

   What|Removed |Added

 Status|PATCH_TO_REVIEW |NEW

--- Comment #33 from Faidon Liambotis  ---
Chris, your theory sounds very plausible. As seen above, I fixed & merged your
patch to increase the timeout for mobile as well, which should help reduce the
occurence of this or even eliminate it.

It's not the underlying cause, though. On comment #6 I had the theory that is
lingering issues from #57026; the bug has been reopened since (by others :),
and Brad has investigated and identified another issue and has proposed a fix
that is yet still not merged.

What probably happens is that due to this bug, double-parsing occurs and
doubles the amount of time it takes to render large pages, such as Barack
Obama, which make response times pass the 30s Varnish timeout mark.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #32 from Gerrit Notification Bot  ---
Change 103376 merged by Faidon Liambotis:
varnish: adjust first_byte_timeout for mobile too

https://gerrit.wikimedia.org/r/103376

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

Gerrit Notification Bot  changed:

   What|Removed |Added

 Status|NEW |PATCH_TO_REVIEW

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #31 from Gerrit Notification Bot  ---
Change 103376 had a related patch set uploaded by Faidon Liambotis:
varnish: adjust first_byte_timeout for mobile too

https://gerrit.wikimedia.org/r/103376

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-20 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #30 from Antoine "hashar" Musso  ---
Some entries from /data/project/logs/archive/slow-parse.log, that logs the
parsing time of articles that takes longer than X seconds

2013-12-18 06:56:11 deployment-apache32 enwiki: 34.53 Barack_Obama
2013-12-18 15:11:21 deployment-apache32 enwiki: 34.99 Barack_Obama
2013-12-18 15:11:35 deployment-apache33 enwiki: 14.24 Barack_Obama
2013-12-18 21:29:00 deployment-apache33 enwiki: 37.49 Barack_Obama
2013-12-18 21:29:11 deployment-apache33 enwiki: 13.17 Barack_Obama
2013-12-19 06:16:51 deployment-apache32 enwiki: 13.79 Barack_Obama
2013-12-19 06:16:51 deployment-apache32 enwiki: 48.87 Barack_Obama
2013-12-19 06:16:51 deployment-apache33 enwiki: 83.96 Barack_Obama
2013-12-19 17:27:58 deployment-apache33 enwiki: 43.98 Barack_Obama
2013-12-19 17:28:01 deployment-apache32 enwiki: 12.60 Barack_Obama
2013-12-19 21:32:00 deployment-apache32 enwiki: 41.26 Barack_Obama
2013-12-19 21:32:07 deployment-apache32 enwiki: 12.45 Barack_Obama
2013-12-20 06:09:31 deployment-apache33 enwiki: 35.53 Barack_Obama
2013-12-20 06:09:43 deployment-apache32 enwiki: 12.77 Barack_Obama
2013-12-20 14:43:16 deployment-apache33 simplewiki: 3.17  Barack_Obama

As of Fri Dec 20 19:58:03 UTC 2013

The text caches have 185 seconds first byte timeout.

The mobile caches apparently have a 35 seconds first byte timeout. If the 503
are mostly encountered for mobile tests, that would explain the 503.


One might want to talk about it with ops and find out why the timeout are
different on text and mobiles.  We might be able to raise on mobile for the
beta cluster (puppet change in manifests/role/cache.pp)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-20 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #29 from Arthur Richards  ---
Glad we at least have a workaround for now, thanks Chris!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-20 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #28 from Chris McMahon  ---

I figured out what's going on here, and I created a workaround in the Mobile
browser tests for it.  Here is some more information: 

We have four tests that use the Barack_Obama page.  

The first test to hit the Barack_Obama page (or any number of other pages in
beta labs I assume) gets a 503 error. 

After that, the Barack_Obama page is in the Varnish cache, so the next three
tests to load the page succeed without error.  

At some point after the Mobile browser test build finishes, the version of the
page in Varnish expires, so the next browser test build encounters the 503
again. 

The workaround is to have the test load the page while disregarding any error,
then load the page again for real.  It is not a significant performance hit to
do this: 

Given /^I am on the (.+) article$/ do |article|
  begin
# put the page into the Varnish cache, avoid 503 errors
visit(ArticlePage, :using_params => {:article_name => article})
  rescue
  end
  visit(ArticlePage, :using_params => {:article_name => article})
end

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-19 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #27 from Chris McMahon  ---
As of Dec 12 we see 503s on beta for Firefox and also for Chrome: 
https://wmf.ci.cloudbees.com/job/MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox/218/

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

Antoine "hashar" Musso  changed:

   What|Removed |Added

 Status|PATCH_TO_REVIEW |NEW

--- Comment #26 from Antoine "hashar" Musso  ---
Puppet is no more restarting Apache on beta cluster.

I have cleared the log file mentioned in Comment #17.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #25 from Gerrit Notification Bot  ---
Change 100988 merged by ArielGlenn:
beta: dont restart apache on fake mwsync

https://gerrit.wikimedia.org/r/100988

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

Gerrit Notification Bot  changed:

   What|Removed |Added

 Status|NEW |PATCH_TO_REVIEW

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #24 from Gerrit Notification Bot  ---
Change 100988 had a related patch set uploaded by Hashar:
beta: dont restart apache on fake mwsync

https://gerrit.wikimedia.org/r/100988

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #23 from Antoine "hashar" Musso  ---
RT #6500 fixed, puppet is no more restarting Apache every time it runs. That
impacted production as well.  It was definitely the root cause for some of the
503 we were receiving albeit probably not the only cause.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #22 from Antoine "hashar" Musso  ---
One issue is that puppet keep restarting Apache. I haven't correlated the
restarts mentioned in /data/project/logs/apache-error.log with the 503 errors
reported in varnish log.   Anyway filled a RT for ops to look at the puppet
conf RT #6500.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #21 from Ryan Kaldari  ---
I'm getting the 503s in Firefox (regarding comment 15):

Request: GET
http://en.wikipedia.beta.wmflabs.org/wiki/This_page_has_issues?action=purge,
from 216.38.130.164 via deployment-cache-mobile01 frontend ([10.4.1.82]:80),
Varnish XID 2095358945
Forwarded for: 216.38.130.164
Error: 503, Service Unavailable at Thu, 12 Dec 2013 00:06:23 GMT

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

Ryan Kaldari  changed:

   What|Removed |Added

 CC||rkald...@wikimedia.org

--- Comment #20 from Ryan Kaldari  ---
Almost all of my API request on beta labs are returning 503 errors today.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #19 from Arthur Richards  ---
Thanks so much Antoine, I am really happy you are on the path to getting to the
bottom of this :)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #18 from Antoine "hashar" Musso  ---
The log files is not that great for human consumption/filtering. One can review
the logs by connection on deployment-cache-text1 and then using

 varnishlog -r /data/project/logs/varnish-cache-text1.log

That will reply all the logs. Can later on filter out the 503 and hopefully
find out the reason for the failure.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #17 from Antoine "hashar" Musso  ---
I have started a background varnishlog process on
deployment-cache-text1.pmtpa.wmflabs instance. Started as root using:

 nohup varnishlog -w /data/project/logs/varnish-cache-text1.log \
  2>&1 > /data/project/logs/varnishlogger-cache-text1.log &

The log file being on the shared directory, it is accessible from
deployment-bastion.pmtpa.wmflabs.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #16 from Matthew Flaschen  ---
(In reply to comment #15)
> I took a look at the historical browser test failures and to my surprise
> discovered that all of the 503 errors are happening in Chrome and none (or
> very few) are happening in Firefox. 

I've had it in Firefox (manual testing, not an automated browser test).  I
believe the examples I posted above were.  The next time it happens in Firefox,
I'll try to post the debugging info here.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

--- Comment #15 from Chris McMahon  ---
I took a look at the historical browser test failures and to my surprise
discovered that all of the 503 errors are happening in Chrome and none (or very
few) are happening in Firefox. 

I asked about this in #wikimedia-ops.  It may be that "Chrome presents a header
that the varnish setup has a Vary on". 

I haven't yet investigated the headers being sent by MF in the browsers, but
this might be a factor in the ongoing 503s.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

dan  changed:

   What|Removed |Added

 CC||d_ent...@yahoo.com

--- Comment #14 from dan  ---
Request: POST http://commons.wikimedia.beta.wmflabs.org/wiki/Special:GWToolset,
from 83.163.0.31 via deployment-cache-text1 frontend ([10.4.1.133]:80), Varnish
XID 2142135504
Forwarded for: 83.163.0.31
Error: 503, Service Unavailable at Mon, 09 Dec 2013 20:20:02 GMT

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 57249] intermittent 503 errors on beta.wmflabs.org

2013-12-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=57249

Krinkle  changed:

   What|Removed |Added

Summary|intermittent 503 errors |intermittent 503 errors on
   ||beta.wmflabs.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l