Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-21 Thread shane knapp
i've seen a few more builds fail w/timeouts and it appears that we're
definitely NOT hitting any rate limiting.

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22005/console

[jenkins@amp-jenkins-slave-01 ~]$ curl -i -H Authorization: token
REDACTED https://api.github.com | grep Rate
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4997
X-RateLimit-Reset: 1413929848
Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit,
X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes,
X-Accepted-OAuth-Scopes, X-Poll-Interval

On Sat, Oct 18, 2014 at 12:44 AM, Davies Liu dav...@databricks.com wrote:

 Cool, the recent 4 build had used the new configs, thanks!

 Let's run more builds.

 Davies

 On Fri, Oct 17, 2014 at 11:06 PM, Josh Rosen rosenvi...@gmail.com wrote:
  I think that the fix was applied.  Take a look at
 
 https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21874/consoleFull
 
  Here, I see a fetch command that mentions this specific PR branch rather
  than the wildcard that we had before:
 
git fetch --tags --progress https://github.com/apache/spark.git
  +refs/pull/2840/*:refs/remotes/origin/pr/2840/* # timeout=15
 
 
  Do you have an example of a Spark PRB build that’s still failing with the
  old fetch failure?
 
  - Josh
 
  On October 17, 2014 at 11:03:14 PM, Davies Liu (dav...@databricks.com)
  wrote:
 
  How can we know the changes has been applied? I had checked several
  recent builds, they all use the original configs.
 
  Davies
 
  On Fri, Oct 17, 2014 at 6:17 PM, Josh Rosen rosenvi...@gmail.com
 wrote:
  FYI, I edited the Spark Pull Request Builder job to try this out. Let’s
  see
  if it works (I’ll be around to revert if it doesn’t).
 
  On October 17, 2014 at 5:26:56 PM, Davies Liu (dav...@databricks.com)
  wrote:
 
  One finding is that all the timeout happened with this command:
 
  git fetch --tags --progress https://github.com/apache/spark.git
  +refs/pull/*:refs/remotes/origin/pr/*
 
  I'm thinking that maybe this may be a expensive call, we could try to
  use a more cheap one:
 
  git fetch --tags --progress https://github.com/apache/spark.git
  +refs/pull/XXX/*:refs/remotes/origin/pr/XXX/*
 
  XXX is the PullRequestID,
 
  The configuration support parameters [1], so we could put this in :
 
  +refs/pull//${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 
  I have not tested this yet, could you give this a try?
 
  Davies
 
 
  [1]
 
 
 https://wiki.jenkins-ci.org/display/JENKINS/GitHub+pull+request+builder+plugin
 
  On Fri, Oct 17, 2014 at 5:00 PM, shane knapp skn...@berkeley.edu
 wrote:
  actually, nvm, you have to be run that command from our servers to
 affect
  our limit. run it all you want from your own machines! :P
 
  On Fri, Oct 17, 2014 at 4:59 PM, shane knapp skn...@berkeley.edu
 wrote:
 
  yep, and i will tell you guys ONLY if you promise to NOT try this
  yourselves... checking the rate limit also counts as a hit and
  increments
  our numbers:
 
  # curl -i https://api.github.com/users/whatever 2 /dev/null | egrep
  ^X-Rate
  X-RateLimit-Limit: 60
  X-RateLimit-Remaining: 51
  X-RateLimit-Reset: 1413590269
 
  (yes, that is the exact url that they recommended on the github site
  lol)
 
  so, earlier today, we had a spark build fail w/a git timeout at
 10:57am,
  but there were only ~7 builds run that hour, so that points to us NOT
  hitting the rate limit... at least for this fail. whee!
 
  is it beer-thirty yet?
 
  shane
 
 
 
  On Fri, Oct 17, 2014 at 4:52 PM, Nicholas Chammas 
  nicholas.cham...@gmail.com wrote:
 
  Wow, thanks for this deep dive Shane. Is there a way to check if we
 are
  getting hit by rate limiting directly, or do we need to contact
 GitHub
  for that?
 
  2014년 10월 17일 금요일, shane knappskn...@berkeley.edu님이 작성한 메시지:
 
  quick update:
 
  here are some stats i scraped over the past week of ALL pull request
  builder projects and timeout failures. due to the large number of
  spark
  ghprb jobs, i don't have great records earlier than oct 7th. the
 data
  is
  current up until ~230pm today:
 
  spark and new spark ghprb total builds vs git fetch timeouts:
  $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -i
  spark | wc -l); failed=$(grep $x SORTED | grep -i spark | wc -l);
 let
  total=passed+failed; fail_percent=$(echo scale=2; $failed/$total |
  bc
  |
  sed s/^\.//g); line=$x -- total builds: $total\tp/f:
  $passed/$failed\tfail%: $fail_percent%; echo -e $line; done
  10-09 -- total builds: 140 p/f: 92/48 fail%: 34%
  10-10 -- total builds: 65 p/f: 59/6 fail%: 09%
  10-11 -- total builds: 29 p/f: 29/0 fail%: 0%
  10-12 -- total builds: 24 p/f: 21/3 fail%: 12%
  10-13 -- total builds: 39 p/f: 35/4 fail%: 10%
  10-14 -- total builds: 7 p/f: 5/2 fail%: 28%
  10-15 -- total builds: 37 p/f: 34/3 fail%: 08%
  10-16 -- total builds: 71 p/f: 59/12 fail%: 16%
  10-17 -- total builds: 26 p/f: 20/6 fail%: 23%
 
  all other ghprb builds vs git fetch timeouts:
  $ 

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-18 Thread Davies Liu
How can we know the changes has been applied? I had checked several
recent builds, they all use the original configs.

Davies

On Fri, Oct 17, 2014 at 6:17 PM, Josh Rosen rosenvi...@gmail.com wrote:
 FYI, I edited the Spark Pull Request Builder job to try this out.  Let’s see
 if it works (I’ll be around to revert if it doesn’t).

 On October 17, 2014 at 5:26:56 PM, Davies Liu (dav...@databricks.com) wrote:

 One finding is that all the timeout happened with this command:

 git fetch --tags --progress https://github.com/apache/spark.git
 +refs/pull/*:refs/remotes/origin/pr/*

 I'm thinking that maybe this may be a expensive call, we could try to
 use a more cheap one:

 git fetch --tags --progress https://github.com/apache/spark.git
 +refs/pull/XXX/*:refs/remotes/origin/pr/XXX/*

 XXX is the PullRequestID,

 The configuration support parameters [1], so we could put this in :

 +refs/pull//${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*

 I have not tested this yet, could you give this a try?

 Davies


 [1]
 https://wiki.jenkins-ci.org/display/JENKINS/GitHub+pull+request+builder+plugin

 On Fri, Oct 17, 2014 at 5:00 PM, shane knapp skn...@berkeley.edu wrote:
 actually, nvm, you have to be run that command from our servers to affect
 our limit. run it all you want from your own machines! :P

 On Fri, Oct 17, 2014 at 4:59 PM, shane knapp skn...@berkeley.edu wrote:

 yep, and i will tell you guys ONLY if you promise to NOT try this
 yourselves... checking the rate limit also counts as a hit and increments
 our numbers:

 # curl -i https://api.github.com/users/whatever 2 /dev/null | egrep
 ^X-Rate
 X-RateLimit-Limit: 60
 X-RateLimit-Remaining: 51
 X-RateLimit-Reset: 1413590269

 (yes, that is the exact url that they recommended on the github site lol)

 so, earlier today, we had a spark build fail w/a git timeout at 10:57am,
 but there were only ~7 builds run that hour, so that points to us NOT
 hitting the rate limit... at least for this fail. whee!

 is it beer-thirty yet?

 shane



 On Fri, Oct 17, 2014 at 4:52 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 Wow, thanks for this deep dive Shane. Is there a way to check if we are
 getting hit by rate limiting directly, or do we need to contact GitHub
 for that?

 2014년 10월 17일 금요일, shane knappskn...@berkeley.edu님이 작성한 메시지:

 quick update:

 here are some stats i scraped over the past week of ALL pull request
 builder projects and timeout failures. due to the large number of spark
 ghprb jobs, i don't have great records earlier than oct 7th. the data
 is
 current up until ~230pm today:

 spark and new spark ghprb total builds vs git fetch timeouts:
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -i
 spark | wc -l); failed=$(grep $x SORTED | grep -i spark | wc -l); let
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total | bc
 |
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:
 $passed/$failed\tfail%: $fail_percent%; echo -e $line; done
 10-09 -- total builds: 140 p/f: 92/48 fail%: 34%
 10-10 -- total builds: 65 p/f: 59/6 fail%: 09%
 10-11 -- total builds: 29 p/f: 29/0 fail%: 0%
 10-12 -- total builds: 24 p/f: 21/3 fail%: 12%
 10-13 -- total builds: 39 p/f: 35/4 fail%: 10%
 10-14 -- total builds: 7 p/f: 5/2 fail%: 28%
 10-15 -- total builds: 37 p/f: 34/3 fail%: 08%
 10-16 -- total builds: 71 p/f: 59/12 fail%: 16%
 10-17 -- total builds: 26 p/f: 20/6 fail%: 23%

 all other ghprb builds vs git fetch timeouts:
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -vi
 spark | wc -l); failed=$(grep $x SORTED | grep -vi spark | wc -l); let
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total | bc
 |
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:
 $passed/$failed\tfail%: $fail_percent%; echo -e $line; done
 10-09 -- total builds: 16 p/f: 16/0 fail%: 0%
 10-10 -- total builds: 46 p/f: 40/6 fail%: 13%
 10-11 -- total builds: 4 p/f: 4/0 fail%: 0%
 10-12 -- total builds: 2 p/f: 2/0 fail%: 0%
 10-13 -- total builds: 2 p/f: 2/0 fail%: 0%
 10-14 -- total builds: 10 p/f: 10/0 fail%: 0%
 10-15 -- total builds: 5 p/f: 5/0 fail%: 0%
 10-16 -- total builds: 5 p/f: 5/0 fail%: 0%
 10-17 -- total builds: 0 p/f: 0/0 fail%: 0%

 note: the 15th was the day i rolled back to the earlier version of the
 git plugin. it doesn't seem to have helped much, so i'll probably bring
 us
 back up to the latest version soon.
 also note: rocking some floating point math on the CLI! ;)

 i also compared the distribution of git timeout failures vs time of
 day,
 and there appears to be no correlation. the failures are pretty evenly
 distributed over each hour of the day.

 we could be hitting the rate limit due to the ghprb hitting github a
 couple of times for each build, but we're averaging ~10-20 builds per
 hour
 (a build hits github 2-4 times, from what i can tell). i'll have to
 look
 more in to this on monday, but suffice to say we may need to move from
 unauthorized https fetches to authorized requests. 

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-18 Thread Josh Rosen
I think that the fix was applied.  Take a look at 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21874/consoleFull

Here, I see a fetch command that mentions this specific PR branch rather than 
the wildcard that we had before:

  git fetch --tags --progress https://github.com/apache/spark.git 
  +refs/pull/2840/*:refs/remotes/origin/pr/2840/* # timeout=15

Do you have an example of a Spark PRB build that’s still failing with the old 
fetch failure?

- Josh
On October 17, 2014 at 11:03:14 PM, Davies Liu (dav...@databricks.com) wrote:

How can we know the changes has been applied? I had checked several  
recent builds, they all use the original configs.  

Davies  

On Fri, Oct 17, 2014 at 6:17 PM, Josh Rosen rosenvi...@gmail.com wrote:  
 FYI, I edited the Spark Pull Request Builder job to try this out. Let’s see  
 if it works (I’ll be around to revert if it doesn’t).  
  
 On October 17, 2014 at 5:26:56 PM, Davies Liu (dav...@databricks.com) wrote:  
  
 One finding is that all the timeout happened with this command:  
  
 git fetch --tags --progress https://github.com/apache/spark.git  
 +refs/pull/*:refs/remotes/origin/pr/*  
  
 I'm thinking that maybe this may be a expensive call, we could try to  
 use a more cheap one:  
  
 git fetch --tags --progress https://github.com/apache/spark.git  
 +refs/pull/XXX/*:refs/remotes/origin/pr/XXX/*  
  
 XXX is the PullRequestID,  
  
 The configuration support parameters [1], so we could put this in :  
  
 +refs/pull//${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*  
  
 I have not tested this yet, could you give this a try?  
  
 Davies  
  
  
 [1]  
 https://wiki.jenkins-ci.org/display/JENKINS/GitHub+pull+request+builder+plugin
   
  
 On Fri, Oct 17, 2014 at 5:00 PM, shane knapp skn...@berkeley.edu wrote:  
 actually, nvm, you have to be run that command from our servers to affect  
 our limit. run it all you want from your own machines! :P  
  
 On Fri, Oct 17, 2014 at 4:59 PM, shane knapp skn...@berkeley.edu wrote:  
  
 yep, and i will tell you guys ONLY if you promise to NOT try this  
 yourselves... checking the rate limit also counts as a hit and increments  
 our numbers:  
  
 # curl -i https://api.github.com/users/whatever 2 /dev/null | egrep  
 ^X-Rate  
 X-RateLimit-Limit: 60  
 X-RateLimit-Remaining: 51  
 X-RateLimit-Reset: 1413590269  
  
 (yes, that is the exact url that they recommended on the github site lol)  
  
 so, earlier today, we had a spark build fail w/a git timeout at 10:57am,  
 but there were only ~7 builds run that hour, so that points to us NOT  
 hitting the rate limit... at least for this fail. whee!  
  
 is it beer-thirty yet?  
  
 shane  
  
  
  
 On Fri, Oct 17, 2014 at 4:52 PM, Nicholas Chammas   
 nicholas.cham...@gmail.com wrote:  
  
 Wow, thanks for this deep dive Shane. Is there a way to check if we are  
 getting hit by rate limiting directly, or do we need to contact GitHub  
 for that?  
  
 2014년 10월 17일 금요일, shane knappskn...@berkeley.edu님이 작성한 메시지:  
  
 quick update:  
  
 here are some stats i scraped over the past week of ALL pull request  
 builder projects and timeout failures. due to the large number of spark  
 ghprb jobs, i don't have great records earlier than oct 7th. the data  
 is  
 current up until ~230pm today:  
  
 spark and new spark ghprb total builds vs git fetch timeouts:  
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -i  
 spark | wc -l); failed=$(grep $x SORTED | grep -i spark | wc -l); let  
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total | bc  
 |  
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:  
 $passed/$failed\tfail%: $fail_percent%; echo -e $line; done  
 10-09 -- total builds: 140 p/f: 92/48 fail%: 34%  
 10-10 -- total builds: 65 p/f: 59/6 fail%: 09%  
 10-11 -- total builds: 29 p/f: 29/0 fail%: 0%  
 10-12 -- total builds: 24 p/f: 21/3 fail%: 12%  
 10-13 -- total builds: 39 p/f: 35/4 fail%: 10%  
 10-14 -- total builds: 7 p/f: 5/2 fail%: 28%  
 10-15 -- total builds: 37 p/f: 34/3 fail%: 08%  
 10-16 -- total builds: 71 p/f: 59/12 fail%: 16%  
 10-17 -- total builds: 26 p/f: 20/6 fail%: 23%  
  
 all other ghprb builds vs git fetch timeouts:  
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -vi  
 spark | wc -l); failed=$(grep $x SORTED | grep -vi spark | wc -l); let  
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total | bc  
 |  
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:  
 $passed/$failed\tfail%: $fail_percent%; echo -e $line; done  
 10-09 -- total builds: 16 p/f: 16/0 fail%: 0%  
 10-10 -- total builds: 46 p/f: 40/6 fail%: 13%  
 10-11 -- total builds: 4 p/f: 4/0 fail%: 0%  
 10-12 -- total builds: 2 p/f: 2/0 fail%: 0%  
 10-13 -- total builds: 2 p/f: 2/0 fail%: 0%  
 10-14 -- total builds: 10 p/f: 10/0 fail%: 0%  
 10-15 -- total builds: 5 p/f: 5/0 fail%: 0%  
 10-16 -- total builds: 5 p/f: 5/0 fail%: 0%  
 10-17 -- total builds: 0 p/f: 0/0 fail%: 

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-18 Thread Davies Liu
Cool, the recent 4 build had used the new configs, thanks!

Let's run more builds.

Davies

On Fri, Oct 17, 2014 at 11:06 PM, Josh Rosen rosenvi...@gmail.com wrote:
 I think that the fix was applied.  Take a look at
 https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21874/consoleFull

 Here, I see a fetch command that mentions this specific PR branch rather
 than the wildcard that we had before:

   git fetch --tags --progress https://github.com/apache/spark.git
 +refs/pull/2840/*:refs/remotes/origin/pr/2840/* # timeout=15


 Do you have an example of a Spark PRB build that’s still failing with the
 old fetch failure?

 - Josh

 On October 17, 2014 at 11:03:14 PM, Davies Liu (dav...@databricks.com)
 wrote:

 How can we know the changes has been applied? I had checked several
 recent builds, they all use the original configs.

 Davies

 On Fri, Oct 17, 2014 at 6:17 PM, Josh Rosen rosenvi...@gmail.com wrote:
 FYI, I edited the Spark Pull Request Builder job to try this out. Let’s
 see
 if it works (I’ll be around to revert if it doesn’t).

 On October 17, 2014 at 5:26:56 PM, Davies Liu (dav...@databricks.com)
 wrote:

 One finding is that all the timeout happened with this command:

 git fetch --tags --progress https://github.com/apache/spark.git
 +refs/pull/*:refs/remotes/origin/pr/*

 I'm thinking that maybe this may be a expensive call, we could try to
 use a more cheap one:

 git fetch --tags --progress https://github.com/apache/spark.git
 +refs/pull/XXX/*:refs/remotes/origin/pr/XXX/*

 XXX is the PullRequestID,

 The configuration support parameters [1], so we could put this in :

 +refs/pull//${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*

 I have not tested this yet, could you give this a try?

 Davies


 [1]

 https://wiki.jenkins-ci.org/display/JENKINS/GitHub+pull+request+builder+plugin

 On Fri, Oct 17, 2014 at 5:00 PM, shane knapp skn...@berkeley.edu wrote:
 actually, nvm, you have to be run that command from our servers to affect
 our limit. run it all you want from your own machines! :P

 On Fri, Oct 17, 2014 at 4:59 PM, shane knapp skn...@berkeley.edu wrote:

 yep, and i will tell you guys ONLY if you promise to NOT try this
 yourselves... checking the rate limit also counts as a hit and
 increments
 our numbers:

 # curl -i https://api.github.com/users/whatever 2 /dev/null | egrep
 ^X-Rate
 X-RateLimit-Limit: 60
 X-RateLimit-Remaining: 51
 X-RateLimit-Reset: 1413590269

 (yes, that is the exact url that they recommended on the github site
 lol)

 so, earlier today, we had a spark build fail w/a git timeout at 10:57am,
 but there were only ~7 builds run that hour, so that points to us NOT
 hitting the rate limit... at least for this fail. whee!

 is it beer-thirty yet?

 shane



 On Fri, Oct 17, 2014 at 4:52 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 Wow, thanks for this deep dive Shane. Is there a way to check if we are
 getting hit by rate limiting directly, or do we need to contact GitHub
 for that?

 2014년 10월 17일 금요일, shane knappskn...@berkeley.edu님이 작성한 메시지:

 quick update:

 here are some stats i scraped over the past week of ALL pull request
 builder projects and timeout failures. due to the large number of
 spark
 ghprb jobs, i don't have great records earlier than oct 7th. the data
 is
 current up until ~230pm today:

 spark and new spark ghprb total builds vs git fetch timeouts:
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -i
 spark | wc -l); failed=$(grep $x SORTED | grep -i spark | wc -l); let
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total |
 bc
 |
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:
 $passed/$failed\tfail%: $fail_percent%; echo -e $line; done
 10-09 -- total builds: 140 p/f: 92/48 fail%: 34%
 10-10 -- total builds: 65 p/f: 59/6 fail%: 09%
 10-11 -- total builds: 29 p/f: 29/0 fail%: 0%
 10-12 -- total builds: 24 p/f: 21/3 fail%: 12%
 10-13 -- total builds: 39 p/f: 35/4 fail%: 10%
 10-14 -- total builds: 7 p/f: 5/2 fail%: 28%
 10-15 -- total builds: 37 p/f: 34/3 fail%: 08%
 10-16 -- total builds: 71 p/f: 59/12 fail%: 16%
 10-17 -- total builds: 26 p/f: 20/6 fail%: 23%

 all other ghprb builds vs git fetch timeouts:
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -vi
 spark | wc -l); failed=$(grep $x SORTED | grep -vi spark | wc -l); let
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total |
 bc
 |
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:
 $passed/$failed\tfail%: $fail_percent%; echo -e $line; done
 10-09 -- total builds: 16 p/f: 16/0 fail%: 0%
 10-10 -- total builds: 46 p/f: 40/6 fail%: 13%
 10-11 -- total builds: 4 p/f: 4/0 fail%: 0%
 10-12 -- total builds: 2 p/f: 2/0 fail%: 0%
 10-13 -- total builds: 2 p/f: 2/0 fail%: 0%
 10-14 -- total builds: 10 p/f: 10/0 fail%: 0%
 10-15 -- total builds: 5 p/f: 5/0 fail%: 0%
 10-16 -- total builds: 5 p/f: 5/0 fail%: 0%
 10-17 -- total builds: 0 p/f: 0/0 fail%: 0%

 note: the 15th was the day i rolled 

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-17 Thread Davies Liu
One finding is that all the timeout happened with this command:

git fetch --tags --progress https://github.com/apache/spark.git
+refs/pull/*:refs/remotes/origin/pr/*

I'm thinking that maybe this may be a expensive call, we could try to
use a more cheap one:

git fetch --tags --progress https://github.com/apache/spark.git
+refs/pull/XXX/*:refs/remotes/origin/pr/XXX/*

XXX is the PullRequestID,

The configuration support parameters [1], so we could put this in :

+refs/pull//${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*

I have not tested this yet, could you give this a try?

Davies


[1] 
https://wiki.jenkins-ci.org/display/JENKINS/GitHub+pull+request+builder+plugin

On Fri, Oct 17, 2014 at 5:00 PM, shane knapp skn...@berkeley.edu wrote:
 actually, nvm, you have to be run that command from our servers to affect
 our limit.  run it all you want from your own machines!  :P

 On Fri, Oct 17, 2014 at 4:59 PM, shane knapp skn...@berkeley.edu wrote:

 yep, and i will tell you guys ONLY if you promise to NOT try this
 yourselves...  checking the rate limit also counts as a hit and increments
 our numbers:

 # curl -i https://api.github.com/users/whatever 2 /dev/null | egrep
 ^X-Rate
 X-RateLimit-Limit: 60
 X-RateLimit-Remaining: 51
 X-RateLimit-Reset: 1413590269

 (yes, that is the exact url that they recommended on the github site lol)

 so, earlier today, we had a spark build fail w/a git timeout at 10:57am,
 but there were only ~7 builds run that hour, so that points to us NOT
 hitting the rate limit...  at least for this fail.  whee!

 is it beer-thirty yet?

 shane



 On Fri, Oct 17, 2014 at 4:52 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 Wow, thanks for this deep dive Shane. Is there a way to check if we are
 getting hit by rate limiting directly, or do we need to contact GitHub
 for that?

 2014년 10월 17일 금요일, shane knappskn...@berkeley.edu님이 작성한 메시지:

 quick update:

 here are some stats i scraped over the past week of ALL pull request
 builder projects and timeout failures.  due to the large number of spark
 ghprb jobs, i don't have great records earlier than oct 7th.  the data is
 current up until ~230pm today:

 spark and new spark ghprb total builds vs git fetch timeouts:
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -i
 spark | wc -l); failed=$(grep $x SORTED | grep -i spark | wc -l); let
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total | bc |
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:
  $passed/$failed\tfail%: $fail_percent%; echo -e $line; done
 10-09 -- total builds: 140 p/f: 92/48 fail%: 34%
 10-10 -- total builds: 65 p/f: 59/6 fail%: 09%
 10-11 -- total builds: 29 p/f: 29/0 fail%: 0%
 10-12 -- total builds: 24 p/f: 21/3 fail%: 12%
 10-13 -- total builds: 39 p/f: 35/4 fail%: 10%
 10-14 -- total builds: 7 p/f: 5/2 fail%: 28%
 10-15 -- total builds: 37 p/f: 34/3 fail%: 08%
 10-16 -- total builds: 71 p/f: 59/12 fail%: 16%
 10-17 -- total builds: 26 p/f: 20/6 fail%: 23%

 all other ghprb builds vs git fetch timeouts:
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -vi
 spark | wc -l); failed=$(grep $x SORTED | grep -vi spark | wc -l); let
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total | bc |
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:
  $passed/$failed\tfail%: $fail_percent%; echo -e $line; done
 10-09 -- total builds: 16 p/f: 16/0 fail%: 0%
 10-10 -- total builds: 46 p/f: 40/6 fail%: 13%
 10-11 -- total builds: 4 p/f: 4/0 fail%: 0%
 10-12 -- total builds: 2 p/f: 2/0 fail%: 0%
 10-13 -- total builds: 2 p/f: 2/0 fail%: 0%
 10-14 -- total builds: 10 p/f: 10/0 fail%: 0%
 10-15 -- total builds: 5 p/f: 5/0 fail%: 0%
 10-16 -- total builds: 5 p/f: 5/0 fail%: 0%
 10-17 -- total builds: 0 p/f: 0/0 fail%: 0%

 note:  the 15th was the day i rolled back to the earlier version of the
 git plugin.  it doesn't seem to have helped much, so i'll probably bring us
 back up to the latest version soon.
 also note:  rocking some floating point math on the CLI!  ;)

 i also compared the distribution of git timeout failures vs time of day,
 and there appears to be no correlation.  the failures are pretty evenly
 distributed over each hour of the day.

 we could be hitting the rate limit due to the ghprb hitting github a
 couple of times for each build, but we're averaging ~10-20 builds per hour
 (a build hits github 2-4 times, from what i can tell).  i'll have to look
 more in to this on monday, but suffice to say we may need to move from
 unauthorized https fetches to authorized requests.  this means retrofitting
 all of our jobs.  yay!  fun!  :)

 another option is to have local mirrors of all of the repos.  the
 problem w/this is that there might be a window where changes haven't made
 it to the local mirror and tests run against it.  more fun stuff to think
 about...

 now that i have some stats, and a list of all of the times/dates of the
 failures, i will be drafting my email to github and 

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-17 Thread Josh Rosen
FYI, I edited the Spark Pull Request Builder job to try this out.  Let’s see if 
it works (I’ll be around to revert if it doesn’t).

On October 17, 2014 at 5:26:56 PM, Davies Liu (dav...@databricks.com) wrote:

One finding is that all the timeout happened with this command:  

git fetch --tags --progress https://github.com/apache/spark.git  
+refs/pull/*:refs/remotes/origin/pr/*  

I'm thinking that maybe this may be a expensive call, we could try to  
use a more cheap one:  

git fetch --tags --progress https://github.com/apache/spark.git  
+refs/pull/XXX/*:refs/remotes/origin/pr/XXX/*  

XXX is the PullRequestID,  

The configuration support parameters [1], so we could put this in :  

+refs/pull//${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*  

I have not tested this yet, could you give this a try?  

Davies  


[1] 
https://wiki.jenkins-ci.org/display/JENKINS/GitHub+pull+request+builder+plugin  

On Fri, Oct 17, 2014 at 5:00 PM, shane knapp skn...@berkeley.edu wrote:  
 actually, nvm, you have to be run that command from our servers to affect  
 our limit. run it all you want from your own machines! :P  
  
 On Fri, Oct 17, 2014 at 4:59 PM, shane knapp skn...@berkeley.edu wrote:  
  
 yep, and i will tell you guys ONLY if you promise to NOT try this  
 yourselves... checking the rate limit also counts as a hit and increments  
 our numbers:  
  
 # curl -i https://api.github.com/users/whatever 2 /dev/null | egrep  
 ^X-Rate  
 X-RateLimit-Limit: 60  
 X-RateLimit-Remaining: 51  
 X-RateLimit-Reset: 1413590269  
  
 (yes, that is the exact url that they recommended on the github site lol)  
  
 so, earlier today, we had a spark build fail w/a git timeout at 10:57am,  
 but there were only ~7 builds run that hour, so that points to us NOT  
 hitting the rate limit... at least for this fail. whee!  
  
 is it beer-thirty yet?  
  
 shane  
  
  
  
 On Fri, Oct 17, 2014 at 4:52 PM, Nicholas Chammas   
 nicholas.cham...@gmail.com wrote:  
  
 Wow, thanks for this deep dive Shane. Is there a way to check if we are  
 getting hit by rate limiting directly, or do we need to contact GitHub  
 for that?  
  
 2014년 10월 17일 금요일, shane knappskn...@berkeley.edu님이 작성한 메시지:  
  
 quick update:  
  
 here are some stats i scraped over the past week of ALL pull request  
 builder projects and timeout failures. due to the large number of spark  
 ghprb jobs, i don't have great records earlier than oct 7th. the data is  
 current up until ~230pm today:  
  
 spark and new spark ghprb total builds vs git fetch timeouts:  
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -i  
 spark | wc -l); failed=$(grep $x SORTED | grep -i spark | wc -l); let  
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total | bc |  
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:  
 $passed/$failed\tfail%: $fail_percent%; echo -e $line; done  
 10-09 -- total builds: 140 p/f: 92/48 fail%: 34%  
 10-10 -- total builds: 65 p/f: 59/6 fail%: 09%  
 10-11 -- total builds: 29 p/f: 29/0 fail%: 0%  
 10-12 -- total builds: 24 p/f: 21/3 fail%: 12%  
 10-13 -- total builds: 39 p/f: 35/4 fail%: 10%  
 10-14 -- total builds: 7 p/f: 5/2 fail%: 28%  
 10-15 -- total builds: 37 p/f: 34/3 fail%: 08%  
 10-16 -- total builds: 71 p/f: 59/12 fail%: 16%  
 10-17 -- total builds: 26 p/f: 20/6 fail%: 23%  
  
 all other ghprb builds vs git fetch timeouts:  
 $ for x in 10-{09..17}; do passed=$(grep $x SORTED.passed | grep -vi  
 spark | wc -l); failed=$(grep $x SORTED | grep -vi spark | wc -l); let  
 total=passed+failed; fail_percent=$(echo scale=2; $failed/$total | bc |  
 sed s/^\.//g); line=$x -- total builds: $total\tp/f:  
 $passed/$failed\tfail%: $fail_percent%; echo -e $line; done  
 10-09 -- total builds: 16 p/f: 16/0 fail%: 0%  
 10-10 -- total builds: 46 p/f: 40/6 fail%: 13%  
 10-11 -- total builds: 4 p/f: 4/0 fail%: 0%  
 10-12 -- total builds: 2 p/f: 2/0 fail%: 0%  
 10-13 -- total builds: 2 p/f: 2/0 fail%: 0%  
 10-14 -- total builds: 10 p/f: 10/0 fail%: 0%  
 10-15 -- total builds: 5 p/f: 5/0 fail%: 0%  
 10-16 -- total builds: 5 p/f: 5/0 fail%: 0%  
 10-17 -- total builds: 0 p/f: 0/0 fail%: 0%  
  
 note: the 15th was the day i rolled back to the earlier version of the  
 git plugin. it doesn't seem to have helped much, so i'll probably bring us 
  
 back up to the latest version soon.  
 also note: rocking some floating point math on the CLI! ;)  
  
 i also compared the distribution of git timeout failures vs time of day,  
 and there appears to be no correlation. the failures are pretty evenly  
 distributed over each hour of the day.  
  
 we could be hitting the rate limit due to the ghprb hitting github a  
 couple of times for each build, but we're averaging ~10-20 builds per hour 
  
 (a build hits github 2-4 times, from what i can tell). i'll have to look  
 more in to this on monday, but suffice to say we may need to move from  
 unauthorized https fetches to authorized requests. this means 

Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-16 Thread shane knapp
the bad news is that we've had a couple more failures due to timeouts, but
the good news is that the frequency that these happen has decreased
significantly (3 in the past ~18hr).

seems like the git plugin downgrade has helped relieve the problem, but
hasn't fixed it.  i'll be looking in to this more today.

On Wed, Oct 15, 2014 at 7:05 PM, Nicholas Chammas 
nicholas.cham...@gmail.com wrote:

 A quick scan through the Spark PR board https://spark-prs.appspot.com/ shows
 no recent failures related to this git checkout problem.

 Looks promising!

 Nick

 On Wed, Oct 15, 2014 at 6:10 PM, shane knapp skn...@berkeley.edu wrote:

 ok, we've had about 10 spark pull request builds go through w/o any git
 timeouts.  it seems that the git timeout issue might be licked.

 i will be definitely be keeping an eye on this for the next few days.

 thanks for being patient!

 shane

 On Wed, Oct 15, 2014 at 2:27 PM, shane knapp skn...@berkeley.edu wrote:

  four builds triggered  and no timeouts.  :crossestoes:  :)
 
  On Wed, Oct 15, 2014 at 2:19 PM, shane knapp skn...@berkeley.edu
 wrote:
 
  ok, we're up and building...  :crossesfingersfortheumpteenthtime:
 
  On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas 
  nicholas.cham...@gmail.com wrote:
 
  I support this effort. :thumbsup:
 
  On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu
  wrote:
 
  i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to
 see
  if
  that helps w/the git fetch timeouts.
 
  this will require a short downtime (~20 mins for builds to finish,
 ~20
  mins
  to downgrade), and will hopefully give us some insight in to wtf is
  going
  on.
 
  thanks for your patience...
 
  shane
 
 
   --
  You received this message because you are subscribed to the Google
  Groups amp-infra group.
  To unsubscribe from this group and stop receiving emails from it, send
  an email to amp-infra+unsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.
 
 
 
 





Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-16 Thread Nicholas Chammas
Thanks for continuing to look into this, Shane.

One suggestion that Patrick brought up, if we have trouble getting to the
bottom of this, is doing the git checkout ourselves in the run-tests-jenkins
script and cutting out the Jenkins git plugin entirely. That way we can
script retries and post friendlier messages about timeouts if they still
occur by ourselves.

Do you think that’s worth trying at some point?

Nick
​

On Thu, Oct 16, 2014 at 2:04 PM, shane knapp skn...@berkeley.edu wrote:

 the bad news is that we've had a couple more failures due to timeouts, but
 the good news is that the frequency that these happen has decreased
 significantly (3 in the past ~18hr).

 seems like the git plugin downgrade has helped relieve the problem, but
 hasn't fixed it.  i'll be looking in to this more today.

 On Wed, Oct 15, 2014 at 7:05 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 A quick scan through the Spark PR board https://spark-prs.appspot.com/ 
 shows
 no recent failures related to this git checkout problem.

 Looks promising!

 Nick

 On Wed, Oct 15, 2014 at 6:10 PM, shane knapp skn...@berkeley.edu wrote:

 ok, we've had about 10 spark pull request builds go through w/o any git
 timeouts.  it seems that the git timeout issue might be licked.

 i will be definitely be keeping an eye on this for the next few days.

 thanks for being patient!

 shane

 On Wed, Oct 15, 2014 at 2:27 PM, shane knapp skn...@berkeley.edu
 wrote:

  four builds triggered  and no timeouts.  :crossestoes:  :)
 
  On Wed, Oct 15, 2014 at 2:19 PM, shane knapp skn...@berkeley.edu
 wrote:
 
  ok, we're up and building...  :crossesfingersfortheumpteenthtime:
 
  On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas 
  nicholas.cham...@gmail.com wrote:
 
  I support this effort. :thumbsup:
 
  On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu
  wrote:
 
  i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to
 see
  if
  that helps w/the git fetch timeouts.
 
  this will require a short downtime (~20 mins for builds to finish,
 ~20
  mins
  to downgrade), and will hopefully give us some insight in to wtf is
  going
  on.
 
  thanks for your patience...
 
  shane
 
 
   --
  You received this message because you are subscribed to the Google
  Groups amp-infra group.
  To unsubscribe from this group and stop receiving emails from it,
 send
  an email to amp-infra+unsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.
 
 
 
 






Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-16 Thread shane knapp
yeah, at this point it might be worth trying.  :)

the absolutely irritating thing is that i am not seeing this happen w/any
other jobs other that the spark prb, nor does it seem to correlate w/time
of day, network or system load, or what slave it runs on.  nor are we
hitting our limit of connections on github.  i really, truly hate
non-deterministic failures.

i'm also going to write an email to support@github and see if they have any
insight in to this as well.

On Thu, Oct 16, 2014 at 12:51 PM, Nicholas Chammas 
nicholas.cham...@gmail.com wrote:

 Thanks for continuing to look into this, Shane.

 One suggestion that Patrick brought up, if we have trouble getting to the
 bottom of this, is doing the git checkout ourselves in the
 run-tests-jenkins script and cutting out the Jenkins git plugin entirely.
 That way we can script retries and post friendlier messages about timeouts
 if they still occur by ourselves.

 Do you think that’s worth trying at some point?

 Nick
 ​

 On Thu, Oct 16, 2014 at 2:04 PM, shane knapp skn...@berkeley.edu wrote:

 the bad news is that we've had a couple more failures due to timeouts,
 but the good news is that the frequency that these happen has decreased
 significantly (3 in the past ~18hr).

 seems like the git plugin downgrade has helped relieve the problem, but
 hasn't fixed it.  i'll be looking in to this more today.

 On Wed, Oct 15, 2014 at 7:05 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 A quick scan through the Spark PR board https://spark-prs.appspot.com/ 
 shows
 no recent failures related to this git checkout problem.

 Looks promising!

 Nick

 On Wed, Oct 15, 2014 at 6:10 PM, shane knapp skn...@berkeley.edu
 wrote:

 ok, we've had about 10 spark pull request builds go through w/o any git
 timeouts.  it seems that the git timeout issue might be licked.

 i will be definitely be keeping an eye on this for the next few days.

 thanks for being patient!

 shane

 On Wed, Oct 15, 2014 at 2:27 PM, shane knapp skn...@berkeley.edu
 wrote:

  four builds triggered  and no timeouts.  :crossestoes:  :)
 
  On Wed, Oct 15, 2014 at 2:19 PM, shane knapp skn...@berkeley.edu
 wrote:
 
  ok, we're up and building...  :crossesfingersfortheumpteenthtime:
 
  On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas 
  nicholas.cham...@gmail.com wrote:
 
  I support this effort. :thumbsup:
 
  On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu
  wrote:
 
  i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2)
 to see
  if
  that helps w/the git fetch timeouts.
 
  this will require a short downtime (~20 mins for builds to finish,
 ~20
  mins
  to downgrade), and will hopefully give us some insight in to wtf is
  going
  on.
 
  thanks for your patience...
 
  shane
 
 
   --
  You received this message because you are subscribed to the Google
  Groups amp-infra group.
  To unsubscribe from this group and stop receiving emails from it,
 send
  an email to amp-infra+unsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.
 
 
 
 







Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-16 Thread Nicholas Chammas
On Thu, Oct 16, 2014 at 3:55 PM, shane knapp skn...@berkeley.edu wrote:

 i really, truly hate non-deterministic failures.


Amen bruddah.


short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread shane knapp
i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to see if
that helps w/the git fetch timeouts.

this will require a short downtime (~20 mins for builds to finish, ~20 mins
to downgrade), and will hopefully give us some insight in to wtf is going
on.

thanks for your patience...

shane


Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread Nicholas Chammas
I support this effort. :thumbsup:

On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu wrote:

 i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to see if
 that helps w/the git fetch timeouts.

 this will require a short downtime (~20 mins for builds to finish, ~20 mins
 to downgrade), and will hopefully give us some insight in to wtf is going
 on.

 thanks for your patience...

 shane



Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread shane knapp
ok, we're up and building...  :crossesfingersfortheumpteenthtime:

On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas 
nicholas.cham...@gmail.com wrote:

 I support this effort. :thumbsup:

 On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu wrote:

 i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to see if
 that helps w/the git fetch timeouts.

 this will require a short downtime (~20 mins for builds to finish, ~20
 mins
 to downgrade), and will hopefully give us some insight in to wtf is going
 on.

 thanks for your patience...

 shane


  --
 You received this message because you are subscribed to the Google Groups
 amp-infra group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to amp-infra+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.



Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread shane knapp
four builds triggered  and no timeouts.  :crossestoes:  :)

On Wed, Oct 15, 2014 at 2:19 PM, shane knapp skn...@berkeley.edu wrote:

 ok, we're up and building...  :crossesfingersfortheumpteenthtime:

 On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 I support this effort. :thumbsup:

 On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu wrote:

 i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to see
 if
 that helps w/the git fetch timeouts.

 this will require a short downtime (~20 mins for builds to finish, ~20
 mins
 to downgrade), and will hopefully give us some insight in to wtf is going
 on.

 thanks for your patience...

 shane


  --
 You received this message because you are subscribed to the Google Groups
 amp-infra group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to amp-infra+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.





Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread shane knapp
ok, we've had about 10 spark pull request builds go through w/o any git
timeouts.  it seems that the git timeout issue might be licked.

i will be definitely be keeping an eye on this for the next few days.

thanks for being patient!

shane

On Wed, Oct 15, 2014 at 2:27 PM, shane knapp skn...@berkeley.edu wrote:

 four builds triggered  and no timeouts.  :crossestoes:  :)

 On Wed, Oct 15, 2014 at 2:19 PM, shane knapp skn...@berkeley.edu wrote:

 ok, we're up and building...  :crossesfingersfortheumpteenthtime:

 On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 I support this effort. :thumbsup:

 On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu
 wrote:

 i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to see
 if
 that helps w/the git fetch timeouts.

 this will require a short downtime (~20 mins for builds to finish, ~20
 mins
 to downgrade), and will hopefully give us some insight in to wtf is
 going
 on.

 thanks for your patience...

 shane


  --
 You received this message because you are subscribed to the Google
 Groups amp-infra group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to amp-infra+unsubscr...@googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.






Re: short jenkins downtime -- trying to get to the bottom of the git fetch timeouts

2014-10-15 Thread Nicholas Chammas
A quick scan through the Spark PR board https://spark-prs.appspot.com/ shows
no recent failures related to this git checkout problem.

Looks promising!

Nick

On Wed, Oct 15, 2014 at 6:10 PM, shane knapp skn...@berkeley.edu wrote:

 ok, we've had about 10 spark pull request builds go through w/o any git
 timeouts.  it seems that the git timeout issue might be licked.

 i will be definitely be keeping an eye on this for the next few days.

 thanks for being patient!

 shane

 On Wed, Oct 15, 2014 at 2:27 PM, shane knapp skn...@berkeley.edu wrote:

  four builds triggered  and no timeouts.  :crossestoes:  :)
 
  On Wed, Oct 15, 2014 at 2:19 PM, shane knapp skn...@berkeley.edu
 wrote:
 
  ok, we're up and building...  :crossesfingersfortheumpteenthtime:
 
  On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas 
  nicholas.cham...@gmail.com wrote:
 
  I support this effort. :thumbsup:
 
  On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu
  wrote:
 
  i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to
 see
  if
  that helps w/the git fetch timeouts.
 
  this will require a short downtime (~20 mins for builds to finish, ~20
  mins
  to downgrade), and will hopefully give us some insight in to wtf is
  going
  on.
 
  thanks for your patience...
 
  shane
 
 
   --
  You received this message because you are subscribed to the Google
  Groups amp-infra group.
  To unsubscribe from this group and stop receiving emails from it, send
  an email to amp-infra+unsubscr...@googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.