Re: [build system] jenkins wedged, had to do a quick restart

2016-01-11 Thread shane knapp
...aaand we're back up and building. shane On Mon, Jan 11, 2016 at 9:47 AM, shane knapp <skn...@berkeley.edu> wrote: > jenkins looked to be wedged, and nothing was showing up in the logs. > i tried a restart, and am still looking in to the problem. > > we should be ba

[build system] jenkins wedged, had to do a quick restart

2016-01-11 Thread shane knapp
jenkins looked to be wedged, and nothing was showing up in the logs. i tried a restart, and am still looking in to the problem. we should be back up and building shortly. sorry for the inconvenience. shane - To unsubscribe,

Re: [build system] brief downtime, 8am PST thursday feb 10th

2016-02-10 Thread shane knapp
reminder: this is happening tomorrow morning. On Mon, Feb 8, 2016 at 9:27 AM, shane knapp <skn...@berkeley.edu> wrote: > happy monday! > > i will be bringing down jenkins and the workers thursday morning to > upgrade docker on all of the workers from 1.5.0-1 to 1.7.1-2. >

[build system] brief downtime, 8am PST thursday feb 10th

2016-02-08 Thread shane knapp
happy monday! i will be bringing down jenkins and the workers thursday morning to upgrade docker on all of the workers from 1.5.0-1 to 1.7.1-2. as of december last year, docker 1.5 and older lost the ability to pull from the docker hub. since we're running centos 6.X on our workers, and can't

Re: [build system] brief downtime, 8am PST thursday feb 10th

2016-02-11 Thread shane knapp
reminder: this is happening in ~30 minutes On Wed, Feb 10, 2016 at 10:58 AM, shane knapp <skn...@berkeley.edu> wrote: > reminder: this is happening tomorrow morning. > > On Mon, Feb 8, 2016 at 9:27 AM, shane knapp <skn...@berkeley.edu> wrote: >> happy monday! &g

Re: [build system] brief downtime, 8am PST thursday feb 10th

2016-02-11 Thread shane knapp
this is now done. On Thu, Feb 11, 2016 at 7:35 AM, shane knapp <skn...@berkeley.edu> wrote: > reminder: this is happening in ~30 minutes > > > On Wed, Feb 10, 2016 at 10:58 AM, shane knapp <skn...@berkeley.edu> wrote: >> reminder: this is happening tomorrow morn

Re: [build system] additional jenkins downtime next thursday

2016-02-24 Thread shane knapp
-worker-08 will also be getting a reboot to test out a fix for: https://github.com/apache/spark/pull/9893 shane On Wed, Feb 17, 2016 at 10:47 AM, shane knapp <skn...@berkeley.edu> wrote: > the security release has been delayed until next wednesday morning, > and i'll be doing the upgrade

Re: [build system] additional jenkins downtime next thursday

2016-02-25 Thread shane knapp
this is happening now. On Wed, Feb 24, 2016 at 6:08 PM, shane knapp <skn...@berkeley.edu> wrote: > the security update has been released, and it's a doozy! > > https://wiki.jenkins-ci.org/display/SECURITY/Security+Advisory+2016-02-24 > > i will be putting jenkins in t

Re: [build system] additional jenkins downtime next thursday

2016-02-25 Thread shane knapp
alright, the update is done and worker-08 rebooted. we're back up and building already! On Thu, Feb 25, 2016 at 8:15 AM, shane knapp <skn...@berkeley.edu> wrote: > this is happening now. > > On Wed, Feb 24, 2016 at 6:08 PM, shane knapp <skn...@berkeley.edu> wrote: >> t

Re: [build system] additional jenkins downtime next thursday

2016-02-17 Thread shane knapp
the security release has been delayed until next wednesday morning, and i'll be doing the upgrade first thing thursday morning. i'll update everyone when i get more information. thanks! shane On Thu, Feb 11, 2016 at 10:19 AM, shane knapp <skn...@berkeley.edu> wrote: > there's a big

FYI: github is getting DDOSed

2016-02-17 Thread shane knapp
this may cause builds to timeout on the git fetch much more than usual[1]. https://status.github.com/messages just thought people might want to know... shane 1 -- this actually happens pretty often, sadly. - To unsubscribe,

[build system] additional jenkins downtime next thursday

2016-02-11 Thread shane knapp
there's a big security patch coming out next week, and i'd like to upgrade our jenkins installation so that we're covered. it'll be around 8am, again, and i'll send out more details about the upgrade when i get them. thanks! shane

[build system] taking amp-jenkins-worker-06 and -07 offline due to disk space issues

2016-04-08 Thread shane knapp
looks like something filled up /home (0% space left), and i'll need to figure out what that is as well as clean up some space. once we're good, i'll put them back online and let everyone know. - To unsubscribe, e-mail:

Re: [build system] taking amp-jenkins-worker-06 and -07 offline due to disk space issues

2016-04-08 Thread shane knapp
ticket for further investigation. On Fri, Apr 8, 2016 at 11:18 AM, shane knapp <skn...@berkeley.edu> wrote: > looks like something filled up /home (0% space left), and i'll need to > figure out what that is as well as clean up some space. > > once we're good, i'll put them b

[build system] short downtime wednesday morning (4-27-16), 7-9am

2016-04-25 Thread shane knapp
another project hosted on our jenkins (e-mission) needs anaconda scipy upgraded from 0.15.1 to 0.17.0. this will also upgrade a few other libs, which i've included at the end of this email. i've spoken w/josh @ databricks and we don't believe that this will impact the spark builds at all. if

Re: [build system] short downtime wednesday morning (4-27-16), 7-9am

2016-04-27 Thread shane knapp
this will be postponed due to the 2.0 code freeze. sorry for the late notice. On Mon, Apr 25, 2016 at 4:50 PM, shane knapp <skn...@berkeley.edu> wrote: > another project hosted on our jenkins (e-mission) needs anaconda scipy > upgraded from 0.15.1 to 0.17.0. this will also upgrade

Re: [build system] short downtime wednesday morning (4-27-16), 7-9am

2016-04-27 Thread shane knapp
we're going to go ahead and do this on monday. i'll send out another email later this week w/the details. On Wed, Apr 27, 2016 at 8:50 AM, shane knapp <skn...@berkeley.edu> wrote: > this will be postponed due to the 2.0 code freeze. sorry for the late notice. > > On Mon, Apr 25,

[build system] short downtime monday morning (5-2-16), 7-9am PDT

2016-04-29 Thread shane knapp
(copy-pasta of previous message) another project hosted on our jenkins (e-mission) needs anaconda scipy upgraded from 0.15.1 to 0.17.0. this will also upgrade a few other libs, which i've included at the end of this email. i've spoken w/josh @ databricks and we don't believe that this will

Re: Using Travis for JDK7/8 compilation and lint-java.

2016-05-23 Thread shane knapp
chiming in, as i'm the one who currently maintains the CI infrastructure... :) +1 on not having more than one CI system... there's no way i can commit to keeping an eye on anything else other than jenkins. and i agree wholeheartedly w/michael: if it's this important, let's add it to the

Re: Using Travis for JDK7/8 compilation and lint-java.

2016-05-24 Thread shane knapp
> Sure, could you give me the permission for Spark Jira? > > Although we haven't decided yet, I can add Travis related section > (summarizing current configurations and expected VM HW, etc). > i can't give you permissions -- that has to be (most likely) through someone @ databricks, like michael.

Re: Using Travis for JDK7/8 compilation and lint-java.

2016-05-24 Thread shane knapp
> Another clarification: not databricks, but the Apache Spark PMC grants > access to the JIRA / wiki. That said... I'm not actually sure how its done. word. i'll make the changes if we need to. - To unsubscribe, e-mail:

Re: [build system] short downtime next thursday morning, 5-12-16 @ 8am PDT

2016-05-09 Thread shane knapp
reminder: this is happening thursday morning. On Wed, May 4, 2016 at 11:38 AM, shane knapp <skn...@berkeley.edu> wrote: > there's a security update coming out for jenkins next week, and i'm > going to install the update first thing thursday morning. > > i'll send out another r

Re: [build system] short downtime next thursday morning, 5-12-16 @ 8am PDT

2016-05-12 Thread shane knapp
ok, i've decided to roll back the upgrade and do this again early next week. some of the new features/security fixes break the pull request builder, so i will need to revisit my plan. sorry for the downtime -- we're back up and running now. On Thu, May 12, 2016 at 8:41 AM, shane knapp <

Re: [build system] short downtime next thursday morning, 5-12-16 @ 8am PDT

2016-05-12 Thread shane knapp
, 2016 at 8:00 AM, shane knapp <skn...@berkeley.edu> wrote: > this is happening now. > > On Wed, May 11, 2016 at 4:42 PM, shane knapp <skn...@berkeley.edu> wrote: >> reminder: this is happening tomorrow morning! >> >> 7am PDT: builds paused >> 8am PD

Re: [build system] short downtime next thursday morning, 5-12-16 @ 8am PDT

2016-05-12 Thread shane knapp
this is happening now. On Wed, May 11, 2016 at 4:42 PM, shane knapp <skn...@berkeley.edu> wrote: > reminder: this is happening tomorrow morning! > > 7am PDT: builds paused > 8am PDT: master reboot, upgrade happens > 9am PDT: builds restarted > > On Mon, May 9, 2

Re: [build system] short downtime next thursday morning, 5-12-16 @ 8am PDT

2016-05-11 Thread shane knapp
reminder: this is happening tomorrow morning! 7am PDT: builds paused 8am PDT: master reboot, upgrade happens 9am PDT: builds restarted On Mon, May 9, 2016 at 4:17 PM, shane knapp <skn...@berkeley.edu> wrote: > reminder: this is happening thursday morning. > > On Wed, May 4, 2

Re: [build system] short downtime monday morning (5-2-16), 7-9am PDT

2016-05-02 Thread shane knapp
this is happening now. On Fri, Apr 29, 2016 at 12:52 PM, shane knapp <skn...@berkeley.edu> wrote: > (copy-pasta of previous message) > > another project hosted on our jenkins (e-mission) needs anaconda scipy > upgraded from 0.15.1 to 0.17.0. this will also upgrade a few other

Re: [build system] short downtime monday morning (5-2-16), 7-9am PDT

2016-05-02 Thread shane knapp
of these two machines. i'm also unpausing builds. On Mon, May 2, 2016 at 8:26 AM, shane knapp <skn...@berkeley.edu> wrote: > this is happening now. > > On Fri, Apr 29, 2016 at 12:52 PM, shane knapp <skn...@berkeley.edu> wrote: >> (copy-pasta of previous message) >>

Re: [build system] short downtime monday morning (5-2-16), 7-9am PDT

2016-05-02 Thread shane knapp
workers -01 and -04 are back up, is is -06 (as i hit the wrong power button by accident). :) -01 and -04 got hung on shutdown, so i'll investigate them and see what exactly happened. regardless, we should be building happily! On Mon, May 2, 2016 at 8:44 AM, shane knapp <skn...@berkeley.

[build system] short downtime next thursday morning, 5-12-16 @ 8am PDT

2016-05-04 Thread shane knapp
there's a security update coming out for jenkins next week, and i'm going to install the update first thing thursday morning. i'll send out another reminder early next week. thanks! shane - To unsubscribe, e-mail:

Re: [build system] issue w/jenkins

2016-04-18 Thread shane knapp
AM, shane knapp <skn...@berkeley.edu> wrote: > for now, you can log in to jenkins by ignoring the http reverse proxy: > https://hadrian.ist.berkeley.edu/jenkins/ > > this still doesn't allow for things like the pull request builder and > whatnot to run... i'm still digging i

Re: [build system] issue w/jenkins

2016-04-18 Thread shane knapp
for now, you can log in to jenkins by ignoring the http reverse proxy: https://hadrian.ist.berkeley.edu/jenkins/ this still doesn't allow for things like the pull request builder and whatnot to run... i'm still digging in to this. thanks, shane On Mon, Apr 18, 2016 at 10:02 AM, shane knapp

Re: Using Travis for JDK7/8 compilation and lint-java.

2016-05-24 Thread shane knapp
> As Sean said, Vanzin made a PR for JDK7 compilation. We can ignore the issue > of JDK7 compilation. > vanzin and i are working together on this right now... we currently have java 7u79 installed on all of the workers. if some random test failures keep happening during his tests, i will roll

[build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-25 Thread shane knapp
around 1pm friday, july 29th, we will be taking jenkins down for a rack move and celebrating national systems administrator day. the outage should only last a couple of hours at most, and will be concluded with champagne toasts. yes, the outage and holiday are real, but the champagne in the

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-28 Thread shane knapp
reminder -- this is happening TOMORROW. On Wed, Jul 27, 2016 at 5:39 PM, shane knapp <skn...@berkeley.edu> wrote: > reminder -- this is happening friday afternoon. > > i will pause the build queue late friday morning. > > On Mon, Jul 25, 2016 at 2:29 PM, shane knapp <sk

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-29 Thread shane knapp
the move is complete and the machines powered back up right away, with no problems. we're doing a quick update on the firewall, and then we'll be done! On Fri, Jul 29, 2016 at 1:03 PM, shane knapp <skn...@berkeley.edu> wrote: > machines are going down NOW > > On Fri, Jul 29, 2

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-29 Thread shane knapp
we're done and building! a bunch of builds failed w/git auth issues, due to me cancelling the quiet period early (as i thought the firewall update was done). this is no longer the case as i was more patient this time. :) happy friday! shane On Fri, Jul 29, 2016 at 1:45 PM, shane knapp <

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-29 Thread shane knapp
machines are going down NOW On Fri, Jul 29, 2016 at 10:53 AM, shane knapp <skn...@berkeley.edu> wrote: > reminder -- this is happening TODAY. jenkins is currently in quiet mode. > > i will post updates over the course of the afternoon, and we should be > back up and b

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-27 Thread shane knapp
reminder -- this is happening friday afternoon. i will pause the build queue late friday morning. On Mon, Jul 25, 2016 at 2:29 PM, shane knapp <skn...@berkeley.edu> wrote: > around 1pm friday, july 29th, we will be taking jenkins down for a > rack move and celebrating nati

[build system] hanging procs on the jenkins master, emergency reboot

2016-06-28 Thread shane knapp
jenkins got itself in to another state and was killing the master. while poking around, i noticed lots of sleeping processes that were using a TON of cpu and had a bunch of log files open, but not writing to them. anyways, it looked like it needed a quick restart and that seems to have fixed the

Re: [build system] jenkins downtime friday afternoon, july 29th 2016

2016-07-29 Thread shane knapp
reminder -- this is happening TODAY. jenkins is currently in quiet mode. i will post updates over the course of the afternoon, and we should be back up and building before COB. On Thu, Jul 28, 2016 at 4:06 PM, shane knapp <skn...@berkeley.edu> wrote: > reminder -- this is happening

Re: Jenkins networking / port contention

2016-07-01 Thread shane knapp
i assume you're talking about zinc ports? the tests are designed to run one at a time on randomized ports -- no containerization. we're on bare metal. the test launch code executes this for each build: # Generate random point for Zinc export ZINC_PORT ZINC_PORT=$(python -S -c "import random;

Re: [build system] quick jenkins restart

2016-07-01 Thread shane knapp
aand we're back. On Fri, Jul 1, 2016 at 10:10 AM, shane knapp <skn...@berkeley.edu> wrote: > i put jenkins in quiet mode as i noticed we have almost no builds > queued. one of our students needed rust installed on the workers, and > i need to update the PATH on all of the

Re: Jenkins networking / port contention

2016-07-01 Thread shane knapp
time on the same machine don't actually behave correctly. > > I already updated the kafka 0.10 consumer tests to use a random port, > and can do the same for the 0.8 consumer tests, but wanted to make > sure I understood what was happening in the Jenkins environment. > > O

[build system] quick jenkins restart

2016-07-01 Thread shane knapp
i put jenkins in quiet mode as i noticed we have almost no builds queued. one of our students needed rust installed on the workers, and i need to update the PATH on all of the workers. we should be back up and building within 30 minutes. thanks! shane

Re: welcoming Burak and Holden as committers

2017-01-24 Thread shane knapp
congrats to the both of you! :) On Tue, Jan 24, 2017 at 10:13 AM, Reynold Xin wrote: > Hi all, > > Burak and Holden have recently been elected as Apache Spark committers. > > Burak has been very active in a large number of areas in Spark, including > linear algebra,

[build system] jenkins restart in ~1 hour

2017-02-16 Thread shane knapp
we don't have many builds running right now, and i need to restart the daemon quickly to enable a new plugin. i'll wait until the pull request builder jobs are finished and then (gently) kick jenkins. updates as they come, shane (who's always nervous about touching this house of cards)

Re: [build system] jenkins restart in ~1 hour

2017-02-16 Thread shane knapp
and we're back! :) On Thu, Feb 16, 2017 at 10:22 AM, shane knapp <skn...@berkeley.edu> wrote: > we don't have many builds running right now, and i need to restart the > daemon quickly to enable a new plugin. > > i'll wait until the pull request builder jobs are finished and th

Re: File JIRAs for all flaky test failures

2017-02-15 Thread shane knapp
it's not an open-file limit -- i have the jenkins workers set up w/a soft file limit of 100k, and a hard limit of 200k. On Wed, Feb 15, 2017 at 12:48 PM, Armin Braun wrote: > I think one thing that is contributing to this a lot too is the general > issue of the tests taking up a

Re: [build system] emergency jenkins master reboot, got some wedged processes

2017-02-27 Thread shane knapp
we're back and things are much snappier! sorry for the downtime. On Mon, Feb 27, 2017 at 1:58 PM, shane knapp <skn...@berkeley.edu> wrote: > the jenkins master is wedged and i'm going to reboot it to increase > it's happiness. > > more up

[build system] emergency jenkins master reboot, got some wedged processes

2017-02-27 Thread shane knapp
the jenkins master is wedged and i'm going to reboot it to increase it's happiness. more updates as they come. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [build system] brief jenkins downtime this morning

2016-09-12 Thread shane knapp
the backup is done and we're building again! On Mon, Sep 12, 2016 at 9:31 AM, shane knapp <skn...@berkeley.edu> wrote: > our weekly backups failed due to a hung job. even though i tried to > change the backup scheduler (internal to jenkins) to run tonite, it's > still insistin

[build system] brief jenkins downtime this morning

2016-09-12 Thread shane knapp
our weekly backups failed due to a hung job. even though i tried to change the backup scheduler (internal to jenkins) to run tonite, it's still insisting that it needs to run immediately and is continually putting jenkins in to quiet mode. short of killing all of the current jobs and restarting

[build system] jenkins wedged itself this weekend, just restarted

2016-08-29 Thread shane knapp
jenkins got in to one of it's "states" and wasn't accepting new builds starting this past saturday night. i restarted it, and now it's catching up on the weekend's queue. shane - To unsubscribe e-mail:

Re: [build system] massive jenkins infrastructure changes forthcoming

2016-11-18 Thread shane knapp
On Thu, Nov 17, 2016 at 4:52 PM, Reynold Xin wrote: > Thanks for the headsup, Shane. > no problem! i'm really looking forward to starting w/a much cleaner slate that what we have now. not only are we locked in a jenkins/plugin version dependency hell that keeps us from

[build system] massive jenkins infrastructure changes forthcoming

2016-11-17 Thread shane knapp
TL;DR: amplab is becomine riselab, and is much more C++ oriented. centos 6 is so far behind, and i'm already having to roll C++ compilers and various libraries by hand. centos 7 is an absolute no-go, so we'll be moving the jenkins workers over to a recent (TBD) version of ubuntu server. also,

[build system] jenkins downtime for backups delayed by a hung build

2016-10-17 Thread shane knapp
i just noticed that jenkins was still in quiet mode this morning due to a hung build. i killed the build, backups happened, and the queue is now happily building. sorry for any delay! shane - To unsubscribe e-mail:

Re: Tests failing with GC limit exceeded

2017-01-03 Thread shane knapp
nope, no changes to jenkins in the past few months. ganglia graphs show higher, but not worrying, memory usage on the workers when the jobs failed... i'll take a closer look later tonite/first thing tomorrow morning. shane On Tue, Jan 3, 2017 at 4:35 PM, Kay Ousterhout

Re: Tests failing with GC limit exceeded

2017-01-06 Thread shane knapp
On Fri, Jan 6, 2017 at 12:20 PM, shane knapp <skn...@berkeley.edu> wrote: > FYI, this is happening across all spark builds... not just the PRB. s/all/almost all/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Tests failing with GC limit exceeded

2017-01-06 Thread shane knapp
FYI, this is happening across all spark builds... not just the PRB. i'm compiling a report now and will email that out this afternoon. :( On Thu, Jan 5, 2017 at 9:00 PM, shane knapp <skn...@berkeley.edu> wrote: > unsurprisingly, we had another GC: > > https://amplab.cs.berkeley.

Re: Tests failing with GC limit exceeded

2017-01-06 Thread shane knapp
(adding michael armbrust and josh rosen for visibility) ok. roughly 9% of all spark tests builds (including both PRB builds are failing due to GC overhead limits. $ wc -l SPARK_TEST_BUILDS GC_FAIL 1350 SPARK_TEST_BUILDS 125 GC_FAIL here are the affected builds (over the past ~2 weeks): $

Re: Tests failing with GC limit exceeded

2017-01-04 Thread shane knapp
all, there seems to be no pattern to which tests are failing (different each time). i'll look a little deeper and decide what to do next. On Tue, Jan 3, 2017 at 6:49 PM, shane knapp <skn...@berkeley.edu> wrote: > nope, no changes to jenkins in the past few months. ganglia graphs &

Re: Tests failing with GC limit exceeded

2017-01-05 Thread shane knapp
unsurprisingly, we had another GC: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70949/console so, definitely not the system (everything looks hunky dory on the build node). > It can always be some memory leak; if we increase the memory settings > and OOMs still happen,

Re: Tests failing with run-tests.py SyntaxError

2017-07-31 Thread shane knapp
(for the wider dev audience) there's an apache issue open for this, which i have commented on. https://issues.apache.org/jira/browse/SPARK-21573 i have a workaround and will be getting this sorted asap. On Fri, Jul 28, 2017 at 7:27 PM, Hyukjin Kwon wrote: > Or maybe in

jenkins is going down NOW -- POWER OUTAGE DUE TO FIRE

2017-08-02 Thread shane knapp
we have a massive fire in the hills behind campus, and PG is shutting down all of the transformers on campus as a precaution. this will impact jenkins. i will be shutting down the workers immediately. http://www.berkeleyside.com/2017/08/02/crews-respond-wildland-fire-east-bay-hills/

Re: jenkins is going down NOW -- POWER OUTAGE DUE TO FIRE

2017-08-03 Thread shane knapp
of months. :\ shane On Wed, Aug 2, 2017 at 5:30 PM, shane knapp <skn...@berkeley.edu> wrote: > we just got the all clear, and power was not cut off. however, the > remote consoles on most of the workers isn't working, and i can't > currently can't power them back on. > > right

Re: spark pypy support?

2017-08-14 Thread shane knapp
actually, we *have* locked on a particular pypy versions for the jenkins workers: 2.5.1 this applies to both the 2.7 and 3.5 conda environments. (py3k)-bash-4.1$ pypy --version Python 2.7.9 (9c4588d731b7fe0b08669bd732c2b676cb0a8233, Apr 09 2015, 02:17:39) [PyPy 2.5.1 with GCC 4.4.7 20120313

[build system] jenkins back up and building

2017-08-11 Thread shane knapp
there was some network work being done last night (~945pm PDT) at our colo, and it had the unintended consequence of kicking a lot of services off the network. jenkins was affected, and the connection to github was lost. i just kicked the jenkins master and things are happily building again.

Re: [build system] important: potential upgrading of the python build environment

2017-07-07 Thread shane knapp
this is done. i'll babysit builds today. On Fri, Jul 7, 2017 at 8:40 AM, shane knapp <skn...@berkeley.edu> wrote: > doing this now. > > On Thu, Jul 6, 2017 at 1:34 PM, shane knapp <skn...@berkeley.edu> wrote: >> (big CC list so the people involved have visibility)

Re: [build system] important: potential upgrading of the python build environment

2017-07-07 Thread shane knapp
doing this now. On Thu, Jul 6, 2017 at 1:34 PM, shane knapp <skn...@berkeley.edu> wrote: > (big CC list so the people involved have visibility) > > we're currently using a (very old) installation of anaconda python to > manage the python3 build deps, but it turns out

upgrade Roxygen2 to 5.0.0

2017-07-10 Thread shane knapp
see: https://issues.apache.org/jira/browse/SPARK-21367 i'm doing this now, and it shouldn't interrupt any running builds. shane - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: upgrade Roxygen2 to 5.0.0

2017-07-10 Thread shane knapp
...and done. On Mon, Jul 10, 2017 at 11:36 AM, shane knapp <skn...@berkeley.edu> wrote: > see: https://issues.apache.org/jira/browse/SPARK-21367 > > i'm doing this now, and it shouldn't interrupt any running bu

Re: upgrade Roxygen2 to 5.0.0

2017-07-10 Thread shane knapp
reverting On Mon, Jul 10, 2017 at 11:37 AM, shane knapp <skn...@berkeley.edu> wrote: > ...and done. > > On Mon, Jul 10, 2017 at 11:36 AM, shane knapp <skn...@berkeley.edu> wrote: >> see: https://issues.apache.org/jira/browse/SPARK-21367 >> >> i'm doing t

[build system] important: potential upgrading of the python build environment

2017-07-06 Thread shane knapp
(big CC list so the people involved have visibility) we're currently using a (very old) installation of anaconda python to manage the python3 build deps, but it turns out that our (very old) versions of numpy and pandas are starting to hold us back. reference:

Re: Is there something wrong with jenkins?

2017-06-27 Thread shane knapp
(adding holden and bryan cutler to the CC on this) we're currently fixing this. i've installed pyarrow 0.4.0 in the default conda environment used by the spark tests (py3k), and either bryan or holden will be removing the pip install from run-pip-tests and adding the arrow tests to the regular

Re: Is there something wrong with jenkins?

2017-06-27 Thread shane knapp
the discussion about this is located here: https://github.com/apache/spark/pull/15821 On Tue, Jun 27, 2017 at 12:32 PM, shane knapp <skn...@berkeley.edu> wrote: > (adding holden and bryan cutler to the CC on this) > > we're currently fixing this. i've installed pyarrow 0.4.0 i

Re: Is there something wrong with jenkins?

2017-06-27 Thread shane knapp
PR being tested now: https://github.com/apache/spark/pull/18443 On Tue, Jun 27, 2017 at 12:33 PM, shane knapp <skn...@berkeley.edu> wrote: > the discussion about this is located here: > https://github.com/apache/spark/pull/15821 > > On Tue, Jun 27, 2017 at 12:32 PM

Re: Question, Flaky tests: pyspark.sql.tests.ArrowTests tests in Jenkins worker 5(?)

2017-08-05 Thread shane knapp
ok, first test to run post-fix is green: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80289/ i'll keep an eye on this worker over the next few days. shane On Sat, Aug 5, 2017 at 11:06 AM, shane knapp <skn...@berkeley.edu> wrote: > amp-jenkins-worker-05 had 0.20.3

Re: Question, Flaky tests: pyspark.sql.tests.ArrowTests tests in Jenkins worker 5(?)

2017-08-05 Thread shane knapp
amp-jenkins-worker-05 had 0.20.3 installed for some reason. it's now been downgraded to 0.19.2 and matches the other workers. shane On Sat, Aug 5, 2017 at 2:01 AM, Liang-Chi Hsieh wrote: > > Maybe a possible fix: >

Re: Welcoming Hyukjin Kwon and Sameer Agarwal as committers

2017-08-08 Thread shane knapp
On Tue, Aug 8, 2017 at 10:16 AM, Sameer Agarwal wrote: > Thanks all. It's really humbling to be part of such an innovative community! > you just can't seem to get enough of your old amplab compatriots, sameer! ;) anyways, congrats to the both of you!

Re: [build system] jenkins got itself wedged...

2017-05-17 Thread shane knapp
i'm going to need to perform a quick reboot on the jenkins master. it looks like it's hung again. sorry about this! shane On Tue, May 16, 2017 at 12:55 PM, shane knapp <skn...@berkeley.edu> wrote: > ...but just now i started getting alerts on system load, which was > rather

Re: [build system] jenkins got itself wedged...

2017-05-17 Thread shane knapp
ok, we're back up, system load looks cromulent and we're happily building (again). shane On Wed, May 17, 2017 at 9:50 AM, shane knapp <skn...@berkeley.edu> wrote: > i'm going to need to perform a quick reboot on the jenkins master. it > looks like it's hung again. >

Re: [build system] jenkins got itself wedged...

2017-05-17 Thread shane knapp
in service. shane On Wed, May 17, 2017 at 9:59 AM, shane knapp <skn...@berkeley.edu> wrote: > ok, we're back up, system load looks cromulent and we're happily > building (again). > > shane > > On Wed, May 17, 2017 at 9:50 AM, shane knapp <skn...@berkeley.edu> wrote: &

[build system] jenkins got itself wedged...

2017-05-16 Thread shane knapp
...so i kicked it and it's now back up and happily building. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

[build system] rolling back R to working version

2017-06-20 Thread shane knapp
i accidentally updated R during the system update, and will be rolling everything back to the known working versions. again, i'm really sorry about this. our jenkins is old, and the new ubuntu one is almost ready to go. i really can't wait to shut down the centos boxes... they're old and

Re: [build system] rolling back R to working version

2017-06-20 Thread shane knapp
at 8:31 PM, shane knapp <skn...@berkeley.edu> wrote: > i accidentally updated R during the system update, and will be rolling > everything back to the known working versions. > > again, i'm really sorry about this. our jenkins is old, and the new > ubuntu one is almost ready t

[build system] [fixed] system update broke symlink for pypy-2.5.1, PRB builds failing

2017-06-20 Thread shane knapp
this is currently fixed, but did cause PRB failures this afternoon. i'll go retrigger as many as i can as penance. :\ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-20 Thread shane knapp
to allow some spark PRB builds to finish. On Tue, Jun 20, 2017 at 9:39 AM, shane knapp <skn...@berkeley.edu> wrote: > and we're back up and building! > > On Tue, Jun 20, 2017 at 8:23 AM, shane knapp <skn...@berkeley.edu> wrote: >> ok, the centos packages have been

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-20 Thread shane knapp
(hopefully this is my last email on this subject...) jenkins is back up. the ray and alluxio-master builds have been de-zombified and are happily building (as well as everything else). :) shane On Tue, Jun 20, 2017 at 12:27 PM, shane knapp <skn...@berkeley.edu> wrote: > i have to

Re: [build system] when it rains... berkeley lost power. again. use new url to visit jenkins

2017-06-21 Thread shane knapp
a lot of berkeley cs infrastructure we depend on is still down. no ETA as to when they'll be up. On Wed, Jun 21, 2017 at 3:43 PM, shane knapp <skn...@berkeley.edu> wrote: > a construction crew working outside hit an underground power line, and > power has just been restored.

[build system] when it rains... berkeley lost power. again. use new url to visit jenkins

2017-06-21 Thread shane knapp
...it pours. we lost power in our building, including the machine room where amplab.cs.berkeley.edu lives. jenkins is still up and you can visit the site by ignoring the reverse proxy: https://hadrian.ist.berkeley.edu/jenkins/ the bad news is that pull request builds won't run. ETA on power

Re: [build system] when it rains... berkeley lost power. again. use new url to visit jenkins

2017-06-21 Thread shane knapp
a construction crew working outside hit an underground power line, and power has just been restored. our servers are coming back up, and access to jenkins should be restored shortly. On Wed, Jun 21, 2017 at 2:14 PM, shane knapp <skn...@berkeley.edu> wrote: > ...it pours. > > we lo

[build system] patching post-mortem: back to normal!

2017-06-21 Thread shane knapp
all systems were updated fully, as it had been over a year since i'd last done it. risky, i know but... things that went right: * a lot of vulnerabilities in the systems were patched. short list: - CVE-2017-1000364 (stack guard) - CVE-2017-1000363 (stack overflow) - CVE-2017-1000366 (gnu

Re: [build system] when it rains... berkeley lost power. again. use new url to visit jenkins

2017-06-21 Thread shane knapp
ok, amplab.cs.berkeley.edu is back up and you can reach jenkins. On Wed, Jun 21, 2017 at 4:18 PM, shane knapp <skn...@berkeley.edu> wrote: > a lot of berkeley cs infrastructure we depend on is still down. no > ETA as to when they'll be up. > > On Wed, Jun 21, 2017 at 3:43 PM

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-20 Thread shane knapp
ok, the centos packages have been released. i've put jenkins in to quiet mode, and will be updating rpms and rebooting ASAP. updates as they come. shane On Mon, Jun 19, 2017 at 2:43 PM, shane knapp <skn...@berkeley.edu> wrote: > i've updated the two ubuntu workers (amp-jenkins-s

[build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-19 Thread shane knapp
jenkins is affected: https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt https://access.redhat.com/security/vulnerabilities/stackguard i'm shutting down jenkins, applying patches and rebooting immediately. ETA unknown. hopefully quick. i'll update here when i find out.

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-19 Thread shane knapp
ok, we're in a holding pattern as the centos packages haven't been released yet. once they're out i'll update this thread and start rebooting. On Mon, Jun 19, 2017 at 10:52 AM, shane knapp <skn...@berkeley.edu> wrote: > jenkins is affected: > > https://www.qualys.com/2017/06/19/st

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-20 Thread shane knapp
and we're back up and building! On Tue, Jun 20, 2017 at 8:23 AM, shane knapp <skn...@berkeley.edu> wrote: > ok, the centos packages have been released. i've put jenkins in to > quiet mode, and will be updating rpms and rebooting ASAP. > > updates as they come. > > shane

Re: [build system] immediate emergency updates and reboot to deal w/stack clash vulnerability

2017-06-19 Thread shane knapp
i've updated the two ubuntu workers (amp-jenkins-staging-01 and -02), and am still twiddling my thumbs and waiting for centos packages to be released. i'm guessing we'll have those some time today, and will update everyone then. On Mon, Jun 19, 2017 at 11:02 AM, shane knapp <skn...@berkeley.

Re: [build system] jenkins got itself wedged...

2017-05-19 Thread shane knapp
last update of the week: things are looking great... we're GCing happily and staying well within our memory limits. i'm going to do one more restart after the two pull request builds finish to re-enable backups, and call it a weekend. :) shane On Fri, May 19, 2017 at 8:29 AM, shane knapp

<    1   2   3   4   5   6   7   8   >