> On 19 Sep 2015, at 08:28, shane knapp <skn...@berkeley.edu> wrote: > > TL; DR: jenkins is currently down and will probably not be brought > back up until monday morning. > > a machine caught fire in the colo this evening, and this tripped the > halon, and now IST is overheating... it looks like it may have been > one of our servers that popped and caused the event, and thankfully no > one was hurt. > > http://ucbsystems.org/ > > amplab jenkins is currently down. some ot her university services are > also down as well. > > jon is currently at the colo unplugging the remaining machines of the > type that caught fire and we've reached out to the vendor who supplied > them to see about an investigation.
hope things recover: once a rack has overheated you are in trouble. I know some clusters that keep the ToR switches in middle of the racks for this reason: its less exposed to the hot air near the ceiling, so the most valuable H/W on the rack gets more protection. As an added benefit: your ether cables are shorter, which, when you go to 4x1 bonded, makes a big difference in cost. > > IST staff will be starting their investigation tomorrow morning, and > jon or i will post some updates as soon as we get them. > > sorry for the inconvenience, > > shane > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org