> On 19 Sep 2015, at 08:28, shane knapp <skn...@berkeley.edu> wrote:
> 
> TL; DR:  jenkins is currently down and will probably not be brought
> back up until monday morning.
> 
> a machine caught fire in the colo this evening, and this tripped the
> halon, and now IST is overheating...  it looks like it may have been
> one of our servers that popped and caused the event, and thankfully no
> one was hurt.
> 
> http://ucbsystems.org/
> 
> amplab jenkins is currently down.  some ot her university services are
> also down as well.
> 
> jon is currently at the colo unplugging the remaining machines of the
> type that caught fire and we've reached out to the vendor who supplied
> them to see about an investigation.


hope things recover: once a rack has overheated you are in trouble.

I know some clusters that keep the ToR switches in middle of the racks for this 
reason: its less exposed to the hot air near the ceiling, so the most valuable 
H/W on the rack gets more protection.

As an added benefit: your ether cables are shorter, which, when you go to 4x1 
bonded, makes a big difference in cost.

> 
> IST staff will be starting their investigation tomorrow morning, and
> jon or i will post some updates as soon as we get them.
> 
> sorry for the inconvenience,
> 
> shane
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to