Thanks Sean!
On Jun 9, 2014, at 3:56 AM, Sean Pringle <[email protected]> wrote: > Hi! > > A bit more info: > > Being the earliest Opsen around I poked around on analytics1021 and 1022 (the > brokers) and found a disk failure for /dev/sdf on analytics1021, along with > corresponding java call stack in the log when the broker died due to the fs > remounting as read-only. > > I unmounted the disk and found more than a simple fsck is required. I > therefore disabled puppet to avoid the endless broker service restart loop, > and to avoid filling up /. > > Faidon silenced the Icinga noise with a patch. > > The problems are at least two fold: > > 1. Only 1 of 2 brokers alive evidently isn't quite enough capacity. Jgage > mentioned on IRC that additional capacity is planned. > > 2. Ori observed: < ori> presumably the alert is flapping because because the > script manages to poll twice between flushes, in which case drerr has not > gone up > > Sean > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
