Hi! A bit more info:
Being the earliest Opsen around I poked around on analytics1021 and 1022 (the brokers) and found a disk failure for /dev/sdf on analytics1021, along with corresponding java call stack in the log when the broker died due to the fs remounting as read-only. I unmounted the disk and found more than a simple fsck is required. I therefore disabled puppet to avoid the endless broker service restart loop, and to avoid filling up /. Faidon silenced the Icinga noise with a patch. The problems are at least two fold: 1. Only 1 of 2 brokers alive evidently isn't quite enough capacity. Jgage mentioned on IRC that additional capacity is planned. 2. Ori observed: < ori> presumably the alert is flapping because because the script manages to poll twice between flushes, in which case drerr has not gone up Sean
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
