Hi, in the week from 2014-12-01–2014-12-07 Andrew, and I worked on the following items around the Analytics Cluster and Analytics related Ops:
* Change in SSL setup causing pagecounts-raw to be off ... temporary * Preparing for vlan move of stats machines * Ganglia -> Graphite -> Grafana * Wikipedia Zero graph comparability (details below) Have fun, Christian * Change in SSL setup causing pagecounts-raw to be off ... temporary Ops changed the SSL setup from dedicated SSL terminators to cache-local SSL terminators for eqiad and esams. This change came a bit as a surprise to us, and (as expected) made webstatscollector's C implementation (pagecounts-raw) overcount HTTPS traffic. We adjusted webstatscollector's C implementation accordingly. While some weeks back that would be the end of the story and we'd just be left with a few days of broken data, we now have the data in the cluster, and have a Hive implementation too. So we could effectively backfill pagecounts-raw for the affected days. Up to my knowledge, this is the first time we could cover/mitigate a webstatscollector on the udp2log pipeline issue through the cluster. And pagecounts-raw has good data again for the affected period :-) * Preparing for vlan move of stats machines To develop infrastructure and research pipelines, devs and researchers would need some more basic development tools (E.g.: Maven, Virtualenv) on stat100[123] that Ops would prefer us not to use in the machines' current vlan. Hence, preparations started to move stat100[123] into the separate analytics vlan. This will address the concerns of Ops, while it still allows to install the needed tools. * Ganglia -> Graphite -> Grafana Ops is more and more moving from ganglia to graphite to do checks on numbers. So work has been started to look into graphite a bit more and on how to instrument it to perform checks. The cluster got re-configured to get the key metrics get fed into graphite. For dashboarding, it seems grafana might give a kibana-like interface. And http://grafana.wikimedia.org/#/dashboard/db/kafka got setup to provide a high-level, realtime view on kafka. * Wikipedia Zero graph comparability Following up from the previous week, the Wikipedia Zero had further concerns about the differences between their new on-wiki graphs and the Analytics team's dashboards. We identified and explained the differences for them. -- ---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ---- Companies' registry: 360296y in Linz Christian Aistleitner Kefermarkterstrasze 6a/3 Email: [email protected] 4293 Gutau, Austria Phone: +43 7946 / 20 5 81 Fax: +43 7946 / 20 5 81 Homepage: http://quelltextlich.at/ ---------------------------------------------------------------
signature.asc
Description: Digital signature
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
