Hi yalls,

Christian and I were talking a bit today about how to figure out why high 
traffic (bits and upload) esams varnishes occasionally have latency issues[1] 
which cause buffers to fill up which causes a small amount of message loss[2]. 
We aren’t totally sure if this is an overall system throughput issue (network 
and/or Kafka brokers), or just something that might be fixable by tweaking more 
configs on the individual varnishkafkas.

We don’t think that anyone is using the webrequest bits data.  It isn’t 
included in udp2log, and therefore doesn’t affect any legacy analytics.  Nor 
are we using it for any productionized analytics in Hadoop.  We aren’t sure if 
others are relying on this data for adhoc analysis though.

If no one objects, we’d like to temporarily disable the bits varnishkafka 
instances.  If we do and the uploads esams varnishes then stop having problems, 
we will know that this is a system throughput issue.  If we continue to have 
problems with esams uploads, then we will know that it is more likely a local 
varnishkafka issue.

So, are there any objections to removing webrequest bits from Kafka webrequest 
logs for a little while?

-Ao


[1] varnishkafka rtt average:
http://grafana.wikimedia.org/#/dashboard/db/kafka?from=1420484972895&to=1420571372895&panelId=10&fullscreen
 
<http://grafana.wikimedia.org/#/dashboard/db/kafka?from=1420484972895&to=1420571372895&panelId=10&fullscreen>
or 
http://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&hreg[]=%28amssq%7Ccp%29.%2B&mreg[]=kafka.rdkafka.brokers..%2B%5C.rtt%5C.avg&gtype=line&title=kafka.rdkafka.brokers..%2B%5C.rtt%5C.avg&aggregate=1
 
<http://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&hreg%5B%5D=%28amssq%7Ccp%29.%2B&mreg%5B%5D=kafka.rdkafka.brokers..%2B%5C.rtt%5C.avg&gtype=line&title=kafka.rdkafka.brokers..%2B%5C.rtt%5C.avg&aggregate=1>

[2] varnishkafka delivery errors: 
http://grafana.wikimedia.org/#/dashboard/db/kafka?from=1420484972895&to=1420571372895&panelId=9&fullscreen
 
<http://grafana.wikimedia.org/#/dashboard/db/kafka?from=1420484972895&to=1420571372895&panelId=9&fullscreen>
or
http://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&hreg[]=%28amssq%7Ccp%29.%2B&mreg[]=kafka.varnishkafka%5C.kafka_drerr.per_second&gtype=line&title=kafka.varnishkafka%5C.kafka_drerr.per_second&aggregate=1
 
<http://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&hreg%5B%5D=%28amssq%7Ccp%29.%2B&mreg%5B%5D=kafka.varnishkafka%5C.kafka_drerr.per_second&gtype=line&title=kafka.varnishkafka%5C.kafka_drerr.per_second&aggregate=1>

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to