Elukey has uploaded a new change for review.


Change subject: Tune varnishkafka-webrequest parameters

Tune varnishkafka-webrequest parameters

The Analytics team discovered a lot of webrequests missing the end
datatime field ending up in data consistency errors.
Varnishlog has been used on various cp hosts with the following
configuration to spot anomalies:

sudo varnishlog -c -n frontend -L 5000 -T 1500
  -q 'VSL or (Timestamp:Start and not Timestamp:Resp)' | tee timeouts.txt

The VSL timeouts settings (-L and -T) are the same used by Varnishkafka.
This query asks for any request that is either logged with a VSL timeout
or with a Start timestamp but not a Resp one. Two things came up:

1) A lot of requests with the HttpGarbage tag are discarded by Varnish but
2) The VSL store overflow error is still present but happens less frequently.

The proposed solution for 1) is to avoid logging any request with
the HttpGarbage tag, and to raise the maximum number of incomplete requests
kept in memory to 10000.

Change-Id: I68ada5789a848a676989c08590819625740b6bd8
M modules/role/manifests/cache/kafka/webrequest.pp
1 file changed, 11 insertions(+), 6 deletions(-)

  git pull ssh://gerrit.wikimedia.org:29418/operations/puppet 

diff --git a/modules/role/manifests/cache/kafka/webrequest.pp 
index 65fcdd7..f541861 100644
--- a/modules/role/manifests/cache/kafka/webrequest.pp
+++ b/modules/role/manifests/cache/kafka/webrequest.pp
@@ -15,10 +15,11 @@
     # Set varnish.arg.q or varnish.arg.m according to Varnish version
     if (hiera('varnish_version4', false)) {
-        # Background from T136314:
+        # Background task: T136314
+        # Background info about the parameters used:
         # 'q':
-        # Filter out PURGE requests and Pipe creation traffic.
-        # A Varnish log containing Timestamp:Pipe does not carry 
+        # 1) Filter out PURGE requests and Pipe creation traffic.
+        # 2) A Varnish log containing Timestamp:Pipe does not carry 
         # used by Analytics to bucket data on Hadoop and for data consistency
         # checks. These requests indicate that Varnish tried to establish a 
         # channel between the client and the backend, an information that
@@ -30,13 +31,15 @@
         # At the moment these requests get logged incorrectly and with partial
         # data (due to the VSL timeout) so it makes sense to filter them out to
         # remove noise from Analytics data.
+        # 3) A request marked with the VSL tag 'HttpGarbage' indicates 
+        # HTTP requests, generating spurious Varnish logs.
         # 'T':
         # VLS API timeout is the maximum time that Varnishkafka will wait 
         # "Begin" and "End" timestamps before flushing the available tags to a 
         # When a timeout occurs most of the times the result is a webrequest 
         # missing values like the end timestamp.
-        # Parameters modified during the upload migration:
+        # VSL Timeout parameters modified during the upload migration:
         # 'L':
         # Sets the upper limit of incomplete transactions kept before the 
         # one is force completed. This setting keeps an upper bound
@@ -44,14 +47,16 @@
         # A change in the -T timeout value has the side effect of keeping more
         # incomplete transactions in memory for each varnishkafka query (in 
our case
         # it directly corresponds to a varnishkafka instance running).
+        # The threshold has been raised to '5000' the first time (which removed
+        # the bulk of the timeouts) and to '10000' the second time.
         # 'T':
         # Raised the maximum timeout for incomplete records from '700' to 
         # after setting the -L to '5000'. VSL timeouts were masked
         # by VSL store overflow errors.
         $varnish_opts = {
-            'q' => 'ReqMethod ne "PURGE" and not Timestamp:Pipe and not 
ReqHeader:Upgrade ~ "[wW]ebsocket"',
+            'q' => 'ReqMethod ne "PURGE" and not Timestamp:Pipe and not 
ReqHeader:Upgrade ~ "[wW]ebsocket" and not HttpGarbage',
             'T' => '1500',
-            'L' => '5000'
+            'L' => '10000'
         $conf_template = 'varnishkafka/varnishkafka_v4.conf.erb'
     } else {

To view, visit https://gerrit.wikimedia.org/r/316306
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I68ada5789a848a676989c08590819625740b6bd8
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Elukey <ltosc...@wikimedia.org>

MediaWiki-commits mailing list

Reply via email to