On Tue, Sep 29, 2020 at 8:10 PM Marian Velez <marianmve...@gmail.com> wrote: > > Hiya all! > > I've been trying to implement an abtest engine on my varnish. The problem > I've come across is related to the manipulation of the original req.url > property. > > What I'm experiencing is that when setting a new "req.url" value and after > forcing a restart(), the URL that varnishncsa uses to log is just the one it > has set when it re-enters recv flow, but not the one it uses from the backend > or the last req.URL object either (which will even have much more sense > since). > > I've made a sample snippet of what I observe just for the sake of reproducing > it and to take away all the abtest logic that may be just confusing. > > vcl 4.1; > backend default { > .host = "127.0.0.1"; > .port = "8000"; > } > > sub vcl_recv { > if (req.restarts == 0) { > set req.http.X-Original-URL = req.url; > set req.url = "/this_url_does_not_exist"; > return(restart); > } > else { > #restore original URL > set req.url = req.http.X-Original-URL; > } > } > > That vcl just sets an alternative req.url and triggers a flow restart. > varnishncsa logs this entry > > ::1 - - [29/Sep/2020:16:49:08 -0300] "GET > http://localhost:6081/this_url_does_not_exist HTTP/1.1" 200 12 "-" > "curl/7.72.0" > > > Of course that URL does not exist, but since its using the req.URL object > from which varnish recv flow started, it just keeps that for logging, and not > any later req.URL set. > > Here attached is the varnishlog output from a fresh start: > > * << Request >> 2 > - Begin req 1 rxreq > - Timestamp Start: 1601409664.461391 0.000000 0.000000 > - Timestamp Req: 1601409664.461391 0.000000 0.000000 > - VCL_use boot > - ReqStart ::1 55180 a0 > - ReqMethod GET > - ReqURL /this_object_exists > - ReqProtocol HTTP/1.1 > - ReqHeader Host: localhost:6081 > - ReqHeader User-Agent: curl/7.72.0 > - ReqHeader Accept: */* > - ReqHeader X-Forwarded-For: ::1 > - VCL_call RECV > - ReqHeader X-Original-URL: /this_object_exists > - ReqURL /this_url_does_not_exist > - VCL_return restart > - VCL_call HASH > - VCL_return lookup > - Timestamp Restart: 1601409664.461429 0.000037 0.000037 > - Link req 3 restart > - End
The transaction above does not reply to the client, it's not picked up by varnishncsa. > * << BeReq >> 4 > - Begin bereq 3 fetch > - VCL_use boot > - Timestamp Start: 1601409664.461553 0.000000 0.000000 > - BereqMethod GET > - BereqURL /this_object_exists > - BereqProtocol HTTP/1.1 > - BereqHeader User-Agent: curl/7.72.0 > - BereqHeader Accept: */* > - BereqHeader X-Forwarded-For: ::1 > - BereqHeader host: localhost:6081 > - BereqHeader Accept-Encoding: gzip > - BereqHeader X-Varnish: 4 > - VCL_call BACKEND_FETCH > - VCL_return fetch > - BackendOpen 26 default 127.0.0.1 8000 127.0.0.1 59480 connect > - Timestamp Bereq: 1601409664.462196 0.000642 0.000642 > - Timestamp Beresp: 1601409664.463115 0.001561 0.000918 > - BerespProtocol HTTP/1.0 > - BerespStatus 200 > - BerespReason OK > - BerespHeader Server: SimpleHTTP/0.6 Python/3.8.6 > - BerespHeader Date: Tue, 29 Sep 2020 20:01:04 GMT > - BerespHeader Content-type: application/octet-stream > - BerespHeader Content-Length: 12 > - BerespHeader Last-Modified: Tue, 29 Sep 2020 19:36:31 GMT > - TTL RFC 120 10 0 1601409664 1601409664 1601409664 0 0 cacheable > - VCL_call BACKEND_RESPONSE > - VCL_return deliver > - Filters > - Storage malloc s0 > - Fetch_Body 3 length stream > - BackendClose 26 default close > - Timestamp BerespBody: 1601409664.463236 0.001682 0.000120 > - Length 12 > - BereqAcct 155 0 155 199 12 211 > - End > > * << Request >> 3 > - Begin req 2 restart > - Timestamp Start: 1601409664.461429 0.000037 0.000000 > - ReqStart ::1 55180 a0 > - ReqMethod GET > - ReqURL /this_url_does_not_exist > - ReqProtocol HTTP/1.1 > - ReqHeader Host: localhost:6081 > - ReqHeader User-Agent: curl/7.72.0 > - ReqHeader Accept: */* > - ReqHeader X-Forwarded-For: ::1 > - ReqHeader X-Original-URL: /this_object_exists This transaction replies to the client but starts with the wrong URL, but we can find the correct one in the header above. > - VCL_call RECV > - ReqURL /this_object_exists > - ReqUnset X-Original-URL: /this_object_exists > - ReqUnset Host: localhost:6081 > - ReqHeader host: localhost:6081 > - VCL_return hash > - VCL_call HASH > - VCL_return lookup > - VCL_call MISS > - VCL_return fetch > - Link bereq 4 fetch > - Timestamp Fetch: 1601409664.463212 0.001821 0.001783 > - RespProtocol HTTP/1.0 > - RespStatus 200 > - RespReason OK > - RespHeader Server: SimpleHTTP/0.6 Python/3.8.6 > - RespHeader Date: Tue, 29 Sep 2020 20:01:04 GMT > - RespHeader Content-type: application/octet-stream > - RespHeader Content-Length: 12 > - RespHeader Last-Modified: Tue, 29 Sep 2020 19:36:31 GMT > - RespProtocol HTTP/1.1 > - RespHeader X-Varnish: 3 > - RespHeader Age: 0 > - RespHeader Via: 1.1 varnish (Varnish/6.5) > - VCL_call DELIVER > - VCL_return deliver > - Timestamp Process: 1601409664.463225 0.001833 0.000012 > - Filters > - RespHeader Accept-Ranges: bytes > - RespHeader Connection: keep-alive > - Timestamp Resp: 1601409664.463286 0.001894 0.000061 > - ReqAcct 96 0 96 299 12 311 > - End > > * << Session >> 1 > - Begin sess 0 HTTP/1 > - SessOpen ::1 55180 a0 ::1 6081 1601409664.461310 18 > - Link req 2 rxreq > - SessClose REM_CLOSE 0.002 > - End > > As you can see from the FETCH block, what it effectively fetches from backend > is not what varnishncsa logs. > I think it has something to do with the order in which the restart affects > the flow, since its not replacing the req.url object, just appending a new > one. BACKEND FETCH uses the right URL object since it uses the last one, but > varnishncsa logs the first it picked up from that restart. > > I've tried setting a blank req.url or unsetting in an attempt to avoid having > several req.URL objects introducing problems, but it seems that property is > readonly since I get errors like "Variable cannot be unset" or "Setting > req.url to empty sting - FAILED". > > Unfortunately I have no other way to implement my abtest engine than by using > the restart() order from varnish, but this logging issue is extremely > affecting what I can see to debug my stuff. > > Any clue of what trick I could use to workaround this? You can grab the right URL from the header, but you need to always set that header, even when you don't use the restart feature. Simply replace the URL pattern from the default format with the header instead: > varnishncsa -F '%h %l %u %t "%m http://%{Host}i%{X-Original-URL}i%q %H" %s %b > "%{Referer}i" "%{User-agent}i"' Based on the varnishncsa manual [1] it should be something like that. Dridi [1] https://varnish-cache.org/docs/6.5/reference/varnishncsa.html _______________________________________________ varnish-misc mailing list varnish-misc@varnish-cache.org https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc