On Fri, May 3, 2019 at 12:07 PM Geoff Simmons <ge...@uplex.de> wrote: > > On 5/3/19 10:51, Dridi Boukelmoune wrote: > > > > First we lose the ability to treat the whole record as a single sting > > out of the box. The client will need to reconstruct the string from > > the various field types that may constitute a record. Think string and > > regex operators. > > Yes, Martin and I recently had a conversation about this on #hacking. > For the string equality comparisons (eq and ne), we could conceivably > look into the comparison string and see if it has blanks that cross > field boundaries. But as you pointed out, some VSL payloads have blanks > that are not field separators (VCL_Log, FetchError, etc). > > For regex comparisons, where the pattern could include things like "\s", > ".*", "\W" and so forth, detecting whether the pattern might match a > field-separating blank has the look of a sophisticated computer > science-y problem.
Does it have to though? We only use those as raw strings, and never as regular expression patterns. > Martin and I agreed that, at least in the first iteration, for VSLQ > queries with a string operator (equality or regex) we should render the > formatted payload into a temporary buffer and run the comparison against > that. Once we have it all working, maybe we could consider more clever > solutions, but I'd be surprised if we'd gain much from getting clever. Ack, that's the logical thing to start with. > I think that an important part of this will be to emphasize in the docs > that authors of VSL queries should use field[n] indices and numeric > comparisons whenever it's appropriate. My (admittedly evidence-free) > impression is that not everyone uses them when they could. Well, today the fields need to be dup'ed anyway when we need a null terminated string so that will indeed need to be documented that field operations become much cheaper than record operations. > Say you're scanning backend logs for fetches to backend foo, like this: > > -q 'BackendOpen ~ "foo"' > > Since the backend name is the second field, it would be better as: > > -q 'BackendOpen[2] eq "foo"' > > Or to look for backend fetches that take longer than 1 second, it > wouldn't surprise me if some people are doing something like: > > -q 'Timestamp:Beresp ~ "[1-9]\.\d*$"' > > When it should be: > > -q 'Timestamp:Beresp[3] > 1.0' Agreed, I sometimes realize after running a long query on a VSL of the GB persuasion that I could have done better. > We'll do much better if we can narrow the query to a field, and if we > don't have to convert numbers at all. And we should make sure that users > know about it. Make the field index mandatory with [0] meaning the whole record and people will learn :p > > Finally, regarding the VSL file header that we'd hypothetically dump > > with varnishlog -w in the future (which I really hope we do), we could > > consider making this a VSL record itself that is not subject to > > vsl_reclen, querying or filtering, for the needs of dynamic VSLs, so > > that whenever something changes VSL clients see an update. It would > > then be varnishlog -w's job to build such a synthetic record in the > > VSL file before dumping actual logs. > > I agree, and I'd say that when we do have meta-data entries in the log, > it should be possible for them to appear at any time, in both log files > and in-memory logs. That would put us on the path to dynamic SLT tags. > > But we're not planning for that in the first iteration. phk has said > that he doesn't foresee dynamic VSL by the September release. And I > agree that we should get the rest working first. I disagree. If we draft a plan that 1) changes the in-memory AND on-disk format and 2) has later a possibility for dynamic VSLs we should make this part of the first iteration. We shouldn't change the format of such a critical piece of the infrastructure every other release. Also by making this both a synthetic record for live log consumer by libvarnishapi and in the future something that varnishd may produce then it becomes a no brainer for varnishlog. It just writes it on disk like any other log record. It may need some intelligence though when logs are rotated, like varnishlog asking for a synthetic record again. Dridi _______________________________________________ varnish-dev mailing list varnish-dev@varnish-cache.org https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev