I've been pondering a bit if we could have a generalized vmod interface to
iterate over blob lists and bodies in particular. Ideally, I'd like to have a
single interface for all of the following pseudo-vcl examples.

It's not that I'd personally need all of these now, the hash(req.body) case is
really the only one I need to get working ASAP (and the plan is to fix the
bodyaccess vmod). But being at it, I couldn't avoid reflecting on how this could
be solved for the general case.

So here's a vcl mock:

vcl_init {
        # vmod-re exists
        new re_evil = re.regex("SQL.*INJECTION");
}

vcl_recv {
        cache_req_body(1MB);

        # .match(STRING) exists, .matchb(BODY) doesn't
        if (re_evil.match(req.url) || re_evil.matchb(req.body)) {
                return (synth(400, "you're evil"));
        }
}

vcl_hash {
        if (req.method != "GET" && req.method != "HEAD") {
                # not possible ATM
                hash_blob(req.body);
        }
}

vcl_backend_response {
        # may be a stupid example, could not come up with anything better
        if (beresp.http.Content-Type == "image/png") {
                image.recompress(beresp.body);
        }

        # blobcode/blobdigest exist, but hashing a body is not
        # possible
        set beresp.http.Etag = blobcode.encode(BASE64,
                                blobdigest.hashb(MD5, beresp.body));
}

vcl_deliver {
        if (req.http.Cookie ~ "loggedin=true") {
                if (resp.http.Content-Type == "audio/mp3") {
                        # another stupid example
                        mp3.watermark(resp.body, req.http.UserId);
                }
        }
}

so the VCC declarations for all of the vmod methods/functions could use a common
BLOB_LIST type

# libvmod-re
$Method BOOL .matchb(BLOB_LIST)

# libvmod-image
$Function VOID .recompress(BLOB_LIST)

# libvmod-blobdigest
$Function BLOB hashb(ENUM {MD5, ...}, BLOB_LIST)

# libvmod-mp3
$Function VOID .watermark(BLOB_LIST, STRING)

only one BLOB_LIST argument would be allowed per Function/Method

The VCL/VMOD interface should have an init call, an iterator and a fini call.
The thing passed when iterating could be the existing vmod_priv

struct vmod_priv_iter;
typedef struct vmod_priv *vmod_priv_iter_f(const struct vmod_priv_iter *, const
struct vmod_priv *);

enum vmod_priv_iter_state_e {
        VI_INIT,
        VI_ITER,
        VI_FINI
};

struct vmod_priv_iter {
        void                            *priv;
        enum vmod_priv_iter_state_e     state;
        vmod_priv_iter_f                *func;
};

The C type of BLOB_LIST would be struct vmod_priv iter *

the compiled VCL would then:

        - alloc the vmod_priv_iter (on the stack?)
        - zero it and set state=VI_INIT
        - call the vmod function once, ignoring the return value
          - the vmod function would alloc/init its priv data and fill
            in the priv and func members of the struct vmod_priv_iter
        - compiled VCL would set VI_ITER and loop through the object,
          calling the vmod_priv_iter_f
          - NULL return from iterator means "have not changed"
          - otherwise the iterator function MAY modify the object
            (if writable form the context) by referencing or copying the        
            returned vmod_priv or copying/freeing it, as applicable
        - compiled VCL would set state=VI_FINI and call the vmod
          function the last time, using the return value unless VOID

Regarding the interfaces with varnish core we need to differentiate the use 
cases:

* vcl_recv {} / vcl_hash {} req.body access

  We got this as a storage object, so the iterator would wrap the
  vmod iterator in a objiterate_f -> _should_ be easy I think

* vcl_backend_response { }

  Trouble here is that we do not have the body, so in principle I see
  a couple of options and I am having a hard time making up my mind
  which would be best

        - early fetch of the body, wrap the vmod iterator in
          a vfp (but where in the vfp stack would we put it?)

        - early fetch of the body, use objiterate_f when done

        Both would disable streaming for anything but a VOID
        return of the vmod iterator, the vfp option would allow
        to stream for VOID return

* vcl_deliver { }

  Here we could use the objiterate_f again, but we would need to
  create some dummy OC_F_PRIVATE object, filling in the modified bits.

Nils

_______________________________________________
varnish-dev mailing list
[email protected]
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Reply via email to