On Mon, 14 Mar 2011 10:46:21 -0600, Alex Rousskov wrote:
On 03/14/2011 10:05 AM, Kinkie wrote:
You could handle encodings in the low-level value parser as well if
you
make them explicit:
name="string with some documented quoting rules"
name=base64:base64encodedvalue
name=rawvalueterminatedbythefirstwhitespace
The latter assumes no raw value can start with "base64:". Other
quoting/encoding mechanisms are possible, of course.
Why put that in the message, if we can do it in the label, and avoid
all quoting issues altogether?
Not sure what you mean by "label", but it is easier to write parsers
and
extend functionality when parsers and packers do not rely on name
semantics to properly handle the message.
If you meant moving value "base64:" prefix to the "name" suffix, that
is
possible, of course, but keeping the encoding of the value closer to
the
value itself is more "natural", and it would be a little easier to
write
value encoders that way, IMO. I do agree that using name suffix will
help avoid encoding specs/value clashes.
I did't really want it to get that complicated just yet. Once the
receiving parsers in squid are unified they can be extended to do all
this. FWIW there is base-64, RFC-1738, shell, quoted, raw binary and
two-way crypto encodings which ALL might appear here.
The parameter message= is added with a quoted string value to
allow
other parameters on the same result line when an error
reason/message is
sent back.
The parameter user= is added to hold the username whenever
relevant for
any reply.
Other parameters are on the planning board for addition after the
changes. So far I have: ttl= for setting a desired
credentials-TTL,
group= for associating a group name with the user=, tag= extended
from
external ACL to auth.
Opinions? problems? other ideas?
I do not know much about existing helpers, but I would recommend
using
the same set of basic syntax rules for _all_ messages:
Agreed. That was the plan. My description for the semantics of several
name= tokens has no bearing on the underlying receiver parser. That is
purely semantics in the end helper and handling components.
The line receiver only needs to care if:
* is it "" around something containing whitespaces?
* is it a single series of bytes with guaranteed no space characters?
There is no low-level differential between token=<base-64_blob> and
user=<username>
There is between message="raw text with possible binary" and
message=<base-64 encoded blob>
<responsecodeterminatedbywhitespace> [name=value]*
where the value syntax is as discussed above. This approach would
make
writing a general response-buffer-to-response-structure parser easy
and
simplify addition of new fields or extensions.
I agree, with a proposed modification (pseudo-regex syntax):
<responsecodeterminatedbywhitespace> (name(:enc)?=value " ")*
Where :enc, if specified may be ":b" (base64-encode), further codes
to
be added as needed. As a separator we could use something different
from space to further reduce the risk of clashes.
I like. Though this does make the low-level user-facing parts a bit
more complicated.
Amos