Hi David,

On Wed, Apr 16, 2014 at 01:03:37PM -0400, David S wrote:
(...)
>     This makes sense.  With all the possible fields, I would prefer a more
> machine friendly format. (I think that is your preference too.)

Actually yes.

>     How about a proxy protocol extension that can be appended to either V1
> or V2?

We cannot really use the same, because v1 parsers really need the CRLF
at the end and the protocol was designed so that an fgets() could return
the whole line. And v2 needs to be cheap and avoid conversions. But we
can design something that works similarly and uses a different encoding.
We can also decide that we don't implement the extensions in v1 which
will motivate adoption for the new v2.

>      To configure, the command could look like:
> 
> send-proxy-extension 1,2,3,17,18,33,26

It's fun because I was wondering whether we should not simply send some
TLS code values :-)

>       The protocol could look like:
> 
> (2 bytes):    Total extension length (N) in network order
> (N-2 bytes):  A series of TLVs
> 
> TLVs:
> (1 byte)   :  Type Field as defined below
> (2 bytes) :  Total length (N) of vector, in network order
> (N-3 bytes) : Payload in format depending on type
> 
> Defined Vector Types
> --------------------------------
> 1: interface name (string)
> 2: source mac address (in binary)
> 3: destination mac address (in binary)
> 
> 17: ASN (string)
> 
> 33: SSL Version (string)
> 34: SNI (string)
> 35: ALPN (string)
> 36: Certificate received (byte)
> 37: Certificate verify result (32 bit integer, network order)
> 38: Fully Distinguished Name (string)
> 39: Common Name (string)

Actually I'd like to be able to have some containers with mandatory
information at the beginning. For example, passing ALPN without having
the SSL version is almost pointless because some servers might want
to only apply it with the proper TLS version.

So I think that we should define a container format and some subfields
formats, all using a TLV format. That would also bring the possibility
for the reader to skip a whole container it's not interested in.

We could have such following containers :

  - interface
    - fields: dev, vlan, src, dst

  - l3/l4 extensions
    - window size, mss, rtt, source ASN

  - NAT extensions
    - orig_src, orig_port, orig_dst, orig_port

  - local system info
    - pid, uid, gid

  - SSL/TLS
    - version, SNI, ALPN, CN, etc...

  - SMTP info
    - ehlo/helo, domain

  - generic auth
    - user id
    - authenticated yes/no
    - auth method

We'd have to assign an identifier to each container, and a list of
identifiers to each sub-field. We could reserve IDs zero and one as
placeholders for padding/alignment : 00 = single byte, 01 NN = skip
N following bytes.

In binary mode, this could result in something like this :

   [type=interface] [len=12]
      [type=dev] [len=4] [val="eth0"]
      [type=vlan] [len=2] [val=1]
   [type=SSL] [len=100]
      [mandatory information such as version and crt present Y/N]
      [type=sni] [len=15] [value="www.example.com"]
      ...
   [type=end]

In text mode, we could use a different format. First, we should absolutely
avoid passing any length in text, because it's a nightmare to produce, it
forces multiple copies which are really cumbersome in a number of languages.
Additionally, some languages have issues determining string lengths because
of multi-byte encoding (in fact it's more some developers who don't use the
proper methods, but who could blame them for misuing something already
confusing). So that means that we should probably only use text such as

   <type>=<value>

where <type> is an integer for containers, and ".<integer>" for a field,
then the value is dumped raw if it does not present any risk (eg: integer
or empty), or as a quoted string, or better, url-encoded string for anything
else, which would also remove spaces and CRLFs that are the only delimiters.
The benefit of url-encoding is that it's harder to get wrong during encoding
than quoted strings, and there are decoders available everywhere. For
containers, the value would hold the mandatory parts if any. So the example
above could look like this :

   <- interface -> <-------- ssl ------------>
   1= .1=eth0 .3=1 15=3.1,N .1=www.example.com
      ^dev    ^vlan   ^version, crt pres   ^SNI

It could be a good start, what do you think ?

> For V1, we could replace "\r\n" with "0xffff" to signify the presence of
> the extension.

For the reason above, I'd avoid this.

> For V2, we could borrow a bit somewhere in the header to signify the same.

Sure!

What's your opinion ?

Willy


Reply via email to