Hi Hari,
On Thu, Nov 03, 2016 at 04:08:37PM +0000, Hari Chandrasekhar wrote:
> Hello,
> Our team is planning to use HAProxy as a TCP load balancer for MQTT
> servers. We don't have much familiarity with the HAProxy set up.
> So we would like to get some clarity on how the process would work. Please
> let me know if this is not the right place to ask questions and Thanks in
> Advance.
>
> We are planning to use HAProxy's SSL termination feature and enforce client
> certificate validation. So far, we were able to get HAProxy to enforce
> clients to present client certificates.
> But we are trying to implement some additional client validations - mainly
> the following.
> 1. Add additional certificate validation by checking the client
> Identifier presented in the MQTT data (MQTT - connect packets) against the
> CN in the presented client certificate
> 2. Perform some authorizations based on the certificate and the type of
> packets (PUBLISH, SUBSCRIBE etc).
> (Here is the MQTT specification document -
> http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html)
>
> Here are some of our questions.
>
> - Is it possible to do the above using HAProxy & is using HAProxy for
> these purposes the right approach?
I think you could indeed extract the client identifier from the CONNECT
packet using Lua. But in order to validate authorization in the the other
packets, you would need to implement a complete protocol parser and at
this point it doesn't sound like a good idea to me.
> - Is Lua scripts the recommended way for extending functionalities
> like this? Are there other plugin mechanisms available?
At the moment there is only Lua to deal with this. Soon there will be an
option to delegate some content processing to an external process, so
that might open many other possibilities, but it's a bit early to suggest
trying it.
> - We were analyzing the sample-fetch options to parse data from the
> request body but found it hard to log these contents. We found the http
> capture options but that seem specific to http. Are there similar ones
> which are more generic than specific to http?
The HTTP ones rely on the HTTP decoder, which is the reason there are so
many. There are a few extra ones like RDP cookie extraction, and SSL hello
message analysis which rely on the beginning of the payload of such protocols.
It would be very easy to implement a new sample fetch function to extract the
client identifier and possibly other information from the beginning of an MQTT
connection, but keeping synchronized with a stream requires a much more
complete protocol implementation, basically like we have for HTTP. I'm not
sure why we're starting to see more and more MQTT over haproxy, I'm interested
in getting your opinion on this since you're one of them. Maybe there's a
growing trend and we'll need to deal with it in the near future, maybe it
will always remain marginal, I don't know.
> - According to the docs - payload_lv can read the content length at the
> given offset. What is the format used to represent the length of the
> contents? MQTT protocol uses similar approach to prefix the length before
> certain contents. We are trying to verify the format used are the same. If
> you could point us to a more detailed document or some examples around
> these - that would be helpful.
>From what I'm seeing, the "length" argument specifies the length of the
block size (which will be in big endian). Thus for example :
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000767, version=1.2.4
req.payload_lv(10,2)
will read two bytes at offsets 10 and 11, consider them as a 16-bit big
endian integer representing the block size, then will start to read the
block at byte 12.
If you think it's too hard to extract these contents, please take a look
at smp_fetch_req_ssl_ver() in payload.c for example, you'll see that it's
easy to add a bit of code to implement simple protocol parsers that can
make it much easier to extract contents, especially when you have to
cascade multiple fields. Just keep in mind that payload_lv() is quite
generic and usable for many things, but there's no strong rule for not
adding specific protocol processing code :-)
Cheers,
Willy