Re: RFC: constraining UPIDs in libprocess messages

James Peach Thu, 20 Apr 2017 09:16:34 -0700

> On Apr 19, 2017, at 5:24 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> 
> It's not obvious to me what all of the implications are. I feel we need a 
> more comprehensive proposal here, can you do a small write up that shows 
> you've thought through all the implications?

Ok, let me summarize ...

In libprocess, actors (processes) are globally identified by their UPID. The 
UPID is a string of the format <ID>@<IP address>:port. When libprocess actors 
communicate, they listen on a single IP address and port and claim this 
endpoint in their UPID. The expectation of other actors is that they can send 
messages to this actor by connecting to the endpoint claimed in the UPID. Note 
that the peer address used to send messages is not the UPID, but whichever 
source address:port that the sender happened to bind.

Due to various network configurations (eg. NAT, multihoming), situations arise 
where a receiver is not able to easily predict which IP address it needs to 
claim in its UPID such that senders will be able to connect to it. This leads 
to the need for configuration options like LIBPROCESS_IP (specify the address 
to listen on) and LIBPROCESS_ADVERTISE_IP (specify the address in the UPID). 

The fundamental problem this proposal attempts to mitigate is the libprocess 
has no mechanism that can authenticate a UPID. This means that unless the 
network fabric is completely trusted, anyone with access to the network can 
inject arbitrary messages into any libprocess actor, impersonating any other 
actor.

This proposal adds an optional libprocess setting to bind the IP address 
claimed in the UPID to the IP address of the message sender. This is a receiver 
option, where the receiver compared the result of getpeername(2) with the UPID 
from the libprocess message. The IP address is required to match, but the port 
and ID are allowed to differ. In practice, this means that libprocess actors 
are required to send from the same IP address they are listening on. So to 
impersonate a specific UPID you have to be able to send the message from the 
same IP address the impersonatee is using (ie. be running on the same host), 
which increases the difficulty of impersonation.

In our deployments, all the cluster members are single-homed and remote access 
to the system is restricted. We know in advance that neither LIBPROCESS_IP nor 
LIBPROCESS_ADVERTISE_IP options are required. We have some confidence that we 
control the services running on cluster hosts. If we bind the UPID address to 
the socket peer address, then UPID impersonation requires that malicious code 
is already running on a cluster host, at which point we probably have bigger 
problems.

In terms of testing, all the test suite passes except for 
ExamplesTest.DiskFullFramework. This test uses the LIBPROCESS_IP option to bind 
to the lo0 interface, which can result in the UPID claiming 127.0.0.1 but 
getpeername(2) returning one of the hosts external IP addresses. This is really 
just one type of a multi-homed host configuration, so it demonstrates that this 
configuration will break.

> 
> E.g. how does this affect the use of proxies, multihomed hosts, our testing 
> facilities for message spoofing, etc.
> 
> On Tue, Apr 18, 2017 at 2:07 PM, James Peach <jor...@gmail.com> wrote:
> 
> > On Apr 7, 2017, at 8:47 AM, Jie Yu <yujie....@gmail.com> wrote:
> >
> > + BenM
> >
> > James, I don't have immediate context on this issue. BenM will be back next
> > week and he should be able to give you more accurate feedback.
> 
> 
> I updated the review chain (from https://reviews.apache.org/r/58517/). Is 
> anyone able to shepherd this?
> 
> 
> >
> > - Jie
> >
> > On Fri, Apr 7, 2017 at 8:37 AM, James Peach <jor...@gmail.com> wrote:
> >
> >>
> >>> On Apr 5, 2017, at 5:42 PM, Jie Yu <yujie....@gmail.com> wrote:
> >>>
> >>> One comment here is that:
> >>>
> >>> We plan to support libprocess communication using domain socket. In other
> >>> words, we plan to make UPID a socket addr. Can we make sure this approach
> >>> also works for the case where UPID is a unix address in the future? For
> >>> instance, what will `socket->peer();` returns for domain socket?
> 
> This would probably work, but it depends on the lib process implementation. 
> Since this is (default false) option, I this it is OK for now. I'll be happy 
> to revisit when libprocess messaging supports domain sockets.
> 
> >>
> >> I can look into that.
> >>
> >> So you would consider this approach a reasonable mitigation?
> >>
> >>>
> >>> - Jie
> >>>
> >>> On Wed, Apr 5, 2017 at 3:27 PM, James Peach <jor...@gmail.com> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> Currently, libprocess messages contain a UPID, which is send by the peer
> >>>> in the HTTP message header. There's no validation of this, so generally
> >>>> messages are trusted to be from the UPID they claim to be.
> >>>>
> >>>> As an RFC, I've pushed https://reviews.apache.org/r/58224/. This patch
> >>>> constrains the UPID to not change for the lifetime of the socket
> 
> I dropped this constraint. There are DiskQuotaTest.SlaveRecovery depends on 
> the UPID being able to change. More accurately, libprocess only matches on 
> the address portion of the UPID when finding a socket to use, and for now I 
> don't think this change is beneficial enough to break that assumption.
> 
> >>>> , and
> >> also
> >>>> enforces that the the IP address portion of the UPID matches the peer
> >>>> socket address. This makes UPIDs more reliable, but the latter check
> >> would
> >>>> break existing configurations. I'd appreciate any feedback on whether
> >> this
> >>>> is worth pursuing at the lib process level and whether people feel that
> >>>> this specific mitigation is worthwhile.
> >>>>
> >>>> thanks,
> >>>> James
> >>
> >>
> 
>

Re: RFC: constraining UPIDs in libprocess messages

Reply via email to