On 7/26/15 11:39 AM, Matthieu Boutier wrote: > Hi John, > > You invalidate the mail I wrote in the plane! Thanks for that great > review. > >> it looks like good quality code. > > Thanks!! > >> * If I'm understanding correctly, the two major changes you make to the >> network message format are 1) inserting a flow ID into the message >> sequence number, and 2) adding a 16-bit field for a message type to the >> encrypted message string, immediately after the two timestamps. That >> second change allows you to add two new message types-- a probe message, >> and a message to report server addresses to the client. Have I got that >> more or less right? > > Yes. I describe it in details here (see fig. 3 and 7): > > http://www.pps.univ-paris-diderot.fr/~boutier/papiers/mpmosh.pdf
Thanks for reminding me about your paper, I'd forgotten about it. >> * Does the flow ID need to be in the sequence number? Is there any >> reason it couldn't be in a separate protobuf field? Along with the >> compatibility issues, this is a significant change to Mosh's >> cryptography. I'm no cryptographic expert, but reducing the namespace >> for sequence IDs and adding maybe-correlatable info to the sequence ID >> worries me. > > No. And I have no objection to separate fields: having a global > sequence number, and a seqno by flow. The message format will then > looks like: > > |<-- nonce -->| > [ dir & seqno | flowID | flow seqno | flags | timestamps | > > It will just take a few more bits, but I don't think that 64 bits more > is a big deal. Adding extra fields to the outer packet header leaves them unprotected by the encryption/authentication, and is probably not a good idea. (I think that's what you're suggesting here.) If the fields are in a protobuf, they might be less expensive, since protobuf compresses integers by stripping leading 0 or 1 bits. So, adding fields that carry small integers only costs two bytes for each field, and each field can be optional. >> * Does your message type need to be in the encrypted message string, but >> outside the protobuf message? In an ideal world, mosh would have used >> another level of protobuf serialization for the timestamps, and adding >> the message type wouldn't cause a compatibility problem. But mosh is in >> the same less-than-ideal world as the rest of us. > > I think so. If Chloé (the attacker) can change the message type, then > she just have to switch the addr bit. If the packet contains data, the > mosh network layer will try to get addresses (will "probably" fail), and > if the packet contains addresses... no clue, but I imagine that the > message will probably be considered as corrupted (does that kill the mosh > process, or just drop the packet?). I've thought a little more about this. My original idea was to add new elements to transportinstruction.proto: optional bytes probe; optional bytes addresses; Any Mosh message would be required to have one (and only one) of the diff, probe, or addresses elements. All of them would have mosh's encryption applied. In the case of the probe messages, the encrypted string would be empty-- but it would still be authenticated, and thus valuable for authenticating the entire message. This authentication eliminates your concern in the paper about malicious address-request packets, I think. I've spent a while trying to modify your branch to make it more compatible with existing versions of Mosh. Unfortunately, I've come to the conclusion that it will be difficult: The existing Mosh code really expects every incoming packet to have a valid Instruction protobuf, and every incoming Instruction affects the protocol state. The existing Mosh code throws exceptions on unexpected packet contents rather than discarding or ignoring them, making it difficult to vary from the existing packet format. The best idea I've come up with is for multipath-Mosh to inject a new flag or field into the Instruction protobuf, and for it to send packets otherwise exactly compatible with older versions of Mosh until it sees that flag/field from the other side. That would require significant change to your code to handle that behavior. >> * Your code makes the server responsible for discovering and reporting >> its listening addresses to the client. Can you explain why you chose >> this approach? > > Yes. First, the server should be able to know it's NATed addresses... > that's not implemented. Second, if you're at the office, and you open > a mosh session to your home server (or the converse), you will converge > with the addresses of your local network. Third, I like the idea to > be connected by link-local addresses: it's so often (not so much these > days) that I'm tweaking my OpenWRT configuration... and fail. The only > way is to connect via ll-addr. I have to disagree here. First, there are situations where the server does not and cannot know its NAT address: consider NAT-464 on your cellphone, for example. The second situation should be handled by having public and private DNS that corresponds to your public and private network. Link-local addresses *are* a plausible use case, but there are difficulties: Mosh must know which interface to use for link-local addresses and it may not know which one that is, unless an address report was delivered using that link local address >> My thinking is that the >> *client* should be responsible for discovering server addresses-- it >> should discover a list of possible server addresses at session start. > > This is (also) already done : > > At startup: > > AddrInfo ai( ip, port, &hints ); > > for ( struct addrinfo *ra_it = ai.res; ra_it != NULL; ra_it = > ra_it->ai_next ) { > if ( ra_it->ai_addr->sa_family == AF_INET || ra_it->ai_addr->sa_family == > AF_INET6 ) { > remote_addr.push_back( Addr( *ra_it->ai_addr, ra_it->ai_addrlen ) ); > } > } Running this loop is not currently useful in standard mosh, because the client is passed a numeric IPv4 or IPv6 address on the command line, and we will not get any more addresses. But I suspect you're not using the Perl script at all for mp-mosh, since you've not modified it at all. For multipath mosh, I think the Perl script will have to pass both the IP address that was chosen for the SSH connection, and the hostname. > Then, in check_flows(), we use "remote_addr". > > for ( std::vector< Addr >::iterator ra_it = remote_addr.begin(); > ra_it != remote_addr.end(); > ra_it++ ) { > new_flow( *la_it, *ra_it ); > } > >> Perhaps it should also update that list when it roams, because the >> client may roam from a public IPv4 network with public DNS records for >> the server to a private network with different routing and private DNS >> records with different addresses for the same server. If the server is >> behind NAT or an L3 proxy, then its local addresses might not be correct >> for either of these cases. > > This is not done! :-) > >> * Do we need to have a message type to describe the probe messages? >> Could we instead say that an empty TransportInstruction protobuf string >> represents a probe message? > > Good question again. What you propose should be possible. I feel that > it's bad, but no good reasons (flexibility, control…). > >> move some of it to session setup > > That's clearly my opinion. > >> or to optional protobuf fields that would offer better compatibility. > > How does it works? > >> I've recently done some work on the mosh script that I think might be >> relevant to multipath session setup, see >> <https://github.com/keithw/mosh/pull/653>. This collects server >> addresses in a way that might be useful for a multipath mosh >> implementation. I think clients may still need to do name lookups later >> to really fully handle roaming, though. (This isn't specific to your >> multipath work, though I think the need might be stronger there.) > > l. 250+: > - why are you connecting a "dummy" socket instead of directly trying to > do ssh? > - where is that socket close? > > (That's good at bootstrap.) An ssh host can be configured in .ssh/config to have a name or IP address that has no relation to the "host" that was given to ssh as an argument. In order to handle this, the mosh script configures ssh to use a proxy program to connect to the host, and report the IP address used. That proxy program is actually the Perl script itself, run as "mosh --fake-proxy". It's not actually fake, it is a real proxy, and must make a TCP connection to the remote SSH server. That's the code you're seeing there. >> Also have a look at my thoughts on implementing a multiplexing moshd >> server: >> <https://github.com/keithw/mosh/issues/295#issuecomment-109861097> This >> is not the same problem, but I think a similar solution for discovering >> unnamed flows might be appropriate here. > > Interesting. I consider using one daemon for probing and sharing the > probing results for multiple sessions. I think it will be great fun to > implement it, but I'm convinced it's clearly not urgent, and that the > overload probing trafic generated is negligible. But… who knows! I > like micro optimisation. ;-) I agree. We also need to think a bit about how a single daemon might affect security and information disclosure. > (And I understand your reasons to use another separate protocol.) > >> Also, do we need the full 5-tuple (ports, address, IPv4/6) to name >> flows? Can we safely use a 3-tuple of (client-port, client-address, >> IPv4/6) instead? > > Yes. Both the client and the server can be multihomed. > > > Ok, another point: I have around 20 patches, waiting for testing, > which gives to mpmosh the ability to evaluate loss ratio in both > directions, and tends to duplicate packets among paths to achieve > a given loss ratio. This extension to my extension [...] cut the > 16 bits flag field in two 8 bits fields: flag and outgoing-loss- > ratio. The flow ID is needed in this extension, to evaluate > correctly the loss ratio. I have two thoughts here: * Replication of packets will increase the bandwidth Mosh uses, obviously. * If the replicated packets share some or all of their network paths, the replication is effectively a more-aggressive congestion-avoidance algorithm. This doesn't matter much for mosh, which limits its bandwidth and doesn't really do congestion-avoidance, but it can be important with a TCP stack. > Now, I think that extending mosh should be done at session setup (by > the perl script). The session setup should be able to get the > supported extensions and options of the server, and choose among > then which ones to use. > > To get the server's options, I think that something like the attached > patch should be sufficient. We may be tempted to add a version for > each extension, but that's clearly not urgent, and no version is > equivalent to "an older one", so I prefer not using it. The option > list conversely is, IMHO, good to have. > > For the perl script... * Mosh is a small project with linear development. Maybe we can just detect extensions based on the Mosh version number, or a protocol version number. * Your patch adds a way to describe extensions that have command line options, but new features may take other forms. I also think that the extension types aren't necessary. Regards, --jh _______________________________________________ mosh-devel mailing list mosh-devel@mit.edu http://mailman.mit.edu/mailman/listinfo/mosh-devel