Re: [TYPES] Types in distributed systems

Ionuț G. Stan Tue, 02 Sep 2014 05:29:30 -0700

[ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list ]

Thank you very much for this short classification. It does, indeed, addsome order to my thoughts. I have added a few comments inline.


On 02/09/14 12:55, Andreas Rossberg wrote:

In addition to the good answers you already got, let me add a few 
clarifications perhaps addressing your question.

First, it is worth noting that “distributed” and “open” are quite different 
properties. A distributed program isn’t necessarily open, and an open program 
isn’t necessarily distributed. In terms of type system support, they manifest 
as very different problems. Openness typically necessitates some amount of 
runtime type checking (or more than that, depending on the possible complexity 
of the data and the degree of trust), to reconfirm static assumptions at 
runtime. Type systems for distribution are potentially concerned with notions 
like locality and mobility.

“Open” does not necessarily imply “upgradable”. A system with the ability to 
e.g. plug in modules at runtime, or communicate with statically unknown 
principals, is already open. In these scenarios, it should be sufficient to 
just perform a one-time check when a module is loaded or a connection is 
established. After that, it stays fixed. Upgrading adds a dimension of 
_mutability_ to the mix. It will typically be necessary to re-check each 
upgrade or reconnect.

Right. Part of the same stream of questions I had initially was wherewould the Java Virtual Machine be situated in this picture. A single JVMprocess isn't distributed, and yet, one may still encounter similarsituations to those that I imagined in original email, but due toclasspath conflicts this time. I'm pretty confident now that the JVM canbe classified as an open system.

Case in point. On the JVM, one may end up in a situation where twolibraries I'm depending upon, depend in turn on the same third library,*but* different versions of it. My code will compile just fine, becausetransitive dependencies aren't verified statically. At runtime, though,only one of those two versions of the third library will win and one ofthe possible outcomes is the dreaded java.lang.NoSuchMethodException.

Although I cannot articulate it precisely, I feel there's a gap between,e.g., having a runtime check for out of bounds array indeces, on onehand, and missing method definitions at runtime on the other hand. As alibrary author, if I'm calling a function or method from an externalmodule and my code compiles then I have a static *guarantee* that thatmethod will be there at runtime, isn't it?

Am I missing something or is this not a useful difference to make inpractice/theory?

An even more involved scenario is to allow upgrading with potentially 
_incompatible_ types. If that is desired, a single check is no longer possible. 
You need to check every single interaction, at least in those places where the 
types are incompatible (which you can define as “the provided type is not a 
subtype of the expected type”). This can be realised by having the language or 
runtime insert checked (higher-order) type coercions between incompatible 
interacting program parts, which will reject unexpected data as a runtime error 
(or take other measures).

The scenario you describe seems to be yet another case. IIUC, you want to 
upgrade _multiple_ components, in a way that is incompatible with the previous 
version, but upgrades _all_ involved parts, so that in the end, types are 
consistent again. Obviously, there is no real problem if the system supports 
atomic update of multiple components. But if it doesn’t, then this problem may 
turn out to be a variant of the previous one: the first update is incompatible 
initially, so a type checking layer must be inserted at the boundaries. 
However, when the second component is also updated in a consistent manner, the 
assumptions match again, and the checks can be eliminated. Such a scheme would 
also work with more than two components.

/Andreas

On Aug 30, 2014, at 22:44 , Ionuț G. Stan <[email protected]> wrote:

[ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list ]

Thank you all for the given pointers. Much more than I had initially 
anticipated. I skimmed over some of the mentioned papers and they all look 
great. This should keep me busy for a little while.


On 30/08/14 14:26, Philip Wadler wrote:

See the Acute system by Sewell and others.

http://www.cl.cam.ac.uk/~pes20/acute/index.html

-- P

.   \ Philip Wadler, Professor of Theoretical Computer Science
.   /\ School of Informatics, University of Edinburgh
.  /  \ http://homepages.inf.ed.ac.uk/wadler/


On 29 August 2014 03:24, "Ionuț G. Stan" <[email protected]
<mailto:[email protected]>> wrote:

    [ The Types Forum,
    http://lists.seas.upenn.edu/__mailman/listinfo/types-list
    <http://lists.seas.upenn.edu/mailman/listinfo/types-list> ]

    Hello,

    First, let me acknowledge upfront that I'm a complete dilettante
    when it comes to type theory (I'm still working through the first
    chapters of TaPL). That means that what I'm about to ask will be
    either stupid or extremely inexact. Nevertheless, I'm curious.

    Now, onto my question...

    The TL;DR version is: how does one specify types in a distributed
    programming model, like actors? And how much can we trust these types?

    I was debating today, on Twitter [1], how to type the sets of
    messages two actors can exchange; a producer (P) and a consumer (C).

    Let's say that the possible set of messages produced by P is M. We
    would like, by means of a type system, to know that C can handle: 1)
    just messages of type M and 2) all possible messages of type M.

    M could be represented by a sum type. Let's consider this particular
    one:

    M1 = A | B | C

    In a closed world this makes complete sense (to me, at least) and
    it's easy to verify statically. But in an open world setting, like a
    distributed system, where P and C are on different machines that may
    be upgraded separately, things look harder to me. You may guarantee
    statically that the two properties are met, but at runtime there may
    appear race conditions that violate the second property.

    For example, we deploy P1 and C1, both agreeing on M1. Next, we add
    a new variant to M1:

    M2 = A | B | C | D

    P1 and C1 are updated accordingly and we get P2 and C2, which we try
    to deploy, but a race condition appears and P2 send message D to C1.
    Obviously, C1 does not understand it. Even though the type system
    told us that C1 can handle all variants of M, it can't actually.

    A similar scenario appears when removing one of the variants of M1.

    Is there any typing approach to this kind of problem?

    It looks to me that types would have to include a version as well
    and all runtime communication between different versions should be
    prohibited by having versioned communication channels.

    Does anyone have any insights or pointers to articles, papers or
    books that discuss this sort of problem?

    Thank you for reading this!


    [1]: https://twitter.com/shajra/__status/504568858967953408
    <https://twitter.com/shajra/status/504568858967953408>

    --
    Ionuț G. Stan  | http://igstan.ro




The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



--
Ionuț G. Stan  |  http://igstan.ro



--
Ionuț G. Stan  |  http://igstan.ro

Re: [TYPES] Types in distributed systems

Reply via email to