[ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list ]
Thank you very much for this short classification. It does, indeed, add
some order to my thoughts. I have added a few comments inline.
On 02/09/14 12:55, Andreas Rossberg wrote:
In addition to the good answers you already got, let me add a few
clarifications perhaps addressing your question.
First, it is worth noting that “distributed” and “open” are quite different
properties. A distributed program isn’t necessarily open, and an open program
isn’t necessarily distributed. In terms of type system support, they manifest
as very different problems. Openness typically necessitates some amount of
runtime type checking (or more than that, depending on the possible complexity
of the data and the degree of trust), to reconfirm static assumptions at
runtime. Type systems for distribution are potentially concerned with notions
like locality and mobility.
“Open” does not necessarily imply “upgradable”. A system with the ability to
e.g. plug in modules at runtime, or communicate with statically unknown
principals, is already open. In these scenarios, it should be sufficient to
just perform a one-time check when a module is loaded or a connection is
established. After that, it stays fixed. Upgrading adds a dimension of
_mutability_ to the mix. It will typically be necessary to re-check each
upgrade or reconnect.
Right. Part of the same stream of questions I had initially was where
would the Java Virtual Machine be situated in this picture. A single JVM
process isn't distributed, and yet, one may still encounter similar
situations to those that I imagined in original email, but due to
classpath conflicts this time. I'm pretty confident now that the JVM can
be classified as an open system.
Case in point. On the JVM, one may end up in a situation where two
libraries I'm depending upon, depend in turn on the same third library,
*but* different versions of it. My code will compile just fine, because
transitive dependencies aren't verified statically. At runtime, though,
only one of those two versions of the third library will win and one of
the possible outcomes is the dreaded java.lang.NoSuchMethodException.
Although I cannot articulate it precisely, I feel there's a gap between,
e.g., having a runtime check for out of bounds array indeces, on one
hand, and missing method definitions at runtime on the other hand. As a
library author, if I'm calling a function or method from an external
module and my code compiles then I have a static *guarantee* that that
method will be there at runtime, isn't it?
Am I missing something or is this not a useful difference to make in
practice/theory?
An even more involved scenario is to allow upgrading with potentially
_incompatible_ types. If that is desired, a single check is no longer possible.
You need to check every single interaction, at least in those places where the
types are incompatible (which you can define as “the provided type is not a
subtype of the expected type”). This can be realised by having the language or
runtime insert checked (higher-order) type coercions between incompatible
interacting program parts, which will reject unexpected data as a runtime error
(or take other measures).
The scenario you describe seems to be yet another case. IIUC, you want to
upgrade _multiple_ components, in a way that is incompatible with the previous
version, but upgrades _all_ involved parts, so that in the end, types are
consistent again. Obviously, there is no real problem if the system supports
atomic update of multiple components. But if it doesn’t, then this problem may
turn out to be a variant of the previous one: the first update is incompatible
initially, so a type checking layer must be inserted at the boundaries.
However, when the second component is also updated in a consistent manner, the
assumptions match again, and the checks can be eliminated. Such a scheme would
also work with more than two components.
/Andreas
On Aug 30, 2014, at 22:44 , Ionuț G. Stan <[email protected]> wrote:
[ The Types Forum, http://lists.seas.upenn.edu/mailman/listinfo/types-list ]
Thank you all for the given pointers. Much more than I had initially
anticipated. I skimmed over some of the mentioned papers and they all look
great. This should keep me busy for a little while.
On 30/08/14 14:26, Philip Wadler wrote:
See the Acute system by Sewell and others.
http://www.cl.cam.ac.uk/~pes20/acute/index.html
-- P
. \ Philip Wadler, Professor of Theoretical Computer Science
. /\ School of Informatics, University of Edinburgh
. / \ http://homepages.inf.ed.ac.uk/wadler/
On 29 August 2014 03:24, "Ionuț G. Stan" <[email protected]
<mailto:[email protected]>> wrote:
[ The Types Forum,
http://lists.seas.upenn.edu/__mailman/listinfo/types-list
<http://lists.seas.upenn.edu/mailman/listinfo/types-list> ]
Hello,
First, let me acknowledge upfront that I'm a complete dilettante
when it comes to type theory (I'm still working through the first
chapters of TaPL). That means that what I'm about to ask will be
either stupid or extremely inexact. Nevertheless, I'm curious.
Now, onto my question...
The TL;DR version is: how does one specify types in a distributed
programming model, like actors? And how much can we trust these types?
I was debating today, on Twitter [1], how to type the sets of
messages two actors can exchange; a producer (P) and a consumer (C).
Let's say that the possible set of messages produced by P is M. We
would like, by means of a type system, to know that C can handle: 1)
just messages of type M and 2) all possible messages of type M.
M could be represented by a sum type. Let's consider this particular
one:
M1 = A | B | C
In a closed world this makes complete sense (to me, at least) and
it's easy to verify statically. But in an open world setting, like a
distributed system, where P and C are on different machines that may
be upgraded separately, things look harder to me. You may guarantee
statically that the two properties are met, but at runtime there may
appear race conditions that violate the second property.
For example, we deploy P1 and C1, both agreeing on M1. Next, we add
a new variant to M1:
M2 = A | B | C | D
P1 and C1 are updated accordingly and we get P2 and C2, which we try
to deploy, but a race condition appears and P2 send message D to C1.
Obviously, C1 does not understand it. Even though the type system
told us that C1 can handle all variants of M, it can't actually.
A similar scenario appears when removing one of the variants of M1.
Is there any typing approach to this kind of problem?
It looks to me that types would have to include a version as well
and all runtime communication between different versions should be
prohibited by having versioned communication channels.
Does anyone have any insights or pointers to articles, papers or
books that discuss this sort of problem?
Thank you for reading this!
[1]: https://twitter.com/shajra/__status/504568858967953408
<https://twitter.com/shajra/status/504568858967953408>
--
Ionuț G. Stan | http://igstan.ro
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
--
Ionuț G. Stan | http://igstan.ro
--
Ionuț G. Stan | http://igstan.ro