As food for discussion, here's a counter proposal for an XO upgrade mechanism, focusing on the network side. --------- Design goals: - minimize round-trips necessary for successful upgrade - minimize size of upgrades
Software versions are assigned sequential integers. Version 1, 2, etc. We assume that every machine should discover an available upgrade within a day, on average. The constants below can be tweaked to adjust this interval. Upgrades consist of a number of messages U_0 through U_N with a maximum size Umax. The maximum size is chosen either so that a) each U_i fits in a single network packet, or b) each U_i can fits into a reserved storage area on the XO (to allow it to be shared with neighbors). Upgrade messages are independent. It may be worthwhile for upgrade messages to be applied in any order, but for simplicity we'll constrain the order of application to be sequential. U_i is authenticated and self-labeling: it contains a signature as well as a field specifying the version # of the upgrade as well as i, the sequence number of the upgrade message. Part 1: discovering an upgrade. Every hour, the XO looks to see how many mesh neighbors it has, counting itself. Call this number N. N is always at least 1. It then generates a random number in [0, N*24). If the number is 0, does a DNS lookup on update.laptop.org to check for a new version. (Note that the DNS lookup may be cached by the mesh portal or school server.) XOs which have version V of the base system also listen to the multicast address xxxx:yyyy:zzzz::<V+1> Part 2: announcing the upgrade. If you are the person to discover a new version V', you are the Upgrade Leader. While the version on your laptop (V) is less than V' do the following: a) Set i=0. Stop listening to the multicast address. b) download U_i for (V+1) via http from update.laptop.org (again, these queries are often cached by a school server). Apply it to our local copy (using COW techniques to prevent the application from being globally visible until the upgrade is complete) c) Broadcast U_i to the multicast address xxxx:yyyy:zzzz::<V+1>. d) If U_i is not the last upgrade message, increment i, and repeat from step b) e) Otherwise, atomically make our updates visible to the system, increment V, and restart part 2 if necessary. f) Start listening to the multicast address xxxx:yyyy:zzzz::<V'+1> Part 3: casual bystanders. If you hear a piece of the upgrade U_i on the multicast address xxxx:yyyy:zzzz::<V+1>: a) If there is a piece U_j between U_0 and U_i which you have not already heard, ignore this packet. b) Otherwise, apply U_i and (re)set the Upgrade Timer to expire some random time in the future (some 10s of minutes). c) If U_i was the last upgrade message, atomically make the updates visible, delete the Upgrade timer, increment our local version, and rebind our multicast listening address to xxxx:yyyy:zzzz::<V+1> for our new version V. d) If the Upgrade Timer expires, become a new Upgrade Leader: set i = the smallest i such that we don't have U_i, and jump to Part 2 step b. ---------------------- Structure of upgrade messages: <Version # from> <Version # to> <Sequence #> <final message flag?> <msg length> 1 or more of: <filename to patch> <pre-patch hash> <post-patch hash> <bsdiff format diff> <signature> We might keep a git-style manifest for system, patching this like any other file. This can be used like fsck to sanity-check/verify the end-result. Further, although the 'from' version and 'to' version are separated by exactly 1 in typical usage, a sequence of upgrade messages can be used to perform any desired amount of up/down grade. I would propose keeping 'up-by-one' and 'down-by-one' upgrade message sequences on update.laptop.org, so that the user can downgrade to any chosen revision. We might either store 'skip-by-N' sequences on the server (for N=1,2,4,8,16,32,...) or generate upgrade messages on the fly if we want faster/further upgrades/downgrades. --------------------- Notes: a) the school server can trigger updates on all machines by becoming an Upgrade Leader and broadcasting bits. b) Upgrades are distributed very efficiently over the air if mesh broadcast is working (ie, not too many nodes drop packets). It could be even more efficient if we were allowed to cache entire upgrade message sequences on the XO and/or apply upgrade messages out-of-order. c) The Upgrade Timer timeout should probably depend on the number of messages in the upgrade and the # of the message you lost. d) Although I've described the process above as an automatic upgrade, once the sequence of upgrades has been applied, the actual atomic swap of current-tree to upgraded-tree could be deferred until some point in the future. e) Limiting the size of the upgrade messages U_i is primarily done by limiting the # of files patched by a particular message. However, we can also split bsdiff patches: bsdiff messages consist of a control block specifying a # of operations to perform on the input file to create the output file. We can split the control block into pieces to create multiple smaller bsdiffs from a monolithic diff. This is where it would be tricky to allow upgrade messages to be applied in any order. f) If we distribute any gzipped files (do we?) we should be sure to use the --rsyncable option to gzip to ensure efficient binary diffs. -- ( http://cscott.net/ ) _______________________________________________ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel