Re: A different proposal for XO upgrade.

2007-06-27 Thread C. Scott Ananian
On 6/27/07, Christopher Blizzard <[EMAIL PROTECTED]> wrote:
> On Tue, 2007-06-26 at 16:19 -0400, Mike C. Fletcher wrote:
> Kind of.  Last time I checked bittorrent just used a random guess to
> figure out who to try and get bits from.  i.e. you were just as likely
> to get your bits from india as you would from the guy on the next lan.

a) There are location-sensitive extensions to bittorrent.
b) If we used bittorrent, we'd use a primary torrent tracker on the
schoolserver, with a backup in cambridge (or using DHT).  That would
keep downloads local unless the schoolserver was down.
 --scott

-- 
 ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: A different proposal for XO upgrade.

2007-06-27 Thread Christopher Blizzard
On Tue, 2007-06-26 at 16:19 -0400, Mike C. Fletcher wrote:
> 
> You've just described BitTorrent, if I'm not missing something.  It's 
> already designed to optimise the bandwidth used by talking to the
> people 
> who have the fastest connection to you, it automatically allows each 
> person who pulls down a unit to share it, it's fault tolerant, allows 
> the "lead" to drop out (doesn't actually have one), and generally is 
> robust as all get out.  It also has a file-format that allows for a 
> downloading just that fragment you need (i.e. just one file, or just
> one 
> block, or whatever).
> 

Kind of.  Last time I checked bittorrent just used a random guess to
figure out who to try and get bits from.  i.e. you were just as likely
to get your bits from india as you would from the guy on the next lan.
We're talking about something here that's different than that and
specifically uses the high bandwith of the local wlan vs. the low
bandwith of upstream connections.  i.e. there's a sense of locality.  bt
doesn't really have that unless they have changed the way that they do
it.

--Chris

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: A different proposal for XO upgrade.

2007-06-26 Thread Mitch Bradley
C. Scott Ananian wrote:
> On 6/26/07, Mike C. Fletcher <[EMAIL PROTECTED]> wrote:
>   
>> You've just described BitTorrent, if I'm not missing something.  It's
>> 
>
> Hmm, that is interesting.  I hadn't made the connection.  I don't
> think bittorrent supports symlinks, hardlinks, permissions, ownership,
> xattrs, or ACLs, though, which would make its use strictly
> experimental for now.  But it would be something interesting to look
> at post-FRS.  Certainly we should take a hard look before rolling a
> new CDS from scratch; from the specification at
> http://www.bittorrent.org/protocol.html it seems like the missing
> filesystem metadata bits wouldn't be too hard to add.
>  --scott
>
>   
Distribute tar files over bittorrent?

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: A different proposal for XO upgrade.

2007-06-26 Thread C. Scott Ananian
On 6/26/07, Mike C. Fletcher <[EMAIL PROTECTED]> wrote:
> You've just described BitTorrent, if I'm not missing something.  It's

Hmm, that is interesting.  I hadn't made the connection.  I don't
think bittorrent supports symlinks, hardlinks, permissions, ownership,
xattrs, or ACLs, though, which would make its use strictly
experimental for now.  But it would be something interesting to look
at post-FRS.  Certainly we should take a hard look before rolling a
new CDS from scratch; from the specification at
http://www.bittorrent.org/protocol.html it seems like the missing
filesystem metadata bits wouldn't be too hard to add.
 --scott

-- 
 ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: A different proposal for XO upgrade.

2007-06-26 Thread Mike C. Fletcher
Christopher Blizzard wrote:
...
> I like the system that scott proposed for how often we should look at
> updates and the idea of a "lead" to say "hey, I'm getting this update so
> don't look."  (It sounds strangely familiar to an idea that I shared
> with scott over lunch a month or so ago so of course I like it!) We also
> might consider setting up other clients to start collecting the update
> before it's completely finished downloading to start spreading around
> the bandwidth over a longer period of time and making the entire process
> more fault tolerant and bandwidth efficient on the local lan.  i.e.
> someone else could pick up the rest of the update if the lead machine
> vanishes.  Also, two machines might be able to get the update faster
> than one, just due to latency over a lot of the links we're talking
> about using.
>   
You've just described BitTorrent, if I'm not missing something.  It's 
already designed to optimise the bandwidth used by talking to the people 
who have the fastest connection to you, it automatically allows each 
person who pulls down a unit to share it, it's fault tolerant, allows 
the "lead" to drop out (doesn't actually have one), and generally is 
robust as all get out.  It also has a file-format that allows for a 
downloading just that fragment you need (i.e. just one file, or just one 
block, or whatever).

Just a thought,
Mike

-- 

  Mike C. Fletcher
  Designer, VR Plumber, Coder
  http://www.vrplumber.com
  http://blog.vrplumber.com

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: A different proposal for XO upgrade.

2007-06-26 Thread C. Scott Ananian
On 6/26/07, Christopher Blizzard <[EMAIL PROTECTED]> wrote:
> (I know that some Red Hat
> guys did something like this for a customer where an entire bank's set
> of terminals could be completely re-imaged after a power failure in 20
> seconds using a mutlicast-rsync setup.  I should see more about that.)

That does sound very interesting.  Please do.

> I like the system that scott proposed for how often we should look at
> updates and the idea of a "lead" to say "hey, I'm getting this update so
> don't look."  (It sounds strangely familiar to an idea that I shared
> with scott over lunch a month or so ago so of course I like it!)

I also like it, but this is a situation where the best is the enemy of
the good-enough. =)  We can revisit broadcast updates,
multicast-rsync, and all sorts of fancy goodness once the basic system
is working -- and Ivan's proposal seems most likely to get that robust
basic system up quickly.
  --scott

-- 
 ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: A different proposal for XO upgrade.

2007-06-26 Thread Christopher Blizzard
Just some comments on this thread.

It seems odd to try to optimize the bandwith on the actual local lan as
we have a decent amount of bandwith to work with.  The only use case
that I can come up with to support that is during unboxing and/or a mass
re-install.  And I don't think that we're ready to try and support that
with this first iteration.  It's a distribution method and probably
something that we can add after the fact.  (I know that some Red Hat
guys did something like this for a customer where an entire bank's set
of terminals could be completely re-imaged after a power failure in 20
seconds using a mutlicast-rsync setup.  I should see more about that.)
As long as the formats that we pick support something like this we
should be pretty safe for now.

The case where we do worry about bandwidth is the client <-> main
updates server setup.  That's where the bandwidth is likely very slow
and expensive and looking at using a diff-style system is worth the
investment given our install base.  I think Alex is looking into this
now.

I like the system that scott proposed for how often we should look at
updates and the idea of a "lead" to say "hey, I'm getting this update so
don't look."  (It sounds strangely familiar to an idea that I shared
with scott over lunch a month or so ago so of course I like it!) We also
might consider setting up other clients to start collecting the update
before it's completely finished downloading to start spreading around
the bandwidth over a longer period of time and making the entire process
more fault tolerant and bandwidth efficient on the local lan.  i.e.
someone else could pick up the rest of the update if the lead machine
vanishes.  Also, two machines might be able to get the update faster
than one, just due to latency over a lot of the links we're talking
about using.

--Chris

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: A different proposal for XO upgrade.

2007-06-26 Thread Alexander Larsson
On Tue, 2007-06-26 at 12:26 +0200, Alexander Larsson wrote:
> My distribution model instead is more distributed. The idea is that
> while full mesh network traffic is spotty and slow the local wlan
> connections have no reason to be much worse than any other wlan
> connections (including ours). So, to download a full file from a
> neighbour isn't all that slow. Nor does it affect the general
> performance of the mesh. 

This seems to be a bit unclear. What i mean is that while general mesh
traversal is slow, if you're 1 hop away from someone, i.e. your two
laptops have a radio connection, things are likely to be quite fast.
And, communication between these two laptops doesn't affect other people
on the mesh (much).

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: A different proposal for XO upgrade.

2007-06-26 Thread Alexander Larsson
On Mon, 2007-06-25 at 17:51 -0400, C. Scott Ananian wrote:
> As food for discussion, here's a counter proposal for an XO upgrade
> mechanism, focusing on the network side.
> -
> Design goals:
>  - minimize round-trips necessary for successful upgrade
>  - minimize size of upgrades

(I'm using this reply as a general reply for the whole update thread
instead of replying to each mail.)

First, I must say that the numbers you posted for psdiff look pretty
nice for smaller updates like security problems. (I'm not sure they will
help as much for larger updates though.) It sounds interesting to try to
use this if possible.

The original discussion about diffs and updates talked about storing the
original "gold" image on all laptops, and then sending each update as a
diff from this image. This would mean that the diffs would keep growing
in size, being larger to send and we'd need to use more and more space
for the original image data not used anymore. However, your proposal
instead seems to be that all diffs are incremental (and availible as
reverse diffs for incremental reversal). This keeps each update small,
but instead you need to get all the updates and apply them in order.

There is a general difference in how data is distributed in these two
proposals. In your proposal each machine polls and downloads from a
central server. This relies on the mesh (and ultimately a connection to
the internet) for downloading the data. This makes it very sensitive to
the performance of the mesh network. To make this better it uses
multicast to avoid the same data being sent over the mesh to many times.

My distribution model instead is more distributed. The idea is that
while full mesh network traffic is spotty and slow the local wlan
connections have no reason to be much worse than any other wlan
connections (including ours). So, to download a full file from a
neighbour isn't all that slow. Nor does it affect the general
performance of the mesh. 

Instead of all machines downloading the update from the server via the
mesh (or multicast) upgrades happen in stages. The first stage is the
school server getting the new version and announching it. Then all
machines that are close to the server will find the upgrade and update.
Then machines close to these laptops will upgrade from the laptops and
then the update will propagate outwards one machine at a time. (Things
are also dynamic, as the machines move around.) The spread is
"exponential" as the number of machines with the update availible
increases, and at no time are we sending data over the mesh.

This distributed system makes it possible for laptops to get upgraded
that doesn't have a connection that reaches the school server, as long
as another updated laptop happens to at some time be in the
neighborhood. And it also is possible to update such a laptop to any
version, without having to store all the diffs between the two versions
in the update (i.e. if the laptop hasn't had connection to the net in a
while and don't have a recent update).

I must say that the multicasting portion of your proposal sounds risky
to me. It seems like it is pretty easy for there to be multiple Upgrade
Leaders for the same version, both over time and in parallel, (for
instance due to missed packages, laptops turned on at different times,
etc). Each of these sending multicast messages on the mesh as they
upgrade sound like it could easily fill the mesh with a lot of multicast
traffic.

When you say "multicast", how far do you mean these packages would be
sent? I guess on a mesh network multicast packages are limted by some
kind of hop limit, to avoid loops if nothing else? Some more info on how
multicast works on the mesh would be nice.

Also, i keep seeing references to vserver, but i still hasn't seen any
way this could be used to update the system image. In the case of your
proposal this doesn't change much though. For instance, if we used jffs2
filesystem transactions instead the only difference is that applying the
diffs as they come might not be a good idea, instead we keep them and
apply them all in a transaction when we're ready.

So, to sum it up, my approach certainly uses more bits on the network
for data transfer, but I don't think that is a large issue, since all
(most?) transfers are local, and thus should get good performance. It is
also less dependent on direct connections to the internet (or the school
server) and doesn't cause mesh network traffic affecting bandwidth for
other people. Its also, imho vastly simpler to implement, test, and
deploy.

However, it certainly would be nice if we could use the binary diff
approach in a distributed system too. A simple extension for this would
be to do diffs on the school server. We could easily store diffs between
"consecutive" blobs and manifests (say as
http://server/blobs/{first-sha1}-{second-sha1}), and the client could
detect that the server has both the version it already has and the
target version and request a diff be