Re: [Server-devel] Schoolserver development in Uruguay

2010-08-24 Thread Martin Langhoff
Hi Bernie, list.

apologies for the latency! Thanks for the work you're doing in Uy, and
for the thoughtful email. Some notes below...


On Thu, Aug 19, 2010 at 8:25 PM, Bernie Innocenti ber...@codewiz.org wrote:
 == Debian vs Fedora ==

I have spoken with the Ceibal team several times. My recommendation
was that they looked into repackaging, but that some aspects would be
hard to repackage. I am of course happy to include a debian dir in
the git repo.

I've also repeated this several times on this list.  I'm kind of
surprised this isn't mentioned at all. Anyone who's been in
server-devel would know anyway ;-)

[ So yes, your work and patches in this direction are very welcome! ]

I have also recommended that Ceibal looks closely at a running XS to
see how it works, how it all fits together.

 == Jabber ==

 There are two people working on Jabber. They have been using ejabberd
 and, quite surprisingly, they've not seen any issues of high CPU load
 and database corruption. Tomorrow I'll get to work more with them.

If you have gathered more info, please share some if possible (in a
separate thread?).

Overall, there are number of reasons that hint at Uy _not_ actually
using Jabber services much. The ejabberd release they are using...
just doesn't work or scale very well.

 recommended by Collabora. My hacker senses are telling me that switching
 from Erlang to Lua is a small step in the direction of sanity and
 simplicity.

My reaction to this line is unprintable.

At this moment, I will need very strong proof that an xmpp daemon is
more solid and significantly less resource hungry, and easily
hackable. Erlang is hard to hack but when it runs, OMG it runs. And
we've hacked the bit needed.

 == Backups ==

 This is a black hole in all deployments I visited.

Server backups, I assume? Let's break this into a separate thread if
you want. Much of it should not be backed up, or benefits from
aggressive hardlinking.

 The feasibility of remote backups varies depending on how much we care
 to backup. In Paraguay, it was decided that the journal backups are to
 be considered a valuable if we are to instill the idea in teachers that
 the laptop is the same of a notebook with homework on it.

I like that.

 Journal backups, however, amount to a whopping 238GB of rapidly
 changing, mostly uncompressible and undeltable data.

Does aggressive hardlinking across users help? (Do users have big
jounral entries that are related to resources many of them download?)

 Yesterday Daniel Castelo and I discussed the idea of performing
 cross-backups between nearby schools. This solution would probably work
 well in terms of bandwidth distribution, but it would bring some
 logistic complexity. Probably an acceptable trade-off.

That probably gives you worst network performance and usage case.

Assuming the usual asymmetric bw caps, you transfer _double_ the
traffic over bw-limited links just to run the crossed backups.

And the restore case gets bogged down (perhaps to an unusable point)
by the upload limit on the other server.

Backup (using rsync) to an upstream server makes a lot more sense.

 == Content management ==
(...)
 Oddly, I've not yet may anyone using Moodle. When I ask why, I always
 hear some vague comment about it being designed for higher education.

And they are right. I _really_ need help to simplify its usage. Lots.

Everything other than Moodle (and I've used a very wide range of
related tools) is... just not fit for this role.

 After they have functioning backups, Uruguay would like to provide a
 wiki.

 - Moodle has a wiki (and a much improved wiki in the next release).
This wiki is 'course' or 'group' oriented.

 - MediaWiki can be easily installed sharing magic login credentials
with Moodle.

 I have a dream that one day each school will evaluate and choose their
 favorite tools autonomously...

More unprintable words follow.

Teachers are overwhelmed. The technical team is overwhelmed by a large
deployment. I am not sending anyone to evaluate (and then integrate!)
random software bits, specially when they usually have no experience
doing that.

The goal of XS is to provide a well-chosen, well integrated set of
tools. (If a deploymetn has the expertise, drive and _time_ to pick
and integrate something else, bravo! Most deployments don't have
them).

 == Server management tools ==

 Paraguay uses Puppet. We're very happy with it.
 Uruguay uses CFengine. They seem to be very happy with it as well.

Puppet is on its way to be the recommended tool to manage a herd of XSs.

Puppet is -- AFAIK -- a lot more tolerant of bad connectivity, and my
intention is to add a sneakernet mechanism for config updates.

 But no distro advocacy, please... they're all good, ok? :-)

I love/hate them all :-)

Official XS will follow the F/RH lineage, but more are welcome.

cheers,



m
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny 

Re: [Server-devel] Schoolserver development in Uruguay

2010-08-20 Thread Tony Anderson
Hi,

I must confess that I am not current on this list. However, I'll toss in 
two cents.

First, I would like to see a build process that starts with a generic 
(Fedora) LAMP system. There should then be a build script that creates a 
schoolserver on this system. The current process (as used in Nepal) 
starts with the 0.6 image which locks it to Fedora 9. A deployment could 
then build their own script to add (or remove) services.

Second, The idea of the DataManager activity is to replace the current 
backup scheme by one controlled by the students. DataManager shows a 
Journal-like listing of all the journal items on the server and on his 
or her XO. If the item is on both, it is shown in Blue. If it is only on 
the server, it is shown in Cyan. The student can delete a Blue item 
(leaving it only on the server) or can click on a Cyan item causing it 
to be downloaded to the XO. A 'fuel guage' shows how much of the Nand is 
free as a guide on whether to delete some local items. All newly created 
items are copied to the schoolserver - provided they have an associated 
data file; otherwise they are deleted. This policy is based on my 
reading of the code that journal items which do not have a file are not 
'resumed' (in 0.82). The 0.82 scheme gives the student no way to avoid 
filling his Nand or controlling what gets saved or discarded (e.g if 
he/she deletes a Journal item on the XO, it will be deleted on the 
backup as well). One additional advantage is that the DataManager 
supports a 'commons' folder on the schoolserver which acts as a Journal 
store but whose items are available to all XOs. Currently, the commons 
folder contains a copy of the Sugar activities. This way students have 
access to all of them and can decide which they want to have local. If 
one is removed, it can be downloaded again. If the local store is lost 
(e.g. by the student changing XOs resulting from a hardware failure), 
all of the journal items are still accessible via the DataManager (given 
suitable update of the schoolserver to reflect the new serial-number).

Tony
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel


[Server-devel] Schoolserver development in Uruguay

2010-08-19 Thread Bernie Innocenti
I'm currently at Plan Ceibal. As you may know, Uruguay developed its own
schoolserver based on Debian, running software developed in-house and
managed with CFengine. Yesterday we briefly discussed their future plans
for the school server.


== Debian vs Fedora ==

First of all, there's no way they're going to reinstall 2500
schoolservers with Fedora or even a newer release of Debian. Online
upgrades would be possible, though.

There's some interest in repackaging in Debian the datastore backup
server and other components of the OLPC XS. This work could be
contributed back to you or whoever will become the next schoolserver
architect.

Perhaps we could get one of the Debian maintainers in our community to
get these packages accepted. I could do the same for Fedora.

As you said, recommending or supporting multiple schoolserver
configurations in parallel doesn't make sense, but it wouldn't hurt if
some of the underlying components were shared horizontally, especially
for the configurations that are already widely deployed.


== Jabber ==

There are two people working on Jabber. They have been using ejabberd
and, quite surprisingly, they've not seen any issues of high CPU load
and database corruption. Tomorrow I'll get to work more with them.

I still had no time to review Prosody, the Jabber implementation
recommended by Collabora. My hacker senses are telling me that switching
from Erlang to Lua is a small step in the direction of sanity and
simplicity.

The Sugar Labs Infrastructure Team has setup new dedicated VM for
collaboration, but at this time nobody has been working on it. It's an
Ubuntu Lucid machine, but we could reinstall it if needed.

Tomeu and Collabora overwhelmed the collaboration stack in Sugar 0.90
and seem to have plans to further evolve it. They should be consulted
prior to making any long-term decision on the server side.


== Backups ==

This is a black hole in all deployments I visited.

Redundant storage is too expensive. One cheap 500GB hard-drive is
typical. In one year, 3 of the 10 schoolservers in Caacupé developed a
hard drive failure.

Loosing all data is sadly the status quo in both Uruguay and Paraguay. I
worked on implementing remote backups for a subset of /library using
rsync, but 2Mbit per school and 70Mbit on the backup server are
insufficient for the initial sync and probably also for nightly updates.

What numbers are we talking about, in terms of size? Here are some
numbers from an actual school which has been operating for over one year
with 530 registered laptops:

 262M   backup
 19Gcache
 3.4M   games
 1.7M   orug
 62Mpgsql-xs
 67Muploads
 238G   users
 20Kwebcontenido
 17Mxs-activation
 516M   xs-activity-server
 827M   xs-rsync
 2.7G   zope-var

The feasibility of remote backups varies depending on how much we care
to backup. In Paraguay, it was decided that the journal backups are to
be considered a valuable if we are to instill the idea in teachers that
the laptop is the same of a notebook with homework on it.

Journal backups, however, amount to a whopping 238GB of rapidly
changing, mostly uncompressible and undeltable data. Quite not the ideal
case for an incremental backup. With today's available resources, we
could afford to backup everything *but* the journals.

Yesterday Daniel Castelo and I discussed the idea of performing
cross-backups between nearby schools. This solution would probably work
well in terms of bandwidth distribution, but it would bring some
logistic complexity. Probably an acceptable trade-off.


== Content management ==

Paraguay seems quite happy with Plone, but frankly I can't understand
why. Teachers heavily use a really simple php tool called PAFM, which
provides basic hierarchical file management with no access control or
versioning.

Oddly, I've not yet may anyone using Moodle. When I ask why, I always
hear some vague comment about it being designed for higher education.
Same goes for Schooltool. These more structured tools probably present
an steeper learning curve and a bad fit for unsophisticated requirements
of users who are being exposed to information systems for the first
time.

After they have functioning backups, Uruguay would like to provide a
wiki. They have already looked at Dokuwiki, with which I'm not familiar.
It seems to have a readable and easy to learn Creole-like syntax. I
would personally recommend going either for the simplest possible wiki
in this category, or straight to Mediawiki--the most feature-complete
out there.

Any mid-size solution such as MoninMoin is likely to bring the worst of
both worlds. Having written my own minimalist wiki, perhaps I'm slightly
biased on this topic. Just slightly, yeah :-) 

Seriously, the choice of wiki would depend on what other tools would
complement it. If you already have Moodle or Schooltool, you probably
need just a basic wiki for taking notes on the side. With Mediawiki, one
would probably install a bunch of powerful extensions to build

Re: [Server-devel] Schoolserver development in Uruguay

2010-08-19 Thread SgtPepper
Bernie, Guys:
A few of my ideas are below:

2010/8/19 Bernie Innocenti ber...@codewiz.org

 I'm currently at Plan Ceibal. As you may know, Uruguay developed its own
 schoolserver based on Debian, running software developed in-house and
 managed with CFengine. Yesterday we briefly discussed their future plans
 for the school server.


 == Debian vs Fedora ==

 First of all, there's no way they're going to reinstall 2500
 schoolservers with Fedora or even a newer release of Debian. Online
 upgrades would be possible, though.

 There's some interest in repackaging in Debian the datastore backup
 server and other components of the OLPC XS. This work could be
 contributed back to you or whoever will become the next schoolserver
 architect.

 Whichever the distro... It should be easily maintainable and deployable.
I've no problem in building the dpkg's of the XS components. I think its
justified, since there is a 2500 servers base. Let me tell you that I've
only worked with rpm before, but I've no problem in learning the Debian
guidelines, and maybe, try to push the packages into debian testing.


 Perhaps we could get one of the Debian maintainers in our community to
 get these packages accepted. I could do the same for Fedora.

 As you said, recommending or supporting multiple schoolserver
 configurations in parallel doesn't make sense, but it wouldn't hurt if
 some of the underlying components were shared horizontally, especially
 for the configurations that are already widely deployed.


 == Jabber ==

 There are two people working on Jabber. They have been using ejabberd
 and, quite surprisingly, they've not seen any issues of high CPU load
 and database corruption. Tomorrow I'll get to work more with them.

 I still had no time to review Prosody, the Jabber implementation
 recommended by Collabora. My hacker senses are telling me that switching
 from Erlang to Lua is a small step in the direction of sanity and
 simplicity.

 The Sugar Labs Infrastructure Team has setup new dedicated VM for
 collaboration, but at this time nobody has been working on it. It's an
 Ubuntu Lucid machine, but we could reinstall it if needed.

 Tomeu and Collabora overwhelmed the collaboration stack in Sugar 0.90
 and seem to have plans to further evolve it. They should be consulted
 prior to making any long-term decision on the server side.


 == Backups ==

 This is a black hole in all deployments I visited.

 Redundant storage is too expensive. One cheap 500GB hard-drive is
 typical. In one year, 3 of the 10 schoolservers in Caacupé developed a
 hard drive failure.

 Loosing all data is sadly the status quo in both Uruguay and Paraguay. I
 worked on implementing remote backups for a subset of /library using
 rsync, but 2Mbit per school and 70Mbit on the backup server are
 insufficient for the initial sync and probably also for nightly updates.

 What numbers are we talking about, in terms of size? Here are some
 numbers from an actual school which has been operating for over one year
 with 530 registered laptops:

  262M   backup
  19Gcache
  3.4M   games
  1.7M   orug
  62Mpgsql-xs
  67Muploads
  238G   users
  20Kwebcontenido
  17Mxs-activation
  516M   xs-activity-server
  827M   xs-rsync
  2.7G   zope-var

 The feasibility of remote backups varies depending on how much we care
 to backup. In Paraguay, it was decided that the journal backups are to
 be considered a valuable if we are to instill the idea in teachers that
 the laptop is the same of a notebook with homework on it.

 Journal backups, however, amount to a whopping 238GB of rapidly
 changing, mostly uncompressible and undeltable data. Quite not the ideal
 case for an incremental backup. With today's available resources, we
 could afford to backup everything *but* the journals.

 Yesterday Daniel Castelo and I discussed the idea of performing
 cross-backups between nearby schools. This solution would probably work
 well in terms of bandwidth distribution, but it would bring some
 logistic complexity. Probably an acceptable trade-off.


 How about 2 500GB in RAID-1? I mean, specially in Paraguay, bandwidth is
scarce.


 == Content management ==

 Paraguay seems quite happy with Plone, but frankly I can't understand
 why. Teachers heavily use a really simple php tool called PAFM, which
 provides basic hierarchical file management with no access control or
 versioning.

 Oddly, I've not yet may anyone using Moodle. When I ask why, I always
 hear some vague comment about it being designed for higher education.
 Same goes for Schooltool. These more structured tools probably present
 an steeper learning curve and a bad fit for unsophisticated requirements
 of users who are being exposed to information systems for the first
 time.

 After they have functioning backups, Uruguay would like to provide a
 wiki. They have already looked at Dokuwiki, with which I'm not familiar.
 It seems to have a readable and easy to learn Creole-like syntax. I
 would 

[Server-devel] Schoolserver development in Uruguay

2010-08-19 Thread Rodolfo D. Arce S.
Comments regarding the initial paraguayian deployment, i'm not very
familiar with the current status

Regarding distros, when the initial setup was made, the XS (fedora
based) schoolserver was the only straightforward instalation that
could have anything working with not so much tampering, and was pretty
automatic, so XS was chosen, I'm a regular sysadmin, and i got the
thing working given enough research time, and martin's help.. is not
always like that.. we must take into account that there are not many
people that could get an XS working on any given distro, and although
there are many volunteers (like bernie) who go around the world doing
this things, sustainability is very far away.

If there is a distro, or many distros, is not the real problem, the
real problem is that there needs to be a simple straightforward and
automatic way to deploy a schoolserver without needing a masters in
computer science, or even a deegree at all. It has to be fast and it
has to be simple.

In Plan Ceibal worked fine using debian, because they have specialized
people that can do the develpment and can do the maintenance, not
because the chose this or that distro

In Paraguay, Fedora was chosen for the same reasons, it was the
fastest way to get things done, and the simple way to sustain it in
the long term, with XS and Fedora patches, which I don't know if were
made later on.


 == Jabber ==

I don't really understand much how collaboration works, so, no comments


 == Backups ==

This numbers make sense to paraguay deployment but may not make sense
to other deployments, so I'll explain the folders that I remember

  262M   backup
Backup folder, where all data that was going to be rsynced to the
datacentes was stored, it would amount to a backup of the plone, the
databases, some configs, and other stuff

  3.4M   games
This was a folder where a web based game was going to be stored, this
would be published by the apache web server

  1.7M   orug
Same as before, it was a game developed by a paraguayian legal team to
help kids learn about thir rights

  62M    pgsql-xs
I don't remembre

  67M    uploads
The PAFM web folder discussed leter

  238G   users
The datastores folder

  20K    webcontenido
The apache default webpage, with specialized links for games,
activities and others

  17M    xs-activation
  516M   xs-activity-server
  827M   xs-rsync
I don't remember

  2.7G   zope-var
Since plone works with a selfcontained filesystem for its webpage,
this _single_ file was going to be backed up to the datacenter as
well, i think this is the folder, but i remember that it had to go to
the backup folder anyways


 The feasibility of remote backups varies depending on how much we care
 to backup. In Paraguay, it was decided that the journal backups are to
 be considered a valuable if we are to instill the idea in teachers that
 the laptop is the same of a notebook with homework on it.

 Journal backups, however, amount to a whopping 238GB of rapidly
 changing, mostly uncompressible and undeltable data. Quite not the ideal
 case for an incremental backup. With today's available resources, we
 could afford to backup everything *but* the journals.

This problem is more related to the way the journal stores the files
and the metadata, I remember little about it, but the main problem
with backing up a laptop is no just about taking files, any single
file in the datastore doesn't get you a back up, you have to take the
whole datastore folder.

Incremental, or differential backups could be made if the datastore
treated the files differently, I'm sorry if I hurt some
suceptibilities, but is the truth, there's no simple way to back up
_just the data_ from the journal, you back it all or nothing, because
_part of it_ is useless.

I don't know if that improved in the newer version, but 0.82 (i think)
is the one that was used in Paraguay, is like that

 Yesterday Daniel Castelo and I discussed the idea of performing
 cross-backups between nearby schools. This solution would probably work
 well in terms of bandwidth distribution, but it would bring some
 logistic complexity. Probably an acceptable trade-off.

This an interesting idea, and is related to the sustainability part of
the deployment, and XS

Deploying a schoolserver should be made easy, this would help small
deployments and big deployments, the faster the server gets to the
school the better, we all know that, but the real advantage is when
maintenance can be made from remote or with simple and fast solutions,
like puppet, CFEngine, or even self conained rpm/deb packages, becuase
this is the way that we get the masters in computer scince in every
schoolserver we want.

The faster that a schoolserver can be install, and for the matter
reinstalled and restore the better, because then you would only need
to send a guy (not a sysadmin) to go, insert a CD, next, next, next,
voila.. even when changes need to be done.. they should be done in the
way that can be applyed to the 

Re: [Server-devel] Schoolserver development in Uruguay

2010-08-19 Thread Daniel Drake
On 19 August 2010 18:25, Bernie Innocenti ber...@codewiz.org wrote:
 == Jabber ==

 There are two people working on Jabber. They have been using ejabberd
 and, quite surprisingly, they've not seen any issues of high CPU load
 and database corruption. Tomorrow I'll get to work more with them.

XS-0.6 and some of the package updates that come later fix a few bugs
related to ejabberd CPU/DB. I guess in Paraguay they are still on 0.5.

 This is a black hole in all deployments I visited.

 Redundant storage is too expensive. One cheap 500GB hard-drive is
 typical. In one year, 3 of the 10 schoolservers in Caacupé developed a
 hard drive failure.

But it's not a huge issue because the XOs also have a copy of the
journal. So, if technical resources are available for a quick XS
repair, disruption should be minimal.

 Journal backups, however, amount to a whopping 238GB of rapidly
 changing, mostly uncompressible and undeltable data. Quite not the ideal
 case for an incremental backup. With today's available resources, we
 could afford to backup everything *but* the journals.

You're giving numbers but missing an important consideration - the XS
backup system makes multiple backups. And it'll continue to do make
more and more copes until it meets a certain threshold based on disk
size (likely to be 238GB in your case). At this point, it will purge
the oldest backups before making new ones.

Saying that you've hit 238GB after a year isn't conclusive because its
likely that you'll meet the threshold when you're measuring an active
school over such a long time period. It's the design - use the
available space.

It's possible that within that space you have 10 backups of every
journal. So you could possibly get away with a disk half the size, and
only retain 5 copies. I'm inventing numbers (and they aren't
strictly copies either), but you can provide real ones - how many
backups (on average) are there of a journal in this server? What's the
disk space used if you only total the space used by the most recent
backup of each journal? Also, is it possible that your space-measuring
script is counting a 5mb file with 2 hardlinks as 10mb of used disk
space?

 Paraguay uses Puppet. We're very happy with it.
 Uruguay uses CFengine. They seem to be very happy with it as well.

 Both employ a flat hierarchy with one puppet master controlling all the
 schools, which is simple and straightforward, but requires excellent
 connectivity.

Excellent is a bit subjective, but yes, the fact that it requires
any form of connectivity is a roadblock in many cases. However, we
came up with a way around this (ideas only, for now, but wouldn't be
hard to implement) for puppet:
- clone all the puppet repositories and the config files and put them
on a USB disk (and do this periodically)
- install puppet-server on all the XSs (but dont run it by default)
- go to a school with said USB disk, plug it in and run puppet-server
- run puppet-client, connecting to localhost
- stop puppet-server, unplug USB disk, go home

Daniel
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel


Re: [Server-devel] Schoolserver development in Uruguay

2010-08-19 Thread Bernie Innocenti
El Fri, 20-08-2010 a las 00:51 -0300, Bernie Innocenti escribió:

 Heh, these are good questions, but answering them all would take quite
 some time, and it's 1AM over here :-)

Meanwhile, my du run to find out the size of current backups completed:

 # du -sh  --exclude datastore-200* /library/backup
 92G/library/backup

So, backing up the last versions of all journals would take just 92GB,
which would take more that 4 days on a 2mbit link for the initial
backup.

-- 
   // Bernie Innocenti - http://codewiz.org/
 \X/  Sugar Labs   - http://sugarlabs.org/

___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel