Checking filesystems periodically

2012-05-04 Thread Anish Mangal
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I am curious to know why we are not periodically checking file systems
after every N boots on the XO laptop.

The biggest reason I can assume is because every Nth time the system
will appear to boot very slowly to the user thus creating the
impression that something is wrong.

However, in light of the fact that XO's often operate in harsh
environments and (probably) greater possibility of hardware failure
because of that, would it be a good idea in doing so?

Perhaps we could mount /boot and /home on different partitions and
check /boot periodically.

One issue that is pertinent to this discussion is [1]

Thoughts?

[1] http://dev.laptop.org.au/issues/1214

Cheers,
Anish

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPo5mRAAoJEBoxUdDHDZVp3dwH/0vOsuR9zxv/u+2wzyMCIXNb
7SKQeevki8oBaZW6S4fVm5UZUT02hNothrFJhfz8XFwYigahVKyPUnYTPmC9kHVu
C88azxuDRHcUeRyuHvVZdp/G7Xyd+FUI+DfjnEi/ts8lgRBMFfiz+g/8MljW2rsN
32Bu6/fevujT3khvkFn3LMKSfRJtVCXMt9PddRHBVXxmIjxS/3JV8kAYJiheCE7o
Na3zsTN3G+nTLEXd+oPJ5JeDiJnuz94genQu160vD4nqhtwsDD5HtsTIXf0tp3BD
yd8TYSGDHuolSU0UqDfPy52mYuA0qTLBjbBQ7TXnwuUXaGJpTyF+XYFgSaorFeo=
=Kl8n
-END PGP SIGNATURE-
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Checking filesystems periodically

2012-05-04 Thread Daniel Drake
On Fri, May 4, 2012 at 2:55 AM, Anish Mangal an...@activitycentral.com wrote:
 Hi,

 I am curious to know why we are not periodically checking file systems
 after every N boots on the XO laptop.

Historically, we have used a filesystem without a checker (jffs2).

Now that we use ext* on newer laptops, there are a few reasons:

Firstly, its non-trivial, due to the design/setup of our initramfs and
filesystem.

Secondly, the user experience:

 The biggest reason I can assume is because every Nth time the system
 will appear to boot very slowly to the user thus creating the
 impression that something is wrong.

Until F17 we haven't had a good way of communicating this via the boot
animation. Now we can do that easily but it lacks implementation.

Thirdly, fsck is not magic. It cannot detect/repair all corruption. As
far as I know, we have not yet found a case of corruption which can be
meaningfully fixed by fsck. We did do quite a bit of testing for this
at an earlier point.

 However, in light of the fact that XO's often operate in harsh
 environments and (probably) greater possibility of hardware failure
 because of that, would it be a good idea in doing so?

Despite all the above, yes, it would be a good idea, it's something
we'll hopefully get around to at some point. Your help implementing it
is welcome.

Daniel
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Checking filesystems periodically

2012-05-04 Thread Chris Ball
Hi,

On Fri, May 04 2012, Daniel Drake wrote:
 Until F17 we haven't had a good way of communicating this via the boot
 animation. Now we can do that easily but it lacks implementation.

 Thirdly, fsck is not magic. It cannot detect/repair all corruption. As
 far as I know, we have not yet found a case of corruption which can be
 meaningfully fixed by fsck. We did do quite a bit of testing for this
 at an earlier point.

Also, our users cannot be expected to understand (or obey) a requirement
that they not turn off the machine while it's doing something dangerous:
so if powering down half way through fsck leaves the filesystem in a
worse state than it was before fsck ran, we probably shouldn't do it
at all.

- Chris.
-- 
Chris Ball   c...@laptop.org   http://printf.net/
One Laptop Per Child
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Checking filesystems periodically

2012-05-04 Thread Chris Ball
Hi,

On Fri, May 04 2012, Chris Ball wrote:
 Hi,

 On Fri, May 04 2012, Daniel Drake wrote:
 Until F17 we haven't had a good way of communicating this via the boot
 animation. Now we can do that easily but it lacks implementation.

 Thirdly, fsck is not magic. It cannot detect/repair all corruption. As
 far as I know, we have not yet found a case of corruption which can be
 meaningfully fixed by fsck. We did do quite a bit of testing for this
 at an earlier point.

 Also, our users cannot be expected to understand (or obey) a requirement
 that they not turn off the machine while it's doing something dangerous:
 so if powering down half way through fsck leaves the filesystem in a
 worse state than it was before fsck ran, we probably shouldn't do it
 at all.

Another also: sometimes when fsck finds an inconsistency it asks you for
the root password, but some of our users don't have the root password,
so they might end up in a reboot loop where they can't progress.

- Chris.
-- 
Chris Ball   c...@laptop.org   http://printf.net/
One Laptop Per Child
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Checking filesystems periodically

2012-05-04 Thread Anish Mangal
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Fri 04 May 2012 07:57:47 PM IST, Chris Ball wrote:
 Hi,

 On Fri, May 04 2012, Chris Ball wrote:
 Hi,

 On Fri, May 04 2012, Daniel Drake wrote:
 Until F17 we haven't had a good way of communicating this via the boot
 animation. Now we can do that easily but it lacks implementation.

 Thirdly, fsck is not magic. It cannot detect/repair all corruption. As
 far as I know, we have not yet found a case of corruption which can be
 meaningfully fixed by fsck. We did do quite a bit of testing for this
 at an earlier point.

 Also, our users cannot be expected to understand (or obey) a requirement
 that they not turn off the machine while it's doing something dangerous:
 so if powering down half way through fsck leaves the filesystem in a
 worse state than it was before fsck ran, we probably shouldn't do it
 at all.


This is a valid point. _If_ there is a possibility that fsck will leave
the system in a worse state if the laptop is accidentally powered off,
this is a bad idea.

However, we can probably do a 'read-only' fsck, and provide a
notification (perhaps to contact technical support) if it finds
problems. That could be a fail-safe way of implementing this.

Am I right in assuming (with my limited knowledge in this area) that
the fsck-on-boot is by default read-only?

 Another also: sometimes when fsck finds an inconsistency it asks you for
 the root password, but some of our users don't have the root password,
 so they might end up in a reboot loop where they can't progress.


As above, a non-intrusive way of doing this would be providing a
notification to contact technical support. The biggest caveat then
becomes that the problem doesn't happen too often ;-) or else tech
support start cursing us!

In general, I wouldn't expect users to open a terminal and type
commands and passwords to repair their machine. If it has to be
implemented, it has to be completely handled by a GUI.

 - Chris

- --
Anish
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPo/NaAAoJEBoxUdDHDZVp7XMH/jQOdpdgeY0KPkgjwu7ImnyA
8kxSJcZUXtQ2bVA9ASXBmQH5+4gW/ZWUl+L0zbj/NwtFxef6clWnyeOtGwoTvRDp
pBBNrj/5y3NjeshEvZGFCZH0VxjmNS7I77I5b9f9qVsp2y03ZUSgC8E9z6WPet4q
fwAfjpgEOW+P3wclVDiWpzDuWFl6rXkiMssETuIIwPWCVEl04Wo7GikNrQrLe1M/
ySh6XSlZJkMv4RexBWzg/MHb9XNNFy+HwHmvWcD6h4wSUC6GjDBmUi14RvmS1Bz6
+a70xhiOA+bp8rsqvkEBzLHJA89g2BRImuLinez8Y2aR9K+rW1TrfzGwjCffdv8=
=q8zv
-END PGP SIGNATURE-

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Checking filesystems periodically

2012-05-04 Thread Mikus Grinbergs
On Fri, May 4, 2012 at 2:55 AM, Anish Mangalan...@activitycentral.com 
 wrote:

 I am curious to know why we are not periodically checking file systems
 after every N boots on the XO laptop.


I think the question is:  Should functions which affect the 'system' be 
performed automatically, or should they be explicitly invoked ?


The currently-existing OLPC precedent is the 'Software updates' facility 
-- which gets invoked manually to check whether the system's Activities 
are up-to-date.  My suggestion is to package the required software 
(fsck?) into a new 'Check system' facility - and add an icon for that 
new facility to the 'My Settings' panel - then the user who invokes it 
will know why his XO is tied up for a while.  [The local XO-laptop 
distributing organization can publish guidelines as to how often such a 
'Check system' facility ought to be used.]


mikus

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [Sugar-devel] Wanting to know a bit of (NetworkManager) workflow upon resume-from-suspend

2012-05-04 Thread Sascha Silbe
Paul Fox p...@laptop.org writes:

 i've cherry-picked 65a5f2b3 onto olpc-2.6.35, and the autobuilder
 did the rest.  this implements a new libertas_disablemesh module
 parameter which should keep mesh from being enabled.  please test:

 
 http://rpmdropbox.laptop.org/f14-xo1/kernel-2.6.35.13_xo1-20120502.1603.olpc.bde819f.i586.rpm

Thanks! Just to be sure: as it's been merged into olpc-2.6.35, all
future official 2.6.35 based OLPC kernel builds will include this patch?

The reason we've not gone the module parameter route so far (in
Dextrose 3) is that we didn't want to divert from upstream (OLPC in this
case) on the kernel level. If it's included now, that concern is
addressed and we can go this route, which IMO is technically the best
option. It avoids all possible race conditions and only needs a single
configuration file to be set up.

Sascha

-- 
http://sascha.silbe.org/
http://www.infra-silbe.de/


pgpktCWgQvo8W.pgp
Description: PGP signature
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Checking filesystems periodically

2012-05-04 Thread Walter Bender
On Fri, May 4, 2012 at 2:46 PM, Mikus Grinbergs mi...@bga.com wrote:
 On Fri, May 4, 2012 at 2:55 AM, Anish Mangalan...@activitycentral.com
  wrote:

  I am curious to know why we are not periodically checking file systems

  after every N boots on the XO laptop.


 I think the question is:  Should functions which affect the 'system' be
 performed automatically, or should they be explicitly invoked ?

 The currently-existing OLPC precedent is the 'Software updates' facility --
 which gets invoked manually to check whether the system's Activities are
 up-to-date.  My suggestion is to package the required software (fsck?) into
 a new 'Check system' facility - and add an icon for that new facility to the
 'My Settings' panel - then the user who invokes it will know why his XO is
 tied up for a while.  [The local XO-laptop distributing organization can
 publish guidelines as to how often such a 'Check system' facility ought to
 be used.]


+1

 mikus


 ___
 Devel mailing list
 Devel@lists.laptop.org
 http://lists.laptop.org/listinfo/devel



-- 
Walter Bender
Sugar Labs
http://www.sugarlabs.org
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [Sugar-devel] Wanting to know a bit of (NetworkManager) workflow upon resume-from-suspend

2012-05-04 Thread Sascha Silbe
Ajay Garg ajaygargn...@gmail.com writes:

[...]
 b)
 Ensured that '/etc/modprobe.d/libertas.conf' contained only the
 following line ::

   options libertas libertas_disablemesh=0
[...]
 f)
 Upon resume-from-suspend, NO ICONS COULD BE SEEN IN NEIGHBORHOOD VIEW.

Interesting. 

 g)
 Observations e) and f) were observed, _every single time_.

OK, I suppose you repeated this often enough and using exactly the same
environment and procedures as the other test cases? I.e. you are sure
that it's really specific to libertas_disablemesh=0 rather than just
occurring at random even without the libertas_disablemesh setting or
based on how the suspend / resume was triggered (e.g. idle suspend
vs. lid or power switch)?

I don't see anything in the patch or the module params code that would
explain this behaviour. If it's reproducible, I'll have to dive into it
and debug a bit...

Thanks for testing, BTW!

Sascha

-- 
http://sascha.silbe.org/
http://www.infra-silbe.de/


pgpji401xnu3W.pgp
Description: PGP signature
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [Sugar-devel] Wanting to know a bit of (NetworkManager) workflow upon resume-from-suspend

2012-05-04 Thread Ajay Garg
On Sat, May 5, 2012 at 2:03 AM, Sascha Silbe si...@activitycentral.com wrote:
 Ajay Garg ajaygargn...@gmail.com writes:

 [...]
 b)
 Ensured that '/etc/modprobe.d/libertas.conf' contained only the
 following line ::

               options libertas libertas_disablemesh=0
 [...]
 f)
 Upon resume-from-suspend, NO ICONS COULD BE SEEN IN NEIGHBORHOOD VIEW.

 Interesting.

 g)
 Observations e) and f) were observed, _every single time_.

 OK, I suppose you repeated this often enough and using exactly the same
 environment and procedures as the other test cases? I.e. you are sure
 that it's really specific to libertas_disablemesh=0 rather than just
 occurring at random even without the libertas_disablemesh setting or
 based on how the suspend / resume was triggered (e.g. idle suspend
 vs. lid or power switch)?

I tried 3 times, and it happened every time. (I tried with the lid
approach every time, under identical conditions and set of
procedures.).






 I don't see anything in the patch or the module params code that would
 explain this behaviour. If it's reproducible, I'll have to dive into it
 and debug a bit...

It is reproducible every time (at least at my end). :)





 Thanks for testing, BTW!

My pleasure :)





 Sascha

 --
 http://sascha.silbe.org/
 http://www.infra-silbe.de/


Regards,
Ajay
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [Sugar-devel] Wanting to know a bit of (NetworkManager) workflow upon resume-from-suspend

2012-05-04 Thread Paul Fox
sascha wrote:
  
   i've cherry-picked 65a5f2b3 onto olpc-2.6.35, and the autobuilder
   did the rest.  this implements a new libertas_disablemesh module
   parameter which should keep mesh from being enabled.  please test:
  
   
   http://rpmdropbox.laptop.org/f14-xo1/kernel-2.6.35.13_xo1-20120502.1603.olpc.bde819f.i586.rpm
  
  Thanks! Just to be sure: as it's been merged into olpc-2.6.35, all
  future official 2.6.35 based OLPC kernel builds will include this patch?

hi sascha -- 

yes.  i think we hope there won't actually be any more of those, but
if there are, that patch will be there.  current and future releases
get the patch for free, since it's upstream.  (thank you)

paul

p.s.  somehow the git hash i pasted above is incorrect.  the correct
cherry-pick was this one:
-
 commit 6bdbdbf4a151a3a1333818cd17a7d7795e936041
 Author: Sascha Silbe si...@activitycentral.com
 Date:   Wed May 11 14:52:34 2011 +0200

libertas: Add libertas_disablemesh module parameter to disable mesh 
interface

This allows individual users and deployments to disable mesh support at
runtime, i.e. without having to build and maintain a custom kernel.

Based on a patch by Paul Fox p...@laptop.org.
Signed-off-by: Sascha Silbe si...@activitycentral.com
Signed-off-by: John W. Linville linvi...@tuxdriver.com
-


  
  The reason we've not gone the module parameter route so far (in
  Dextrose 3) is that we didn't want to divert from upstream (OLPC in this
  case) on the kernel level. If it's included now, that concern is
  addressed and we can go this route, which IMO is technically the best
  option. It avoids all possible race conditions and only needs a single
  configuration file to be set up.
  
  Sascha
  
  -- 
  http://sascha.silbe.org/
  http://www.infra-silbe.de/

=-
 paul fox, p...@laptop.org
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [Sugar-devel] Wanting to know a bit of (NetworkManager) workflow upon resume-from-suspend

2012-05-04 Thread Martin Abente
Did you remove the disable mesh.script for the testing?
El may 4, 2012 10:39 p.m., Ajay Garg ajaygargn...@gmail.com escribió:

 On Sat, May 5, 2012 at 2:03 AM, Sascha Silbe si...@activitycentral.com
 wrote:
  Ajay Garg ajaygargn...@gmail.com writes:
 
  [...]
  b)
  Ensured that '/etc/modprobe.d/libertas.conf' contained only the
  following line ::
 
options libertas libertas_disablemesh=0
  [...]
  f)
  Upon resume-from-suspend, NO ICONS COULD BE SEEN IN NEIGHBORHOOD VIEW.
 
  Interesting.
 
  g)
  Observations e) and f) were observed, _every single time_.
 
  OK, I suppose you repeated this often enough and using exactly the same
  environment and procedures as the other test cases? I.e. you are sure
  that it's really specific to libertas_disablemesh=0 rather than just
  occurring at random even without the libertas_disablemesh setting or
  based on how the suspend / resume was triggered (e.g. idle suspend
  vs. lid or power switch)?

 I tried 3 times, and it happened every time. (I tried with the lid
 approach every time, under identical conditions and set of
 procedures.).





 
  I don't see anything in the patch or the module params code that would
  explain this behaviour. If it's reproducible, I'll have to dive into it
  and debug a bit...

 It is reproducible every time (at least at my end). :)




 
  Thanks for testing, BTW!

 My pleasure :)




 
  Sascha
 
  --
  http://sascha.silbe.org/
  http://www.infra-silbe.de/


 Regards,
 Ajay
 ___
 Sugar-devel mailing list
 sugar-de...@lists.sugarlabs.org
 http://lists.sugarlabs.org/listinfo/sugar-devel

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [Sugar-devel] Wanting to know a bit of (NetworkManager) workflow upon resume-from-suspend

2012-05-04 Thread Ajay Garg
On Sat, May 5, 2012 at 3:42 AM, Martin Abente
martin.abente.lah...@gmail.com wrote:
 Did you remove the disable mesh.script for the testing?

Yes.

Both from ::

a)
my custom added in '/etc/init.d/NetworkManager'.

b)
'/etc/powed/postresume.d/disable_mesh.sh'


Regards,
Ajay

 El may 4, 2012 10:39 p.m., Ajay Garg ajaygargn...@gmail.com escribió:

 On Sat, May 5, 2012 at 2:03 AM, Sascha Silbe si...@activitycentral.com
 wrote:
  Ajay Garg ajaygargn...@gmail.com writes:
 
  [...]
  b)
  Ensured that '/etc/modprobe.d/libertas.conf' contained only the
  following line ::
 
                options libertas libertas_disablemesh=0
  [...]
  f)
  Upon resume-from-suspend, NO ICONS COULD BE SEEN IN NEIGHBORHOOD VIEW.
 
  Interesting.
 
  g)
  Observations e) and f) were observed, _every single time_.
 
  OK, I suppose you repeated this often enough and using exactly the same
  environment and procedures as the other test cases? I.e. you are sure
  that it's really specific to libertas_disablemesh=0 rather than just
  occurring at random even without the libertas_disablemesh setting or
  based on how the suspend / resume was triggered (e.g. idle suspend
  vs. lid or power switch)?

 I tried 3 times, and it happened every time. (I tried with the lid
 approach every time, under identical conditions and set of
 procedures.).





 
  I don't see anything in the patch or the module params code that would
  explain this behaviour. If it's reproducible, I'll have to dive into it
  and debug a bit...

 It is reproducible every time (at least at my end). :)




 
  Thanks for testing, BTW!

 My pleasure :)




 
  Sascha
 
  --
  http://sascha.silbe.org/
  http://www.infra-silbe.de/


 Regards,
 Ajay
 ___
 Sugar-devel mailing list
 sugar-de...@lists.sugarlabs.org
 http://lists.sugarlabs.org/listinfo/sugar-devel
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel