On Tue, Mar 20, 2012 at 12:52:48PM +0100, Zygmunt Krynicki wrote: > W dniu 20.03.2012 11:48, Alexander Sack pisze: > >On Tue, Mar 20, 2012 at 11:41:37AM +0100, Zygmunt Krynicki wrote: > >>Hi > >> > >>Experimenting with the dispatcher made me realize that forced reboots (on > >>timeouts, for example) are an excellent way to damage the master image. At > >>the very best we are forced to re-check the master image. At the very worst > >>we may damage the superblock and generally hose the master. > >> > >>Do you think it is feasible to mount the master read-only and only do r/w > >>work on the test partitions? > >> > > > >I like this idea... That combined with always-poweroff-on-reboot feels > >like a good idea to compensate potential issues... > > Just curious: why would we always poweroff on reboot? Do you mean actual > power being cut or the equivalent of poweroff(8)?
That's a different background. The key of automation infrastructure is to ensure that each invidividual test is run in a controlled environment with as close to 100% reprodicibility of the state as possible. soft rebooting the unit doesn't guarantee to bring you back into a known base state. Hence, the requirement to always hard reboot with proper time unpowered in between. > > >Take the approach known from live-cd into account, such as > >aufs/unionfs and things should work well ... maybe master image doesnt > >even need a partition anymore, but can be just a .img file on the fat > >boot partition, just like how ubuntu live-cd etc. works... > > I wonder what is the complexity of this approach. I would also like to > consider the memory requirements. As an alternative we could try to mount > the master image from NBD. The NBD server already support "reverting to > snapshot" and keeping delta for each connected client in a temporary > file. Considering that the master image is nano and boots to the console only, I don't think that the memory requirements would exceed what we target LAVA lab at. What I don't like about NBD is that it makes the LAVA infrastructure more complex and harder to replicate. Everytime we add a new server/service that isn't the image/board itself, we diverge from something that can be validated and released efficiently/effectively a bit further. > > >Anyone can think of reason to not put that into the backlog? > > > >If LAVA team decides to investigate that path, please check with > >DevPlatform team on how they can help... > > I think we should seriously consider it as a milestone towards LAVA > reliability and automation of master image construction. Do we have a few empiric examples of the gathered list of LAVA incident that allows us to identify changes to master image (not talking about reproducability here) as a recurring source for unreliability? -- Alexander Sack <[email protected]> Technical Director, Linaro Platform Teams http://www.linaro.org | Open source software for ARM SoCs http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog _______________________________________________ linaro-validation mailing list [email protected] http://lists.linaro.org/mailman/listinfo/linaro-validation
