Re: XO-1 OFW occasionally has difficulty reading from my permanent SD card [Devel Digest, Vol 61, Issue 38]
You might try patching OpenFirmware to not turn off the card between subsequent accesses. To do this, assuming you are using Q2E45, add the following early in your olpc.fth file: dev /sd patch 2drop cb! card-power-off Is this also true for the XO-1.5 q3a62? or is an XO-1-only issue? ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
XO 1.5 performance testing.
It has been a fun and fulfilling weekend tracking down performance regressions when using DRI on the XO 1.5 platform. One place I was looking at specifically was blitting solids from userspace to the kernel. I found that copy_from_user was really chewing up a significant amount of cpu. Looking further I found that for VIA C7 processors these two options, X86_INTEL_USERCOPY and X86_USE_PPRO_CHECKSUM, are not enabled. I have not done thorough testing on it, but after patch the kernel with the attached patch my gtkperf run dropped from 63 seconds down to 50 seconds. A 20% improvement is not bad, but more testing is definitely needed, and testing both options independently is also needed. I have my test kernel available here http://dev.laptop.org/~jnettlet/f14/kernel-2.6.35.9_xo1.5-20110313.2249.fc14.c06443f.i586.rpm If anyone else reports back that they are also seeing improvements I will open a bug and we can track this improvement down further. Jon diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu index 2ac9069..e320d51 100644 --- a/arch/x86/Kconfig.cpu +++ b/arch/x86/Kconfig.cpu @@ -364,11 +364,11 @@ config X86_ALIGNMENT_16 config X86_INTEL_USERCOPY def_bool y - depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2 + depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MVIAC7 || MCORE2 config X86_USE_PPRO_CHECKSUM def_bool y - depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MEFFICEON || MGEODE_LX || MCORE2 || MATOM + depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MATOM config X86_USE_3DNOW def_bool y ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: XO-1 developer key does not work
If your laptops contain your deployment keys, I would recommend using this [1] so you can easily generate and manage your activations in situations like this :). 1. http://wiki.paraguayeduca.org/index.php/Yaas_documentation kind regards, tch On Sat, Mar 12, 2011 at 1:46 AM, Sridhar Dhanapalan srid...@laptop.org.au wrote: On 12 March 2011 05:10, C. Scott Ananian csc...@cscott.net wrote: Posting your machine's serial number as well as then contents of your develop.sig might help; your developer key might be malformed or correspond to a different XO than the one you are trying to use it on. You can also try the collection key method, as one more check on the process by which you are generating the developer key. This is a bit strange - I am getting a different result for each method. Printed in the battery compartment is the serial number SHC83102126 /home/.devkey.html contains: SN - SHC8320373E UUID - D5576981-BDA0-4271-ABFE-0183633847D1 /ofw/serial-number contains SHC832038CC laptops.dat (created with a collection key) contains: SHC832038CC C95B2B75-18A6-4860-B834-9AEAC7A4C47F 20110312T034351Z laptops.dat concurs with /ofw/serial-number. I've used the laptop.dat details to request a developer key. The page says that it'll be available in 24 hours. There are three main questions raised by this process: 1. why am I getting different readings for each method? 2. what is the most trustworthy method? 3. why must I wait 24 hours to get the developer key? Thanks, Sridhar ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: XO-1 developer key does not work
Nice, you pre-empted my next question! :D Thanks, Sridhar On 15 March 2011 00:03, Martin Abente martin.abente.lah...@gmail.com wrote: If your laptops contain your deployment keys, I would recommend using this [1] so you can easily generate and manage your activations in situations like this :). 1. http://wiki.paraguayeduca.org/index.php/Yaas_documentation kind regards, tch On Sat, Mar 12, 2011 at 1:46 AM, Sridhar Dhanapalan srid...@laptop.org.au wrote: On 12 March 2011 05:10, C. Scott Ananian csc...@cscott.net wrote: Posting your machine's serial number as well as then contents of your develop.sig might help; your developer key might be malformed or correspond to a different XO than the one you are trying to use it on. You can also try the collection key method, as one more check on the process by which you are generating the developer key. This is a bit strange - I am getting a different result for each method. Printed in the battery compartment is the serial number SHC83102126 /home/.devkey.html contains: SN - SHC8320373E UUID - D5576981-BDA0-4271-ABFE-0183633847D1 /ofw/serial-number contains SHC832038CC laptops.dat (created with a collection key) contains: SHC832038CC C95B2B75-18A6-4860-B834-9AEAC7A4C47F 20110312T034351Z laptops.dat concurs with /ofw/serial-number. I've used the laptop.dat details to request a developer key. The page says that it'll be available in 24 hours. There are three main questions raised by this process: 1. why am I getting different readings for each method? 2. what is the most trustworthy method? 3. why must I wait 24 hours to get the developer key? Thanks, Sridhar ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Memory replacement
On Sunday 13 March 2011, Mikus Grinbergs wrote: The tests have also helped expose other issues with things like sudden power off. In one case a SPO during a write would corrupt the card so badly it became useless. You could only recover them via a super secret tool from the manufacturer. Is there any sledgehammer process available to users without a super secret tool ? You can recover some cards by issueing an erase on the full drive. Unfortunately, this requires a patch to the SDHCI device driver, which is only now going into the kernel, I think it will be in 2.6.39. Issuing an erase (ioctl BLKDISCARD) also helps recover the performance on cards that get slower with increased internal fragmentation, but most cards use GC algorithms far too simple to get into that problem in the first place. Arnd ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Memory replacement
On 03/13/2011 06:34 PM, Mikus Grinbergs wrote: The tests have also helped expose other issues with things like sudden power off. In one case a SPO during a write would corrupt the card so badly it became useless. You could only recover them via a super secret tool from the manufacturer. Is there any sledgehammer process available to users without a super secret tool ? Wasn't just secret to users. They would not give us the info on how to do it either. It was vendor specific so not really worth the effort of trying to reverse engineer. -- Richard A. Smith rich...@laptop.org One Laptop per Child ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: XO 1.5 performance testing.
On 14 March 2011 07:28, Jon Nettleton jon.nettle...@gmail.com wrote: It has been a fun and fulfilling weekend tracking down performance regressions when using DRI on the XO 1.5 platform. One place I was looking at specifically was blitting solids from userspace to the kernel. I found that copy_from_user was really chewing up a significant amount of cpu. Looking further I found that for VIA C7 processors these two options, X86_INTEL_USERCOPY and X86_USE_PPRO_CHECKSUM, are not enabled. I have not done thorough testing on it, but after patch the kernel with the attached patch my gtkperf run dropped from 63 seconds down to 50 seconds. Wow! Nice work! But, I think your gains must have come from elsewhere. Please correct me if you spot something that I haven't. Firstly X86_INTEL_USERCOPY The function that decides when the alternative codepath enabled by this config option gets enabled is __movsl_is_ok() in arch/x86/lib/usercopy_32.c. This function has to return 0 for the interesting new functions such as __copy_user_zeroing_intel() to be called. The one way that this function can return 0 is: if (n = 64 ((a1 ^ a2) movsl_mask.mask)) return 0; This above return 0 will never happen on our platform. movsl_mask.mask is always 0. (on other platforms it is initialized in Intel-specific code in arch/x86/kernel/cpu/intel.c) So, X86_INTEL_USERCOPY doesn't seem to have any effect for us. I tried hacking movsl_mask.mask to 7 like Intel, and it resulted in a 0.9% speedup in copy_to_user (possibly just noise) when doing unaligned writes (its important to realise that the codepaths enabled by this option are only an optimization for unaligned accesses, well-aligned accesses are unchanged). X86_USE_PPRO_CHECKSUM looks like it is worth doing. It results in csum_partial() calls speeding up by a factor of 1.5. But I'd be surprised if this is having any direct impact on your graphics work (this checksum-calculating function seems to only have a handful of users outside of networking) unless your new DRI code is calling this function directly? Your performance gains are very exciting but I think they must have resulted from something else. My benchmark code is here (including a hack to enable the unaligned write optimization when X86_INTEL_USERCOPY is set): http://dev.laptop.org/~dsd/20110314/benchmark-copy_from_user-and-csum_partial.patch Daniel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Memory replacement
On Sunday 13 March 2011, Richard A. Smith wrote: On 03/13/2011 01:21 PM, Arnd Bergmann wrote: There's a 2nd round of test(s) that runs during the manufacturing and burn-in phases. One is a simple firmware test to see if you can talk the card at all and then one runs at burn in. It doesn't have a minimum write size criteria but during the run there must not be any bit errors. ok. It does seem a bit crude, because many cards are not really suitable for this kind of file system when their wear leveling is purely optimized to the accesses defined in the sd card file system specification. If you did this on e.g. a typical Kingston card, it can have a write amplification 100 times higher than normal (FAT32, nilfs2, ...), so it gets painfully slow and wears out very quickly. Crude as they are they have been useful tests for us. Our top criteria is reliability. We want to ship the machines with a SD card thats going to last for the 5 year design life using the filesystem we ship. We tried to create an access pattern was the worst possible and the highest stress on the wear leveling system. I see. Using the 2 KB block size on ext3 as described in the Wiki should certainly do that, even on old cards that use 4 KB pages. I typically misalign the partition by a few sectors to get a similar effect, doubling the amount of internal garbage collection. I guess the real images use a higher block size, right? I had hoped that someone already correlated the GC algorithms with the requirements of specific file systems to allow a more systematic approach. At the time we started doing this testing none of the log structure filesystems were deemed to be mature enough for us to ship. So we didn't bother to try and torture test using them. If more precision tests were created that still allowed us to make a reasonable estimate of data write lifetime we would be happy to start using them. The tool that I'm working is git://git.linaro.org/people/arnd/flashbench.git It can be used to characterize a card in terms of its erase block size, number of open erase blocks, FAT optimized sections of the card, and possible access patterns inside of erase blocks, all by doing raw block I/O. Using it is currently a more manual process than I'd hope to make it for giving it to regular users. It also needs to be correlated to block access patterns from the file system. When you have that, it should be possible to accurately predict the amount of write amplification, which directly relates to how long the card ends up living. What I cannot determine right now is whether the card does static wear leveling. I have a Panasonic card that is advertized as doing it, but I haven't been able to pin down when that happens using timing attacks. Another thing you might be interested in is my other work on a block remapper that is designed to reduce the garbage collection by writing data in a log-structured way, similar to how some SSDs work internally. This will also do static wear leveling, as a way to improve the expected life by multiple orders of magnitude in some cases. https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Projects/FlashDeviceMapper lists some concepts I want to use, but I have done a lot of changes to the design that are not yet reflected in the Wiki. I need to talk to more people at the Embedded Linux Conference and Storage/FS summit in San Francisco to make sure I get that right. Arnd ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Memory replacement
On Mar 13, 2011, at 6:34 PM, Mikus Grinbergs wrote: The tests have also helped expose other issues with things like sudden power off. In one case a SPO during a write would corrupt the card so badly it became useless. You could only recover them via a super secret tool from the manufacturer. Is there any sledgehammer process available to users without a super secret tool ? No. Such software does exist for every controller, but it doesn't necessarily use the SD interface as SD. I've encountered SD cards which will be recognized as a device when plugged in to a running XO-1 (though 'ls' of a filesystem on that SD card is corrupt) -- but 'fdisk' is ineffective when I want to write a new partition table (and 'fsck' appears to loop). Since otherwise I'd just have to throw the card away, I'd be willing to apply EXTREME measures to get such a card into a reusable (blank slate) condition. Cards that are in the state you describe are most likely dead due to running out of spare blocks. There is nothing that can be done to rehabilitate them, even using the manufacturer's secret code. In a disturbing trend, most of the cards I've returned for failure analysis in the past year have been worn out (and not just trashed meta-data due to a firmware error). Bummer, wad ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Memory replacement
On Monday 14 March 2011 19:50:27 John Watlington wrote: Cards that are in the state you describe are most likely dead due to running out of spare blocks. There is nothing that can be done to rehabilitate them, even using the manufacturer's secret code. In a disturbing trend, most of the cards I've returned for failure analysis in the past year have been worn out (and not just trashed meta-data due to a firmware error). Part of the explanation for this could be the fact that erase block sizes have rapidly increased. AFAIK, the original XO builtin flash had 128KB erase blocks, which is also a common size for 1GB SD and CF cards. Cards made in 2010 or later typically have erase blocks of 2 MB, and combine two of them into an allocation unit of 4 MB. This means that in the worst case (random access over the whole medium), the write amplification has increased by a factor of 32. Another effect is that the page size has increased by a factor of 8, from 2 or 4 KB to 16 or 32 KB. Writing data that as smaller than a page is more likely to get you into the worst case mentioned above. This is part of why FAT32 with 32 KB clusters still works reasonably well, but ext3 with 4 KB blocks has regressed so much. Arnd ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: XO-1 OFW occasionally has difficulty reading from my? permanent SD card [Devel Digest, Vol 61, Issue 38]
On Sun, Mar 13, 2011 at 11:27:44PM -0700, Yioryos Asprobounitis wrote: You might try patching OpenFirmware to not turn off the card between subsequent accesses.? To do this, assuming you are using Q2E45, add the following early in your olpc.fth file: ??? dev /sd ??? patch 2drop cb! card-power-off Is this also true for the XO-1.5 q3a62? or is an XO-1-only issue? The workaround you quoted above is not correct for XO-1.5 Q3A62, it was correct only for Q2E45. The root cause of the problem (failure to discharge the power supply to the card) is also present in XO-1.5 hardware. This is described in http://dev.laptop.org/ticket/10512 A workaround is in Q3A62 which increases the time that the firmware waits after turning off the power to the card. http://tracker.coreboot.org/trac/openfirmware/changeset/2065 shows the time was increased from 20ms to 40ms. This was so that a new batch of cards in manufacturing would pass testing. It is not possible for me to predict the behaviour on other cards, since I don't have them. This change also adds support for changing the delay: dev /sd d# 60 to power-off-time The change is lost on reboot. -- James Cameron http://quozl.linux.org.au/ ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: XO-1 developer key does not work
On 14 March 2011 10:58, James Cameron qu...@laptop.org wrote: On Sat, Mar 12, 2011 at 03:46:02PM +1100, Sridhar Dhanapalan wrote: There are three main questions raised by this process: [...] 3. why must I wait 24 hours to get the developer key? Presuming you are asking about an OLPC developer key rather than a deployment developer key ... the delay is to allow time for the laptop to be reported to OLPC as stolen. If an XO has both the OLPC and our own deployment developer keys, would it be correct to say that it can receive a developer key from either OLPC or us? Hence, an XO theft must be reported to both OLPC and OLPCAU? Sridhar ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: XO-1 developer key does not work
Sridhar - Yes, that's correct. Multiple valid keys weakens security, since the same rights can be obtained from multiple sources. Handling your own key-issuing authority is something we fully support, but it is a complex and substantial undertaking. It requires a reasonable commitment to both initial and ongoing staffing infrastructure on your end. I won't advise you not to consider it, but if you're considering it you should take it very seriously. That is particularly true if you are interested in replacing OLPC's various keys with your own (rather than adding to them). If you do so you can get yourself into situations in which no one else can help you. The very well-organized and professional team at Plan Ceibal (who replace OLPC's keys with their own) have had a few difficulties in the field. It's also important to realize that you'll need to provide support to Quanta's manufacturing team. Sometimes laptops require reworking due to test failures, and that can require them to be unlocked; if they're not using OLPC's keys you'll have to be able to provide those keys yourself. - Ed On Mar 14, 2011, at 7:46 PM, Sridhar Dhanapalan wrote: On 14 March 2011 10:58, James Cameron qu...@laptop.org wrote: On Sat, Mar 12, 2011 at 03:46:02PM +1100, Sridhar Dhanapalan wrote: There are three main questions raised by this process: [...] 3. why must I wait 24 hours to get the developer key? Presuming you are asking about an OLPC developer key rather than a deployment developer key ... the delay is to allow time for the laptop to be reported to OLPC as stolen. If an XO has both the OLPC and our own deployment developer keys, would it be correct to say that it can receive a developer key from either OLPC or us? Hence, an XO theft must be reported to both OLPC and OLPCAU? Sridhar ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: XO-1 developer key does not work
On 15 March 2011 11:01, Ed McNierney e...@laptop.org wrote: Sridhar - Yes, that's correct. Multiple valid keys weakens security, since the same rights can be obtained from multiple sources. Handling your own key-issuing authority is something we fully support, but it is a complex and substantial undertaking. It requires a reasonable commitment to both initial and ongoing staffing infrastructure on your end. I won't advise you not to consider it, but if you're considering it you should take it very seriously. That is particularly true if you are interested in replacing OLPC's various keys with your own (rather than adding to them). If you do so you can get yourself into situations in which no one else can help you. The very well-organized and professional team at Plan Ceibal (who replace OLPC's keys with their own) have had a few difficulties in the field. It's also important to realize that you'll need to provide support to Quanta's manufacturing team. Sometimes laptops require reworking due to test failures, and that can require them to be unlocked; if they're not using OLPC's keys you'll have to be able to provide those keys yourself. Thanks for that, Ed. At this point, we are having our deployment keys applied to XOs in the factory and field in addition to the standard OLPC keys. Our XOs are not developer locked, but I'm creating the option to lock them later on should the situation ask for it. It likely will be quite a while before we seriously consider it. Sridhar ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Memory replacement
On Mar 14, 2011, at 3:18 PM, Arnd Bergmann wrote: On Monday 14 March 2011 19:50:27 John Watlington wrote: Cards that are in the state you describe are most likely dead due to running out of spare blocks. There is nothing that can be done to rehabilitate them, even using the manufacturer's secret code. In a disturbing trend, most of the cards I've returned for failure analysis in the past year have been worn out (and not just trashed meta-data due to a firmware error). Part of the explanation for this could be the fact that erase block sizes have rapidly increased. AFAIK, the original XO builtin flash had 128KB erase blocks, which is also a common size for 1GB SD and CF cards. Cards made in 2010 or later typically have erase blocks of 2 MB, and combine two of them into an allocation unit of 4 MB. This means that in the worst case (random access over the whole medium), the write amplification has increased by a factor of 32. Another effect is that the page size has increased by a factor of 8, from 2 or 4 KB to 16 or 32 KB. Writing data that as smaller than a page is more likely to get you into the worst case mentioned above. This is part of why FAT32 with 32 KB clusters still works reasonably well, but ext3 with 4 KB blocks has regressed so much. The explanation is simple: manufacturers moved to two-bit/cell (MLC) NAND Flash over a year ago, and six months ago moved to three-bit/cell (TLC) NAND Flash. Reliability went down, then went through the floor (I cannot recommend TLC for anything but write-once devices). You might have noticed this as an increase in the size of the erase block, as it doubled or more with the change. Cheers, wad ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Emulating the XO
What would it take to get the current OLPC OS working on a standard computer, e.g. using VirtualBox? I found these instructions: http://wiki.laptop.org/go/Emulating_the_XO Which are deprecated. I'm thinking of giving it a go, but is there anything I should be aware of? Sridhar ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
[Server-devel] Hardaware compatibility
Hi the Server-Devel team, Reuben referred me to your community. In preparation to a deployment in Gabon, we're testing with Logic Supply the XS school server on the following system: SolidLogic Atom GS-L02 Fanless Mini-ITX System (Atom-GS-L02) Mainboard: Intel D945GSEJT Johnstown and HP81 Case: Serener GS-L02 Fanless Mini-ITX Case - Black Memory: 200-pin DDR2 800 SO-DIMM 2GB HDD Standard Flash: 3.5 Seagate SATA 7200rpm 1TB CD/DVD Drive: None AC Adapter (brick): AC Power Adapter 60W, 12V (PW-12V5A) Accessories: None Backplate: None Power Switch: None - Unit will be set to Auto-Power-On Wireless: None Wireless Antenna: None Mounting: None Operating System: None Build and Test: Build Test: Fanless - Standard (3-5 full business days) Warranty: 1 Year Warranty (Standard) Also, we'll be using the IlovemyXO USB LAN adapters to link to a switch and 3 PoE access points. Tomasz at Logic Supply did the Kickstart installation on the system using the last stable image http://xs-dev.laptop.org/xs/OLPC-School-Server-0.6-i386.iso . It installed properly. However, when he rebooted it he saw two errors before the login screen: 1. klogctl: Invalid argument 2. Kdump: failed I think kdmup was the logging mechanism, I'm not sure about the other one. After these two messages, the OS loads to the logon screen. Can you let me know if these two are significant and if there's incompatibility with the hardware? Thanks, Cordialement, Cordially, Kaçandre Bourdelais FranXOphonie http://www.franxophonie.org kacan...@franxophonie.org IRC Channel: irc://irc.freenode.net/franxophonie skype: kacandre ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel