Re: Memory replacement

2011-03-15 Thread Arnd Bergmann
On Tuesday 15 March 2011 01:29:19 John Watlington wrote:

 On Mar 14, 2011, at 3:18 PM, Arnd Bergmann wrote:

  Another effect is that the page size has increased by a factor of 8,
  from 2 or 4 KB to 16 or 32 KB. Writing data that as smaller than
  a page is more likely to get you into the worst case mentioned
  above. This is part of why FAT32 with 32 KB clusters still works
  reasonably well, but ext3 with 4 KB blocks has regressed so much.
 
 The explanation is simple: manufacturers moved to two-bit/cell (MLC) NAND 
 Flash
 over a year ago, and six months ago moved to three-bit/cell (TLC) NAND Flash.
 Reliability went down, then went through the floor (I cannot recommend TLC for
 anything but write-once devices).   You might have noticed this as an 
 increase in
 the size of the erase block, as it doubled or more with the change.

That, and the move to smaller structures (down to 25 nm) has of course
reduced reliablility further, down to 2000 or so erase cycles per block,
but that effect is unrelated to the file system being used.

My point was that even if the card was done perfectly for FAT32 (maybe a
write amplification of 2), the changes I described are pessimising ext3
(data from my head, easily off by an order of magnitude):

 drive   block  pageerase   w-amplftn   expected life
 sizesize   sizecycles  FAT ext3FAT ext3
2005 SLC 256 MB  64 KB  1 KB10  2   8   13 TB   3.2 TB
2005 MLC 512 MB  128 KB 2 KB1   2   16  2.5 TB  640 GB
2011 SLC 4 GB2 MB   8 KB5   2   512 100 TB  200 GB
2011 MLC 8 GB4 MB   16 KB   50002   102420 TB   40 GB
2011 TLC 16 GB   4 MB   16 KB   20002   102416 TB   32 GB

The manufacturers have probably mitigated this slightly by using more
spare blocks, better ECC and better GC over the years, but essentially
your measurements are matching the theory.

Arnd
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-14 Thread Arnd Bergmann
On Sunday 13 March 2011, Mikus Grinbergs wrote:
  The tests have also helped expose other issues with things like sudden 
  power off.  In one case a SPO during a write would corrupt the card so 
  badly it became useless.  You could only recover them via a super secret 
  tool from the manufacturer.
 
 Is there any sledgehammer process available to users without a super
 secret tool ?

You can recover some cards by issueing an erase on the full drive.
Unfortunately, this requires a patch to the SDHCI device driver,
which is only now going into the kernel, I think it will be
in 2.6.39.

Issuing an erase (ioctl BLKDISCARD) also helps recover the performance
on cards that get slower with increased internal fragmentation, but
most cards use GC algorithms far too simple to get into that problem
in the first place.

Arnd
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-14 Thread Richard A. Smith
On 03/13/2011 06:34 PM, Mikus Grinbergs wrote:
 The tests have also helped expose other issues with things like sudden
 power off.  In one case a SPO during a write would corrupt the card so
 badly it became useless.  You could only recover them via a super secret
 tool from the manufacturer.

 Is there any sledgehammer process available to users without a super
 secret tool ?

Wasn't just secret to users.  They would not give us the info on how to 
do it either.  It was vendor specific so not really worth the effort of 
trying to reverse engineer.

-- 
Richard A. Smith  rich...@laptop.org
One Laptop per Child
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-14 Thread Arnd Bergmann
On Sunday 13 March 2011, Richard A. Smith wrote:
 On 03/13/2011 01:21 PM, Arnd Bergmann wrote:
 There's a 2nd round of test(s) that runs during the manufacturing and 
 burn-in phases. One is a simple firmware test to see if you can talk the 
 card at all and then one runs at burn in.  It doesn't have a minimum 
 write size criteria but during the run there must not be any bit errors.

ok.

  It does seem a bit crude, because many cards are not really suitable
  for this kind of file system when their wear leveling is purely optimized
  to the accesses defined in the sd card file system specification.
 
  If you did this on e.g. a typical Kingston card, it can have a write
  amplification 100 times higher than normal (FAT32, nilfs2, ...), so
  it gets painfully slow and wears out very quickly.
 
 Crude as they are they have been useful tests for us.  Our top criteria 
 is reliability.  We want to ship the machines with a SD card thats going 
 to last for the 5 year design life using the filesystem we ship.  We 
 tried to create an access pattern was the worst possible and the highest 
 stress on the wear leveling system.

I see. Using the 2 KB block size on ext3 as described in the Wiki should
certainly do that, even on old cards that use 4 KB pages. I typically
misalign the partition by a few sectors to get a similar effect,
doubling the amount of internal garbage collection.

I guess the real images use a higher block size, right?

  I had hoped that someone already correlated the GC algorithms with
  the requirements of specific file systems to allow a more systematic
  approach.
 
 At the time we started doing this testing none of the log structure 
 filesystems were deemed to be mature enough for us to ship. So we didn't 
 bother to try and torture test using them.
 
 If more precision tests were created that still allowed us to make a 
 reasonable estimate of data write lifetime we would be happy to start 
 using them.

The tool that I'm working is git://git.linaro.org/people/arnd/flashbench.git
It can be used to characterize a card in terms of its erase block size,
number of open erase blocks, FAT optimized sections of the card, and
possible access patterns inside of erase blocks, all by doing raw block
I/O. Using it is currently a more manual process than I'd hope to
make it for giving it to regular users. It also needs to be correlated
to block access patterns from the file system. When you have that, it
should be possible to accurately predict the amount of write amplification,
which directly relates to how long the card ends up living.

What I cannot determine right now is whether the card does static wear
leveling. I have a Panasonic card that is advertized as doing it, but
I haven't been able to pin down when that happens using timing attacks.

Another thing you might be interested in is my other work on a block
remapper that is designed to reduce the garbage collection by writing
data in a log-structured way, similar to how some SSDs work internally.
This will also do static wear leveling, as a way to improve the expected
life by multiple orders of magnitude in some cases.
https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Projects/FlashDeviceMapper
lists some concepts I want to use, but I have done a lot of changes
to the design that are not yet reflected in the Wiki. I need to
talk to more people at the Embedded Linux Conference and Storage/FS summit
in San Francisco to make sure I get that right.

Arnd
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-14 Thread John Watlington

On Mar 13, 2011, at 6:34 PM, Mikus Grinbergs wrote:

 The tests have also helped expose other issues with things like sudden 
 power off.  In one case a SPO during a write would corrupt the card so 
 badly it became useless.  You could only recover them via a super secret 
 tool from the manufacturer.
 
 Is there any sledgehammer process available to users without a super
 secret tool ?

No.   Such software does exist for every controller, but it doesn't
necessarily use the SD interface as SD.

 I've encountered SD cards which will be recognized as a device when
 plugged in to a running XO-1 (though 'ls' of a filesystem on that SD
 card is corrupt) -- but 'fdisk' is ineffective when I want to write a
 new partition table (and 'fsck' appears to loop).  Since otherwise I'd
 just have to throw the card away, I'd be willing to apply EXTREME
 measures to get such a card into a reusable (blank slate) condition.


Cards that are in the state you describe are most likely dead due to
running out of spare blocks.   There is nothing that can be done to
rehabilitate them, even using the manufacturer's secret code.
In a disturbing trend, most of the cards I've returned for failure analysis
in the past year have been worn out (and not just trashed meta-data
due to a firmware error).

Bummer,
wad

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-14 Thread Arnd Bergmann
On Monday 14 March 2011 19:50:27 John Watlington wrote:
 Cards that are in the state you describe are most likely dead due to
 running out of spare blocks.   There is nothing that can be done to
 rehabilitate them, even using the manufacturer's secret code.
 In a disturbing trend, most of the cards I've returned for failure analysis
 in the past year have been worn out (and not just trashed meta-data
 due to a firmware error).

Part of the explanation for this could be the fact that erase block
sizes have rapidly increased. AFAIK, the original XO builtin flash
had 128KB erase blocks, which is also a common size for 1GB SD and
CF cards.

Cards made in 2010 or later typically have erase blocks of 2 MB, and
combine two of them into an allocation unit of 4 MB. This means that
in the worst case (random access over the whole medium), the write
amplification has increased by a factor of 32.

Another effect is that the page size has increased by a factor of 8,
from 2 or 4 KB to 16 or 32 KB. Writing data that as smaller than
a page is more likely to get you into the worst case mentioned
above. This is part of why FAT32 with 32 KB clusters still works
reasonably well, but ext3 with 4 KB blocks has regressed so much.

Arnd
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-14 Thread John Watlington

On Mar 14, 2011, at 3:18 PM, Arnd Bergmann wrote:

 On Monday 14 March 2011 19:50:27 John Watlington wrote:
 Cards that are in the state you describe are most likely dead due to
 running out of spare blocks.   There is nothing that can be done to
 rehabilitate them, even using the manufacturer's secret code.
 In a disturbing trend, most of the cards I've returned for failure analysis
 in the past year have been worn out (and not just trashed meta-data
 due to a firmware error).
 
 Part of the explanation for this could be the fact that erase block
 sizes have rapidly increased. AFAIK, the original XO builtin flash
 had 128KB erase blocks, which is also a common size for 1GB SD and
 CF cards.

 Cards made in 2010 or later typically have erase blocks of 2 MB, and
 combine two of them into an allocation unit of 4 MB. This means that
 in the worst case (random access over the whole medium), the write
 amplification has increased by a factor of 32.
 
 Another effect is that the page size has increased by a factor of 8,
 from 2 or 4 KB to 16 or 32 KB. Writing data that as smaller than
 a page is more likely to get you into the worst case mentioned
 above. This is part of why FAT32 with 32 KB clusters still works
 reasonably well, but ext3 with 4 KB blocks has regressed so much.

The explanation is simple: manufacturers moved to two-bit/cell (MLC) NAND Flash
over a year ago, and six months ago moved to three-bit/cell (TLC) NAND Flash.
Reliability went down, then went through the floor (I cannot recommend TLC for
anything but write-once devices).   You might have noticed this as an increase 
in
the size of the erase block, as it doubled or more with the change.

Cheers,
wad

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-13 Thread C. Scott Ananian
On Sun, Mar 13, 2011 at 8:57 AM, Andrei Warkentin andr...@motorola.com wrote:
 Sorry to butt in, I think I'm missing most of the context
 herenevertheless... I'm curious, ignoring outer packaging and
 product names, if you look at cards with the same CID (i.e. same
 manfid/oemid/date/firmware and hw rev), do you get same performance
 characteristics?

No.
 --scott

-- 
                         ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-13 Thread C. Scott Ananian
On Sun, Mar 13, 2011 at 1:00 PM, C. Scott Ananian csc...@laptop.org wrote:
 On Sun, Mar 13, 2011 at 8:57 AM, Andrei Warkentin andr...@motorola.com 
 wrote:
 Sorry to butt in, I think I'm missing most of the context
 herenevertheless... I'm curious, ignoring outer packaging and
 product names, if you look at cards with the same CID (i.e. same
 manfid/oemid/date/firmware and hw rev), do you get same performance
 characteristics?

 No.

To elaborate: see bunnie's blog post (cited above) on how the CID is
often forged or wrong.  I've also personally witnessed a
manufacturer's rep come to the factory floor to reprogram a compact
flash card's internal microcontroller with new firmware.  This did not
update any externally visible information reported by the chip.  I had
to convince the manufacturer to leave their proprietary hardware on
the factory floor in order to be able to verify that future units
would have the correct firmware.  (Granted, this was not an MMC unit,
but I would be surprised if MMC vendors were significantly different
in this regard.)

If you've spent any time working with Chinese/Taiwanese OEMs, you will
notice that version control methodologies are (in general)
disappointingly lax.
 --scott

-- 
                         ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-13 Thread Arnd Bergmann
On Sunday 13 March 2011 02:01:22 C. Scott Ananian wrote:
 On Sat, Mar 12, 2011 at 5:51 PM, Arnd Bergmann a...@arndb.de wrote:
  I've had four cards with a Sandisk label that had unusual characteristics
  and manufacturer/OEM IDs that refer to other companies, three Samsung (SM)
  and one unknown (BE, possibly lexar). In all cases, the Sandisk support
  has confirmed from photos that the cards are definitely fake. They also
 
 Please see the blog post I cited in the email immediately prior to
 yours, which discusses this situation precisely.  Often the cards are
 not actually fake -- they may even be produced on the exact same
 equipment as the usual cards, but off the books during hours when
 the factory is officially closed.  This sort of thing is very very
 widespread, and fakes can come even via official distribution
 channels.  (Discussed in bunnie's post.)

I am very familiar with bunnie's research, and have referenced
it from my own page on the linaro wiki. I have also found Kingston
cards with the exact same symptoms that triggered his original
interest (very slow, manfid 0x41, oemid 42, low serial number).

Another interesting case of a fake card I found had a Sandisk
label and LEXAR in its MMC name field. Moreover, it actually
contained copyrighted software that Lexar ships in their real
cards. So what I'd assume is happening here is that the factory
that produces the cards or Lexar had a graveyard shift where they
were just printing Sandisk labels on the cards.

 You're giving the OEMs too much credit.  As John says, unless you
 arrange for a special SKU, even the first source companies will give
 you whatever they've got cheap that day.

It's pretty clear that they are moving to cheaper NAND chips when
possible, and I also mentioned that. For the controller chips, I don't
understand how they would save money by buying them on the spot market.
On the contrary, using the smart controllers that Sandisk themselves
make allows them to use even slower NAND chips and still qualify for
a better nominal speed grade, while companies that don't have acess
to decent controllers need to either use chips that are fast enough
to make up for the bad GC algorithm or lie in their speed grades.

  How we deal with this is constant testing and getting notification from
  the manufacturer that they are changing the internals (unfortunately,
  we aren't willing to pay the premium to have a special SKU).
 
  Do you have test results somewhere publically available? We are currently
  discussing adding some tweaks to the linux mmc drivers to detect cards
  with certain features, and to do some optimizations in the block layer
  for common ones.
 
 http://wiki.laptop.org/go/NAND_Testing

Ok, so the testing essentially means you create an ext2/3/4 file system
and run tests on the file system until the card wears out, right?

It does seem a bit crude, because many cards are not really suitable
for this kind of file system when their wear leveling is purely optimized
to the accesses defined in the sd card file system specification.

If you did this on e.g. a typical Kingston card, it can have a write
amplification 100 times higher than normal (FAT32, nilfs2, ...), so
it gets painfully slow and wears out very quickly.

I had hoped that someone already correlated the GC algorithms with
the requirements of specific file systems to allow a more systematic
approach.

Arnd
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-13 Thread Andrei Warkentin
On Sat, Mar 12, 2011 at 7:01 PM, C. Scott Ananian csc...@laptop.org wrote:
 On Sat, Mar 12, 2011 at 5:51 PM, Arnd Bergmann a...@arndb.de wrote:
 I've had four cards with a Sandisk label that had unusual characteristics
 and manufacturer/OEM IDs that refer to other companies, three Samsung (SM)
 and one unknown (BE, possibly lexar). In all cases, the Sandisk support
 has confirmed from photos that the cards are definitely fake. They also

 Please see the blog post I cited in the email immediately prior to
 yours, which discusses this situation precisely.  Often the cards are
 not actually fake -- they may even be produced on the exact same
 equipment as the usual cards, but off the books during hours when
 the factory is officially closed.  This sort of thing is very very
 widespread, and fakes can come even via official distribution
 channels.  (Discussed in bunnie's post.)

 However, they have apparently managed to make them work well
 for random access by using some erase blocks as SLC (writing only
 the pages that carry the most significant bit in each cell) and
 by doing log structured writes in there, something that apparently
 others have not figured out yet. Also, as I mentioned, they
 consistenly use a relatively large number of open erase blocks.
 I've measured both effects on SD cards and USB sticks.

 You've been lucky.

 I believe you can get this level of sophistication only from
 companies that make the nand flash, the controller and the card:
 Sandisk, Samsung and Toshiba.
 Other brands that just get the controllers and the flash chips
 from whoever sells them cheaply (kingston, adata, panasonic,
 transcend, ...) apparently don't get the really good stuff.

 You're giving the OEMs too much credit.  As John says, unless you
 arrange for a special SKU, even the first source companies will give
 you whatever they've got cheap that day.

 How we deal with this is constant testing and getting notification from
 the manufacturer that they are changing the internals (unfortunately,
 we aren't willing to pay the premium to have a special SKU).

 Do you have test results somewhere publically available? We are currently
 discussing adding some tweaks to the linux mmc drivers to detect cards
 with certain features, and to do some optimizations in the block layer
 for common ones.

 http://wiki.laptop.org/go/NAND_Testing

 But the testing wad is talking about is really *on the factory floor*:
  Regular sampling of chips as they come into the factory to ensure
 that the chips *you are actually about to put into the XOs* are
 consistent.  Relying on manufacturing data reported by the chips is
 not reliable.
  --scott


Sorry to butt in, I think I'm missing most of the context
herenevertheless... I'm curious, ignoring outer packaging and
product names, if you look at cards with the same CID (i.e. same
manfid/oemid/date/firmware and hw rev), do you get same performance
characteristics?

Anyway, if you're curious about optimizing performance for certain
cards, I'm curious to see your results, your tests and (if any) vendor
recommendations. I'm collecting data and trying to re-validate some of
the vendor suggestions for Toshiba eMMCs... in particular - splitting
unaligned writes into an unaligned and aligned part. The only thing I
can say now is that the more data I collect the less it makes sense
:-).

I'm resubmitting a change to MMC layer that allows creating block MMC
quirks... Skipping the actual quirks as I'm trying to revalidate data
taken by others and provide data I'm confident about, but you might be
interested in the overall quirks support if you're thinking about
adding your own.

A
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-13 Thread Richard A. Smith
On 03/13/2011 01:21 PM, Arnd Bergmann wrote:

 Do you have test results somewhere publically available? We are currently
 discussing adding some tweaks to the linux mmc drivers to detect cards
 with certain features, and to do some optimizations in the block layer
 for common ones.

 http://wiki.laptop.org/go/NAND_Testing

 Ok, so the testing essentially means you create an ext2/3/4 file system
 and run tests on the file system until the card wears out, right?

The qualifying test is that the card must pass 3TB of writes with no 
errors.  We run that on samples from the various mfg's.

There's a 2nd round of test(s) that runs during the manufacturing and 
burn-in phases. One is a simple firmware test to see if you can talk the 
card at all and then one runs at burn in.  It doesn't have a minimum 
write size criteria but during the run there must not be any bit errors.

 It does seem a bit crude, because many cards are not really suitable
 for this kind of file system when their wear leveling is purely optimized
 to the accesses defined in the sd card file system specification.

 If you did this on e.g. a typical Kingston card, it can have a write
 amplification 100 times higher than normal (FAT32, nilfs2, ...), so
 it gets painfully slow and wears out very quickly.

Crude as they are they have been useful tests for us.  Our top criteria 
is reliability.  We want to ship the machines with a SD card thats going 
to last for the 5 year design life using the filesystem we ship.  We 
tried to create an access pattern was the worst possible and the highest 
stress on the wear leveling system.

If a card pases the 3TB abuse test then we are pretty certain its going 
to meet that goal.  There were many cards that died very quickly.

The tests have also helped expose other issues with things like sudden 
power off.  In one case a SPO during a write would corrupt the card so 
badly it became useless.  You could only recover them via a super secret 
tool from the manufacturer.

 I had hoped that someone already correlated the GC algorithms with
 the requirements of specific file systems to allow a more systematic
 approach.

At the time we started doing this testing none of the log structure 
filesystems were deemed to be mature enough for us to ship. So we didn't 
bother to try and torture test using them.

If more precision tests were created that still allowed us to make a 
reasonable estimate of data write lifetime we would be happy to start 
using them.

-- 
Richard A. Smith  rich...@laptop.org
One Laptop per Child
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-13 Thread Mikus Grinbergs
 The tests have also helped expose other issues with things like sudden 
 power off.  In one case a SPO during a write would corrupt the card so 
 badly it became useless.  You could only recover them via a super secret 
 tool from the manufacturer.

Is there any sledgehammer process available to users without a super
secret tool ?

I've encountered SD cards which will be recognized as a device when
plugged in to a running XO-1 (though 'ls' of a filesystem on that SD
card is corrupt) -- but 'fdisk' is ineffective when I want to write a
new partition table (and 'fsck' appears to loop).  Since otherwise I'd
just have to throw the card away, I'd be willing to apply EXTREME
measures to get such a card into a reusable (blank slate) condition.

mikus

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-12 Thread John Watlington

On Mar 11, 2011, at 5:35 AM, Arnd Bergmann wrote:

 I've tested around a dozen media from them, and while you are true
 that they use rather different algorithms and NAND chips inside, all
 of them can write to at least 5 erase blocks before getting into
 garbage collection, which is really needed for ext3 file systems.
 
 Contrast this with Kingston cards, which all use the same algorithm
 and can only write data linearly to one erase block at a time, resulting
 in one or two orders of magnitude higher internal write amplification.
 
 Most other vendors are somewhere inbetween, and you sometimes get
 fake cards that don't do what you expect, such a a bunch of Samsung
 microSDHC cards thaI have I which are labeled Sandisk on the outside.

Those aren't fakes.   That is what I'm trying to get across.

 I've also seen some really cheap noname cards outperform similar-spec'd
 sandisk card, both regarding maximum throughput and the garbage collection
 algorithms, but you can't rely on that.


My point is that you can't rely on Sandisk either.

I've been in discussion with both Sandisk and Adata about these issues,
as well as constantly testing batches of new SD cards from all major
vendors.  Unless you pay a lot extra and order at least 100K, you have no
control over what they give you.   They don't just change NAND chips,
they change the controller chip and its firmware.  Frequently.
And they don't update either the SKU number, part marking or the
identification fields available to software.The manufacturing batch
number printed on the outside is the only thing that changes.

How we deal with this is constant testing and getting notification from
the manufacturer that they are changing the internals (unfortunately,
we aren't willing to pay the premium to have a special SKU).

Cheers,
wad

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-12 Thread C. Scott Ananian
Canonical related blog post: http://www.bunniestudios.com/blog/?p=918

Mandatory reading for anyone who has to deal with flash memory.
 --scott

-- 
                         ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-12 Thread Arnd Bergmann
On Friday 11 March 2011 18:28:49 John Watlington wrote:
 
 On Mar 11, 2011, at 5:35 AM, Arnd Bergmann wrote:
 
  I've tested around a dozen media from them, and while you are true
  that they use rather different algorithms and NAND chips inside, all
  of them can write to at least 5 erase blocks before getting into
  garbage collection, which is really needed for ext3 file systems.
  
  Contrast this with Kingston cards, which all use the same algorithm
  and can only write data linearly to one erase block at a time, resulting
  in one or two orders of magnitude higher internal write amplification.
  
  Most other vendors are somewhere inbetween, and you sometimes get
  fake cards that don't do what you expect, such a a bunch of Samsung
  microSDHC cards thaI have I which are labeled Sandisk on the outside.
 
 Those aren't fakes.   That is what I'm trying to get across.

I've had four cards with a Sandisk label that had unusual characteristics
and manufacturer/OEM IDs that refer to other companies, three Samsung (SM)
and one unknown (BE, possibly lexar). In all cases, the Sandisk support
has confirmed from photos that the cards are definitely fake. They also
explained that all authentic cards (possibly fake ones as well, but I have
not seen them) will be labeled Made in China, not Made in Korea or
Made in Taiwan as my fake ones, and that the authentic microSD cards have
the serial number on the front side, not on the back.

  I've also seen some really cheap noname cards outperform similar-spec'd
  sandisk card, both regarding maximum throughput and the garbage collection
  algorithms, but you can't rely on that.
 
 
 My point is that you can't rely on Sandisk either.
 
 I've been in discussion with both Sandisk and Adata about these issues,
 as well as constantly testing batches of new SD cards from all major
 vendors.

 Unless you pay a lot extra and order at least 100K, you have no
 control over what they give you.   They don't just change NAND chips,
 they change the controller chip and its firmware.  Frequently.
 And they don't update either the SKU number, part marking or the
 identification fields available to software.The manufacturing batch
 number printed on the outside is the only thing that changes.

I agree that you cannot rely on specific behavior to stay the 
same with any vendor. One thing I noticed for instance is that
many new Sandisk cards are using TLC (three level cell) NAND,
which is inherently slower and cheaper than the regular two-level
MLC used in older cards or those from other vendors.

However, they have apparently managed to make them work well
for random access by using some erase blocks as SLC (writing only
the pages that carry the most significant bit in each cell) and
by doing log structured writes in there, something that apparently
others have not figured out yet. Also, as I mentioned, they
consistenly use a relatively large number of open erase blocks.
I've measured both effects on SD cards and USB sticks.

I believe you can get this level of sophistication only from
companies that make the nand flash, the controller and the card:
Sandisk, Samsung and Toshiba.

Other brands that just get the controllers and the flash chips
from whoever sells them cheaply (kingston, adata, panasonic,
transcend, ...) apparently don't get the really good stuff.
 
 How we deal with this is constant testing and getting notification from
 the manufacturer that they are changing the internals (unfortunately,
 we aren't willing to pay the premium to have a special SKU).

Do you have test results somewhere publically available? We are currently
discussing adding some tweaks to the linux mmc drivers to detect cards
with certain features, and to do some optimizations in the block layer
for common ones.

Arnd
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-12 Thread C. Scott Ananian
On Sat, Mar 12, 2011 at 5:51 PM, Arnd Bergmann a...@arndb.de wrote:
 I've had four cards with a Sandisk label that had unusual characteristics
 and manufacturer/OEM IDs that refer to other companies, three Samsung (SM)
 and one unknown (BE, possibly lexar). In all cases, the Sandisk support
 has confirmed from photos that the cards are definitely fake. They also

Please see the blog post I cited in the email immediately prior to
yours, which discusses this situation precisely.  Often the cards are
not actually fake -- they may even be produced on the exact same
equipment as the usual cards, but off the books during hours when
the factory is officially closed.  This sort of thing is very very
widespread, and fakes can come even via official distribution
channels.  (Discussed in bunnie's post.)

 However, they have apparently managed to make them work well
 for random access by using some erase blocks as SLC (writing only
 the pages that carry the most significant bit in each cell) and
 by doing log structured writes in there, something that apparently
 others have not figured out yet. Also, as I mentioned, they
 consistenly use a relatively large number of open erase blocks.
 I've measured both effects on SD cards and USB sticks.

You've been lucky.

 I believe you can get this level of sophistication only from
 companies that make the nand flash, the controller and the card:
 Sandisk, Samsung and Toshiba.
 Other brands that just get the controllers and the flash chips
 from whoever sells them cheaply (kingston, adata, panasonic,
 transcend, ...) apparently don't get the really good stuff.

You're giving the OEMs too much credit.  As John says, unless you
arrange for a special SKU, even the first source companies will give
you whatever they've got cheap that day.

 How we deal with this is constant testing and getting notification from
 the manufacturer that they are changing the internals (unfortunately,
 we aren't willing to pay the premium to have a special SKU).

 Do you have test results somewhere publically available? We are currently
 discussing adding some tweaks to the linux mmc drivers to detect cards
 with certain features, and to do some optimizations in the block layer
 for common ones.

http://wiki.laptop.org/go/NAND_Testing

But the testing wad is talking about is really *on the factory floor*:
 Regular sampling of chips as they come into the factory to ensure
that the chips *you are actually about to put into the XOs* are
consistent.  Relying on manufacturing data reported by the chips is
not reliable.
  --scott

-- 
                         ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-11 Thread Arnd Bergmann
On Friday 11 March 2011, John Watlington wrote:
 On Mar 9, 2011, at 2:23 PM, Arnd Bergmann wrote:
 
  On Wednesday 09 March 2011 17:31:24 Kevin Gordon wrote:
  go, no-go, spend the extra pennies and get a Class 4/6/8/10
  
  Note that Class 8 does not exist (except fakes) and class 10 is
  usually not faster than class 6 if you run ext3 on it.
  
  Also, a Sandisk card is usually faster than a card from
  most other manufacturers even if they are one class faster
  nominally.
 
 I'll call BS on that claim.   Sandisk cards are all over the map,
 depending on the controller used internally.Please understand
 that these manufacturers change controllers all the time --- tests
 results from nine months ago are invalid.

I've tested around a dozen media from them, and while you are true
that they use rather different algorithms and NAND chips inside, all
of them can write to at least 5 erase blocks before getting into
garbage collection, which is really needed for ext3 file systems.

Contrast this with Kingston cards, which all use the same algorithm
and can only write data linearly to one erase block at a time, resulting
in one or two orders of magnitude higher internal write amplification.

Most other vendors are somewhere inbetween, and you sometimes get
fake cards that don't do what you expect, such a a bunch of Samsung
microSDHC cards thaI have I which are labeled Sandisk on the outside.

I've also seen some really cheap noname cards outperform similar-spec'd
sandisk card, both regarding maximum throughput and the garbage collection
algorithms, but you can't rely on that.

Arnd
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-10 Thread Sridhar Dhanapalan
On 10 March 2011 05:51, Paul Fox p...@laptop.org wrote:
 kevin wrote:
   and having my anti-static wrist guard properly attached - advice please: 
 go,
   no-go, spend the extra pennies and get a Class 4/6/8/10.  All I know for
   sure is the 2GiB card in there has to be replaced.  There are progressively

 if you're using the machine a lot, and you have the pennies, the difference
 a faster card makes will be noticeable.

I assume that this is just for personal use and not for deployment. A
few months ago I enquired about the possibility of getting Class 6
cards in our deployment XOs, and was informed that none of the Class 6
cards tested could pass reliability tests. That sort of thing becomes
critical if you want the XO to last for five years in the field.

Sridhar
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-10 Thread John Watlington

On Mar 9, 2011, at 2:23 PM, Arnd Bergmann wrote:

 On Wednesday 09 March 2011 17:31:24 Kevin Gordon wrote:
 go, no-go, spend the extra pennies and get a Class 4/6/8/10
 
 Note that Class 8 does not exist (except fakes) and class 10 is
 usually not faster than class 6 if you run ext3 on it.
 
 Also, a Sandisk card is usually faster than a card from
 most other manufacturers even if they are one class faster
 nominally.

I'll call BS on that claim.   Sandisk cards are all over the map,
depending on the controller used internally.Please understand
that these manufacturers change controllers all the time --- tests
results from nine months ago are invalid.

 See 
 https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Projects/FlashCardSurvey
 for a list of many cards. Go for a brand that has a larger
 number of open segments. Also make sure that the partition
 is aligned to 4 MB, otherwise you waste half the performance
 and expected life.

We align our images to 8 MB boundaries, as 4MB isn't enough
for some cards.
Since fs-update installs the partition table as well as the partition
images, this happens automatically.

Cheers,
wad

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-09 Thread Kevin Gordon
Mikus and James and the gang:

OK, the little 8GiB microSD card inserted into an SD adapter, inserted into
the external SD slot, passed the dir test that James said to perform at
OFW.  Didnt complain.  However, it is a Class 2 Sandisk card, so it might
not really be the right way to go.  Before I do the surgery, armed with
silver heat sink paste, and being very careful about pressure on the m/b,
and having my anti-static wrist guard properly attached - advice please: go,
no-go, spend the extra pennies and get a Class 4/6/8/10.  All I know for
sure is the 2GiB card in there has to be replaced.  There are progressively
more and  red squares appearing on every refresh, and since this is a
'contributors machine' that I screw up regularly testing a billion USB
contraptions, I reload almost every day that I use it :-)

Thanks gents.

KG

On Sun, Mar 6, 2011 at 4:06 PM, Mikus Grinbergs mi...@bga.com wrote:

  how to upgrade the SD card ?

 All you have to do is stick the new card in, then perform the
 'fs-update' (with an appropriate-sized .zd image).  The ENTIRE
 SD-card-content will be written-over-anew, including the partition table.

 

 The catch is that the micro-SD card is beneath the heat spreader - and
 once you replace the card, you have to make sure that the heat spreader
 has thermal contact with the CPU chip.  [I suspect that too much
 physical pressure at the CPU chip might have contributed to the #10314
 failures of some pre-production XO-1.5 motherboards.]

 

  a fresh new 8GiB micro-SD card

 I had good luck with a 8GiB micro-SD card in the 2009 XO-1.5 I had (but
 that system eventually stopped working - I don't know if my having put
 in a different micro-SD card had any relationship to that failure).

 What appears to matter most with the micro-SD card is its SPEED (it
 would take a heck of a lot of Journal entries to fill even a 4GiB card).
  There was a discussion on the devel list a long while ago about the
 transfer speeds measured on the XO-1.5 with several different cards --
 apparently NOT ALL the available cards met OLPC reliability specs.


 mikus





___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-09 Thread Mikus Grinbergs
 advice please: go, no-go, 
 spend the extra pennies and get a Class 4/6/8/10

Go.

I was interested in having a higher-performing XO-1.5 -- so the card I
bought back then was a class 6.  It is likely the micro-SD card you have
now is a class 2 -- so your new card (Sandisk has good reputation for
reliable SD cards) will perform the same as what you are used to.

Since your existing card is deteriorating - go ahead and replace it.

mikus


___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-09 Thread Paul Fox
kevin wrote:
  Mikus and James and the gang:
  
  OK, the little 8GiB microSD card inserted into an SD adapter, inserted into
  the external SD slot, passed the dir test that James said to perform at
  OFW.  Didnt complain.  However, it is a Class 2 Sandisk card, so it might
  not really be the right way to go.  Before I do the surgery, armed with
  silver heat sink paste, and being very careful about pressure on the m/b,

no paste is used when the laptops are manufactured, and none should
really be necessary afterward.  it's true that later head spreaders
were modified (with an extra attachment point) to ensure proper
contact with the cpu, but you can help ensure the same thing by gently
bending each of the flat feet that holds a screw slightly downward. 
(not the ones that _don't_ hold a screw -- leave those flat.)  bending
the mounting tabs down will cause the center of the heat spreader to
bow towards the motherboard when the screws flatten out the feet.

there's a picture here:
 
http://lists.laptop.org/pipermail/devel/attachments/20091103/4e777bb2/attachment-0001.pdf
(it's an attachment to this devel message:
 http://lists.laptop.org/pipermail/devel/2009-November/026110.html )

  and having my anti-static wrist guard properly attached - advice please: go,
  no-go, spend the extra pennies and get a Class 4/6/8/10.  All I know for
  sure is the 2GiB card in there has to be replaced.  There are progressively

if you're using the machine a lot, and you have the pennies, the difference
a faster card makes will be noticeable.

paul

  more and  red squares appearing on every refresh, and since this is a
  'contributors machine' that I screw up regularly testing a billion USB
  contraptions, I reload almost every day that I use it :-)
  
  Thanks gents.

and ladies, i'm sure.

=-
 paul fox, p...@laptop.org
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-09 Thread Arnd Bergmann
On Wednesday 09 March 2011 17:31:24 Kevin Gordon wrote:
 go, no-go, spend the extra pennies and get a Class 4/6/8/10

Note that Class 8 does not exist (except fakes) and class 10 is
usually not faster than class 6 if you run ext3 on it.

Also, a Sandisk card is usually faster than a card from
most other manufacturers even if they are one class faster
nominally.

See 
https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Projects/FlashCardSurvey
for a list of many cards. Go for a brand that has a larger
number of open segments. Also make sure that the partition
is aligned to 4 MB, otherwise you waste half the performance
and expected life.

Arnd
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-09 Thread Mikus Grinbergs
 Also make sure that the partition
 is aligned to 4 MB, otherwise you waste half the performance
 and expected life.

I do this for every SD card onto which I myself write the partition table.

But I think the .zd files re-write the WHOLE SD card (including its
partition table).  If that is true, then the person replacing the SD
card has no control over where the partitions get placed -- only the
person who created the .zd file can customize that partition table.

mikus

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-09 Thread James Cameron
On Wed, Mar 09, 2011 at 03:15:12PM -0600, Mikus Grinbergs wrote:
 But I think the .zd files re-write the WHOLE SD card (including its
 partition table).  If that is true, then the person replacing the SD
 card has no control over where the partitions get placed -- only the
 person who created the .zd file can customize that partition table.

This is true.

-- 
James Cameron
http://quozl.linux.org.au/
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-09 Thread James Cameron
On Wed, Mar 09, 2011 at 11:31:24AM -0500, Kevin Gordon wrote:
 OK, the little 8GiB microSD card inserted into an SD adapter, inserted
 into the external SD slot, passed the dir test that James said to
 perform at OFW.  Didnt complain.

Good.  You must test again if you change cards, by the way.

 However, it is a Class 2 Sandisk card, so it might not really be the
 right way to go.

You can fs-update it in the external slot, and this will give you a good
indication of write performance in two ways; the flickering of the
storage LED during the fs-update, and the total time shown at the end of
the fs-update.

ok devalias fsdisk /sd/disk@1:0
ok fs-update ...

Then you can boot it, and the laptop should boot from the external card
and not use the internal card.

 Before I do the surgery, armed with silver heat sink paste,

I've never tried heat sink paste, I don't know if it is recommended, but
there's a thermal test in OpenFirmware you might try before and after.

ok test /switches

It instructs you to close the laptop, after which you should open it,
then it asks for e-book configuration.

Note the temperature rise.  Note your environment temperature so that
you can reproduce the test reliably.

 advice please: go, no-go, spend the extra pennies and get a Class
 4/6/8/10.

A faster card is certainly worth getting, but Class 4/6/8/10 doesn't
really indicate how fast it will be.  The definition of these classes
corresponds to sequential write rate by a camera to a FAT filesystem,
not random writes to an ext3 filesystem!

 All I know for sure is the 2GiB card in there has to be replaced.
 There are progressively more and red squares appearing on every
 refresh,

I don't agree with your assessment.  It might easily be another defect,
and not the card itself.  If it were in my hands I would do more write
testing on the internal card, using Tiny Core booted from USB.

You might defer the surgery and instead use the external slot for a
while.

-- 
James Cameron
http://quozl.linux.org.au/
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-06 Thread Chris Ball
Hi,

On Sun, Mar 06 2011, Kevin Gordon wrote:
 Might someone be able to point me to the place where one can get
 instructions on how to upgrade the SD card from an old XO 1.5 currently
 with 2GiB, to a fresh new 8GiB micro-SD card?

Just:

wget http://build.laptop.org/10.1.3/xo-1.5/os860/os860-8g.zd
ok fs-update os860-8g.zd

(OFW does the formatting for you.)

- Chris.
-- 
Chris Ball   c...@laptop.org   http://printf.net/
One Laptop Per Child
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Memory replacement

2011-03-06 Thread Mikus Grinbergs
 how to upgrade the SD card ?

All you have to do is stick the new card in, then perform the
'fs-update' (with an appropriate-sized .zd image).  The ENTIRE
SD-card-content will be written-over-anew, including the partition table.



The catch is that the micro-SD card is beneath the heat spreader - and
once you replace the card, you have to make sure that the heat spreader
has thermal contact with the CPU chip.  [I suspect that too much
physical pressure at the CPU chip might have contributed to the #10314
failures of some pre-production XO-1.5 motherboards.]



 a fresh new 8GiB micro-SD card

I had good luck with a 8GiB micro-SD card in the 2009 XO-1.5 I had (but
that system eventually stopped working - I don't know if my having put
in a different micro-SD card had any relationship to that failure).

What appears to matter most with the micro-SD card is its SPEED (it
would take a heck of a lot of Journal entries to fill even a 4GiB card).
 There was a discussion on the devel list a long while ago about the
transfer speeds measured on the XO-1.5 with several different cards --
apparently NOT ALL the available cards met OLPC reliability specs.


mikus




___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel