This is working moderately well for me now, well enough to
circulate to see if other folk have the same results.
Test platform is DM355 EVM.

These three patches go on top of current GIT code (including
the 07190aa9f93b2ff107c15ef2e6c2c4a6dd266275 cleanup patch):

 1) Support dm355 4-bit ECC as NAND_ECC_HW_SYNDROME mode,
    supporting SMALL page chips only.  These won't need
    to trash the manufacturer's bad block markings; and
    for now, they refuse to use a flash-based BBT.

 2) Bugfix to NAND core code:  make raw read/write
    work correctly for NAND_ECC_HW_SYNDROME and large
    page flash, when more than one chunk is needed.
    (I suspect more such fixes will be needed... this
    "subpage"-ish chunking mechanism seems unused.)

 3) Update that 4-bit ECC to handle LARGE page chips.
    These overwrite manufacturer bad block markings,
    so a BBT must be written by someone before Linux
    uses the chip.  And the OOB layout is special.

Tests to do:

 (a) Ideally someone has a small page flash chip they
     were using with the original LSP, and they can
     try using it with just that first patch applied:
     take the large page chip out of the socket, swap
     in the small page chip, boot ... and tell us it
     works fine.  (You'd probably want to specify some
     different partitioning schem on the kernel command
     line...)

One technical issue I wonder about relates to whether
the OOB data is protected by the ECC.  This code does
protect the OOB data.  The LSP code seemed to expect it
was not protected ... but that doesn't match with the
test results I show below.

 (b) Anyone applying just the first patch, or the first
     two patches, should get a clean boot failure on
     the EVM when it uses its normal large page chip.
     They'd detect a 1 GByte chip, two instances.

 (c) The potentially data-destroying step involves
     the third patch.  My own test results follow,
     but that's after the bad block table got trashed
     before I came up with the second patch ... so I
     can't exactly know what other folk will see.

Other technical questions relate to how combining large
pages with ECC_HW_SYNDROME implies subpage/chunked I/O.
That second patch hits such an obvious, and early, bug
that I suspect these code paths are bug-infested and
have barely been used.

Which is part of why it'd be particularly useful to get
test results with small page chips, one ECC chunk per
page, which won't be using that troublesome mode.  If
they pass, the core ECC code will behave/interoperate
and be relatively safe to merge.  Then the issues with
large page chips can be chased down later.

- Dave


My basic testing results on DM355 EVM with its 2GByte
NAND chip:

  - u-boot reading, kernel writing:
      pass:     bbt updated by kernel
      pass:     kernels written to /dev/mtd2

  - kernel reading, u-boot writing:
      pass:     bbt written by u-boot
      FAIL:     read kernel written to /dev/mtd2 by u-boot
                (ALSO FAIL: u-boot reading the same thing)

  - drivers/mtd/tests
      pass:     page
      pass:     read
      pass:     speed (1.9 MiB/sec)
      pass:     stress (short)
      skip:     torture (don't want to kill the chip)
      FAIL:     oob ... this may be the ECC issue
      FAIL:     subpagetest ... ECC error in verify

_______________________________________________
Davinci-linux-open-source mailing list
[email protected]
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source

Reply via email to