[Milkymist-devel] NOR corruption analysis 3/4, a subtle gotcha

Werner Almesberger Wed, 12 Oct 2011 11:38:23 -0700

Whenever the test encountered a standby failure, the script also ran
a CRC check on the remaining partitions. It found two such instances:


8 (3703): flickernoise.fbi(rescue)(CRC)
12 (4356): flickernoise.fbi(rescue)(CRC)

(No attempt was made to auto-recover these partitions. The script
simply tracks the calculated CRC to detect additional corruption.)

This is similar to the first round of testing, where I also found
two corruptions outside of standby in ~10500 cycles.

However, this time there was an important difference: everything but
standby was supposed to be safely locked. So was the protection
locking afforded too weak to thwart NOR corruption ?

To make a long story short, I found nothing quite so grim. The reason
is much simpler: all the blocks had become unlocked again. This was a
side-effect of restoring the standby partition with "flashmem".

"flashmem" unlocks a block, erases it, writes it, and then proceeds
with the next block. At least that's what it thinks it does.

The logic in UrJTAG is that each block is locked or unlocked
individually. However, the NOR doesn't work like this. While the
unlocking command is innocently named "Unlock Block" and table 8 of
http://www.micron.com/get-document/?documentId=6062&file=319942_J3_65_256M_MLC_DS.pdf
suggests that it does indeed unlock just one block, it does in fact
unlock all the blocks in the entire NOR chip.

Table 7 has the correct wording: "[...], all of the Block lock bits
that are set are cleared in parallel." Interestingly, section 10.1
calls the command "Clear Block Lock Bits".

Practical implications of this:

- UrJTAG has a subtle failure mode in the sense that explicitly
  unlocking a block or writing a block (with hard-coded implicit
  unlock) will unlock the entire NOR.

- if Flickernoise unlocks any NOR blocks during an update and
  doesn't lock everything that's meant to be read-only afterwards,
  this would also remove the protection.

- the protection we get is a little weaker than expected, because
  any random occurrence of the two-byte unlock sequence could unlock
  the NOR, after which it would be vulnerable to regular NOR
  corruption. This may still be a very unlikely event, though.

- the process for setting up the M1rc3 being shipped should be
  checked if the lock bits are really set at the end. At least my
  version of reflash_m1.sh issues the "lockflash" before the last
  "flashmem".

I've written a crude script that attempts a NOR write through JTAG:
http://projects.qi-hardware.com/index.php/p/wernermisc/source/tree/master/m1rc3/norruption/2/upset

Usage: upset [address [value]]

To just probe whether a location can be written, write 0xffff. Even
if the write succeeds, the original value will be preserved.

There is an attempt to change the (locked) rescue bitstream:

./upset 0xa0004 0xfff0
...
URJ_BUS_READ(0x000a0004) = 0xFFFF (65535)
URJ_BUS_READ(0x000a0004) = 0x0092 (146)
URJ_BUS_READ(0x000a0004) = 0xFFFF (65535)

The values shown are: original value, status (0x92 = failed because
locked), and the resulting value. Without locking, the result would
look like this:

./upset 0xa0004 0xfff0
...
URJ_BUS_READ(0x000a0004) = 0xFFFF (65535)
URJ_BUS_READ(0x000a0004) = 0x0080 (128)
URJ_BUS_READ(0x000a0004) = 0xFFF0 (65520)

"upset" uses UrJTAG. fjmem.bit must be in the current directory.

Next: further experiments.

- Werner
_______________________________________________
http://lists.milkymist.org/listinfo.cgi/devel-milkymist.org
IRC: #milkymist@Freenode

[Milkymist-devel] NOR corruption analysis 3/4, a subtle gotcha

Reply via email to