Jeff Cooper wrote: > Stephen, > > This has got to have something to do with the ECC correction being > done by the hardware. The 'nand dump' command just reads the raw data > from the chip and doesn't do any ECC on it where the 'nand read' is > doing ECC. > It very well could be ECC. But I'm not sure how. I would think every block would fail ECC checking/correction with the OOB set to all zeros.
> I wonder if some pattern in the data is fooling the DM355 is thinking > there is an error when there isn't one and the 'correction' is > actually corrupting the data. > possible I suppose. One would hope that an ECC correction algorithm would not care about the data. I think fixing this means someone has to figure out the data-ECC pattern being used by Uboot and the kernel, and what method is being used. Last time I read on the list that somewhere (Uboot or kernel I'm not sure which) ECC was done in software. I'm betting that all the pieces don't match up... Unfortunately - I don't have time to do this now. I will at some point in the next couple of months have to solve this - or at least get a clearer picture in my mind as to what is going on wrt ECC. > Jeff > > Stephen Berry wrote: >> Jeff, >> >> We have seen similar, but different issues with boot corruption on our >> custom 355 board. >> >> In general booting from the kernel image is never a problem - but one >> day we loaded a kernel into flash for a customer, and suddenly we have >> CRC failures on 1 out of every 3 boots. Comparing the data, there was a >> single byte of corrupted data, and always in the same spot. >> >> To make things even stranger - if we program the kernel image with >> Uboot, the problem goes away (JTAG image into memory then nandwrite). >> >> One thing we did note - if we program the NAND partition via linux >> nandwrite and compare it to "nand write" in Uboot - it seems the OOB is >> different. It looks like there is no ECC (all zeros) when linux writes >> the /dev/mtd2 block... not good. >> >> And the last strangeness - I recompiled the kernel (I think I took out a >> debug statement) and burned that into NAND. Poof. Problem solved :( >> >> Steve >> >> Jeff Cooper wrote: >> >>> I'm having an odd issue with U-Boot reading a page of NAND that I hope >>> someone can shed some light on. >>> >>> The issue came to light when I was creating a bootable SD card to load >>> the NAND on my system. The SD card would boot and load the UBL, >>> U-Boot, kernel and rootfs into DDR and then jump into U-Boot. Then >>> U-Boot would handle writing from DDR to NAND. >>> When I wrote the kernel image into NAND from U-Boot, U-Boot reported a >>> CRC error from when it tried to boot that image. Investigating, I >>> found that a couple of bytes of the image loaded from NAND didn't >>> match the image in DDR. >>> >>> The odd thing is that the error seems to be dependent on where I start >>> writing the data from DDR. If I write a page (512 bytes) of NAND >>> starting at DDR address 0x80765400 and then read that page back from >>> NAND into DDR, I'll get corrupted data read back. A 'nand dump' >>> command shows that the data was correctly written to NAND. >>> >>> If I write the data starting from DDR address 0x80765500, then the >>> error doesn't occur when I read the data back. >>> >>> It also doesn't matter where in NAND I try to write the data, any >>> block written to starting at 0x80765400 will produce corrupted data >>> when read back. >>> >>> Another odd thing is that the data will change depending on the number >>> of times the nand is read. The first read after a erase/write cycle >>> has 3 bytes that are in error. The second and all subsequent reads >>> will have only 1 byte wrong. >>> >>> A memory dump of the DDR data looks like: >>> >>> 80765400: 9e8767df 55dde0ca 8af2ffa0 7ba362f3 .g.....U.....b.{ >>> 80765410: 27557dfc 12b9ea3d 519cc35e d7cda66e .}U'=...^..Qn... >>> 80765420: bc2a2563 b9a953e6 0b7f7531 af38adee c%*..S..1u....8. >>> 80765430: 8f31f473 da6acf9a 8de7a13d a9eb13be s.1...j.=....... >>> 80765440: afb4dae6 f406566c d57414b3 aea1f15a ....lV....t.Z... >>> 80765450: befee9b6 95f098a5 ce071973 b768f79a ........s.....h. >>> 80765460: 5bc9f751 df776aaf be995e6f 70ab5acf Q..[.jw.o^...Z.p >>> 80765470: 9c0ef02d 38d5cdf7 9aebc40e 1370732c -......8....,sp. >>> 80765480: 13def6e9 3f5fbebe b87631e1 bd1ea039 ......_?.1v.9... >>> 80765490: 632f82d6 cdb4d2df 57e27f67 d7486b75 ../c....g..WukH. >>> 807654a0: 50e3ff5f df835f3a bad25b1c a035f3a2 _..P:_...[....5. >>> 807654b0: ddce35f3 7ce9e76f e8afaf9d d14730ff .5..o..|.....0G. >>> 807654c0: e4d48d58 d3cb9ae0 39870afa 3f5a79d6 X..........9.yZ? >>> 807654d0: df3eb424 13b557d7 ad6fc2e3 a8e0bc8c $.>..W....o..... >>> 807654e0: 44f02d78 d4f4eaf3 6976463d 2be5add8 x-.D....=Fvi...+ >>> 807654f0: 4133cfb5 58e282f0 a737ed8f 4b7bdda7 ..3A...X..7...{K >>> 80765500: 3fb73fe2 c28b7b8c 2f7bb61d 761e29ef .?.?.{....{/.).v >>> 80765510: c7b805cf fe0f79c0 ebf1a11e e26eaf18 .....y........n. >>> 80765520: c4e4733c cbfb8e57 9fa912ab 82a7f87f <s..W........... >>> 80765530: 6731e3df b15e0fdb 9f19f6cb c7bb4ca1 ..1g..^......L.. >>> 80765540: e783f527 f26ea792 8e39dc69 64d689dd '.....n.i.9....d >>> 80765550: b2996b8a 0964e737 dff989be 8f20f1d0 .k..7.d....... . >>> 80765560: 742e49b7 045b7819 50e39aeb a3ea1d4b .I.t.x[....PK... >>> 80765570: e8c54c89 02d64a44 dd08cae6 cf817cea .L..DJ.......|.. >>> 80765580: 0f612f91 60fe6bf0 a13b5746 8ba9cc68 ./a..k.`FW;.h... >>> 80765590: b18276ad fb093398 a6386562 671adcf9 .v...3..be8....g >>> 807655a0: f565aea6 f58f9d61 e095cc69 d3ec8b7b ..e.a...i...{... >>> 807655b0: f3885cee 7bf9c686 0bc571fb df18beef .\.....{.q...... >>> 807655c0: 5da1398d e158dab8 527e4f30 be847312 .9.]..X.0O~R.s.. >>> 807655d0: 90bcc8ec 31f17fea e420aef9 31add60e .......1.. ....1 >>> 807655e0: bda7cfac b8aecf96 71a9ebe2 e9e92dd6 ...........q.-.. >>> 807655f0: 8b729678 ea9e6219 c14f69cd 75ce66b5 x.r..b...iO..f.u >>> >>> A 'nand dump' of a block written from the DDR block above looks like: >>> >>> df 67 87 9e ca e0 dd 55 a0 ff f2 8a f3 62 a3 7b >>> fc 7d 55 27 3d ea b9 12 5e c3 9c 51 6e a6 cd d7 >>> 63 25 2a bc e6 53 a9 b9 31 75 7f 0b ee ad 38 af >>> 73 f4 31 8f 9a cf 6a da 3d a1 e7 8d be 13 eb a9 >>> e6 da b4 af 6c 56 06 f4 b3 14 74 d5 5a f1 a1 ae >>> b6 e9 fe be a5 98 f0 95 73 19 07 ce 9a f7 68 b7 >>> 51 f7 c9 5b af 6a 77 df 6f 5e 99 be cf 5a ab 70 >>> 2d f0 0e 9c f7 cd d5 38 0e c4 eb 9a 2c 73 70 13 >>> e9 f6 de 13 be be 5f 3f e1 31 76 b8 39 a0 1e bd >>> d6 82 2f 63 df d2 b4 cd 67 7f e2 57 75 6b 48 d7 >>> 5f ff e3 50 3a 5f 83 df 1c 5b d2 ba a2 f3 35 a0 >>> f3 35 ce dd 6f e7 e9 7c 9d af af e8 ff 30 47 d1 >>> 58 8d d4 e4 e0 9a cb d3 fa 0a 87 39 d6 79 5a 3f >>> 24 b4 3e df d7 57 b5 13 e3 c2 6f ad 8c bc e0 a8 >>> 78 2d f0 44 f3 ea f4 d4 3d 46 76 69 d8 ad e5 2b >>> b5 cf 33 41 f0 82 e2 58 8f ed 37 a7 a7 dd 7b 4b >>> e2 3f b7 3f 8c 7b 8b c2 1d b6 7b 2f ef 29 1e 76 >>> cf 05 b8 c7 c0 79 0f fe 1e a1 f1 eb 18 af 6e e2 >>> 3c 73 e4 c4 57 8e fb cb ab 12 a9 9f 7f f8 a7 82 >>> df e3 31 67 db 0f 5e b1 cb f6 19 9f a1 4c bb c7 >>> 27 f5 83 e7 92 a7 6e f2 69 dc 39 8e dd 89 d6 64 >>> 8a 6b 99 b2 37 e7 64 09 be 89 f9 df d0 f1 20 8f >>> b7 49 2e 74 19 78 5b 04 eb 9a e3 50 4b 1d ea a3 >>> 89 4c c5 e8 44 4a d6 02 e6 ca 08 dd ea 7c 81 cf >>> 91 2f 61 0f f0 6b fe 60 46 57 3b a1 68 cc a9 8b >>> ad 76 82 b1 98 33 09 fb 62 65 38 a6 f9 dc 1a 67 >>> a6 ae 65 f5 61 9d 8f f5 69 cc 95 e0 7b 8b ec d3 >>> ee 5c 88 f3 86 c6 f9 7b fb 71 c5 0b ef be 18 df >>> 8d 39 a1 5d b8 da 58 e1 30 4f 7e 52 12 73 84 be >>> ec c8 bc 90 ea 7f f1 31 f9 ae 20 e4 0e d6 ad 31 >>> ac cf a7 bd 96 cf ae b8 e2 eb a9 71 d6 2d e9 e9 >>> 78 96 72 8b 19 62 9e ea cd 69 4f c1 b5 66 ce 75 >>> OOB: >>> ff ff ff ff ff ff 91 33 >>> 47 64 67 73 69 56 ef 20 >>> >>> The first read has errors at: >>> >>> 0x190: 158276ad, it should be b18276ad >>> 0x198: a6386515, it should be a6386562 >>> 0x1e4: 17aecf96, it should be b8aecf96 >>> >>> The second read has an error at: >>> >>> 0x190: 158276ad, it should be b18276ad >>> >>> The NAND I'm currently working with doesn't have any factory bad >>> blocks on it. Since my board has an NAND socket I've tried two other >>> NAND devices and they both act the same. I've also tried a second >>> board with a soldered NAND and it also does the same thing. I've >>> tried writing to the first page in a block and pages with-in the block >>> without any effect on the problem. >>> >>> The NAND circuit is laid out as on the EVM except we're using a ST >>> NAND512W3A2CN6. It has 512 byte pages in 16k blocks. >>> >>> All of the above debugging was done on a custom U-Boot binary based on >>> the EVM U-Boot. >>> >>> I've also tried the original U-Boot binary that came with the EVM. It >>> exhibits the same issue, although slightly differently. The first >>> read has 3 bytes wrong, the second has two (same bytes wrong as shown >>> at 0x190 and 0x198 above), the third read has one (0x190), the fourth >>> is the same as the second, the fifth and subsequent reads all have one >>> (0x190). >>> >>> I'd appreciate any suggestions that anyone can offer. >>> >>> thanks, >>> Jeff >>> >>> >> >> > >
_______________________________________________ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source