Re: [U-Boot-Users] [PATCH RFC] ARM: Davinci: NAND fix for large page ECC and linux compatibility
On Mon, 2008-06-30 at 10:26 -0500, Scott Wood wrote: We should probably default to doing it the right way, not the broken-but-compatible way for large pages, though. It depends if you put backwards compatibility over reliability though. In the long term, I value the latter -- compatibility should be possible, but it shouldn't cause new users to continue to generate bad ECC indefinetly (both causing them reliability problems and expanding the number of people that would be affected if the default were to change down the road). I communicated with Bernard and he told me he moved to other things and probably would not have time to upgrade his patch against the newest git repository. So I did it, and I inverted the meaning of the #define so that new users will generate correct ECC without having to define anything. I also named the #define CFG_DAVINCI_BROKEN_ECC since it is DaVinci specific. The patch will follow after this email. Hugo Villeneuve. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ U-Boot-Users mailing list U-Boot-Users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/u-boot-users
Re: [U-Boot-Users] [PATCH RFC] ARM: Davinci: NAND fix for large page ECC and linux compatibility
On Sat, Jun 28, 2008 at 11:31:18AM +0800, Bernard Blackham wrote: It seems odd that backwards compatibility requires turning *off* an option with compatible in the name... I'd invert the sense of the ifdef, and have it be something like CFG_BROKEN_ECC_COMPATIBILITY. The concern with this is people that use their own custom config files will need to add this #define when they upgrade. How about just changing the name to CFG_NEW_NAND_ECC_FORMAT then? How about having both, and #erroring if one or the other isn't defined (similar to what you suggest below, but for both small and large page)? Also, both should probably be CFG_DAVINCI_xxx rather than CFG_xxx, to make clear that it's not a general NAND issue. If the old way of doing small page ECC was valid, should we preserve that (and change Linux back)? That's a little controversial. Basically, the old OOB layout didn't match any other layout used (except by the MV kernel), the actual ECC layout meant that the method for correction was overly complex (with 170 non-obvious lines of code), and allegedly broken: http://article.gmane.org/gmane.comp.boot-loaders.u-boot/32035 The new code is about 30 lines, really simple, and I can even prove it's correctness (which I couldn't even begin to with the old code). OK -- was just making sure that it wasn't a gratuitous change. We should probably default to doing it the right way, not the broken-but-compatible way for large pages, though. It depends if you put backwards compatibility over reliability though. In the long term, I value the latter -- compatibility should be possible, but it shouldn't cause new users to continue to generate bad ECC indefinetly (both causing them reliability problems and expanding the number of people that would be affected if the default were to change down the road). This forces the user to make a choice - they'll probably curse while they're doing it, but they can't plead ignorance when they find their large page NAND isn't detecting ECC errors. Or when they end up getting lots of ECC errors when using non-MV Linux. I really do believe it should be a clean switch from one format to the other, for both small and large page NAND, with no run-time backwards compatibility. But that's just my POV. Agreed, that's the simplest route if it can be managed without surprise breakage. -Scott - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ U-Boot-Users mailing list U-Boot-Users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/u-boot-users
Re: [U-Boot-Users] [PATCH RFC] ARM: Davinci: NAND fix for large page ECC and linux compatibility
+#define CFG_LINUX_COMPATIBLE_ECC + */ It seems odd that backwards compatibility requires turning *off* an option with compatible in the name... I'd invert the sense of the ifdef, and have it be something like CFG_BROKEN_ECC_COMPATIBILITY. The concern with this is people that use their own custom config files will need to add this #define when they upgrade. How about just changing the name to CFG_NEW_NAND_ECC_FORMAT then? I like CFG_NEW_NAND_ECC_FORMAT better as well. #if defined(CFG_NAND_LARGEPAGE) !defined(CFG_LINUX_COMPATIBLE_ECC) /* Comment this #error out only if you really really have to. */ #error You are using old ECC code that is broken on large page devices. See doc/README.davinci #endif This forces the user to make a choice - they'll probably curse while they're doing it, but they can't plead ignorance when they find their large page NAND isn't detecting ECC errors. I like this too. Maybe a #warning for small pages as well. Of course both would also depend on #ifdef CFG_NAND_HW_ECC. Perhaps we could use some currently unused OOB byte as a marker for new/old ECC layout? Could do, but any filesystems which use the OOB bytes might step on these. It also complicates the code even moreso and creates a lot more scenarios to test and that could go wrong. I really do believe it should be a clean switch from one format to the other, for both small and large page NAND, with no run-time backwards compatibility. But that's just my POV. I hope that eventually we can remove the old format, but this patch has my ack. Thanks Bernard Troy - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ U-Boot-Users mailing list U-Boot-Users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/u-boot-users
Re: [U-Boot-Users] [PATCH RFC] ARM: Davinci: NAND fix for large page ECC and linux compatibility
#endif +#endif static void nand_davinci_enable_hwecc(struct mtd_info *mtd, int mode) { @@ -141,12 +146,29 @@ static u_int32_t nand_davinci_readecc(st static int nand_davinci_calculate_ecc(struct mtd_info *mtd, const u_char *dat, u_char *ecc_code) { +#ifdef CFG_LINUX_COMPATIBLE_ECC + unsigned int ecc_val = nand_davinci_readecc(mtd, 1); + /* squeeze 0 middle bits out so that it fits in 3 bytes */ + unsigned int tmp = (ecc_val0x0fff)|((ecc_val0x0fff)4); unsigned int tmp = (ecc_val 0x0fff) | ((ecc_val 0x0fff) 4); please and space between operator plese use the same alignement add an empty line + /* invert so that erased block ecc is correct */ + tmp = ~tmp; + ecc_code[0] = (u_char)(tmp); + ecc_code[1] = (u_char)(tmp 8); + ecc_code[2] = (u_char)(tmp 16); +#else u_int32_t tmp; int region, n; struct nand_chip*this = mtd-priv; n = (this-eccmode == NAND_ECC_HW12_2048) ? 4 : 1; + u_char *read_ecc, u_char *calc_ecc) +{ + struct nand_chip *chip = mtd-priv; + u_int32_t ecc_nand = read_ecc[0] | (read_ecc[1] 8) | + (read_ecc[2] 16); + u_int32_t ecc_calc = calc_ecc[0] | (calc_ecc[1] 8) | + (calc_ecc[2] 16); + u_int32_t diff = ecc_calc ^ ecc_nand; + + if (diff) { + if diff12)^diff) 0xfff) == 0xfff) { please and space between operator + /* Correctable error */ please and space between operator + if ((diff(12+3)) chip-eccsize) { + uint8_t find_bit = 1 ((diff12)7); please and space between operator + uint32_t find_byte = diff(12+3); uint32_t find_byte = diff 15; please and space between operator + dat[find_byte] ^= find_bit; + DEBUG (MTD_DEBUG_LEVEL0, Correcting single bit ECC error at offset: %d, bit: %d\n, find_byte, find_bit); too long please split + return 1; + } else { + return -1; + } + } else if (!(diff (diff-1))) { please and space between operator + /* Single bit ECC error in the ECC itself, +nothing to fix */ please use this style of comment /* * */ + DEBUG (MTD_DEBUG_LEVEL0, Single bit ECC error in ECC.\n); + return 1; + } else { + /* Uncorrectable error */ + DEBUG (MTD_DEBUG_LEVEL0, ECC UNCORRECTED_ERROR 1\n); + return -1; /* Set address of hardware control function */ nand-hwcontrol = nand_davinci_hwcontrol; Index: u-boot-1.3.3/include/configs/davinci_dvevm.h === --- u-boot-1.3.3.orig/include/configs/davinci_dvevm.h 2008-05-19 18:47:11.0 +0800 +++ u-boot-1.3.3/include/configs/davinci_dvevm.h 2008-06-27 13:04:07.0 +0800 @@ -46,6 +46,18 @@ #define CONFIG_NOR_UART_BOOT */ +/* + * Previous versions of u-boot (1.3.3 and prior) and Montavista Linux kernels + * generated bogus ECCs on large-page NAND. Both large and small page NAND ECCs + * were incompatible with the Linux davinci git tree (since NAND was integrated + * in 2.6.24). + * Don't turn this on if you want backwards compatibility. + * Do turn this on if you want u-boot to be able to read and write NAND + * that can be written or read by the Linux davinci git kernel. + * +#define CFG_LINUX_COMPATIBLE_ECC + */ please move this in README.davinci + Best Regards, J. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ U-Boot-Users mailing list U-Boot-Users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/u-boot-users
Re: [U-Boot-Users] [PATCH RFC] ARM: Davinci: NAND fix for large page ECC and linux compatibility
[offlist thread cc'd back to the list - hope you don't mind Scott!] Hi Scott, Thanks for your feedback. On Fri, Jun 27, 2008 at 12:04:00PM -0500, Scott Wood wrote: Bernard Blackham wrote: What I would like to know though, is are you the right person to push this through, and can it make it into U-boot 1.3.4? I'm asking because it contains potential compatibility breaks, and I'd like to document the specific U-boot versions that will be affected. There is a general ARM tree, but I can take it through mine if that's what is preferred. That was the suggestion on #uboot. I'd say it's more NAND related than ARM related too. +/* + * Previous versions of u-boot (1.3.3 and prior) and Montavista Linux kernels + * generated bogus ECCs on large-page NAND. Both large and small page NAND ECCs + * were incompatible with the Linux davinci git tree (since NAND was integrated + * in 2.6.24). + * Don't turn this on if you want backwards compatibility. + * Do turn this on if you want u-boot to be able to read and write NAND + * that can be written or read by the Linux davinci git kernel. + * +#define CFG_LINUX_COMPATIBLE_ECC + */ It seems odd that backwards compatibility requires turning *off* an option with compatible in the name... I'd invert the sense of the ifdef, and have it be something like CFG_BROKEN_ECC_COMPATIBILITY. The concern with this is people that use their own custom config files will need to add this #define when they upgrade. How about just changing the name to CFG_NEW_NAND_ECC_FORMAT then? If the old way of doing small page ECC was valid, should we preserve that (and change Linux back)? That's a little controversial. Basically, the old OOB layout didn't match any other layout used (except by the MV kernel), the actual ECC layout meant that the method for correction was overly complex (with 170 non-obvious lines of code), and allegedly broken: http://article.gmane.org/gmane.comp.boot-loaders.u-boot/32035 The new code is about 30 lines, really simple, and I can even prove it's correctness (which I couldn't even begin to with the old code). Troy (cc'd) I believe was the original author. It could probably do with some comments though to make it dead obvious to the casual observer what's going on. I'll add them in. We should probably default to doing it the right way, not the broken-but-compatible way for large pages, though. It depends if you put backwards compatibility over reliability though. Many davinci users are still running the MontaVista-supplied 2.6.10 kernel which has the same broken ECC code and I've heard no word from MV on fixing it yet (and they're probably struggling to deal with the same backwards compatibility issue). How about this solution: in davinci/nand.c, we add something like this: #if defined(CFG_NAND_LARGEPAGE) !defined(CFG_LINUX_COMPATIBLE_ECC) /* Comment this #error out only if you really really have to. */ #error You are using old ECC code that is broken on large page devices. See doc/README.davinci #endif This forces the user to make a choice - they'll probably curse while they're doing it, but they can't plead ignorance when they find their large page NAND isn't detecting ECC errors. Perhaps we could use some currently unused OOB byte as a marker for new/old ECC layout? Could do, but any filesystems which use the OOB bytes might step on these. It also complicates the code even moreso and creates a lot more scenarios to test and that could go wrong. I really do believe it should be a clean switch from one format to the other, for both small and large page NAND, with no run-time backwards compatibility. But that's just my POV. Cheers, Bernard. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ U-Boot-Users mailing list U-Boot-Users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/u-boot-users
[U-Boot-Users] [PATCH RFC] ARM: Davinci: NAND fix for large page ECC and linux compatibility
U-boot's HW ECC support for large page NAND on Davinci is completely broken. Some kernels, such as the 2.6.10 one supported by MontaVista for Davinci, rely upon this broken behaviour as they share the same code for ECCs. In the existing scheme, error detection *might* work on large page, but error correction definitely does not. Small page ECC correction works, but the format is not compatible with the mainline git kernel. This patch adds ECC code that matches what is currently in the Davinci git repository (since NAND support was added in 2.6.24). This makes the ECC and OOB layout written by u-boot compatible with Linux for both small page and large page devices and fixes ECC correction for large page devices. The code depends on a #define CFG_LINUX_COMPATIBLE_ECC, which is undefined by default, making the default state backwards compatible. I have verified this by compiling without the #define and producing a binary byte-for-byte identical to one without this patch. [NOTE: I have not yet been able to get my hands on a board with small-page NAND to test, but large page does work. If anybody is interested in testing it, please do and let me know if it works for you (i.e. uboot with this patch and davinci git kernel can read/write the same NAND).] Signed-off-by: Bernard Blackham [EMAIL PROTECTED] --- cpu/arm926ejs/davinci/nand.c| 79 ++-- include/configs/davinci_dvevm.h | 12 ++ 2 files changed, 89 insertions(+), 2 deletions(-) Index: u-boot-1.3.3/cpu/arm926ejs/davinci/nand.c === --- u-boot-1.3.3.orig/cpu/arm926ejs/davinci/nand.c 2008-05-19 18:47:11.0 +0800 +++ u-boot-1.3.3/cpu/arm926ejs/davinci/nand.c 2008-06-27 13:04:03.0 +0800 @@ -87,6 +87,10 @@ static void nand_davinci_select_chip(str } #ifdef CFG_NAND_HW_ECC + +#ifndef CFG_LINUX_COMPATIBLE_ECC +/* Linux-compatible ECC uses MTD defaults. */ +/* These layouts are not compatible with Linux or RBL/UBL. */ #ifdef CFG_NAND_LARGEPAGE static struct nand_oobinfo davinci_nand_oobinfo = { .useecc = MTD_NANDECC_AUTOPLACE, @@ -104,6 +108,7 @@ static struct nand_oobinfo davinci_nand_ #else #error Either CFG_NAND_LARGEPAGE or CFG_NAND_SMALLPAGE must be defined! #endif +#endif static void nand_davinci_enable_hwecc(struct mtd_info *mtd, int mode) { @@ -141,12 +146,29 @@ static u_int32_t nand_davinci_readecc(st static int nand_davinci_calculate_ecc(struct mtd_info *mtd, const u_char *dat, u_char *ecc_code) { +#ifdef CFG_LINUX_COMPATIBLE_ECC + unsigned int ecc_val = nand_davinci_readecc(mtd, 1); + /* squeeze 0 middle bits out so that it fits in 3 bytes */ + unsigned int tmp = (ecc_val0x0fff)|((ecc_val0x0fff)4); + /* invert so that erased block ecc is correct */ + tmp = ~tmp; + ecc_code[0] = (u_char)(tmp); + ecc_code[1] = (u_char)(tmp 8); + ecc_code[2] = (u_char)(tmp 16); +#else u_int32_t tmp; int region, n; struct nand_chip*this = mtd-priv; n = (this-eccmode == NAND_ECC_HW12_2048) ? 4 : 1; + /* +* This is not how you should read ECCs on large page Davinci devices. +* The region parameter gets you ECCs for flash chips on different chip +* selects, not the 4x512 byte pages in a 2048 byte page. +* +* Preserved for backwards compatibility though. +*/ region = 1; while (n--) { tmp = nand_davinci_readecc(mtd, region); @@ -155,9 +177,51 @@ static int nand_davinci_calculate_ecc(st *ecc_code++ = ((tmp 8) 0x0f) | ((tmp 20) 0xf0); region++; } +#endif + return(0); } +#ifdef CFG_LINUX_COMPATIBLE_ECC +static int nand_davinci_correct_data(struct mtd_info *mtd, u_char *dat, + u_char *read_ecc, u_char *calc_ecc) +{ + struct nand_chip *chip = mtd-priv; + u_int32_t ecc_nand = read_ecc[0] | (read_ecc[1] 8) | + (read_ecc[2] 16); + u_int32_t ecc_calc = calc_ecc[0] | (calc_ecc[1] 8) | + (calc_ecc[2] 16); + u_int32_t diff = ecc_calc ^ ecc_nand; + + if (diff) { + if diff12)^diff) 0xfff) == 0xfff) { + /* Correctable error */ + if ((diff(12+3)) chip-eccsize) { + uint8_t find_bit = 1 ((diff12)7); + uint32_t find_byte = diff(12+3); + dat[find_byte] ^= find_bit; + DEBUG (MTD_DEBUG_LEVEL0, Correcting single bit ECC error at offset: %d, bit: %d\n, find_byte, find_bit); + return 1; + } else { + return -1; + } + } else if