Re: [PATCH v2 11/53] mtd: nand: denali: fix bitflips calculation in handle_ecc()

2017-03-23 Thread Boris Brezillon
On Thu, 23 Mar 2017 16:02:02 +0900
Masahiro Yamada  wrote:

> Hi Boris,
> 
> 2017-03-23 5:57 GMT+09:00 Boris Brezillon 
> :
> > On Wed, 22 Mar 2017 23:07:18 +0900
> > Masahiro Yamada  wrote:
> >  
> >> + do {
> >> + err_addr = ioread32(denali->flash_reg + ECC_ERROR_ADDRESS);
> >> + err_sector = ECC_SECTOR(err_addr);
> >> + err_byte = ECC_BYTE(err_addr);
> >> +
> >> + err_cor_info = ioread32(denali->flash_reg + 
> >> ERR_CORRECTION_INFO);
> >> + err_cor_value = ECC_CORRECTION_VALUE(err_cor_info);
> >> + err_device = ECC_ERR_DEVICE(err_cor_info);
> >> +
> >> + /* reset the bitflip counter when crossing ECC sector */
> >> + if (err_sector != prev_sector)
> >> + bitflips = 0;
> >> +
> >> + if (ECC_ERROR_UNCORRECTABLE(err_cor_info)) {
> >> + /*
> >> +  * if the error is not correctable, need to look at 
> >> the
> >> +  * page to see if it is an erased page. if so, then
> >> +  * it's not a real ECC error
> >> +  */
> >> + ret = -EBADMSG;  
> >
> > You should never return -EBADMSG directly. Just increment
> > ecc_stats.failed and let the core return -EBADMSG to the upper layer.
> >  
> 
> Here, -EBADMSG is used like that returned from  ->ecc.correct()
> 
> 
> Please notice denali_read_page() never returns -EBADMSG.
> 
>  -EBADMSG is used as a mark "we need erased page check".
> 
> 
> I think nand_read_page_syndrome() does similar;
> -EBADMSG is used internally.

That's not exactly what happens. nand_read_page_syndrome() calls
ecc->correct() for each chunk, and if this method returns -EBADMSG (and
nand_check_erased_ecc_chunk() returns -EBADMSG too) it increments the
ecc_stats.failed counter.

Here you check all chunks in the same function and only increment
ecc_stats.failed once in denali_read_page() even if several chunks are
uncorrectable.
You handle_ecc() should act like nand_read_page_syndrome() WRT ECC
checking: check each block one by one, call
nand_check_erased_ecc_chunk() if needed, increment ecc_stats.failed
when an uncorrectable error is detected, and return max_bitflips at the
end.


Re: [PATCH v2 11/53] mtd: nand: denali: fix bitflips calculation in handle_ecc()

2017-03-23 Thread Boris Brezillon
On Thu, 23 Mar 2017 16:02:02 +0900
Masahiro Yamada  wrote:

> Hi Boris,
> 
> 2017-03-23 5:57 GMT+09:00 Boris Brezillon 
> :
> > On Wed, 22 Mar 2017 23:07:18 +0900
> > Masahiro Yamada  wrote:
> >  
> >> + do {
> >> + err_addr = ioread32(denali->flash_reg + ECC_ERROR_ADDRESS);
> >> + err_sector = ECC_SECTOR(err_addr);
> >> + err_byte = ECC_BYTE(err_addr);
> >> +
> >> + err_cor_info = ioread32(denali->flash_reg + 
> >> ERR_CORRECTION_INFO);
> >> + err_cor_value = ECC_CORRECTION_VALUE(err_cor_info);
> >> + err_device = ECC_ERR_DEVICE(err_cor_info);
> >> +
> >> + /* reset the bitflip counter when crossing ECC sector */
> >> + if (err_sector != prev_sector)
> >> + bitflips = 0;
> >> +
> >> + if (ECC_ERROR_UNCORRECTABLE(err_cor_info)) {
> >> + /*
> >> +  * if the error is not correctable, need to look at 
> >> the
> >> +  * page to see if it is an erased page. if so, then
> >> +  * it's not a real ECC error
> >> +  */
> >> + ret = -EBADMSG;  
> >
> > You should never return -EBADMSG directly. Just increment
> > ecc_stats.failed and let the core return -EBADMSG to the upper layer.
> >  
> 
> Here, -EBADMSG is used like that returned from  ->ecc.correct()
> 
> 
> Please notice denali_read_page() never returns -EBADMSG.
> 
>  -EBADMSG is used as a mark "we need erased page check".
> 
> 
> I think nand_read_page_syndrome() does similar;
> -EBADMSG is used internally.

That's not exactly what happens. nand_read_page_syndrome() calls
ecc->correct() for each chunk, and if this method returns -EBADMSG (and
nand_check_erased_ecc_chunk() returns -EBADMSG too) it increments the
ecc_stats.failed counter.

Here you check all chunks in the same function and only increment
ecc_stats.failed once in denali_read_page() even if several chunks are
uncorrectable.
You handle_ecc() should act like nand_read_page_syndrome() WRT ECC
checking: check each block one by one, call
nand_check_erased_ecc_chunk() if needed, increment ecc_stats.failed
when an uncorrectable error is detected, and return max_bitflips at the
end.


Re: [PATCH v2 11/53] mtd: nand: denali: fix bitflips calculation in handle_ecc()

2017-03-23 Thread Masahiro Yamada
Hi Boris,

2017-03-23 5:57 GMT+09:00 Boris Brezillon :
> On Wed, 22 Mar 2017 23:07:18 +0900
> Masahiro Yamada  wrote:
>
>> + do {
>> + err_addr = ioread32(denali->flash_reg + ECC_ERROR_ADDRESS);
>> + err_sector = ECC_SECTOR(err_addr);
>> + err_byte = ECC_BYTE(err_addr);
>> +
>> + err_cor_info = ioread32(denali->flash_reg + 
>> ERR_CORRECTION_INFO);
>> + err_cor_value = ECC_CORRECTION_VALUE(err_cor_info);
>> + err_device = ECC_ERR_DEVICE(err_cor_info);
>> +
>> + /* reset the bitflip counter when crossing ECC sector */
>> + if (err_sector != prev_sector)
>> + bitflips = 0;
>> +
>> + if (ECC_ERROR_UNCORRECTABLE(err_cor_info)) {
>> + /*
>> +  * if the error is not correctable, need to look at the
>> +  * page to see if it is an erased page. if so, then
>> +  * it's not a real ECC error
>> +  */
>> + ret = -EBADMSG;
>
> You should never return -EBADMSG directly. Just increment
> ecc_stats.failed and let the core return -EBADMSG to the upper layer.
>

Here, -EBADMSG is used like that returned from  ->ecc.correct()


Please notice denali_read_page() never returns -EBADMSG.

 -EBADMSG is used as a mark "we need erased page check".


I think nand_read_page_syndrome() does similar;
-EBADMSG is used internally.



-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 11/53] mtd: nand: denali: fix bitflips calculation in handle_ecc()

2017-03-23 Thread Masahiro Yamada
Hi Boris,

2017-03-23 5:57 GMT+09:00 Boris Brezillon :
> On Wed, 22 Mar 2017 23:07:18 +0900
> Masahiro Yamada  wrote:
>
>> + do {
>> + err_addr = ioread32(denali->flash_reg + ECC_ERROR_ADDRESS);
>> + err_sector = ECC_SECTOR(err_addr);
>> + err_byte = ECC_BYTE(err_addr);
>> +
>> + err_cor_info = ioread32(denali->flash_reg + 
>> ERR_CORRECTION_INFO);
>> + err_cor_value = ECC_CORRECTION_VALUE(err_cor_info);
>> + err_device = ECC_ERR_DEVICE(err_cor_info);
>> +
>> + /* reset the bitflip counter when crossing ECC sector */
>> + if (err_sector != prev_sector)
>> + bitflips = 0;
>> +
>> + if (ECC_ERROR_UNCORRECTABLE(err_cor_info)) {
>> + /*
>> +  * if the error is not correctable, need to look at the
>> +  * page to see if it is an erased page. if so, then
>> +  * it's not a real ECC error
>> +  */
>> + ret = -EBADMSG;
>
> You should never return -EBADMSG directly. Just increment
> ecc_stats.failed and let the core return -EBADMSG to the upper layer.
>

Here, -EBADMSG is used like that returned from  ->ecc.correct()


Please notice denali_read_page() never returns -EBADMSG.

 -EBADMSG is used as a mark "we need erased page check".


I think nand_read_page_syndrome() does similar;
-EBADMSG is used internally.



-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 11/53] mtd: nand: denali: fix bitflips calculation in handle_ecc()

2017-03-22 Thread Boris Brezillon
On Wed, 22 Mar 2017 23:07:18 +0900
Masahiro Yamada  wrote:

> + do {
> + err_addr = ioread32(denali->flash_reg + ECC_ERROR_ADDRESS);
> + err_sector = ECC_SECTOR(err_addr);
> + err_byte = ECC_BYTE(err_addr);
> +
> + err_cor_info = ioread32(denali->flash_reg + 
> ERR_CORRECTION_INFO);
> + err_cor_value = ECC_CORRECTION_VALUE(err_cor_info);
> + err_device = ECC_ERR_DEVICE(err_cor_info);
> +
> + /* reset the bitflip counter when crossing ECC sector */
> + if (err_sector != prev_sector)
> + bitflips = 0;
> +
> + if (ECC_ERROR_UNCORRECTABLE(err_cor_info)) {
> + /*
> +  * if the error is not correctable, need to look at the
> +  * page to see if it is an erased page. if so, then
> +  * it's not a real ECC error
> +  */
> + ret = -EBADMSG;

You should never return -EBADMSG directly. Just increment
ecc_stats.failed and let the core return -EBADMSG to the upper layer.



Re: [PATCH v2 11/53] mtd: nand: denali: fix bitflips calculation in handle_ecc()

2017-03-22 Thread Boris Brezillon
On Wed, 22 Mar 2017 23:07:18 +0900
Masahiro Yamada  wrote:

> + do {
> + err_addr = ioread32(denali->flash_reg + ECC_ERROR_ADDRESS);
> + err_sector = ECC_SECTOR(err_addr);
> + err_byte = ECC_BYTE(err_addr);
> +
> + err_cor_info = ioread32(denali->flash_reg + 
> ERR_CORRECTION_INFO);
> + err_cor_value = ECC_CORRECTION_VALUE(err_cor_info);
> + err_device = ECC_ERR_DEVICE(err_cor_info);
> +
> + /* reset the bitflip counter when crossing ECC sector */
> + if (err_sector != prev_sector)
> + bitflips = 0;
> +
> + if (ECC_ERROR_UNCORRECTABLE(err_cor_info)) {
> + /*
> +  * if the error is not correctable, need to look at the
> +  * page to see if it is an erased page. if so, then
> +  * it's not a real ECC error
> +  */
> + ret = -EBADMSG;

You should never return -EBADMSG directly. Just increment
ecc_stats.failed and let the core return -EBADMSG to the upper layer.