Hi Boris & Miquel, Could you please provide your thoughts on this driver to support HW-ECC? As I said previously, there is no way to detect errors beyond N bit. I am ok to update the driver based on your inputs.
Thanks, Naga Sureshkumar Relli > -----Original Message----- > From: linux-mtd [mailto:linux-mtd-boun...@lists.infradead.org] On Behalf Of > Naga > Sureshkumar Relli > Sent: Friday, December 21, 2018 1:06 PM > To: Miquel Raynal <miquel.ray...@bootlin.com> > Cc: r...@kernel.org; marek.va...@gmail.com; rich...@nod.at; > martin.lund@keep-it- > simple.com; linux-kernel@vger.kernel.org; Boris Brezillon > <boris.brezil...@bootlin.com>; > linux-...@lists.infradead.org; nagasures...@gmail.com; Michal Simek > <mich...@xilinx.com>; computersforpe...@gmail.com; dw...@infradead.org > Subject: RE: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for > Arasan > NAND Flash Controller > > Hi Miquel, > > > -----Original Message----- > > From: Miquel Raynal [mailto:miquel.ray...@bootlin.com] > > Sent: Wednesday, December 19, 2018 7:57 PM > > To: Naga Sureshkumar Relli <nagas...@xilinx.com> > > Cc: Boris Brezillon <boris.brezil...@bootlin.com>; r...@kernel.org; > > rich...@nod.at; linux-kernel@vger.kernel.org; marek.va...@gmail.com; > > linux-...@lists.infradead.org; nagasures...@gmail.com; Michal Simek > > <mich...@xilinx.com>; computersforpe...@gmail.com; > > dw...@infradead.org; martin.l...@keep-it-simple.com > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support > > for Arasan NAND Flash Controller > > > > Hi Naga, > > > > + Martin > > > > Naga Sureshkumar Relli <nagas...@xilinx.com> wrote on Tue, 18 Dec 2018 > > 05:33:53 +0000: > > > > > Hi Miquel, > > > > > > > -----Original Message----- > > > > From: Miquel Raynal [mailto:miquel.ray...@bootlin.com] > > > > Sent: Monday, December 17, 2018 10:11 PM > > > > To: Naga Sureshkumar Relli <nagas...@xilinx.com> > > > > Cc: Boris Brezillon <boris.brezil...@bootlin.com>; > > > > r...@kernel.org; rich...@nod.at; linux- ker...@vger.kernel.org; > > > > marek.va...@gmail.com; linux-...@lists.infradead.org; > > > > nagasures...@gmail.com; Michal Simek <mich...@xilinx.com>; > > > > computersforpe...@gmail.com; dw...@infradead.org > > > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add > > > > support for Arasan NAND Flash Controller > > > > > > > > Hi Naga, > > > > > > > > [...] > > > > > > > > > Inserted biterror @ 48/7 > > > > > Successfully corrected 25 bit errors per subpage Inserted > > > > > biterror @ > > > > > 50/7 ECC failure, invalid data despite read success > > > > > root@xilinx-zc1751-dc2-2018_1:~# > > > > > > > > > > But even in this case also, driver is saying ECC failure but read > > > > > success. > > > > > That means controller is able to detect errors on read page up to 24 > > > > > bit only. > > > > > After that there is no way to say to the upper layers that the > > > > > page is bad because of the > > > > limitation in the controller. > > > > > > > > This is more than a "limitation", the design is broken. I am not > > > > sure how to support such controller, and I am not sure if we even want > > > > to. > > > > > > The number of errors that are correctable is limited by a parameter > > > 't'(total number of errors), If there is a condition that the number > > > of errors greater than 't', > > then the controller won't be able to detect that. > > > I guess this concept is same for other controllers as well. > > > In Arasan it is limited to 24-bit. > > > > > > Even, in case of Hamming, it is 1-bit error correction and 2-bit error > > > detection. > > > What will happen if there are multiple errors(greater than 2-bit)? > > > > Ok let's use the Hamming comparison in your ECC engine case. > > > > -> hamming: > > * 0 bf: everything is fine > > * 1 bf: will be detected, corrected, signaled > > * 2 bf: will be detected, not corrected, signaled > > * 3+ bf: don't care > > > > -> BCH: > > * 0 bf: everything is fine > > * 1-24 bf: will be detected, corrected, signaled > > * 25 bf: everything is fine > > * 26+ bf: don't care > > > > Do you see the problem? > No. > > > > In the 25 bf case, the controller is reporting that everything went > > fine while it should report that it detected an uncorrectable situation. > > > > Here are two leads to solve this issue, please investigate them both: > > 1/ Talk to your colleagues that developed the RTL, ask if there is a > > hidden/reserved bit for that purpose that is not documented. > I spoke to RTL guys, there is nothing hidden/reserved bit for this purpose. > I tried reading the status registers reserved bits, but they are raz(read as > zero) > > > 2/ Search for a status in the registers that might indicate that an > > error occurred, for instance "0 bf corrected" and "bf have been > > detected". > I tried reading status registers and other registers as well, but no luck. > > > > NB: I know that, with a BCH ECC engine, error detection at (strength + > > 1) is not 100% sure but statistically it will almost always be > > detected and in this case we need the controller to warn the user! > Ok, I understood now. > > Thanks, > Naga Sureshkumar Relli > > > > > > Thanks, > > Miquèl > ______________________________________________________ > Linux MTD discussion mailing list > http://lists.infradead.org/mailman/listinfo/linux-mtd/