Am 31.10.2014 um 16:34 schrieb Richard Weinberger: > Hi Tanya, > > Am 31.10.2014 um 14:12 schrieb Tanya Brokhman: >> Hi Richard >> >> On 10/29/2014 2:00 PM, Richard Weinberger wrote: >>> Tanya, >>> >>> Am 29.10.2014 um 12:03 schrieb Tanya Brokhman: >>>> I'll try to address all you comments in one place. >>>> You're right that the read counters don't have to be exact but they do >>>> have to reflect the real state. >>> >>> But it does not really matter if the counters are a way to high or too low? >>> It does also not matter if a re-read of adjacent PEBs is issued too often. >>> It won't hurt. >>> >>>> Regarding your idea of saving them to a file, or somehow with userspace >>>> involved; This is doable, but such solution will depend on user space >>>> implementation: >>>> - one need to update kernel with correct read counters (saved somewhere in >>>> userspace) >>>> - it is required on every boot. >>>> - saving the counters back to userspace should be periodically triggered >>>> as well. >>>> So the minimal workflow for each boot life cycle will be: >>>> - on boot: update kernel with correct values from userspace >>> >>> Correct. >>> >>>> - kernel updates the counters on each read operation >>> >>> Yeah, that's a plain simple in kernel counter.. >>> >>>> - on powerdown: save the updated kernel counters back to userspace >>> >>> Correct. The counters can also be saved once a day by cron. >>> If one or two save operations are missed it won't hurt either. >>> >>>> The read-disturb handling is based on kernel updating and monitoring read >>>> counters. Taking this out of the kernel space will result in an incomplete >>>> and very fragile solution for >>>> the read-disturb problem since the dependency in userspace is just too big. >>> >>> Why? >>> We both agree on the fact that the counters don't have to be exact. >>> Maybe I'm wrong but to my understanding they are just a rough indicator >>> that sometime later UBI has to check for bitrot/flips. >> >> The idea is to prevent data loss, to prevent errors while reading, because >> we might hit errors we can't fix. So although the read_disturb_threshold is >> a rough estimation based on >> statistics, we can't ignore it and need to stay close to the calculated >> statistics. >> >> Its really the same as wear-leveling. You have a limitation that each peb >> can be erased limited number of times. This erase-limit is also an >> estimation based on statistics >> collected by the card vendor. But you do want to know the exact number of >> erase counter to prevent erasing the block extensively. > > So you have to update the EC-Header every time we read a PEB...? > >> >>> >>>> Another issue to consider is that each SW upgrade will result in loosing >>>> the counters saved in userspace and reset all. Otherwise, system upgrade >>>> process will also have to be >>>> updated. >>> >>> Does it hurt if these counters are lost upon an upgrade? >>> Why do we need them for ever? >>> If they start after an upgrade from 0 again heavily read PEBs will quickly >>> gain a high counter and will be checked. >> >> yes, we do need the ACCURATE counters and cant loose them. For example: we >> have a heavily read block. It was read from 100 times when the >> read-threshold is 101. Meaning, the 101 >> read will most probably fail. > > You are trying me to tell that the NAND is that crappy that it will die after > 100 reads? I really hope this was just a bad example. > You *will* loose counters unless you update the EC-Header upon every read, > which is also not sane at all. > >> You do a SW upgrade, and set the read-counter for this block as 0 and don't >> scrubb it. Next time you try reading from it (since it's heavily read from >> block), you'll get errors. If >> you're lucky, ecc will fx them for you, but its not guarantied. >> >>> >>> And of course these counters can be preserved. One can also place them into >>> a UBI static volume. >>> Or use a sane upgrade process... >> >> "Sane upgrade" means that in order to support read-disturb we twist the >> users hand into implementing not a trivial logic in userspace. >> >>> >>> As I wrote in my last mail we could also create a new internal UBI volume >>> to store these counters. >>> Then you can have the logic in kernel but don't have to change the UBI >>> on-disk layout. >>> >>>> The read counters are very much like the ec counters used for >>>> wear-leveling; One is updated on each erase, other on each read; One is >>>> used to handle issues caused by frequent >>>> writes (erase operations), the other handle issues caused by frequent >>>> reads. >>>> So how are the two different? Why isn't wear-leveling (and erase counters) >>>> handled by userspace? My guess that the decision to encapsulate the >>>> wear-leveling into the kernel was due >>>> to the above mentioned reasons. >>> >>> The erase counters are crucial for UBI to operate. Even while booting up >>> the kernel and mounting UBIFS the EC counters have to available >>> because UBI maybe needs to move LEBs around or has to find free PEBs which >>> are not worn out. I UBI makes here a bad decision things will break. >> >> Same with read-counters and last_erase_timestamps. If ec counters are lost, >> we might get with bad blocks (since they are worn out) and have data loss. >> If we ignore read-disturb and don't' scrubb heavily read blocks we will have >> data loss as well. >> the only difference between the 2 scenarios is "how long before it happens". >> Read-disturb wasn't an issue since average lifespan of a nand device was ~5 >> years. Read-disturb occurs >> in a longer lifespan. that's why it's required now: a need for a "long life >> nand". > > Okay, read-disturb will only happen if you read blocks *very* often. Do you > have numbers, datasheets, etc...? > > Let's recap. > > We need to address two issues: > a) If a PEB is ready very often we need to scrub it. > b) PEBs which are not read for a very long time need to be re-read/scrubbed > to detect bit-rot > > Solving b) is easy, just re-read every PEB from time to time. No persistent > data at all is needed. > To solve a) you suggest adding the read-counter to the UBI on-disk layout > like the erase-counter values. > I don't think that this is a good solution. > We can perfectly fine save the read-counters from time to time and upon > detach either to a file on UBIFS > or into a new internal value. As read-disturb will only happen after a long > time and hence very high read-counters > it does not matter if we lose some values upon a powercut. i.e. Such that a > counter is 50000 instead of 50500. > Btw: We also have to be very careful that reading data will not wear out the > flash. > > So, we need a logic within UBI which counts every read access and persists > this data in some way. > As suggested in an earlier mail this can also be done purely in userspace. > It can also be done within the UBI kernel module. I.e. by storing the > counters into a internal volume.
Another point: What if we scrub every PEB once a week? Why would that not work? Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
