Interesting; I see what this method does for you. I still think there is a fairly high chance of getting a false positive with a 1 byte crc but that might not be a huge issue.
Anyway, thanks for this! Will > On Jul 31, 2018, at 12:47 AM, Laczen JMS <laczen...@gmail.com> wrote: > > Hi Marko and Will, > > I have experienced the same problem while developing nvs for zephyr. I > always got stuck when I would except that there would be changes to > flash by an "external" factor. As soon as this could happen there is a > problem with the meaning of the crc. The crc fails this means: > a. Incomplete write, > b. Some tampering occurred, > c. The len is not pointing to the correct location, > In all these cases the result will be that the data is not considered > valid, the only case where a increased crc might be helping is in b. > > In the latest version of nvs I have changed how the data is written to > flash. The data itself is written from the start of a sector, a > "allocation entry" is written from the end of a flash sector. This > allocation entry contains a sector offset and length of the data > written. The allocation entry is of fixed size. The writing of data > is done using the following method: > 1. Check if there is sufficient space (difference between allocation > entry location and next data location), > 2. Write the data at the specific location, this could include a data crc, > 3. Write the allocation entry, this includes a crc calculated over the > item length and offset, > > When there is insufficient space a allocation entry of all zeros is > written at the end of the allocation entries, and a new sector is > used. > > At startup the allocation entry write position is found by stepping > through the allocation entries until all 0xff are found. The data > write position is found by searching the position from which all data > up to the allocation entry write position is 0xff. > > The disadvantage is of course that the allocation entries take up some > space (only the offset is added). > > KInd regards, > > Jehudi > > Op ma 30 jul. 2018 om 23:16 schreef Cufi, Carles <carles.c...@nordicsemi.no>: >> >> + Andrzej >> >> On 30/07/2018, 18:45, "will sanfilippo" <wi...@runtime.io> wrote: >> >> I guess you could also use some error correction/detection coding scheme >> as well if you wanted to try and actually recover data that had errors in it >> (but that seems a bit much). Another thing I have seen done is that just the >> header has FEC on it. This way, you can almost always get the proper length. >> Again, not sure if any of this is warranted. >> >> Will >>> On Jul 30, 2018, at 6:44 AM, marko kiiskila <ma...@runtime.io> wrote: >>> >>> Hi, >>> >>>> On Jul 30, 2018, at 1:47 PM, Laczen JMS <laczen...@gmail.com> wrote: >>>> >>>> Hi Marko and Will, >>>> >>>> I have been studying fcb and I think you can leave it at 8 bit as it >>>> was. The crc can only be interpreted as a check that >>>> the closing was done properly. As soon as the crc check fails this >>>> only means that the closing failed. It could just as well >>>> be fixed to zero instead of a crc. >>> >>> That is a valid point. If you can’t trust your data path to flash, what can >>> you trust? >>> >>>> >>>> When writing errors (bit errors) would occur fcb would need a lot more >>>> protection, the length which is written first could >>>> no longer be trusted and it would be impossible to find the crc. The >>>> writing of the length can also be a more problematic >>>> case to solve, what happens when the write of the length fails and the >>>> length that is read is pointing outside the sector ? >>> >>> On the other hand, there are flashes where multiple write cycles to >>> same location are allowed; you can keep turning more bits to zero. >>> There you can corrupt the data after writing. And on some the default, >>> unwritten state is 0 (this we need to address for FCB, image management). >>> >>> You’re right; adding mechanisms to detect corruption of length field, for >>> example, starts to get complicated and costly. Recovery is easier to do, >>> however, if we use a stronger version of CRC. I.e. if CRC does not match, >>> then just go forward a byte at a time until it does. >>> >>> >>>> >>>> Kind regards, >>>> >>>> Jehudi >>>> >>>> >>>> Op ma 30 jul. 2018 om 11:10 schreef marko kiiskila <ma...@runtime.io>: >>>>> >>>>> Thanks for the response, Will. >>>>> >>>>> I made it one-byte because the scenario where the CRC check strength >>>>> comes into play is somewhat rare; >>>>> it is to detect partial writes. I.e. when fcb_append_finish() fails to >>>>> get called. This, presumably, would >>>>> onlly happen on crash/power outage within a specific time window. This is >>>>> not used as an error detection >>>>> mechanism on a channel where we expect bit errors. >>>>> >>>>> The way I did it was I added 2 syscfg knobs to control which CRC is >>>>> included in the build. In case you get >>>>> really tight on code space. Of course, newtmgr uses CRC16, so if you have >>>>> that enabled, >>>>> there is no gain. >>>>> There’s 3 different options when starting FCB: inherit from flash, force >>>>> 16 bit, or force 8 bits. If the flash region >>>>> has not been initialized with anything, then the ‘inherit’ option will >>>>> prefer 16 bits over 8 bits. >>>>> So if you need to worry about backwards compatibility, set ‘inherit’. If >>>>> you want to use 16 bit, and are flashing >>>>> a new device, use ‘inherit’ or set ‘force 16bits’. If you need to keep >>>>> FCB region backwards compatible, use >>>>> option ‘force 8 bits’. >>>>> >>>>> >>>>>> On Jul 27, 2018, at 11:05 PM, will sanfilippo <wi...@runtime.io> wrote: >>>>>> >>>>>> I realize I am a bit late with these comments but better late than never >>>>>> I guess. Well, maybe some or all of these comments should never be made >>>>>> :-) This could be a horrible idea but why not just make it backward >>>>>> incompatible? (for #2). Maybe have a syscfg for folks that want to use >>>>>> the old format in the code? I would even consider making all entries >>>>>> have a two-byte CRC (regardless of length of data written). I presume >>>>>> the resume that it was one-byte was to save flash space? Heck, maybe >>>>>> even allow the syscfg to specify a 32-bit CRC (for those with lots of >>>>>> flash and/or really do not want a false positive to occur). >>>>>> >>>>>> Will >>>>>>> On Jul 26, 2018, at 5:10 AM, marko kiiskila <ma...@runtime.io> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> there’s few issues with FCB which I need to fix. >>>>>>> >>>>>>> 1) Compressing flash sector in the middle of write. >>>>>>> Currently the cycle of operation is expected to be the following: >>>>>>> fcb_append() -> get offset to write data to >>>>>>> flash_area_write() -> write data >>>>>>> fcb_append_finish() -> writes CRC etc >>>>>>> >>>>>>> This is not too great if the location returned by fcb_append() gets >>>>>>> rotated to be >>>>>>> scratch area before fcb_append_finish() is called. Which is quite >>>>>>> possible >>>>>>> if you’re using FCB to store logs, and data comes from different tasks. >>>>>>> >>>>>>> 2) CRC is 1 byte only. It’s ok for short config entries, but for longer >>>>>>> things >>>>>>> (log entries) is not good enough. >>>>>>> >>>>>>> To fix 1 I was going to add linkage to fcb_entry such that FCB can track >>>>>>> of incomplete writes, and add a write routine which should be used when >>>>>>> doing the write. >>>>>>> >>>>>>> Issue 2 is a little bit more cumbersome. I want to be able to at least >>>>>>> allow backwards compatibility, where code figures out from data in >>>>>>> the flash which CRC format is there. But then be able to switch to >>>>>>> 16 bit CRC format as new flash areas are written. Or keep using >>>>>>> the old format, because you want to do downgrade to old SW version >>>>>>> if necessary. >>>>>>> >>>>>>> So that’s what I’m currently thinking of trying to achieve. Let me know >>>>>>> if you guys have other things to take into consideration here. >>>>>> >>>>> >>> >> >> >>