Re: FCB revamp

will sanfilippo Tue, 31 Jul 2018 09:24:50 -0700

Interesting; I see what this method does for you. I still think there is a 
fairly high chance of getting a false positive with a 1 byte crc but that might 
not be a huge issue.


Anyway, thanks for this!

Will

> On Jul 31, 2018, at 12:47 AM, Laczen JMS <laczen...@gmail.com> wrote:
> 
> Hi Marko and Will,
> 
> I have experienced the same problem while developing nvs for zephyr. I
> always got stuck when I would except that there would be changes to
> flash by an "external" factor. As soon as this could happen there is a
> problem with the meaning of the crc. The crc fails this means:
> a. Incomplete write,
> b. Some tampering occurred,
> c. The len is not pointing to the correct location,
> In all these cases the result will be that the data is not considered
> valid, the only case where a increased crc might be helping is in b.
> 
> In the latest version of nvs I have changed how the data is written to
> flash. The data itself is written from the start of a sector, a
> "allocation entry" is written from the end of a flash sector. This
> allocation entry contains a sector offset and length of the data
> written.  The allocation entry is of fixed size. The writing of data
> is done using the following method:
> 1. Check if there is sufficient space (difference between allocation
> entry location and next data location),
> 2. Write the data at the specific location, this could include a data crc,
> 3. Write the allocation entry, this includes a crc calculated over the
> item length and offset,
> 
> When there is insufficient space a allocation entry of all zeros is
> written at the end of the allocation entries, and a new sector is
> used.
> 
> At startup the allocation entry write position is found by stepping
> through the allocation entries until all 0xff are found. The data
> write position is found by searching the position from which all data
> up to the allocation entry write position is 0xff.
> 
> The disadvantage is of course that the allocation entries take up some
> space (only the offset is added).
> 
> KInd regards,
> 
> Jehudi
> 
> Op ma 30 jul. 2018 om 23:16 schreef Cufi, Carles <carles.c...@nordicsemi.no>:
>> 
>> + Andrzej
>> 
>> On 30/07/2018, 18:45, "will sanfilippo" <wi...@runtime.io> wrote:
>> 
>>    I guess you could also use some error correction/detection coding scheme 
>> as well if you wanted to try and actually recover data that had errors in it 
>> (but that seems a bit much). Another thing I have seen done is that just the 
>> header has FEC on it. This way, you can almost always get the proper length. 
>> Again, not sure if any of this is warranted.
>> 
>>    Will
>>> On Jul 30, 2018, at 6:44 AM, marko kiiskila <ma...@runtime.io> wrote:
>>> 
>>> Hi,
>>> 
>>>> On Jul 30, 2018, at 1:47 PM, Laczen JMS <laczen...@gmail.com> wrote:
>>>> 
>>>> Hi Marko and Will,
>>>> 
>>>> I have been studying fcb and I think you can leave it at 8 bit as it
>>>> was. The crc can only be interpreted as a check that
>>>> the closing was done properly. As soon as the crc check fails this
>>>> only means that the closing failed. It could just as well
>>>> be fixed to zero instead of a crc.
>>> 
>>> That is a valid point. If you can’t trust your data path to flash, what can 
>>> you trust?
>>> 
>>>> 
>>>> When writing errors (bit errors) would occur fcb would need a lot more
>>>> protection, the  length which is written first could
>>>> no longer be trusted and it would be impossible to find the crc. The
>>>> writing of the length can also be a more problematic
>>>> case to solve, what happens when the write of the length fails and the
>>>> length that is read is pointing outside the sector ?
>>> 
>>> On the other hand, there are flashes where multiple write cycles to
>>> same location are allowed; you can keep turning more bits to zero.
>>> There you can corrupt the data after writing. And on some the default,
>>> unwritten state is 0 (this we need to address for FCB, image management).
>>> 
>>> You’re right; adding mechanisms to detect corruption of length field, for
>>> example, starts to get complicated and costly. Recovery is easier to do,
>>> however, if we use a stronger version of CRC. I.e. if CRC does not match,
>>> then just go forward a byte at a time until it does.
>>> 
>>> 
>>>> 
>>>> Kind regards,
>>>> 
>>>> Jehudi
>>>> 
>>>> 
>>>> Op ma 30 jul. 2018 om 11:10 schreef marko kiiskila <ma...@runtime.io>:
>>>>> 
>>>>> Thanks for the response, Will.
>>>>> 
>>>>> I made it one-byte because the scenario where the CRC check strength 
>>>>> comes into play is somewhat rare;
>>>>> it is to detect partial writes. I.e. when fcb_append_finish() fails to 
>>>>> get called. This, presumably, would
>>>>> onlly happen on crash/power outage within a specific time window. This is 
>>>>> not used as an error detection
>>>>> mechanism on a channel where we expect bit errors.
>>>>> 
>>>>> The way I did it was I added 2 syscfg knobs to control which CRC is 
>>>>> included in the build. In case you get
>>>>> really tight on code space. Of course, newtmgr uses CRC16, so if you have 
>>>>> that enabled,
>>>>> there is no gain.
>>>>> There’s 3 different options when starting FCB: inherit from flash, force 
>>>>> 16 bit, or force 8 bits. If the flash region
>>>>> has not been initialized with anything, then the ‘inherit’ option will 
>>>>> prefer 16 bits over 8 bits.
>>>>> So if you need to worry about backwards compatibility, set ‘inherit’. If 
>>>>> you want to use 16 bit, and are flashing
>>>>> a new device, use ‘inherit’ or set ‘force 16bits’. If you need to keep 
>>>>> FCB region backwards compatible, use
>>>>> option ‘force 8 bits’.
>>>>> 
>>>>> 
>>>>>> On Jul 27, 2018, at 11:05 PM, will sanfilippo <wi...@runtime.io> wrote:
>>>>>> 
>>>>>> I realize I am a bit late with these comments but better late than never 
>>>>>> I guess. Well, maybe some or all of these comments should never be made 
>>>>>> :-) This could be a horrible idea but why not just make it backward 
>>>>>> incompatible? (for #2). Maybe have a syscfg for folks that want to use 
>>>>>> the old format in the code? I would even consider making all entries 
>>>>>> have a two-byte CRC (regardless of length of data written). I presume 
>>>>>> the resume that it was one-byte was to save flash space? Heck, maybe 
>>>>>> even allow the syscfg to specify a 32-bit CRC (for those with lots of 
>>>>>> flash and/or really do not want a false positive to occur).
>>>>>> 
>>>>>> Will
>>>>>>> On Jul 26, 2018, at 5:10 AM, marko kiiskila <ma...@runtime.io> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> there’s few issues with FCB which I need to fix.
>>>>>>> 
>>>>>>> 1) Compressing flash sector in the middle of write.
>>>>>>> Currently the cycle of operation is expected to be the following:
>>>>>>> fcb_append() -> get offset to write data to
>>>>>>> flash_area_write() -> write data
>>>>>>> fcb_append_finish() -> writes CRC etc
>>>>>>> 
>>>>>>> This is not too great if the location returned by fcb_append() gets 
>>>>>>> rotated to be
>>>>>>> scratch area before fcb_append_finish() is called. Which is quite 
>>>>>>> possible
>>>>>>> if you’re using FCB to store logs, and data comes from different tasks.
>>>>>>> 
>>>>>>> 2) CRC is 1 byte only. It’s ok for short config entries, but for longer 
>>>>>>> things
>>>>>>> (log entries) is not good enough.
>>>>>>> 
>>>>>>> To fix 1 I was going to add linkage to fcb_entry such that FCB can track
>>>>>>> of incomplete writes, and add a write routine which should be used when
>>>>>>> doing the write.
>>>>>>> 
>>>>>>> Issue 2 is a little bit more cumbersome. I want to be able to at least
>>>>>>> allow backwards compatibility, where code figures out from data in
>>>>>>> the flash which CRC format is there. But then be able to switch to
>>>>>>> 16 bit CRC format as new flash areas are written. Or keep using
>>>>>>> the old format, because you want to do downgrade to old SW version
>>>>>>> if necessary.
>>>>>>> 
>>>>>>> So that’s what I’m currently thinking of trying to achieve. Let me know
>>>>>>> if you guys have other things to take into consideration here.
>>>>>> 
>>>>> 
>>> 
>> 
>> 
>>

Re: FCB revamp

Reply via email to