On Tuesday, October 8, 2024 2:38 AM Alexander Kanavin <[email protected]> 
wrote:
>On Mon, 7 Oct 2024 at 22:09, Yu, Max <[email protected]> wrote:
>> Corruptions are unavoidable part of our life, disks and network can inject
>> failures due to unpredictable and unknown reasons like
>> https://www.sciencedirect.com/science/article/abs/pii/S0026271421003723.
>> Even multi-layer protection is not perfect as it depends on where the error
>> is injected.
> I'm sorry but I am not convinced. You are the only one reporting the issue,
> so I don't think it's unavoidable.
The "only one reporting" does not mean the problem does not exist, it may be
just not as big problem for others as for us. Similar problem exist for DRAM
memory corruptions? Most of the people don't care about that but for some
this is important problem, e.g. when you see 1 OS crash per 10 years it is
not a big deal but if you own 10k servers you see 3 crashes per day. That is
the scale factor that is important. Max talked about our scale already.
Summarizing, the manual work is not a solution for us due to scale.

>> We proposed this patch as it is aligned with how we currently handle
>> incorrect checksums:
>> https://github.com/yoctoproject/poky/commit/672c07de4a96eb67eaafba0873eced44ec9ae1a6.
> What the patch is doing is essentially -c cleansstate on a cache used by
> several consumers. This makes it a non-starter: you can't remove items from
> live sstate like that. Once an object is in the cache, it needs to be removed
> offline, or replaced atomically if it's corrupted.
I disagree, we can overwrite bad artifact. Yocto indirectly does that as it has
to rebuild the package. This is "by design" behavior. And to be honest, there is
no difference between (1) rebuilding the package every time and (2) overwriting
sstate cache so any other build can reuse it. Is there any concern around
uploading such freshly built artifact?

> Before we decide how to handle corrupted items, it's perhaps better to
> consider first how they should be reported. Then users can decide what they
> want to do with that information without yocto forcing something on them.
> So please: how can I observe the issue?
This is random thing, we are not in control of e.g. DRAM bit flip error, they
simply happen. To simulate the situation you can inject an error yourself by
e.g., overwriting the random byte of the zstd file before it is uploaded.

>Alex

Regards,
Przemyslaw
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#205300): 
https://lists.openembedded.org/g/openembedded-core/message/205300
Mute This Topic: https://lists.openembedded.org/mt/108828269/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to