On Mon, 7 Oct 2024 at 22:09, Yu, Max <[email protected]> wrote:
> Corruptions are unavoidable part of our life, disks and network can inject 
> failures due to unpredictable and unknown reasons like 
> https://www.sciencedirect.com/science/article/abs/pii/S0026271421003723. Even 
> multi-layer protection is not perfect as it depends on where the error is 
> injected.
>
> We proposed this patch as it is aligned with how we currently handle 
> incorrect checksums: 
> https://github.com/yoctoproject/poky/commit/672c07de4a96eb67eaafba0873eced44ec9ae1a6.
>
> For context, we have builds running at a large scale, almost 24/7. This scale 
> contributes to the following challenges:
> 1. sstate corruption for us happens ~1/10000 of builds. These are extremely 
> hard to reproduce and debug...
> 2. with current behavior of reuploading the corrupted sstate object, we end 
> up in an endless loop of rebuilding that we cannot break.
> We considered this yocto behavior as a bug, specifically because of #2.
>
> I hear your point, even though following your reasoning we should fail the 
> entire build rather than rebuilding the packages. We can explore other 
> options like parameterizing the behavior, but it will be very useful to be 
> able to break the bad loop of rebuilding somehow.

Perhaps you can describe how the issue can be reproduced and observed
in a local build? (e.g. build something, corrupt sstate for it,
observe the endless rebuild problem)
I'd like to understand particularly where in existing code the endless
cycle would be triggered.

Alex
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#205278): 
https://lists.openembedded.org/g/openembedded-core/message/205278
Mute This Topic: https://lists.openembedded.org/mt/108828269/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to