On Mon, 7 Oct 2024 at 22:09, Yu, Max <[email protected]> wrote: > Corruptions are unavoidable part of our life, disks and network can inject > failures due to unpredictable and unknown reasons like > https://www.sciencedirect.com/science/article/abs/pii/S0026271421003723. Even > multi-layer protection is not perfect as it depends on where the error is > injected. > > We proposed this patch as it is aligned with how we currently handle > incorrect checksums: > https://github.com/yoctoproject/poky/commit/672c07de4a96eb67eaafba0873eced44ec9ae1a6. > > For context, we have builds running at a large scale, almost 24/7. This scale > contributes to the following challenges: > 1. sstate corruption for us happens ~1/10000 of builds. These are extremely > hard to reproduce and debug... > 2. with current behavior of reuploading the corrupted sstate object, we end > up in an endless loop of rebuilding that we cannot break. > We considered this yocto behavior as a bug, specifically because of #2. > > I hear your point, even though following your reasoning we should fail the > entire build rather than rebuilding the packages. We can explore other > options like parameterizing the behavior, but it will be very useful to be > able to break the bad loop of rebuilding somehow.
Perhaps you can describe how the issue can be reproduced and observed in a local build? (e.g. build something, corrupt sstate for it, observe the endless rebuild problem) I'd like to understand particularly where in existing code the endless cycle would be triggered. Alex
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#205278): https://lists.openembedded.org/g/openembedded-core/message/205278 Mute This Topic: https://lists.openembedded.org/mt/108828269/21656 Group Owner: [email protected] Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
