Re: [yocto] Bitbake returning non-zero due to sstate errors
On Thu, Feb 20, 2020, 9:22 AM Paul Barker wrote: > On Thu, 20 Feb 2020 at 12:04, Richard Purdie > wrote: > > > > On Thu, 2020-02-20 at 11:59 +, Paul Barker wrote: > > > I'm now looking into this... > > > > > > In sstate_checkhashes() we mark sstate as available if > > > fetcher.checkstatus() succeeds. Then at a later point > > > sstate_setscene() calls sstate_installpkg() calls pstaging_fetch() > > > calls fetcher.download() to actually get the sstate artifact. If the > > > artifact is removed from the mirror between these two accesses (due > > > to an sstate mirror clean up running in parallel to a build), or if > > > there is an intermittent download failure we could see checkstatus() > > > succeed then download() fail. > > > > > > I don't think we should ignore all setscene errors but in the > > > specific case where it's the download step that fails I think that > > > should be a warning. Or it could be an error by default with a > > > variable we can set to turn it into a warning. Does that sound > > > reasonable? If so I'll work up a patch. > > > > Thinking about the code, I'm not sure how you're generically going to > > tell the difference between a setscene task that fails as the file > > disappeared compared to a setscene failure with another real error? :/ > > > > We could make all failed setscene tasks warnings but I think that > > buries actual real errors. > > > > This is probably why I've not changed the code before now. > > > > Special exit code values? :/ > > > > I'm open to proposals. > > > > I know we could put in some configuration option but in general I hate > > these as it just means more test matrix combinations and more ways for > > people to see different behaviours. They have a time/place but I'm not > > sure its here. > > I agree - I really don't want to have to add additional complexity > here. But I do think we need to fix this in some way, others are > affected by this as can be seen from previous discussions. And in the > case of a public sstate mirror we can't control when users decide to > run builds, there will always be the chance of a user running a build > on an old commit while old sstate artifacts are cleaned or starting a > build just as the mirror is taken offline for some maintenance. > > I think we might be able to make this work if we can avoid adding any > new conditional logic to the fetcher itself. I can see that almost > every call to logger.error() is followed by raising an error - perhaps > we could rework the code to include all the relevant info in the > raised error object and allow higher level code to catch the exception > and decide what to do with it. Because once logger.error() is called, > knotty counts an error and bitbake will exit non-zero even if the > error is safely handled. Once the fetcher simply raises exceptions in > the case of failed downloads we could handle this neatly in > sstate.bbclass. Would that be a viable way forward? Or would that > break the other fetcher use cases? > FWIW we also have this problem because our CI nodes all update the sstate cache via rsync after they finish, which causes races. This hasn't affected our developers, but I suspect that is only because they aren't doing builds at 1 AM. The way we worked around it was to split up the build into two invocations of bitbake: bitbake --setscene-only || true bitbake --skip-setscene Although this will likely not work very well with hash equivalence. > Thanks, > Paul > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#48565): https://lists.yoctoproject.org/g/yocto/message/48565 Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656 Group Owner: yocto+ow...@lists.yoctoproject.org Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [yocto] Bitbake returning non-zero due to sstate errors
On Thu, 20 Feb 2020 at 12:04, Richard Purdie wrote: > > On Thu, 2020-02-20 at 11:59 +, Paul Barker wrote: > > I'm now looking into this... > > > > In sstate_checkhashes() we mark sstate as available if > > fetcher.checkstatus() succeeds. Then at a later point > > sstate_setscene() calls sstate_installpkg() calls pstaging_fetch() > > calls fetcher.download() to actually get the sstate artifact. If the > > artifact is removed from the mirror between these two accesses (due > > to an sstate mirror clean up running in parallel to a build), or if > > there is an intermittent download failure we could see checkstatus() > > succeed then download() fail. > > > > I don't think we should ignore all setscene errors but in the > > specific case where it's the download step that fails I think that > > should be a warning. Or it could be an error by default with a > > variable we can set to turn it into a warning. Does that sound > > reasonable? If so I'll work up a patch. > > Thinking about the code, I'm not sure how you're generically going to > tell the difference between a setscene task that fails as the file > disappeared compared to a setscene failure with another real error? :/ > > We could make all failed setscene tasks warnings but I think that > buries actual real errors. > > This is probably why I've not changed the code before now. > > Special exit code values? :/ > > I'm open to proposals. > > I know we could put in some configuration option but in general I hate > these as it just means more test matrix combinations and more ways for > people to see different behaviours. They have a time/place but I'm not > sure its here. I agree - I really don't want to have to add additional complexity here. But I do think we need to fix this in some way, others are affected by this as can be seen from previous discussions. And in the case of a public sstate mirror we can't control when users decide to run builds, there will always be the chance of a user running a build on an old commit while old sstate artifacts are cleaned or starting a build just as the mirror is taken offline for some maintenance. I think we might be able to make this work if we can avoid adding any new conditional logic to the fetcher itself. I can see that almost every call to logger.error() is followed by raising an error - perhaps we could rework the code to include all the relevant info in the raised error object and allow higher level code to catch the exception and decide what to do with it. Because once logger.error() is called, knotty counts an error and bitbake will exit non-zero even if the error is safely handled. Once the fetcher simply raises exceptions in the case of failed downloads we could handle this neatly in sstate.bbclass. Would that be a viable way forward? Or would that break the other fetcher use cases? Thanks, Paul -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#48535): https://lists.yoctoproject.org/g/yocto/message/48535 Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656 Group Owner: yocto+ow...@lists.yoctoproject.org Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [yocto] Bitbake returning non-zero due to sstate errors
On Thu, Feb 20, 2020 at 11:26:54AM +, Paul Barker wrote: > In my new CI setup I'm using an sstate mirror which seems to have some > occasional download issues. This results in the setscene task failing. > For example: > > ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0 > do_package_write_ipk_setscene: Fetcher failure: Unable to find file > file://fd/sstate:qt3d:armv7at2hf-neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a74_package_write_ipk.tgz;downloadfilename=fd/sstate:qt3d:armv7at2hf-neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a74_package_write_ipk.tgz > anywhere. The paths that were searched were: > /builds/SanCloudLtd/sancloud-arago/build/sstate-cache > /builds/SanCloudLtd/sancloud-arago/build/sstate-cache > ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0 > do_package_write_ipk_setscene: No suitable staging package found > ERROR: Logfile of failure stored in: > /builds/SanCloudLtd/sancloud-arago/build/tmp/work/armv7at2hf-neon-linux-gnueabi/qt3d/5.13.2+gitAUTOINC+93361f1a59-r0/temp/log.do_package_write_ipk_setscene.10524 > NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task > do_package_write_ipk_setscene: Failed > WARNING: Setscene task > (/builds/SanCloudLtd/sancloud-arago/sources/meta-qt5/recipes-qt/qt5/qt3d_git.bb:do_package_write_ipk_setscene) > failed with exit code '1' - real task will be run instead > > As indicated in the final warning message there the real tasks run > since no sstate artifact is available. These tasks succeed: > > NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task > do_package_write_ipk: Succeeded > > The result is a successful build of the desired images. However, the > build is marked as a failure due to those sstate errors: > > Summary: There were 11 ERROR messages shown, returning a non-zero exit code. > > Is this the expected behaviour? The final images are built correctly. > I can't see any simple way to mask those setscene errors but I might > be missing something. > > The full log can be seen at > https://gitlab.com/SanCloudLtd/sancloud-arago/-/jobs/443901140/raw. > I'm on the zeus branch here, I'll try to re-test on master later if I > can. See this previous discussion which includes patches for bitbake and oe-core to change them to warnings: https://marc.info/?l=openembedded-core=150403687120408=2 Because it was rejected I'm parsing the bitbake output to see if all ERROR: messages were only about setscene tasks as mentioned in the same thread much later: https://marc.info/?l=openembedded-core=157504616302317=4 signature.asc Description: PGP signature -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#48533): https://lists.yoctoproject.org/g/yocto/message/48533 Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656 Group Owner: yocto+ow...@lists.yoctoproject.org Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [yocto] Bitbake returning non-zero due to sstate errors
On Thu, 2020-02-20 at 11:59 +, Paul Barker wrote: > I'm now looking into this... > > In sstate_checkhashes() we mark sstate as available if > fetcher.checkstatus() succeeds. Then at a later point > sstate_setscene() calls sstate_installpkg() calls pstaging_fetch() > calls fetcher.download() to actually get the sstate artifact. If the > artifact is removed from the mirror between these two accesses (due > to an sstate mirror clean up running in parallel to a build), or if > there is an intermittent download failure we could see checkstatus() > succeed then download() fail. > > I don't think we should ignore all setscene errors but in the > specific case where it's the download step that fails I think that > should be a warning. Or it could be an error by default with a > variable we can set to turn it into a warning. Does that sound > reasonable? If so I'll work up a patch. Thinking about the code, I'm not sure how you're generically going to tell the difference between a setscene task that fails as the file disappeared compared to a setscene failure with another real error? :/ We could make all failed setscene tasks warnings but I think that buries actual real errors. This is probably why I've not changed the code before now. Special exit code values? :/ I'm open to proposals. I know we could put in some configuration option but in general I hate these as it just means more test matrix combinations and more ways for people to see different behaviours. They have a time/place but I'm not sure its here. Cheers, Richard -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#48532): https://lists.yoctoproject.org/g/yocto/message/48532 Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656 Group Owner: yocto+ow...@lists.yoctoproject.org Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [yocto] Bitbake returning non-zero due to sstate errors
On Thu, 20 Feb 2020 at 11:36, Richard Purdie wrote: > > On Thu, 2020-02-20 at 11:26 +, Paul Barker wrote: > > In my new CI setup I'm using an sstate mirror which seems to have > > some > > occasional download issues. This results in the setscene task > > failing. > > For example: > > > > ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0 > > do_package_write_ipk_setscene: Fetcher failure: Unable to find file > > file://fd/sstate:qt3d:armv7at2hf-neon-linux- > > gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf- > > neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a > > 74_package_write_ipk.tgz;downloadfilename=fd/sstate:qt3d:armv7at2hf- > > neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf- > > neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a > > 74_package_write_ipk.tgz > > anywhere. The paths that were searched were: > > /builds/SanCloudLtd/sancloud-arago/build/sstate-cache > > /builds/SanCloudLtd/sancloud-arago/build/sstate-cache > > ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0 > > do_package_write_ipk_setscene: No suitable staging package found > > ERROR: Logfile of failure stored in: > > /builds/SanCloudLtd/sancloud-arago/build/tmp/work/armv7at2hf-neon- > > linux-gnueabi/qt3d/5.13.2+gitAUTOINC+93361f1a59- > > r0/temp/log.do_package_write_ipk_setscene.10524 > > NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task > > do_package_write_ipk_setscene: Failed > > WARNING: Setscene task > > (/builds/SanCloudLtd/sancloud-arago/sources/meta-qt5/recipes- > > qt/qt5/qt3d_git.bb:do_package_write_ipk_setscene) > > failed with exit code '1' - real task will be run instead > > > > As indicated in the final warning message there the real tasks run > > since no sstate artifact is available. These tasks succeed: > > > > NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task > > do_package_write_ipk: Succeeded > > > > The result is a successful build of the desired images. However, the > > build is marked as a failure due to those sstate errors: > > > > Summary: There were 11 ERROR messages shown, returning a non-zero > > exit code. > > > > Is this the expected behaviour? The final images are built correctly. > > I can't see any simple way to mask those setscene errors but I might > > be missing something. > > > > The full log can be seen at > > https://gitlab.com/SanCloudLtd/sancloud-arago/-/jobs/443901140/raw. > > I'm on the zeus branch here, I'll try to re-test on master later if I > > can. > > We've discussed this before and it can be argued either way. > > Personally, I worry about why artefacts "disappear" and this is why its > an error, files should not be disappearing part way through a build. > > From a bitbake perspective, a task really did fail and task failures > are errors. The fact it was able to recover is a bonus. > > Perhaps it should be a warning now we have levels of warnings that are > meaningful. Previously we threw so many, this would have been one more > lost amongst many. I know many people don't like the behaviour. I'm now looking into this... In sstate_checkhashes() we mark sstate as available if fetcher.checkstatus() succeeds. Then at a later point sstate_setscene() calls sstate_installpkg() calls pstaging_fetch() calls fetcher.download() to actually get the sstate artifact. If the artifact is removed from the mirror between these two accesses (due to an sstate mirror clean up running in parallel to a build), or if there is an intermittent download failure we could see checkstatus() succeed then download() fail. I don't think we should ignore all setscene errors but in the specific case where it's the download step that fails I think that should be a warning. Or it could be an error by default with a variable we can set to turn it into a warning. Does that sound reasonable? If so I'll work up a patch. Thanks, Paul -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#48530): https://lists.yoctoproject.org/g/yocto/message/48530 Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656 Group Owner: yocto+ow...@lists.yoctoproject.org Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [yocto] Bitbake returning non-zero due to sstate errors
On Thu, 2020-02-20 at 11:26 +, Paul Barker wrote: > In my new CI setup I'm using an sstate mirror which seems to have > some > occasional download issues. This results in the setscene task > failing. > For example: > > ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0 > do_package_write_ipk_setscene: Fetcher failure: Unable to find file > file://fd/sstate:qt3d:armv7at2hf-neon-linux- > gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf- > neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a > 74_package_write_ipk.tgz;downloadfilename=fd/sstate:qt3d:armv7at2hf- > neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf- > neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a > 74_package_write_ipk.tgz > anywhere. The paths that were searched were: > /builds/SanCloudLtd/sancloud-arago/build/sstate-cache > /builds/SanCloudLtd/sancloud-arago/build/sstate-cache > ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0 > do_package_write_ipk_setscene: No suitable staging package found > ERROR: Logfile of failure stored in: > /builds/SanCloudLtd/sancloud-arago/build/tmp/work/armv7at2hf-neon- > linux-gnueabi/qt3d/5.13.2+gitAUTOINC+93361f1a59- > r0/temp/log.do_package_write_ipk_setscene.10524 > NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task > do_package_write_ipk_setscene: Failed > WARNING: Setscene task > (/builds/SanCloudLtd/sancloud-arago/sources/meta-qt5/recipes- > qt/qt5/qt3d_git.bb:do_package_write_ipk_setscene) > failed with exit code '1' - real task will be run instead > > As indicated in the final warning message there the real tasks run > since no sstate artifact is available. These tasks succeed: > > NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task > do_package_write_ipk: Succeeded > > The result is a successful build of the desired images. However, the > build is marked as a failure due to those sstate errors: > > Summary: There were 11 ERROR messages shown, returning a non-zero > exit code. > > Is this the expected behaviour? The final images are built correctly. > I can't see any simple way to mask those setscene errors but I might > be missing something. > > The full log can be seen at > https://gitlab.com/SanCloudLtd/sancloud-arago/-/jobs/443901140/raw. > I'm on the zeus branch here, I'll try to re-test on master later if I > can. We've discussed this before and it can be argued either way. Personally, I worry about why artefacts "disappear" and this is why its an error, files should not be disappearing part way through a build. >From a bitbake perspective, a task really did fail and task failures are errors. The fact it was able to recover is a bonus. Perhaps it should be a warning now we have levels of warnings that are meaningful. Previously we threw so many, this would have been one more lost amongst many. I know many people don't like the behaviour. Cheers, Richard -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#48528): https://lists.yoctoproject.org/g/yocto/message/48528 Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656 Group Owner: yocto+ow...@lists.yoctoproject.org Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[yocto] Bitbake returning non-zero due to sstate errors
In my new CI setup I'm using an sstate mirror which seems to have some occasional download issues. This results in the setscene task failing. For example: ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0 do_package_write_ipk_setscene: Fetcher failure: Unable to find file file://fd/sstate:qt3d:armv7at2hf-neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a74_package_write_ipk.tgz;downloadfilename=fd/sstate:qt3d:armv7at2hf-neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a74_package_write_ipk.tgz anywhere. The paths that were searched were: /builds/SanCloudLtd/sancloud-arago/build/sstate-cache /builds/SanCloudLtd/sancloud-arago/build/sstate-cache ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0 do_package_write_ipk_setscene: No suitable staging package found ERROR: Logfile of failure stored in: /builds/SanCloudLtd/sancloud-arago/build/tmp/work/armv7at2hf-neon-linux-gnueabi/qt3d/5.13.2+gitAUTOINC+93361f1a59-r0/temp/log.do_package_write_ipk_setscene.10524 NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task do_package_write_ipk_setscene: Failed WARNING: Setscene task (/builds/SanCloudLtd/sancloud-arago/sources/meta-qt5/recipes-qt/qt5/qt3d_git.bb:do_package_write_ipk_setscene) failed with exit code '1' - real task will be run instead As indicated in the final warning message there the real tasks run since no sstate artifact is available. These tasks succeed: NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task do_package_write_ipk: Succeeded The result is a successful build of the desired images. However, the build is marked as a failure due to those sstate errors: Summary: There were 11 ERROR messages shown, returning a non-zero exit code. Is this the expected behaviour? The final images are built correctly. I can't see any simple way to mask those setscene errors but I might be missing something. The full log can be seen at https://gitlab.com/SanCloudLtd/sancloud-arago/-/jobs/443901140/raw. I'm on the zeus branch here, I'll try to re-test on master later if I can. Thanks, Paul -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#48527): https://lists.yoctoproject.org/g/yocto/message/48527 Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656 Group Owner: yocto+ow...@lists.yoctoproject.org Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-