Re: [yocto] Bitbake returning non-zero due to sstate errors

2020-02-21 Thread Joshua Watt
On Thu, Feb 20, 2020, 9:22 AM Paul Barker  wrote:

> On Thu, 20 Feb 2020 at 12:04, Richard Purdie
>  wrote:
> >
> > On Thu, 2020-02-20 at 11:59 +, Paul Barker wrote:
> > > I'm now looking into this...
> > >
> > > In sstate_checkhashes() we mark sstate as available if
> > > fetcher.checkstatus() succeeds. Then at a later point
> > > sstate_setscene() calls sstate_installpkg() calls pstaging_fetch()
> > > calls fetcher.download() to actually get the sstate artifact. If the
> > > artifact is removed from the mirror between these two accesses (due
> > > to an sstate mirror clean up running in parallel to a build), or if
> > > there is an intermittent download failure we could see checkstatus()
> > > succeed then download() fail.
> > >
> > > I don't think we should ignore all setscene errors but in the
> > > specific case where it's the download step that fails I think that
> > > should be a warning. Or it could be an error by default with a
> > > variable we can set to turn it into a warning. Does that sound
> > > reasonable? If so I'll work up a patch.
> >
> > Thinking about the code, I'm not sure how you're generically going to
> > tell the difference between a setscene task that fails as the file
> > disappeared compared to a setscene failure with another real error? :/
> >
> > We could make all failed setscene tasks warnings but I think that
> > buries actual real errors.
> >
> > This is probably why I've not changed the code before now.
> >
> > Special exit code values? :/
> >
> > I'm open to proposals.
> >
> > I know we could put in some configuration option but in general I hate
> > these as it just means more test matrix combinations and more ways for
> > people to see different behaviours. They have a time/place but I'm not
> > sure its here.
>
> I agree - I really don't want to have to add additional complexity
> here. But I do think we need to fix this in some way, others are
> affected by this as can be seen from previous discussions. And in the
> case of a public sstate mirror we can't control when users decide to
> run builds, there will always be the chance of a user running a build
> on an old commit while old sstate artifacts are cleaned or starting a
> build just as the mirror is taken offline for some maintenance.
>
> I think we might be able to make this work if we can avoid adding any
> new conditional logic to the fetcher itself. I can see that almost
> every call to logger.error() is followed by raising an error - perhaps
> we could rework the code to include all the relevant info in the
> raised error object and allow higher level code to catch the exception
> and decide what to do with it. Because once logger.error() is called,
> knotty counts an error and bitbake will exit non-zero even if the
> error is safely handled. Once the fetcher simply raises exceptions in
> the case of failed downloads we could handle this neatly in
> sstate.bbclass. Would that be a viable way forward? Or would that
> break the other fetcher use cases?
>

FWIW we also have this problem because our CI nodes all update the sstate
cache via rsync after they finish, which causes races. This hasn't affected
our developers, but I suspect that is only because they aren't doing builds
at 1 AM.

The way we worked around it was to split up the build into two invocations
of bitbake:

 bitbake --setscene-only  || true
 bitbake --skip-setscene 

Although this will likely not work very well with hash equivalence.


> Thanks,
> Paul
> 
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#48565): https://lists.yoctoproject.org/g/yocto/message/48565
Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [yocto] Bitbake returning non-zero due to sstate errors

2020-02-20 Thread Paul Barker
On Thu, 20 Feb 2020 at 12:04, Richard Purdie
 wrote:
>
> On Thu, 2020-02-20 at 11:59 +, Paul Barker wrote:
> > I'm now looking into this...
> >
> > In sstate_checkhashes() we mark sstate as available if
> > fetcher.checkstatus() succeeds. Then at a later point
> > sstate_setscene() calls sstate_installpkg() calls pstaging_fetch()
> > calls fetcher.download() to actually get the sstate artifact. If the
> > artifact is removed from the mirror between these two accesses (due
> > to an sstate mirror clean up running in parallel to a build), or if
> > there is an intermittent download failure we could see checkstatus()
> > succeed then download() fail.
> >
> > I don't think we should ignore all setscene errors but in the
> > specific case where it's the download step that fails I think that
> > should be a warning. Or it could be an error by default with a
> > variable we can set to turn it into a warning. Does that sound
> > reasonable? If so I'll work up a patch.
>
> Thinking about the code, I'm not sure how you're generically going to
> tell the difference between a setscene task that fails as the file
> disappeared compared to a setscene failure with another real error? :/
>
> We could make all failed setscene tasks warnings but I think that
> buries actual real errors.
>
> This is probably why I've not changed the code before now.
>
> Special exit code values? :/
>
> I'm open to proposals.
>
> I know we could put in some configuration option but in general I hate
> these as it just means more test matrix combinations and more ways for
> people to see different behaviours. They have a time/place but I'm not
> sure its here.

I agree - I really don't want to have to add additional complexity
here. But I do think we need to fix this in some way, others are
affected by this as can be seen from previous discussions. And in the
case of a public sstate mirror we can't control when users decide to
run builds, there will always be the chance of a user running a build
on an old commit while old sstate artifacts are cleaned or starting a
build just as the mirror is taken offline for some maintenance.

I think we might be able to make this work if we can avoid adding any
new conditional logic to the fetcher itself. I can see that almost
every call to logger.error() is followed by raising an error - perhaps
we could rework the code to include all the relevant info in the
raised error object and allow higher level code to catch the exception
and decide what to do with it. Because once logger.error() is called,
knotty counts an error and bitbake will exit non-zero even if the
error is safely handled. Once the fetcher simply raises exceptions in
the case of failed downloads we could handle this neatly in
sstate.bbclass. Would that be a viable way forward? Or would that
break the other fetcher use cases?

Thanks,
Paul
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#48535): https://lists.yoctoproject.org/g/yocto/message/48535
Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [yocto] Bitbake returning non-zero due to sstate errors

2020-02-20 Thread Martin Jansa
On Thu, Feb 20, 2020 at 11:26:54AM +, Paul Barker wrote:
> In my new CI setup I'm using an sstate mirror which seems to have some
> occasional download issues. This results in the setscene task failing.
> For example:
> 
> ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0
> do_package_write_ipk_setscene: Fetcher failure: Unable to find file
> file://fd/sstate:qt3d:armv7at2hf-neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a74_package_write_ipk.tgz;downloadfilename=fd/sstate:qt3d:armv7at2hf-neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a74_package_write_ipk.tgz
> anywhere. The paths that were searched were:
> /builds/SanCloudLtd/sancloud-arago/build/sstate-cache
> /builds/SanCloudLtd/sancloud-arago/build/sstate-cache
> ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0
> do_package_write_ipk_setscene: No suitable staging package found
> ERROR: Logfile of failure stored in:
> /builds/SanCloudLtd/sancloud-arago/build/tmp/work/armv7at2hf-neon-linux-gnueabi/qt3d/5.13.2+gitAUTOINC+93361f1a59-r0/temp/log.do_package_write_ipk_setscene.10524
> NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task
> do_package_write_ipk_setscene: Failed
> WARNING: Setscene task
> (/builds/SanCloudLtd/sancloud-arago/sources/meta-qt5/recipes-qt/qt5/qt3d_git.bb:do_package_write_ipk_setscene)
> failed with exit code '1' - real task will be run instead
> 
> As indicated in the final warning message there the real tasks run
> since no sstate artifact is available. These tasks succeed:
> 
> NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task
> do_package_write_ipk: Succeeded
> 
> The result is a successful build of the desired images. However, the
> build is marked as a failure due to those sstate errors:
> 
> Summary: There were 11 ERROR messages shown, returning a non-zero exit code.
> 
> Is this the expected behaviour? The final images are built correctly.
> I can't see any simple way to mask those setscene errors but I might
> be missing something.
> 
> The full log can be seen at
> https://gitlab.com/SanCloudLtd/sancloud-arago/-/jobs/443901140/raw.
> I'm on the zeus branch here, I'll try to re-test on master later if I
> can.

See this previous discussion which includes patches for bitbake and
oe-core to change them to warnings:
https://marc.info/?l=openembedded-core=150403687120408=2

Because it was rejected I'm parsing the bitbake output to see if all
ERROR: messages were only about setscene tasks as mentioned in the same
thread much later:
https://marc.info/?l=openembedded-core=157504616302317=4


signature.asc
Description: PGP signature
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#48533): https://lists.yoctoproject.org/g/yocto/message/48533
Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [yocto] Bitbake returning non-zero due to sstate errors

2020-02-20 Thread Richard Purdie
On Thu, 2020-02-20 at 11:59 +, Paul Barker wrote:
> I'm now looking into this...
> 
> In sstate_checkhashes() we mark sstate as available if
> fetcher.checkstatus() succeeds. Then at a later point
> sstate_setscene() calls sstate_installpkg() calls pstaging_fetch()
> calls fetcher.download() to actually get the sstate artifact. If the
> artifact is removed from the mirror between these two accesses (due
> to an sstate mirror clean up running in parallel to a build), or if
> there is an intermittent download failure we could see checkstatus()
> succeed then download() fail.
> 
> I don't think we should ignore all setscene errors but in the
> specific case where it's the download step that fails I think that
> should be a warning. Or it could be an error by default with a
> variable we can set to turn it into a warning. Does that sound
> reasonable? If so I'll work up a patch.

Thinking about the code, I'm not sure how you're generically going to
tell the difference between a setscene task that fails as the file
disappeared compared to a setscene failure with another real error? :/

We could make all failed setscene tasks warnings but I think that
buries actual real errors.

This is probably why I've not changed the code before now.

Special exit code values? :/

I'm open to proposals.

I know we could put in some configuration option but in general I hate
these as it just means more test matrix combinations and more ways for
people to see different behaviours. They have a time/place but I'm not
sure its here.

Cheers,

Richard



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#48532): https://lists.yoctoproject.org/g/yocto/message/48532
Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [yocto] Bitbake returning non-zero due to sstate errors

2020-02-20 Thread Paul Barker
On Thu, 20 Feb 2020 at 11:36, Richard Purdie
 wrote:
>
> On Thu, 2020-02-20 at 11:26 +, Paul Barker wrote:
> > In my new CI setup I'm using an sstate mirror which seems to have
> > some
> > occasional download issues. This results in the setscene task
> > failing.
> > For example:
> >
> > ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0
> > do_package_write_ipk_setscene: Fetcher failure: Unable to find file
> > file://fd/sstate:qt3d:armv7at2hf-neon-linux-
> > gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-
> > neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a
> > 74_package_write_ipk.tgz;downloadfilename=fd/sstate:qt3d:armv7at2hf-
> > neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-
> > neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a
> > 74_package_write_ipk.tgz
> > anywhere. The paths that were searched were:
> > /builds/SanCloudLtd/sancloud-arago/build/sstate-cache
> > /builds/SanCloudLtd/sancloud-arago/build/sstate-cache
> > ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0
> > do_package_write_ipk_setscene: No suitable staging package found
> > ERROR: Logfile of failure stored in:
> > /builds/SanCloudLtd/sancloud-arago/build/tmp/work/armv7at2hf-neon-
> > linux-gnueabi/qt3d/5.13.2+gitAUTOINC+93361f1a59-
> > r0/temp/log.do_package_write_ipk_setscene.10524
> > NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task
> > do_package_write_ipk_setscene: Failed
> > WARNING: Setscene task
> > (/builds/SanCloudLtd/sancloud-arago/sources/meta-qt5/recipes-
> > qt/qt5/qt3d_git.bb:do_package_write_ipk_setscene)
> > failed with exit code '1' - real task will be run instead
> >
> > As indicated in the final warning message there the real tasks run
> > since no sstate artifact is available. These tasks succeed:
> >
> > NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task
> > do_package_write_ipk: Succeeded
> >
> > The result is a successful build of the desired images. However, the
> > build is marked as a failure due to those sstate errors:
> >
> > Summary: There were 11 ERROR messages shown, returning a non-zero
> > exit code.
> >
> > Is this the expected behaviour? The final images are built correctly.
> > I can't see any simple way to mask those setscene errors but I might
> > be missing something.
> >
> > The full log can be seen at
> > https://gitlab.com/SanCloudLtd/sancloud-arago/-/jobs/443901140/raw.
> > I'm on the zeus branch here, I'll try to re-test on master later if I
> > can.
>
> We've discussed this before and it can be argued either way.
>
> Personally, I worry about why artefacts "disappear" and this is why its
> an error, files should not be disappearing part way through a build.
>
> From a bitbake perspective, a task really did fail and task failures
> are errors. The fact it was able to recover is a bonus.
>
> Perhaps it should be a warning now we have levels of warnings that are
> meaningful. Previously we threw so many, this would have been one more
> lost amongst many. I know many people don't like the behaviour.

I'm now looking into this...

In sstate_checkhashes() we mark sstate as available if
fetcher.checkstatus() succeeds. Then at a later point
sstate_setscene() calls sstate_installpkg() calls pstaging_fetch()
calls fetcher.download() to actually get the sstate artifact. If the
artifact is removed from the mirror between these two accesses (due to
an sstate mirror clean up running in parallel to a build), or if there
is an intermittent download failure we could see checkstatus() succeed
then download() fail.

I don't think we should ignore all setscene errors but in the specific
case where it's the download step that fails I think that should be a
warning. Or it could be an error by default with a variable we can set
to turn it into a warning. Does that sound reasonable? If so I'll work
up a patch.

Thanks,
Paul
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#48530): https://lists.yoctoproject.org/g/yocto/message/48530
Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [yocto] Bitbake returning non-zero due to sstate errors

2020-02-20 Thread Richard Purdie
On Thu, 2020-02-20 at 11:26 +, Paul Barker wrote:
> In my new CI setup I'm using an sstate mirror which seems to have
> some
> occasional download issues. This results in the setscene task
> failing.
> For example:
> 
> ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0
> do_package_write_ipk_setscene: Fetcher failure: Unable to find file
> file://fd/sstate:qt3d:armv7at2hf-neon-linux-
> gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-
> neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a
> 74_package_write_ipk.tgz;downloadfilename=fd/sstate:qt3d:armv7at2hf-
> neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-
> neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a
> 74_package_write_ipk.tgz
> anywhere. The paths that were searched were:
> /builds/SanCloudLtd/sancloud-arago/build/sstate-cache
> /builds/SanCloudLtd/sancloud-arago/build/sstate-cache
> ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0
> do_package_write_ipk_setscene: No suitable staging package found
> ERROR: Logfile of failure stored in:
> /builds/SanCloudLtd/sancloud-arago/build/tmp/work/armv7at2hf-neon-
> linux-gnueabi/qt3d/5.13.2+gitAUTOINC+93361f1a59-
> r0/temp/log.do_package_write_ipk_setscene.10524
> NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task
> do_package_write_ipk_setscene: Failed
> WARNING: Setscene task
> (/builds/SanCloudLtd/sancloud-arago/sources/meta-qt5/recipes-
> qt/qt5/qt3d_git.bb:do_package_write_ipk_setscene)
> failed with exit code '1' - real task will be run instead
> 
> As indicated in the final warning message there the real tasks run
> since no sstate artifact is available. These tasks succeed:
> 
> NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task
> do_package_write_ipk: Succeeded
> 
> The result is a successful build of the desired images. However, the
> build is marked as a failure due to those sstate errors:
> 
> Summary: There were 11 ERROR messages shown, returning a non-zero
> exit code.
> 
> Is this the expected behaviour? The final images are built correctly.
> I can't see any simple way to mask those setscene errors but I might
> be missing something.
> 
> The full log can be seen at
> https://gitlab.com/SanCloudLtd/sancloud-arago/-/jobs/443901140/raw.
> I'm on the zeus branch here, I'll try to re-test on master later if I
> can.

We've discussed this before and it can be argued either way.

Personally, I worry about why artefacts "disappear" and this is why its
an error, files should not be disappearing part way through a build.

>From a bitbake perspective, a task really did fail and task failures
are errors. The fact it was able to recover is a bonus.

Perhaps it should be a warning now we have levels of warnings that are
meaningful. Previously we threw so many, this would have been one more
lost amongst many. I know many people don't like the behaviour.

Cheers,

Richard



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#48528): https://lists.yoctoproject.org/g/yocto/message/48528
Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[yocto] Bitbake returning non-zero due to sstate errors

2020-02-20 Thread Paul Barker
In my new CI setup I'm using an sstate mirror which seems to have some
occasional download issues. This results in the setscene task failing.
For example:

ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0
do_package_write_ipk_setscene: Fetcher failure: Unable to find file
file://fd/sstate:qt3d:armv7at2hf-neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a74_package_write_ipk.tgz;downloadfilename=fd/sstate:qt3d:armv7at2hf-neon-linux-gnueabi:5.13.2+gitAUTOINC+93361f1a59:r0:armv7at2hf-neon:3:fda6c3edff0205b07ff176cf16771247117fa310bc65a6a1df6befc4230e0a74_package_write_ipk.tgz
anywhere. The paths that were searched were:
/builds/SanCloudLtd/sancloud-arago/build/sstate-cache
/builds/SanCloudLtd/sancloud-arago/build/sstate-cache
ERROR: qt3d-5.13.2+gitAUTOINC+93361f1a59-r0
do_package_write_ipk_setscene: No suitable staging package found
ERROR: Logfile of failure stored in:
/builds/SanCloudLtd/sancloud-arago/build/tmp/work/armv7at2hf-neon-linux-gnueabi/qt3d/5.13.2+gitAUTOINC+93361f1a59-r0/temp/log.do_package_write_ipk_setscene.10524
NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task
do_package_write_ipk_setscene: Failed
WARNING: Setscene task
(/builds/SanCloudLtd/sancloud-arago/sources/meta-qt5/recipes-qt/qt5/qt3d_git.bb:do_package_write_ipk_setscene)
failed with exit code '1' - real task will be run instead

As indicated in the final warning message there the real tasks run
since no sstate artifact is available. These tasks succeed:

NOTE: recipe qt3d-5.13.2+gitAUTOINC+93361f1a59-r0: task
do_package_write_ipk: Succeeded

The result is a successful build of the desired images. However, the
build is marked as a failure due to those sstate errors:

Summary: There were 11 ERROR messages shown, returning a non-zero exit code.

Is this the expected behaviour? The final images are built correctly.
I can't see any simple way to mask those setscene errors but I might
be missing something.

The full log can be seen at
https://gitlab.com/SanCloudLtd/sancloud-arago/-/jobs/443901140/raw.
I'm on the zeus branch here, I'll try to re-test on master later if I
can.

Thanks,
Paul
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#48527): https://lists.yoctoproject.org/g/yocto/message/48527
Mute This Topic: https://lists.yoctoproject.org/mt/71426351/21656
Group Owner: yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/yocto/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-