Re: [Openembedded-architecture] Dependent task hashes in depsig.*

2022-04-27 Thread Richard Purdie
On Wed, 2022-04-27 at 08:26 -0500, Joshua Watt wrote:
> On Wed, Apr 27, 2022 at 6:04 AM Richard Purdie
>  wrote:
> > 
> > On Wed, 2022-04-27 at 12:39 +0200, Jacob Kroon wrote:
> > > On 4/27/22 12:12, Richard Purdie wrote:
> > > > On Wed, 2022-04-27 at 11:06 +0200, Jacob Kroon wrote:
> > > > > Hi Richard and Joshua,
> > > > > 
> > > > > When using hash equivalency, since commits
> > > > > 
> > > > > https://git.openembedded.org/openembedded-core/commit/?id=d6c7b9f4f0e
> > > > > https://git.openembedded.org/openembedded-core/commit/?id=1cf62882bba
> > > > > 
> > > > > scrambling a header in one of the gcc patches causes all target 
> > > > > packages
> > > > > to rebuild.
> > > > 
> > > > That is probably unfortunately inevitable. If the output has changed 
> > > > (i.e. the
> > > > headers are different), it shouldn't be matching a previous build as we 
> > > > can't
> > > > know what has changed.
> > > > 
> > > 
> > > I don't think I was being clear enough. With "scrambling a header in a
> > > gcc patch" I mean that I change something in the section before the
> > > "---" line in a .patch file, like modifying the "Upstream-Status" line.
> > > To me, hash equivalence should be able to optimize that scenario, since
> > > the output from building gcc-cross is not changed.
> > 
> > That definitely wasn't clear! I now understand better what you mean and yes,
> > we're supposed to be optimising that scenario.
> > 
> > > 
> > > > >  This is because the depsig.do_populate_sysroot in "libgcc"
> > > > > changes:
> > > > > 
> > > > > > [jkroon@fedora work]$ diff -u 
> > > > > > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1*
> > > > > > --- 
> > > > > > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1589812
> > > > > >   2022-04-27 10:14:22.403251775 +0200
> > > > > > +++ 
> > > > > > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1674014
> > > > > >   2022-04-27 10:26:45.329365448 +0200
> > > > > > @@ -1,7 +1,7 @@
> > > > > >  OEOuthashBasic
> > > > > >  12
> > > > > >  glibc: 
> > > > > > 8feab297dd38b103daa4f26eeabb5690a74b8b5700d16e4eca7b56e6fd667a5e
> > > > > > -libgcc: 
> > > > > > dfd38409a4cc5320b781edc14de2af8321180c3f194a58b798870ad7ff6a9226
> > > > > > +libgcc: 
> > > > > > 195f6a155dac8e450e72a7432ab91959a8e095e057d5b79e3adba41721dc7ea5
> > > > > >  linux-libc-headers: 
> > > > > > 12a5aaf8aec9554ac3c778cdc6c65df4db52fc573e84b7110572d459a15c9d6a
> > > > > >  SSTATE_PKGSPEC=sstate:libgcc:i686-oe-linux:11.3.0:r0:i686:8:
> > > > > >  task=populate_sysroot
> > > > > 
> > > > > Is it the case that it is the dependent task hashes that are added
> > > > > above, and that the checksum of patches are included in the those task
> > > > > hashes ?
> > > > 
> > > > The dependent resolved hashes are used, as resolved by hashequiv which 
> > > > is a key
> > > > difference.
> > > > 
> > > 
> > > So it is the outhashes that are listed above?
> > 
> > No, they are hash equiv resolved hashes which would have a one to one 
> > mapping
> > with an outhash.
> > 
> > >  Then I don't understand
> > > the diff above. libgcc depends on itself ? But apparently no files in
> > > the sysroot changed, since the above is the only diff I get.
> > 
> > To be honest, I don't remember/understand offhand either. I'd need to go and
> > spend time trying to page in all the information.
> > 
> > We have too few people with the knowledge in these areas and I'm rapidly 
> > burning
> > out. I don't like this reply but I just don't have the time to dive into it 
> > and
> > debug it right now and I can't really give much more of a helpful 
> > comment/reply
> > without doing so.
> > 
> > I agree there is some issue here which does need investigation. At least do 
> > file
> > a bug so it doesn't get forgotten but we don't have many people taking on 
> > bugs
> > either. This one would get triaged to me or Joshua.
> > 
> > I'd also add that gcc is pretty horrific in that it bundles up a lot of it's
> > build tree into the sysroot. It is possible those bundled files are varying
> > somehow reproducibility wise causing some instability. I've worried about 
> > this
> > kind of issue for a while but I don't scale and there are a load of other 
> > issues
> > going on too :(.
> 
> It looks like this could possibly be a bug in the code in
> meta/classes/staging.bbclass that injects the dependencies into
> do_populate_sysroot & do_package.
> 
> It doesn't explicitly filter out ${PN} from BB_TASKDEPDATA, although
> it perhaps should? I'm not sure if the bug is that
> libgcc:do_populate_sysroot is in BB_TASKDEPDATA to begin with, or that
> the code isn't filtering it out. If we do need to filter it out,
> that's a pretty easy change to make.

The current task has to be in BB_TASKDEPDATA so I think you're right, we should
filter out "ourselves". I'll send a patch.

Cheers,

Richard


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1532): 

Re: [Openembedded-architecture] Dependent task hashes in depsig.*

2022-04-27 Thread Joshua Watt
On Wed, Apr 27, 2022 at 6:04 AM Richard Purdie
 wrote:
>
> On Wed, 2022-04-27 at 12:39 +0200, Jacob Kroon wrote:
> > On 4/27/22 12:12, Richard Purdie wrote:
> > > On Wed, 2022-04-27 at 11:06 +0200, Jacob Kroon wrote:
> > > > Hi Richard and Joshua,
> > > >
> > > > When using hash equivalency, since commits
> > > >
> > > > https://git.openembedded.org/openembedded-core/commit/?id=d6c7b9f4f0e
> > > > https://git.openembedded.org/openembedded-core/commit/?id=1cf62882bba
> > > >
> > > > scrambling a header in one of the gcc patches causes all target packages
> > > > to rebuild.
> > >
> > > That is probably unfortunately inevitable. If the output has changed 
> > > (i.e. the
> > > headers are different), it shouldn't be matching a previous build as we 
> > > can't
> > > know what has changed.
> > >
> >
> > I don't think I was being clear enough. With "scrambling a header in a
> > gcc patch" I mean that I change something in the section before the
> > "---" line in a .patch file, like modifying the "Upstream-Status" line.
> > To me, hash equivalence should be able to optimize that scenario, since
> > the output from building gcc-cross is not changed.
>
> That definitely wasn't clear! I now understand better what you mean and yes,
> we're supposed to be optimising that scenario.
>
> >
> > > >  This is because the depsig.do_populate_sysroot in "libgcc"
> > > > changes:
> > > >
> > > > > [jkroon@fedora work]$ diff -u 
> > > > > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1*
> > > > > --- 
> > > > > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1589812
> > > > >   2022-04-27 10:14:22.403251775 +0200
> > > > > +++ 
> > > > > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1674014
> > > > >   2022-04-27 10:26:45.329365448 +0200
> > > > > @@ -1,7 +1,7 @@
> > > > >  OEOuthashBasic
> > > > >  12
> > > > >  glibc: 
> > > > > 8feab297dd38b103daa4f26eeabb5690a74b8b5700d16e4eca7b56e6fd667a5e
> > > > > -libgcc: 
> > > > > dfd38409a4cc5320b781edc14de2af8321180c3f194a58b798870ad7ff6a9226
> > > > > +libgcc: 
> > > > > 195f6a155dac8e450e72a7432ab91959a8e095e057d5b79e3adba41721dc7ea5
> > > > >  linux-libc-headers: 
> > > > > 12a5aaf8aec9554ac3c778cdc6c65df4db52fc573e84b7110572d459a15c9d6a
> > > > >  SSTATE_PKGSPEC=sstate:libgcc:i686-oe-linux:11.3.0:r0:i686:8:
> > > > >  task=populate_sysroot
> > > >
> > > > Is it the case that it is the dependent task hashes that are added
> > > > above, and that the checksum of patches are included in the those task
> > > > hashes ?
> > >
> > > The dependent resolved hashes are used, as resolved by hashequiv which is 
> > > a key
> > > difference.
> > >
> >
> > So it is the outhashes that are listed above?
>
> No, they are hash equiv resolved hashes which would have a one to one mapping
> with an outhash.
>
> >  Then I don't understand
> > the diff above. libgcc depends on itself ? But apparently no files in
> > the sysroot changed, since the above is the only diff I get.
>
> To be honest, I don't remember/understand offhand either. I'd need to go and
> spend time trying to page in all the information.
>
> We have too few people with the knowledge in these areas and I'm rapidly 
> burning
> out. I don't like this reply but I just don't have the time to dive into it 
> and
> debug it right now and I can't really give much more of a helpful 
> comment/reply
> without doing so.
>
> I agree there is some issue here which does need investigation. At least do 
> file
> a bug so it doesn't get forgotten but we don't have many people taking on bugs
> either. This one would get triaged to me or Joshua.
>
> I'd also add that gcc is pretty horrific in that it bundles up a lot of it's
> build tree into the sysroot. It is possible those bundled files are varying
> somehow reproducibility wise causing some instability. I've worried about this
> kind of issue for a while but I don't scale and there are a load of other 
> issues
> going on too :(.

It looks like this could possibly be a bug in the code in
meta/classes/staging.bbclass that injects the dependencies into
do_populate_sysroot & do_package.

It doesn't explicitly filter out ${PN} from BB_TASKDEPDATA, although
it perhaps should? I'm not sure if the bug is that
libgcc:do_populate_sysroot is in BB_TASKDEPDATA to begin with, or that
the code isn't filtering it out. If we do need to filter it out,
that's a pretty easy change to make.

>
> Cheers,
>
> Richard
>
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1531): 
https://lists.openembedded.org/g/openembedded-architecture/message/1531
Mute This Topic: https://lists.openembedded.org/mt/90726763/21656
Group Owner: openembedded-architecture+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [Openembedded-architecture] Dependent task hashes in depsig.*

2022-04-27 Thread Richard Purdie
On Wed, 2022-04-27 at 12:39 +0200, Jacob Kroon wrote:
> On 4/27/22 12:12, Richard Purdie wrote:
> > On Wed, 2022-04-27 at 11:06 +0200, Jacob Kroon wrote:
> > > Hi Richard and Joshua,
> > > 
> > > When using hash equivalency, since commits
> > > 
> > > https://git.openembedded.org/openembedded-core/commit/?id=d6c7b9f4f0e
> > > https://git.openembedded.org/openembedded-core/commit/?id=1cf62882bba
> > > 
> > > scrambling a header in one of the gcc patches causes all target packages
> > > to rebuild.
> > 
> > That is probably unfortunately inevitable. If the output has changed (i.e. 
> > the
> > headers are different), it shouldn't be matching a previous build as we 
> > can't
> > know what has changed.
> > 
> 
> I don't think I was being clear enough. With "scrambling a header in a
> gcc patch" I mean that I change something in the section before the
> "---" line in a .patch file, like modifying the "Upstream-Status" line.
> To me, hash equivalence should be able to optimize that scenario, since
> the output from building gcc-cross is not changed.

That definitely wasn't clear! I now understand better what you mean and yes,
we're supposed to be optimising that scenario.

> 
> > >  This is because the depsig.do_populate_sysroot in "libgcc"
> > > changes:
> > > 
> > > > [jkroon@fedora work]$ diff -u 
> > > > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1*   
> > > > 
> > > > 
> > > >
> > > > --- 
> > > > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1589812  
> > > > 2022-04-27 10:14:22.403251775 +0200 
> > > > 
> > > > 
> > > >  
> > > > +++ 
> > > > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1674014  
> > > > 2022-04-27 10:26:45.329365448 +0200 
> > > > 
> > > > 
> > > >  
> > > > @@ -1,7 +1,7 @@ 
> > > > 
> > > > 
> > > > 
> > > >  
> > > >  OEOuthashBasic 
> > > > 
> > > > 
> > > > 
> > > >  
> > > >  12 
> > > > 
> > > > 
> > > > 
> > > >  
> > > >  glibc: 
> > > > 8feab297dd38b103daa4f26eeabb5690a74b8b5700d16e4eca7b56e6fd667a5e
> > > > 
> > > > 
> > > > 
> > > >  
> > > > -libgcc: 
> > > > dfd38409a4cc5320b781edc14de2af8321180c3f194a58b798870ad7ff6a9226
> > > > 
> > > > 
> > > > 
> > > > 
> > > > +libgcc: 
> > > > 195f6a155dac8e450e72a7432ab91959a8e095e057d5b79e3adba41721dc7ea5
> > > > 
> > > > 
> > > > 
> > > > 
> > > >  linux-libc-headers: 
> > > > 12a5aaf8aec9554ac3c778cdc6c65df4db52fc573e84b7110572d459a15c9d6a
> > > > 
> > > > 
> > > > 
> > > > 
> > > >  

Re: [Openembedded-architecture] Dependent task hashes in depsig.*

2022-04-27 Thread Jacob Kroon
On 4/27/22 12:12, Richard Purdie wrote:
> On Wed, 2022-04-27 at 11:06 +0200, Jacob Kroon wrote:
>> Hi Richard and Joshua,
>>
>> When using hash equivalency, since commits
>>
>> https://git.openembedded.org/openembedded-core/commit/?id=d6c7b9f4f0e
>> https://git.openembedded.org/openembedded-core/commit/?id=1cf62882bba
>>
>> scrambling a header in one of the gcc patches causes all target packages
>> to rebuild.
> 
> That is probably unfortunately inevitable. If the output has changed (i.e. the
> headers are different), it shouldn't be matching a previous build as we can't
> know what has changed.
> 

I don't think I was being clear enough. With "scrambling a header in a
gcc patch" I mean that I change something in the section before the
"---" line in a .patch file, like modifying the "Upstream-Status" line.
To me, hash equivalence should be able to optimize that scenario, since
the output from building gcc-cross is not changed.

>>  This is because the depsig.do_populate_sysroot in "libgcc"
>> changes:
>>
>>> [jkroon@fedora work]$ diff -u 
>>> i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1*   
>>> 
>>> 
>>>
>>> --- i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1589812  
>>> 2022-04-27 10:14:22.403251775 +0200 
>>> 
>>> 
>>>  
>>> +++ i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1674014  
>>> 2022-04-27 10:26:45.329365448 +0200 
>>> 
>>> 
>>>  
>>> @@ -1,7 +1,7 @@ 
>>> 
>>> 
>>> 
>>>  
>>>  OEOuthashBasic 
>>> 
>>> 
>>> 
>>>  
>>>  12 
>>> 
>>> 
>>> 
>>>  
>>>  glibc: 8feab297dd38b103daa4f26eeabb5690a74b8b5700d16e4eca7b56e6fd667a5e
>>> 
>>> 
>>> 
>>>  
>>> -libgcc: dfd38409a4cc5320b781edc14de2af8321180c3f194a58b798870ad7ff6a9226   
>>> 
>>> 
>>> 
>>>  
>>> +libgcc: 195f6a155dac8e450e72a7432ab91959a8e095e057d5b79e3adba41721dc7ea5   
>>> 
>>> 
>>> 
>>>  
>>>  linux-libc-headers: 
>>> 12a5aaf8aec9554ac3c778cdc6c65df4db52fc573e84b7110572d459a15c9d6a
>>> 
>>> 
>>> 
>>>  SSTATE_PKGSPEC=sstate:libgcc:i686-oe-linux:11.3.0:r0:i686:8:   
>>> 
>>> 
>>> 
>>>  
>>>  task=populate_sysroot
>>
>> Is it the case that it is the dependent task hashes that are added
>> above, and that the checksum of patches are included in the those task
>> hashes ?
> 
> The dependent resolved 

Re: [Openembedded-architecture] Dependent task hashes in depsig.*

2022-04-27 Thread Richard Purdie
On Wed, 2022-04-27 at 11:06 +0200, Jacob Kroon wrote:
> Hi Richard and Joshua,
> 
> When using hash equivalency, since commits
> 
> https://git.openembedded.org/openembedded-core/commit/?id=d6c7b9f4f0e
> https://git.openembedded.org/openembedded-core/commit/?id=1cf62882bba
> 
> scrambling a header in one of the gcc patches causes all target packages
> to rebuild.

That is probably unfortunately inevitable. If the output has changed (i.e. the
headers are different), it shouldn't be matching a previous build as we can't
know what has changed.

>  This is because the depsig.do_populate_sysroot in "libgcc"
> changes:
> 
> > [jkroon@fedora work]$ diff -u 
> > i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1*   
> > 
> > 
> >
> > --- i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1589812  
> > 2022-04-27 10:14:22.403251775 +0200 
> > 
> > 
> >  
> > +++ i686-oe-linux/libgcc/11.3.0-r0/temp/depsig.do_populate_sysroot.1674014  
> > 2022-04-27 10:26:45.329365448 +0200 
> > 
> > 
> >  
> > @@ -1,7 +1,7 @@ 
> > 
> > 
> > 
> >  
> >  OEOuthashBasic 
> > 
> > 
> > 
> >  
> >  12 
> > 
> > 
> > 
> >  
> >  glibc: 8feab297dd38b103daa4f26eeabb5690a74b8b5700d16e4eca7b56e6fd667a5e
> > 
> > 
> > 
> >  
> > -libgcc: dfd38409a4cc5320b781edc14de2af8321180c3f194a58b798870ad7ff6a9226   
> > 
> > 
> > 
> >  
> > +libgcc: 195f6a155dac8e450e72a7432ab91959a8e095e057d5b79e3adba41721dc7ea5   
> > 
> > 
> > 
> >  
> >  linux-libc-headers: 
> > 12a5aaf8aec9554ac3c778cdc6c65df4db52fc573e84b7110572d459a15c9d6a
> > 
> > 
> > 
> >  SSTATE_PKGSPEC=sstate:libgcc:i686-oe-linux:11.3.0:r0:i686:8:   
> > 
> > 
> > 
> >  
> >  task=populate_sysroot
> 
> Is it the case that it is the dependent task hashes that are added
> above, and that the checksum of patches are included in the those task
> hashes ?

The dependent resolved hashes are used, as resolved by hashequiv which is a key
difference.

> In order to solve the original problem that those patches were fixing,
> would it not be possible to instead include the *outhashes* of the
> dependent recipes ?

Since the resolved hashes should map to a single outhash, I don't think it would
change anything?

Cheers,

Richard




-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all