On Mon, Aug 7, 2023 at 8:41 AM Maksim Chichikalov
<[email protected]> wrote:
>
> On Mon, Aug 7, 2023 at 4:01 AM Richard Purdie
> <[email protected]> wrote:
> >
> > On Sun, 2023-08-06 at 13:15 -0600, Joshua Watt wrote:
> > > On Sat, Aug 5, 2023 at 7:54 PM Maksim Chichikalov
> > > <[email protected]> wrote:
> > > > My name is Max; nice to e-meet you. I need your help with
> > > > Equivalence Server :)
> > > > I watched your presentation on Youtube - it 100% helped me to
> > > > understand the logic better and debug the issue we are facing.
> > > > Thank you for all your input into OE development.
> > > >
> > > > I found in OE repo that you introduced "def OEOuthashBasic()", so I
> > > > decided to write to you first before opening the topic on an email
> > > > list.
> > > >
> > > > The problem:
> > > >
> > > > Input data are propagated to the output hash generation function,
> > > > which generates a different out-hash.
> > > >
> > > > Description:
> > > >
> > > > All "_git.bb" recipes have to append SRCPV to PV. As a result, PV
> > > > is different on each commit.
> > > >
> > > > The OEOuthashBasic function includes SSTATE_PKGSPEC to generate
> > > > hash, which contains PV (PV contains git hash). As a result, there
> > > > is no way to generate the same out-hash even if changes introduced
> > > > within a commit were trivial.
> > > >
> > >
> > > Right, this is sort of on purpose, because the hash equivalence is
> > > basically trying to say that an sstate object can be used in place of
> > > another one, even when the task hashes aren't the same (but the
> > > output hashes are). However, the sstate code itself will only look
> > > for sstate object with a certain name (which include PV); hash
> > > equivalency does have _some_ control over the file name sstate looks
> > > for, since it will replace the taskhash portion of the name with the
> > > unified hash, but it doesn't have complete control.
> > >
> > > >
> > > > In our codebase, our components have API part, which is managed by
> > > > an independent recipe per component. The described above problem
> > > > caused the recompilation of all components dependent on API, even
> > > > in cases when API was not changed. CI for pull requests recompiles
> > > > mostly the entire code base, I need to do something with it.
> > > > (sorry, quite hard for me to explain it in a nutshell, let me know
> > > > if you like to know slightly more details)
> > > >
> > >
> > > Ya, sounds like a typical mono-repo design?
> > >
> > > >
> > > > I see a couple of options for us:
> > > > * Add a custom implementation of out-hash generated function and
> > > > overwrite SSTATE_HASHEQUIV_METHOD.
> > > > * Better understand why it's mandatory to append SRCPV to PV, and
> > > > maybe it's flexible in our cases to do it.
> > > >
> > >
> > > This might be the best option, at least for your recipes, but I've
> > > CC'd the list for additional feedback
> > >
> > > > * Propose a patch to fix OEOuthashBasic().
> > > >
> > > > In my humble opinion, the commit's hash shouldn't be included in
> > > > out-hash generation, it doesn't make sense. Unless I'm missing
> > > > something important - What are your thoughts?
> > > >
> > >
> > > Yes and no. It's not intentional, but a side effect of hash
> > > equivalency trying to make sure that the things it's marking as
> > > equivalent can actually be found in sstate (basically, because sstate
> > > include the commit hash, hash equivalency kinda has to include it).
> >
> > This all sounds a bit unfortunate.
> >
> > sstate only works as long as the filenames are predictable. Some
> > elements of the sstate filenames are essential to operation, e.g. the
> > package architecture since one input would result in multiple files
> > with the same hash in the filename of the output otherwise. The recipe
> > name and version are there mainly for debugging to allow someone to
> > more easily know where an sstate object came from and what it
> > represents. This is summarised by the comment in sstate.bbclass "Fields
> > 0,5,6 are mandatory, 1 is most useful, 2,3,4 are just for information"
> > in generate_sstatefn().
> >
> > When we added hash equivalence, we added the ability to equate the
> > hashes but we'd not considered that the version string mismatch may
> > stop significant artefact reuse. I suspect at the time we reasoned that
> > if the version changes, the output probably does too.
> >
> > Sadly fixing this isn't simple. Changing the hash algorithm isn't
> > enough, we need to stop the SRCREV part of PV being used in the sstate
> > filename. If we stop that happening, the output hash algorithm may well
> > "just work" at that point, I'm not sure if it directly encodes PV or
> > just the sstate filename, hopefully the latter.
>
> It's including SSTATE_PKGSPEC in the output hash calculation,
> specifically so that a change in sstate filename changes the outhash
> (that way, hash equivalence will never map two difference sstate files
> as equivalent).
>
> >
> > The hard part is how to do this generically without adding a lot of
> > complex and potentially fragile code. The datastore context in that
> > function is the core configuration, not the recipe's datastore. The
> > options I can think of offhand:
> >
> > a) Live with the issue
> >
> > b) Put a hack into generate_sstatefn() which changed the PV element if
> > it matched a pattern but we'd have to do this for each SCM as needed.
> > Ugly but lowest overhead.
> >
> > c) Indirect PV to SSTATEPV (or similar) and then alter SSTATEPV to drop
> > SRCPV to a fixed string. Nicer code in some ways but horrible
> > parsing/performance overhead since every recipe, even non-SCM ones
> > would be affected.
> >
> > d) A partial version of c) where recipes can set SSTATEPV to a function
> > if they need it. Solves your specific case with overhead without
> > affecting everyone else. Would not solve the issue generally without
> > manual user intervention.
> >
> > I'm not sure which of these will end up making the most sense. These
> > assume the output hash code uses the sstate filename and not PV. If it
> > uses PV there would be more work needed.
>
> I'm not seeing where SRCPV or SRCREV are being explicitly added to
> SSTATE_PKGSPEC; PV is added, but SRCPV or SRCREV would have to be
> manually added to PV in the recipe itself for that to happen correct?
> Maybe it's as simple as not doing that in these particular recipes?
>
>
> What are pitfalls of not adding SRCPV to PV for packages with SRCREV:${PN} = 
> ${AUTOREV}? Is it recommended just for improving traceability and debugging?
>
> When I remove, bitbake is not happy - for example: if I remove explicit 
> addition of SRCPV, the bitbake throws the following exception (need to dig 
> deeper tp figure out why it happens only for certain packages)
>
> raise bb.fetch2.FetchError("Recipe uses a floating tag/branch without a fixed 
> SRCREV yet doesn't call bb.fetch2.get_srcrev() (use SRCPV in PV for OE).")

Ah, I didn't realize you were using AUTOREV; I would recommend not
doing that and instead manually specifying the SRCREV you want to
build from instead; aside from giving a more reproducible history, I
think it means you can remove SRCPV from PV and hash equivalency will
work

At my $DAYJOB, we don't use AUTOREV. Instead, we have a (rather dumb)
script that makes a commit to update the SRCREV to match the current
head revision of the BRANCH specified in a recipe. We run this every
night for recipes where we want AUTOREV-like behavior. Upstream also
has the Auto-Upgrade-Helper (AUH) that does something similar if you
want to look at that too.

>
> BR,
> Max Chichikalov
>
> >
> > Cheers,
> >
> > Richard
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#185606): 
https://lists.openembedded.org/g/openembedded-core/message/185606
Mute This Topic: https://lists.openembedded.org/mt/100586415/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to