> On Mon, Aug 7, 2023 at 8:41 AM Maksim Chichikalov
> <[email protected]> wrote:
> >
> > On Mon, Aug 7, 2023 at 4:01 AM Richard Purdie
> > <[email protected]> wrote:
> > >
> > > On Sun, 2023-08-06 at 13:15 -0600, Joshua Watt wrote:
> > > > On Sat, Aug 5, 2023 at 7:54 PM Maksim Chichikalov
> > > > <[email protected]> wrote:
> > > > > My name is Max; nice to e-meet you. I need your help with
> > > > > Equivalence Server :)
> > > > > I watched your presentation on Youtube - it 100% helped me to
> > > > > understand the logic better and debug the issue we are facing.
> > > > > Thank you for all your input into OE development.
> > > > >
> > > > > I found in OE repo that you introduced "def OEOuthashBasic()", so I
> > > > > decided to write to you first before opening the topic on an email
> > > > > list.
> > > > >
> > > > > The problem:
> > > > >
> > > > > Input data are propagated to the output hash generation function,
> > > > > which generates a different out-hash.
> > > > >
> > > > > Description:
> > > > >
> > > > > All "_git.bb" recipes have to append SRCPV to PV. As a result, PV
> > > > > is different on each commit.
> > > > >
> > > > > The OEOuthashBasic function includes SSTATE_PKGSPEC to generate
> > > > > hash, which contains PV (PV contains git hash). As a result, there
> > > > > is no way to generate the same out-hash even if changes introduced
> > > > > within a commit were trivial.
> > > > >
> > > >
> > > > Right, this is sort of on purpose, because the hash equivalence is
> > > > basically trying to say that an sstate object can be used in place of
> > > > another one, even when the task hashes aren't the same (but the
> > > > output hashes are). However, the sstate code itself will only look
> > > > for sstate object with a certain name (which include PV); hash
> > > > equivalency does have _some_ control over the file name sstate looks
> > > > for, since it will replace the taskhash portion of the name with the
> > > > unified hash, but it doesn't have complete control.
> > > >
> > > > >
> > > > > In our codebase, our components have API part, which is managed by
> > > > > an independent recipe per component. The described above problem
> > > > > caused the recompilation of all components dependent on API, even
> > > > > in cases when API was not changed. CI for pull requests recompiles
> > > > > mostly the entire code base, I need to do something with it.
> > > > > (sorry, quite hard for me to explain it in a nutshell, let me know
> > > > > if you like to know slightly more details)
> > > > >
> > > >
> > > > Ya, sounds like a typical mono-repo design?
> > > >
> > > > >
> > > > > I see a couple of options for us:
> > > > > * Add a custom implementation of out-hash generated function and
> > > > > overwrite SSTATE_HASHEQUIV_METHOD.
> > > > > * Better understand why it's mandatory to append SRCPV to PV, and
> > > > > maybe it's flexible in our cases to do it.
> > > > >
> > > >
> > > > This might be the best option, at least for your recipes, but I've
> > > > CC'd the list for additional feedback
> > > >
> > > > > * Propose a patch to fix OEOuthashBasic().
> > > > >
> > > > > In my humble opinion, the commit's hash shouldn't be included in
> > > > > out-hash generation, it doesn't make sense. Unless I'm missing
> > > > > something important - What are your thoughts?
> > > > >
> > > >
> > > > Yes and no. It's not intentional, but a side effect of hash
> > > > equivalency trying to make sure that the things it's marking as
> > > > equivalent can actually be found in sstate (basically, because sstate
> > > > include the commit hash, hash equivalency kinda has to include it).
> > >
> > > This all sounds a bit unfortunate.
> > >
> > > sstate only works as long as the filenames are predictable. Some
> > > elements of the sstate filenames are essential to operation, e.g. the
> > > package architecture since one input would result in multiple files
> > > with the same hash in the filename of the output otherwise. The recipe
> > > name and version are there mainly for debugging to allow someone to
> > > more easily know where an sstate object came from and what it
> > > represents. This is summarised by the comment in sstate.bbclass "Fields
> > > 0,5,6 are mandatory, 1 is most useful, 2,3,4 are just for information"
> > > in generate_sstatefn().
> > >
> > > When we added hash equivalence, we added the ability to equate the
> > > hashes but we'd not considered that the version string mismatch may
> > > stop significant artefact reuse. I suspect at the time we reasoned that
> > > if the version changes, the output probably does too.
> > >
> > > Sadly fixing this isn't simple. Changing the hash algorithm isn't
> > > enough, we need to stop the SRCREV part of PV being used in the sstate
> > > filename. If we stop that happening, the output hash algorithm may well
> > > "just work" at that point, I'm not sure if it directly encodes PV or
> > > just the sstate filename, hopefully the latter.
> >
> > It's including SSTATE_PKGSPEC in the output hash calculation,
> > specifically so that a change in sstate filename changes the outhash
> > (that way, hash equivalence will never map two difference sstate files
> > as equivalent).
> >
> > >
> > > The hard part is how to do this generically without adding a lot of
> > > complex and potentially fragile code. The datastore context in that
> > > function is the core configuration, not the recipe's datastore. The
> > > options I can think of offhand:
> > >
> > > a) Live with the issue
> > >
> > > b) Put a hack into generate_sstatefn() which changed the PV element if
> > > it matched a pattern but we'd have to do this for each SCM as needed.
> > > Ugly but lowest overhead.
> > >
> > > c) Indirect PV to SSTATEPV (or similar) and then alter SSTATEPV to drop
> > > SRCPV to a fixed string. Nicer code in some ways but horrible
> > > parsing/performance overhead since every recipe, even non-SCM ones
> > > would be affected.
> > >
> > > d) A partial version of c) where recipes can set SSTATEPV to a function
> > > if they need it. Solves your specific case with overhead without
> > > affecting everyone else. Would not solve the issue generally without
> > > manual user intervention.
> > >
> > > I'm not sure which of these will end up making the most sense. These
> > > assume the output hash code uses the sstate filename and not PV. If it
> > > uses PV there would be more work needed.
> >
> > I'm not seeing where SRCPV or SRCREV are being explicitly added to
> > SSTATE_PKGSPEC; PV is added, but SRCPV or SRCREV would have to be
> > manually added to PV in the recipe itself for that to happen correct?
> > Maybe it's as simple as not doing that in these particular recipes?
> >
> >
> > What are pitfalls of not adding SRCPV to PV for packages with SRCREV:${PN} 
> > = ${AUTOREV}? Is it recommended just for improving traceability and 
> > debugging?
> >
> > When I remove, bitbake is not happy - for example: if I remove explicit 
> > addition of SRCPV, the bitbake throws the following exception (need to dig 
> > deeper tp figure out why it happens only for certain packages)
> >
> > raise bb.fetch2.FetchError("Recipe uses a floating tag/branch without a 
> > fixed SRCREV yet doesn't call bb.fetch2.get_srcrev() (use SRCPV in PV for 
> > OE).")
>
> Ah, I didn't realize you were using AUTOREV; I would recommend not
> doing that and instead manually specifying the SRCREV you want to
> build from instead; aside from giving a more reproducible history, I
> think it means you can remove SRCPV from PV and hash equivalency will
> work
>
> At my $DAYJOB, we don't use AUTOREV. Instead, we have a (rather dumb)
> script that makes a commit to update the SRCREV to match the current
> head revision of the BRANCH specified in a recipe. We run this every
> night for recipes where we want AUTOREV-like behavior. Upstream also
> has the Auto-Upgrade-Helper (AUH) that does something similar if you
> want to look at that too.
>

TY Joshua!
AUTOREV is in use for pull request builds, when a package(s -> multi PR builds) 
should be built from the latest commit of the specific branch. I don't think we 
can stop using it, unless we add some mechanism of setting SRCREV to hash of 
the latest commit on each build invocation.

Richard, what are your thoughts about:
What are pitfalls of not adding SRCPV to PV for packages with SRCREV:${PN} = 
${AUTOREV}?

Is it recommended just for improving traceability and debugging, or it has some 
bitabke/OE business logic implication?

BR,
Max Chichikalov

>
> BR,
> Max Chichikalov
>
> >
> > Cheers,
> >
> > Richard
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#185607): 
https://lists.openembedded.org/g/openembedded-core/message/185607
Mute This Topic: https://lists.openembedded.org/mt/100586415/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to