On Sun, 2023-08-06 at 13:15 -0600, Joshua Watt wrote: > On Sat, Aug 5, 2023 at 7:54 PM Maksim Chichikalov > <[email protected]> wrote: > > My name is Max; nice to e-meet you. I need your help with > > Equivalence Server :) > > I watched your presentation on Youtube - it 100% helped me to > > understand the logic better and debug the issue we are facing. > > Thank you for all your input into OE development. > > > > I found in OE repo that you introduced "def OEOuthashBasic()", so I > > decided to write to you first before opening the topic on an email > > list. > > > > The problem: > > > > Input data are propagated to the output hash generation function, > > which generates a different out-hash. > > > > Description: > > > > All "_git.bb" recipes have to append SRCPV to PV. As a result, PV > > is different on each commit. > > > > The OEOuthashBasic function includes SSTATE_PKGSPEC to generate > > hash, which contains PV (PV contains git hash). As a result, there > > is no way to generate the same out-hash even if changes introduced > > within a commit were trivial. > > > > Right, this is sort of on purpose, because the hash equivalence is > basically trying to say that an sstate object can be used in place of > another one, even when the task hashes aren't the same (but the > output hashes are). However, the sstate code itself will only look > for sstate object with a certain name (which include PV); hash > equivalency does have _some_ control over the file name sstate looks > for, since it will replace the taskhash portion of the name with the > unified hash, but it doesn't have complete control. > > > > > In our codebase, our components have API part, which is managed by > > an independent recipe per component. The described above problem > > caused the recompilation of all components dependent on API, even > > in cases when API was not changed. CI for pull requests recompiles > > mostly the entire code base, I need to do something with it. > > (sorry, quite hard for me to explain it in a nutshell, let me know > > if you like to know slightly more details) > > > > Ya, sounds like a typical mono-repo design? > > > > > I see a couple of options for us: > > * Add a custom implementation of out-hash generated function and > > overwrite SSTATE_HASHEQUIV_METHOD. > > * Better understand why it's mandatory to append SRCPV to PV, and > > maybe it's flexible in our cases to do it. > > > > This might be the best option, at least for your recipes, but I've > CC'd the list for additional feedback > > > * Propose a patch to fix OEOuthashBasic(). > > > > In my humble opinion, the commit's hash shouldn't be included in > > out-hash generation, it doesn't make sense. Unless I'm missing > > something important - What are your thoughts? > > > > Yes and no. It's not intentional, but a side effect of hash > equivalency trying to make sure that the things it's marking as > equivalent can actually be found in sstate (basically, because sstate > include the commit hash, hash equivalency kinda has to include it).
This all sounds a bit unfortunate. sstate only works as long as the filenames are predictable. Some elements of the sstate filenames are essential to operation, e.g. the package architecture since one input would result in multiple files with the same hash in the filename of the output otherwise. The recipe name and version are there mainly for debugging to allow someone to more easily know where an sstate object came from and what it represents. This is summarised by the comment in sstate.bbclass "Fields 0,5,6 are mandatory, 1 is most useful, 2,3,4 are just for information" in generate_sstatefn(). When we added hash equivalence, we added the ability to equate the hashes but we'd not considered that the version string mismatch may stop significant artefact reuse. I suspect at the time we reasoned that if the version changes, the output probably does too. Sadly fixing this isn't simple. Changing the hash algorithm isn't enough, we need to stop the SRCREV part of PV being used in the sstate filename. If we stop that happening, the output hash algorithm may well "just work" at that point, I'm not sure if it directly encodes PV or just the sstate filename, hopefully the latter. The hard part is how to do this generically without adding a lot of complex and potentially fragile code. The datastore context in that function is the core configuration, not the recipe's datastore. The options I can think of offhand: a) Live with the issue b) Put a hack into generate_sstatefn() which changed the PV element if it matched a pattern but we'd have to do this for each SCM as needed. Ugly but lowest overhead. c) Indirect PV to SSTATEPV (or similar) and then alter SSTATEPV to drop SRCPV to a fixed string. Nicer code in some ways but horrible parsing/performance overhead since every recipe, even non-SCM ones would be affected. d) A partial version of c) where recipes can set SSTATEPV to a function if they need it. Solves your specific case with overhead without affecting everyone else. Would not solve the issue generally without manual user intervention. I'm not sure which of these will end up making the most sense. These assume the output hash code uses the sstate filename and not PV. If it uses PV there would be more work needed. Cheers, Richard
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#185591): https://lists.openembedded.org/g/openembedded-core/message/185591 Mute This Topic: https://lists.openembedded.org/mt/100586415/21656 Group Owner: [email protected] Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
