> -----Original Message-----
> From: [email protected] 
> <[email protected]> On Behalf Of Richard Purdie
> Sent: den 31 december 2022 18:03
> To: Mark Hatle <[email protected]>; Joshua Watt 
> <[email protected]>
> Cc: openembedded-architecture 
> <[email protected]>; Trevor Woerner 
> <[email protected]>
> Subject: Re: [Openembedded-architecture] WORKDIR fetcher interaction issue
> 
> On Sat, 2022-12-31 at 10:47 -0600, Mark Hatle wrote:
> >
> > On 12/29/22 9:06 AM, Richard Purdie wrote:
> > > On Thu, 2022-12-29 at 08:50 -0600, Joshua Watt wrote:
> > > > On Thu, Dec 29, 2022 at 7:56 AM Richard Purdie
> > > > <[email protected]> wrote:
> > > > >
> > > > > I was asked about a WORKDIR/fetcher interaction problem and the bugs 
> > > > > it
> > > > > results in. I've tried to write down my thoughts.
> > > > >
> > > > > The unpack task writes it's output to WORKDIR as base.bbclass says:
> > > > >
> > > > >          fetcher = bb.fetch2.Fetch(src_uri, d)
> > > > >          fetcher.unpack(d.getVar('WORKDIR')
> > > > >
> > > > > We historically dealt with tarballs which usually have a NAME-VERSION
> > > > > directory within them, so when you extract them, they go into a sub
> > > > > directory which tar creates. We usually call that subdirectory "S".
> > > > >
> > > > > When we wrote the git fetcher, we emulated this by using a "git"
> > > > > directory to extract into rather than WORKDIR.
> > > > >
> > > > > For local files, there is no sub directory so they go into WORKDIR.
> > > > > This includes patches, which do_patch looks for in WORKDIR and applies
> > > > > them from there.
> > > > >
> > > > > What issues does this cause? If you have an existing WORKDIR and run a
> > > > > build with:
> > > > >
> > > > > SRC_URI = "file://a file://b"
> > > > >
> > > > > then change it to:
> > > > >
> > > > > SRC_URI = "file://a"
> > > > >
> > > > > and rebuild the recipe, the fetch and unpack tasks will rerun and 
> > > > > their
> > > > > hashes will change but the file "b" is still in WORKDIR. Nothing in 
> > > > > the
> > > > > codebase knows that it should delete "b" from there. If you have code
> > > > > which does "if exists(b)", which is common, it will break.
> > > > >
> > > > > There are variations on this, such as a conditional append on some
> > > > > override to SRC_URI but the fundamental problem is one of cleanup when
> > > > > unpack is to rerun.
> > > > >
> > > > > The naive approach is then to think "lets just delete WORKDIR" when
> > > > > running do_unpack. There is the small problem of WORKDIR/temp with 
> > > > > logs
> > > > > in. There is also the pseudo database and other things tasks could 
> > > > > have
> > > > > done. Basically, whilst tempting, it doesn't work out well in practise
> > > > > particularly as that whilst unpack might rerun, not all other tasks
> > > > > might.
> > > > >
> > > > > I did also try a couple of other ideas. We could fetch into a
> > > > > subdirectory, then either copy or symlink into place depending on 
> > > > > which
> > > > > set of performance/usabiity challenges you want to deal with. You 
> > > > > could
> > > > > involve a manifest of the files and then move into position so later
> > > > > you'd know which ones to delete.
> > > > >
> > > > > Part of the problem is that in some cases recipes do:
> > > > >
> > > > > S = "${WORKDIR}"
> > > > >
> > > > > for simplicity. This means that you also can't wipe out S as it might
> > > > > point at WORKDIR.
> > > > >
> > > > > SPDX users have requested a json file of file and checksums after the
> > > > > unpack and before do_patch. Such a manifest could also be useful for
> > > > > attempting cleanup of an existing WORKDIR so I suspect the solution
> > > > > probably lies in that direction, probably unpacking into a subdir,
> > > > > indexing it, then moving into position.
> > > >
> > > > By "moving it into position" do you mean moving the files from the
> > > > clean subdirectory to the locations they would occupy today?
> > > >
> > > > If so.... I don't understand why that's strictly necessary. It seems
> > > > like almost all of the complexity of this will be to support a
> > > > use-case we don't really like anyway (S = "${WORKDIR}"). Manifests are
> > > > great and all, but it causes a lot of problems if they get out of sync
> > > > and I suspect that would happen more often than we would like, e.g.
> > > > with devtool, make config, manual editing, etc. If we can keep it
> > > > simple and not rely on external state (e.g. a manifest) I think it
> > > > will be a lot easier to maintain in the long run.
> > >
> > > Dropping S = "${WORKDIR}" doesn't solve the problem being described
> > > here, it just removes something which complicates current code and
> > > makes that problem harder to solve. Even not supporting S =
> > > "${WORKDIR}", do_unpack still unpacks to WORKDIR with the S directory
> > > created by the tarball.
> >
> > In this particular piece, it's always bugged me that I don't have control 
> > over
> > the place it unpacks (whatever it is), where it patches and the S directory.
> > (These are NOT the same thing in some cases of cases, but we ending up 
> > having to
> > "make them the same".)
> >
> > For instance, I've got software that is going to download (currently) into:
> >
> > WORKDIR/embeddedsw/
> >     apps/
> >        app1/
> >          variant1
> >          variant2
> >        app2/
> >          variant1
> >          variant2
> >
> > (I don't have ownership over the structure, so I have to live with it...)
> >
> > Each app & variant is a separate recipe.  So we end up having to play games 
> > with
> > the S and patchlevel and other thing so that I can have 2 recipes (app1 and
> > app2) that will build the correct variant for their machine.
> >
> > If I could say:
> >
> > unpack to $WORKDIR
> 
> FWIW unpack is controlled by the directory passed into the fetcher by
> the do_unpack task.
> 
> > patch in $WORKDIR/embeddedsw/apps
> 
> Patch isn't configurable today, making it configurable would probably
> cause the most problems.

Well, you can use "S:task-patch". I have had cases where the recipe fetches 
from Git, but uses a sub-directory, e.g., `S = "${WORKDIR}/git/subdir"`. In 
this case, setting `S:task-patch = "${WORKDIR}/git"` makes it easier to 
apply patches generated using `git format-patch` (especially if it is not 
only files in subdir that are modified by the patch, as otherwise striplevel=2 
normally works).

> 
> > source in $WORKDIR/embeddedsw/apps/app1/variant1
> 
> This is configurable with ${S}.
> 
> > it would make it easier in this extreme case.  I would guess in most other
> > cases, the complexity could be hidden by defaults to preserve existing 
> > behavior.
> >
> >
> > In some ways, what we're really trying to do is define for a given task what
> > directory it should be working within.  So maybe that is a better way of
> > thinking of this.  (adding that configure,compile,install would operate 
> > within B.)
> >
> > Anyway, just a few thoughts from reading through this.
> 
> I worry we already support too many corner cases and should be aiming
> to simplify, not add most possible configuration options. Thanks for
> the data though.
> 
> Cheers,
> 
> Richard

//Peter

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1699): 
https://lists.openembedded.org/g/openembedded-architecture/message/1699
Mute This Topic: https://lists.openembedded.org/mt/95936561/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to