On Sat, 2022-12-31 at 10:47 -0600, Mark Hatle wrote:
> 
> On 12/29/22 9:06 AM, Richard Purdie wrote:
> > On Thu, 2022-12-29 at 08:50 -0600, Joshua Watt wrote:
> > > On Thu, Dec 29, 2022 at 7:56 AM Richard Purdie
> > > <richard.pur...@linuxfoundation.org> wrote:
> > > > 
> > > > I was asked about a WORKDIR/fetcher interaction problem and the bugs it
> > > > results in. I've tried to write down my thoughts.
> > > > 
> > > > The unpack task writes it's output to WORKDIR as base.bbclass says:
> > > > 
> > > >          fetcher = bb.fetch2.Fetch(src_uri, d)
> > > >          fetcher.unpack(d.getVar('WORKDIR')
> > > > 
> > > > We historically dealt with tarballs which usually have a NAME-VERSION
> > > > directory within them, so when you extract them, they go into a sub
> > > > directory which tar creates. We usually call that subdirectory "S".
> > > > 
> > > > When we wrote the git fetcher, we emulated this by using a "git"
> > > > directory to extract into rather than WORKDIR.
> > > > 
> > > > For local files, there is no sub directory so they go into WORKDIR.
> > > > This includes patches, which do_patch looks for in WORKDIR and applies
> > > > them from there.
> > > > 
> > > > What issues does this cause? If you have an existing WORKDIR and run a
> > > > build with:
> > > > 
> > > > SRC_URI = "file://a file://b"
> > > > 
> > > > then change it to:
> > > > 
> > > > SRC_URI = "file://a"
> > > > 
> > > > and rebuild the recipe, the fetch and unpack tasks will rerun and their
> > > > hashes will change but the file "b" is still in WORKDIR. Nothing in the
> > > > codebase knows that it should delete "b" from there. If you have code
> > > > which does "if exists(b)", which is common, it will break.
> > > > 
> > > > There are variations on this, such as a conditional append on some
> > > > override to SRC_URI but the fundamental problem is one of cleanup when
> > > > unpack is to rerun.
> > > > 
> > > > The naive approach is then to think "lets just delete WORKDIR" when
> > > > running do_unpack. There is the small problem of WORKDIR/temp with logs
> > > > in. There is also the pseudo database and other things tasks could have
> > > > done. Basically, whilst tempting, it doesn't work out well in practise
> > > > particularly as that whilst unpack might rerun, not all other tasks
> > > > might.
> > > > 
> > > > I did also try a couple of other ideas. We could fetch into a
> > > > subdirectory, then either copy or symlink into place depending on which
> > > > set of performance/usabiity challenges you want to deal with. You could
> > > > involve a manifest of the files and then move into position so later
> > > > you'd know which ones to delete.
> > > > 
> > > > Part of the problem is that in some cases recipes do:
> > > > 
> > > > S = "${WORKDIR}"
> > > > 
> > > > for simplicity. This means that you also can't wipe out S as it might
> > > > point at WORKDIR.
> > > > 
> > > > SPDX users have requested a json file of file and checksums after the
> > > > unpack and before do_patch. Such a manifest could also be useful for
> > > > attempting cleanup of an existing WORKDIR so I suspect the solution
> > > > probably lies in that direction, probably unpacking into a subdir,
> > > > indexing it, then moving into position.
> > > 
> > > By "moving it into position" do you mean moving the files from the
> > > clean subdirectory to the locations they would occupy today?
> > > 
> > > If so.... I don't understand why that's strictly necessary. It seems
> > > like almost all of the complexity of this will be to support a
> > > use-case we don't really like anyway (S = "${WORKDIR}"). Manifests are
> > > great and all, but it causes a lot of problems if they get out of sync
> > > and I suspect that would happen more often than we would like, e.g.
> > > with devtool, make config, manual editing, etc. If we can keep it
> > > simple and not rely on external state (e.g. a manifest) I think it
> > > will be a lot easier to maintain in the long run.
> > 
> > Dropping S = "${WORKDIR}" doesn't solve the problem being described
> > here, it just removes something which complicates current code and
> > makes that problem harder to solve. Even not supporting S =
> > "${WORKDIR}", do_unpack still unpacks to WORKDIR with the S directory
> > created by the tarball.
> 
> In this particular piece, it's always bugged me that I don't have control 
> over 
> the place it unpacks (whatever it is), where it patches and the S directory. 
> (These are NOT the same thing in some cases of cases, but we ending up having 
> to 
> "make them the same".)
> 
> For instance, I've got software that is going to download (currently) into:
> 
> WORKDIR/embeddedsw/
>     apps/
>        app1/
>          variant1
>          variant2
>        app2/
>          variant1
>          variant2
> 
> (I don't have ownership over the structure, so I have to live with it...)
> 
> Each app & variant is a separate recipe.  So we end up having to play games 
> with 
> the S and patchlevel and other thing so that I can have 2 recipes (app1 and 
> app2) that will build the correct variant for their machine.
> 
> If I could say:
> 
> unpack to $WORKDIR

FWIW unpack is controlled by the directory passed into the fetcher by
the do_unpack task.

> patch in $WORKDIR/embeddedsw/apps

Patch isn't configurable today, making it configurable would probably
cause the most problems.

> source in $WORKDIR/embeddedsw/apps/app1/variant1

This is configurable with ${S}.

> it would make it easier in this extreme case.  I would guess in most other 
> cases, the complexity could be hidden by defaults to preserve existing 
> behavior.
> 
> 
> In some ways, what we're really trying to do is define for a given task what 
> directory it should be working within.  So maybe that is a better way of 
> thinking of this.  (adding that configure,compile,install would operate 
> within B.)
> 
> Anyway, just a few thoughts from reading through this.

I worry we already support too many corner cases and should be aiming
to simplify, not add most possible configuration options. Thanks for
the data though.

Cheers,

Richard

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1698): 
https://lists.openembedded.org/g/openembedded-architecture/message/1698
Mute This Topic: https://lists.openembedded.org/mt/95936561/21656
Group Owner: openembedded-architecture+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to