Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2023-01-13 Thread Alberto Pianon

Il 2022-12-29 16:06 Richard Purdie ha scritto:

On Thu, 2022-12-29 at 08:50 -0600, Joshua Watt wrote:

On Thu, Dec 29, 2022 at 7:56 AM Richard Purdie
 wrote:
>
> I was asked about a WORKDIR/fetcher interaction problem and the bugs it
> results in. I've tried to write down my thoughts.
>
> The unpack task writes it's output to WORKDIR as base.bbclass says:
>
> fetcher = bb.fetch2.Fetch(src_uri, d)
> fetcher.unpack(d.getVar('WORKDIR')
>
> We historically dealt with tarballs which usually have a NAME-VERSION
> directory within them, so when you extract them, they go into a sub
> directory which tar creates. We usually call that subdirectory "S".
>
> When we wrote the git fetcher, we emulated this by using a "git"
> directory to extract into rather than WORKDIR.
>
> For local files, there is no sub directory so they go into WORKDIR.
> This includes patches, which do_patch looks for in WORKDIR and applies
> them from there.
>
> What issues does this cause? If you have an existing WORKDIR and run a
> build with:
>
> SRC_URI = "file://a file://b"
>
> then change it to:
>
> SRC_URI = "file://a"
>
> and rebuild the recipe, the fetch and unpack tasks will rerun and their
> hashes will change but the file "b" is still in WORKDIR. Nothing in the
> codebase knows that it should delete "b" from there. If you have code
> which does "if exists(b)", which is common, it will break.
>
> There are variations on this, such as a conditional append on some
> override to SRC_URI but the fundamental problem is one of cleanup when
> unpack is to rerun.
>
> The naive approach is then to think "lets just delete WORKDIR" when
> running do_unpack. There is the small problem of WORKDIR/temp with logs
> in. There is also the pseudo database and other things tasks could have
> done. Basically, whilst tempting, it doesn't work out well in practise
> particularly as that whilst unpack might rerun, not all other tasks
> might.
>
> I did also try a couple of other ideas. We could fetch into a
> subdirectory, then either copy or symlink into place depending on which
> set of performance/usabiity challenges you want to deal with. You could
> involve a manifest of the files and then move into position so later
> you'd know which ones to delete.
>
> Part of the problem is that in some cases recipes do:
>
> S = "${WORKDIR}"
>
> for simplicity. This means that you also can't wipe out S as it might
> point at WORKDIR.
>
> SPDX users have requested a json file of file and checksums after the
> unpack and before do_patch. Such a manifest could also be useful for
> attempting cleanup of an existing WORKDIR so I suspect the solution
> probably lies in that direction, probably unpacking into a subdir,
> indexing it, then moving into position.

By "moving it into position" do you mean moving the files from the
clean subdirectory to the locations they would occupy today?

If so I don't understand why that's strictly necessary. It seems
like almost all of the complexity of this will be to support a
use-case we don't really like anyway (S = "${WORKDIR}"). Manifests are
great and all, but it causes a lot of problems if they get out of sync
and I suspect that would happen more often than we would like, e.g.
with devtool, make config, manual editing, etc. If we can keep it
simple and not rely on external state (e.g. a manifest) I think it
will be a lot easier to maintain in the long run.


Dropping S = "${WORKDIR}" doesn't solve the problem being described
here, it just removes something which complicates current code and
makes that problem harder to solve. Even not supporting S =
"${WORKDIR}", do_unpack still unpacks to WORKDIR with the S directory
created by the tarball.

Cheers,

Richard



Hi Richard,

I'm not sure I'm one of the SPDX guys you intended to refer to :) but I 
definitely support the idea of having a separate manifest with file 
relpaths and checksums (and also download loacation) for each item in 
SRC_URI before do_patch (and before putting everything together into S).


That would be a game changer as to software composition analysis and IP 
compliance, and it would help also the creation of a SPDX license 
calculation tool I was asked to contribute[^1]: such manifest would 
allow to analyze real upstream sources and match them with license 
metadata coming from existing online resources[^2].


I understand Joshua's concerns about using such manifest to handle 
source cleanup in case of SRC_URI modifications, and I don't have an 
answer for that (that is not my field). By the way, IMHO the requirement 
by SPDX users would be a strong enough motivation to implement such 
manifest generation.


I would be glad to contribute, if you decide to do it

Cheers,

Alberto

[^1]: https://bugzilla.yoctoproject.org/show_bug.cgi?id=4517#c2
[^2]: Apart from the well-known ones (ClearlyDefined, Software Heritage, 
Debian, etc.) there's an interesting new project by osadl.org, which may 
become a good source of trusted license metadata 

Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2023-01-02 Thread Peter Kjellerstedt
> -Original Message-
> From: openembedded-architecture@lists.openembedded.org 
>  On Behalf Of Richard Purdie
> Sent: den 31 december 2022 18:03
> To: Mark Hatle ; Joshua Watt 
> 
> Cc: openembedded-architecture 
> ; Trevor Woerner 
> 
> Subject: Re: [Openembedded-architecture] WORKDIR fetcher interaction issue
> 
> On Sat, 2022-12-31 at 10:47 -0600, Mark Hatle wrote:
> >
> > On 12/29/22 9:06 AM, Richard Purdie wrote:
> > > On Thu, 2022-12-29 at 08:50 -0600, Joshua Watt wrote:
> > > > On Thu, Dec 29, 2022 at 7:56 AM Richard Purdie
> > > >  wrote:
> > > > >
> > > > > I was asked about a WORKDIR/fetcher interaction problem and the bugs 
> > > > > it
> > > > > results in. I've tried to write down my thoughts.
> > > > >
> > > > > The unpack task writes it's output to WORKDIR as base.bbclass says:
> > > > >
> > > > >  fetcher = bb.fetch2.Fetch(src_uri, d)
> > > > >  fetcher.unpack(d.getVar('WORKDIR')
> > > > >
> > > > > We historically dealt with tarballs which usually have a NAME-VERSION
> > > > > directory within them, so when you extract them, they go into a sub
> > > > > directory which tar creates. We usually call that subdirectory "S".
> > > > >
> > > > > When we wrote the git fetcher, we emulated this by using a "git"
> > > > > directory to extract into rather than WORKDIR.
> > > > >
> > > > > For local files, there is no sub directory so they go into WORKDIR.
> > > > > This includes patches, which do_patch looks for in WORKDIR and applies
> > > > > them from there.
> > > > >
> > > > > What issues does this cause? If you have an existing WORKDIR and run a
> > > > > build with:
> > > > >
> > > > > SRC_URI = "file://a file://b"
> > > > >
> > > > > then change it to:
> > > > >
> > > > > SRC_URI = "file://a"
> > > > >
> > > > > and rebuild the recipe, the fetch and unpack tasks will rerun and 
> > > > > their
> > > > > hashes will change but the file "b" is still in WORKDIR. Nothing in 
> > > > > the
> > > > > codebase knows that it should delete "b" from there. If you have code
> > > > > which does "if exists(b)", which is common, it will break.
> > > > >
> > > > > There are variations on this, such as a conditional append on some
> > > > > override to SRC_URI but the fundamental problem is one of cleanup when
> > > > > unpack is to rerun.
> > > > >
> > > > > The naive approach is then to think "lets just delete WORKDIR" when
> > > > > running do_unpack. There is the small problem of WORKDIR/temp with 
> > > > > logs
> > > > > in. There is also the pseudo database and other things tasks could 
> > > > > have
> > > > > done. Basically, whilst tempting, it doesn't work out well in practise
> > > > > particularly as that whilst unpack might rerun, not all other tasks
> > > > > might.
> > > > >
> > > > > I did also try a couple of other ideas. We could fetch into a
> > > > > subdirectory, then either copy or symlink into place depending on 
> > > > > which
> > > > > set of performance/usabiity challenges you want to deal with. You 
> > > > > could
> > > > > involve a manifest of the files and then move into position so later
> > > > > you'd know which ones to delete.
> > > > >
> > > > > Part of the problem is that in some cases recipes do:
> > > > >
> > > > > S = "${WORKDIR}"
> > > > >
> > > > > for simplicity. This means that you also can't wipe out S as it might
> > > > > point at WORKDIR.
> > > > >
> > > > > SPDX users have requested a json file of file and checksums after the
> > > > > unpack and before do_patch. Such a manifest could also be useful for
> > > > > attempting cleanup of an existing WORKDIR so I suspect the solution
> > > > > probably lies in that direction, probably unpacking into a subdir,
> > > > > indexing it, then moving into position.
> > > >
>

Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2022-12-31 Thread Richard Purdie
On Sat, 2022-12-31 at 10:47 -0600, Mark Hatle wrote:
> 
> On 12/29/22 9:06 AM, Richard Purdie wrote:
> > On Thu, 2022-12-29 at 08:50 -0600, Joshua Watt wrote:
> > > On Thu, Dec 29, 2022 at 7:56 AM Richard Purdie
> > >  wrote:
> > > > 
> > > > I was asked about a WORKDIR/fetcher interaction problem and the bugs it
> > > > results in. I've tried to write down my thoughts.
> > > > 
> > > > The unpack task writes it's output to WORKDIR as base.bbclass says:
> > > > 
> > > >  fetcher = bb.fetch2.Fetch(src_uri, d)
> > > >  fetcher.unpack(d.getVar('WORKDIR')
> > > > 
> > > > We historically dealt with tarballs which usually have a NAME-VERSION
> > > > directory within them, so when you extract them, they go into a sub
> > > > directory which tar creates. We usually call that subdirectory "S".
> > > > 
> > > > When we wrote the git fetcher, we emulated this by using a "git"
> > > > directory to extract into rather than WORKDIR.
> > > > 
> > > > For local files, there is no sub directory so they go into WORKDIR.
> > > > This includes patches, which do_patch looks for in WORKDIR and applies
> > > > them from there.
> > > > 
> > > > What issues does this cause? If you have an existing WORKDIR and run a
> > > > build with:
> > > > 
> > > > SRC_URI = "file://a file://b"
> > > > 
> > > > then change it to:
> > > > 
> > > > SRC_URI = "file://a"
> > > > 
> > > > and rebuild the recipe, the fetch and unpack tasks will rerun and their
> > > > hashes will change but the file "b" is still in WORKDIR. Nothing in the
> > > > codebase knows that it should delete "b" from there. If you have code
> > > > which does "if exists(b)", which is common, it will break.
> > > > 
> > > > There are variations on this, such as a conditional append on some
> > > > override to SRC_URI but the fundamental problem is one of cleanup when
> > > > unpack is to rerun.
> > > > 
> > > > The naive approach is then to think "lets just delete WORKDIR" when
> > > > running do_unpack. There is the small problem of WORKDIR/temp with logs
> > > > in. There is also the pseudo database and other things tasks could have
> > > > done. Basically, whilst tempting, it doesn't work out well in practise
> > > > particularly as that whilst unpack might rerun, not all other tasks
> > > > might.
> > > > 
> > > > I did also try a couple of other ideas. We could fetch into a
> > > > subdirectory, then either copy or symlink into place depending on which
> > > > set of performance/usabiity challenges you want to deal with. You could
> > > > involve a manifest of the files and then move into position so later
> > > > you'd know which ones to delete.
> > > > 
> > > > Part of the problem is that in some cases recipes do:
> > > > 
> > > > S = "${WORKDIR}"
> > > > 
> > > > for simplicity. This means that you also can't wipe out S as it might
> > > > point at WORKDIR.
> > > > 
> > > > SPDX users have requested a json file of file and checksums after the
> > > > unpack and before do_patch. Such a manifest could also be useful for
> > > > attempting cleanup of an existing WORKDIR so I suspect the solution
> > > > probably lies in that direction, probably unpacking into a subdir,
> > > > indexing it, then moving into position.
> > > 
> > > By "moving it into position" do you mean moving the files from the
> > > clean subdirectory to the locations they would occupy today?
> > > 
> > > If so I don't understand why that's strictly necessary. It seems
> > > like almost all of the complexity of this will be to support a
> > > use-case we don't really like anyway (S = "${WORKDIR}"). Manifests are
> > > great and all, but it causes a lot of problems if they get out of sync
> > > and I suspect that would happen more often than we would like, e.g.
> > > with devtool, make config, manual editing, etc. If we can keep it
> > > simple and not rely on external state (e.g. a manifest) I think it
> > > will be a lot easier to maintain in the long run.
> > 
> > Dropping S = "${WORKDIR}" doesn't solve the problem being described
> > here, it just removes something which complicates current code and
> > makes that problem harder to solve. Even not supporting S =
> > "${WORKDIR}", do_unpack still unpacks to WORKDIR with the S directory
> > created by the tarball.
> 
> In this particular piece, it's always bugged me that I don't have control 
> over 
> the place it unpacks (whatever it is), where it patches and the S directory. 
> (These are NOT the same thing in some cases of cases, but we ending up having 
> to 
> "make them the same".)
> 
> For instance, I've got software that is going to download (currently) into:
> 
> WORKDIR/embeddedsw/
> apps/
>app1/
>  variant1
>  variant2
>app2/
>  variant1
>  variant2
> 
> (I don't have ownership over the structure, so I have to live with it...)
> 
> Each app & variant is a separate recipe.  So we end up having to play games 
> with 
> the S and patchlevel and other thing so that I can 

Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2022-12-31 Thread Mark Hatle



On 12/29/22 9:06 AM, Richard Purdie wrote:

On Thu, 2022-12-29 at 08:50 -0600, Joshua Watt wrote:

On Thu, Dec 29, 2022 at 7:56 AM Richard Purdie
 wrote:


I was asked about a WORKDIR/fetcher interaction problem and the bugs it
results in. I've tried to write down my thoughts.

The unpack task writes it's output to WORKDIR as base.bbclass says:

 fetcher = bb.fetch2.Fetch(src_uri, d)
 fetcher.unpack(d.getVar('WORKDIR')

We historically dealt with tarballs which usually have a NAME-VERSION
directory within them, so when you extract them, they go into a sub
directory which tar creates. We usually call that subdirectory "S".

When we wrote the git fetcher, we emulated this by using a "git"
directory to extract into rather than WORKDIR.

For local files, there is no sub directory so they go into WORKDIR.
This includes patches, which do_patch looks for in WORKDIR and applies
them from there.

What issues does this cause? If you have an existing WORKDIR and run a
build with:

SRC_URI = "file://a file://b"

then change it to:

SRC_URI = "file://a"

and rebuild the recipe, the fetch and unpack tasks will rerun and their
hashes will change but the file "b" is still in WORKDIR. Nothing in the
codebase knows that it should delete "b" from there. If you have code
which does "if exists(b)", which is common, it will break.

There are variations on this, such as a conditional append on some
override to SRC_URI but the fundamental problem is one of cleanup when
unpack is to rerun.

The naive approach is then to think "lets just delete WORKDIR" when
running do_unpack. There is the small problem of WORKDIR/temp with logs
in. There is also the pseudo database and other things tasks could have
done. Basically, whilst tempting, it doesn't work out well in practise
particularly as that whilst unpack might rerun, not all other tasks
might.

I did also try a couple of other ideas. We could fetch into a
subdirectory, then either copy or symlink into place depending on which
set of performance/usabiity challenges you want to deal with. You could
involve a manifest of the files and then move into position so later
you'd know which ones to delete.

Part of the problem is that in some cases recipes do:

S = "${WORKDIR}"

for simplicity. This means that you also can't wipe out S as it might
point at WORKDIR.

SPDX users have requested a json file of file and checksums after the
unpack and before do_patch. Such a manifest could also be useful for
attempting cleanup of an existing WORKDIR so I suspect the solution
probably lies in that direction, probably unpacking into a subdir,
indexing it, then moving into position.


By "moving it into position" do you mean moving the files from the
clean subdirectory to the locations they would occupy today?

If so I don't understand why that's strictly necessary. It seems
like almost all of the complexity of this will be to support a
use-case we don't really like anyway (S = "${WORKDIR}"). Manifests are
great and all, but it causes a lot of problems if they get out of sync
and I suspect that would happen more often than we would like, e.g.
with devtool, make config, manual editing, etc. If we can keep it
simple and not rely on external state (e.g. a manifest) I think it
will be a lot easier to maintain in the long run.


Dropping S = "${WORKDIR}" doesn't solve the problem being described
here, it just removes something which complicates current code and
makes that problem harder to solve. Even not supporting S =
"${WORKDIR}", do_unpack still unpacks to WORKDIR with the S directory
created by the tarball.


In this particular piece, it's always bugged me that I don't have control over 
the place it unpacks (whatever it is), where it patches and the S directory. 
(These are NOT the same thing in some cases of cases, but we ending up having to 
"make them the same".)


For instance, I've got software that is going to download (currently) into:

WORKDIR/embeddedsw/
   apps/
  app1/
variant1
variant2
  app2/
variant1
variant2

(I don't have ownership over the structure, so I have to live with it...)

Each app & variant is a separate recipe.  So we end up having to play games with 
the S and patchlevel and other thing so that I can have 2 recipes (app1 and 
app2) that will build the correct variant for their machine.


If I could say:

unpack to $WORKDIR
patch in $WORKDIR/embeddedsw/apps
source in $WORKDIR/embeddedsw/apps/app1/variant1

it would make it easier in this extreme case.  I would guess in most other 
cases, the complexity could be hidden by defaults to preserve existing behavior.



In some ways, what we're really trying to do is define for a given task what 
directory it should be working within.  So maybe that is a better way of 
thinking of this.  (adding that configure,compile,install would operate within B.)


Anyway, just a few thoughts from reading through this.

--Mark



Cheers,

Richard






-=-=-=-=-=-=-=-=-=-=-=-

Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2022-12-29 Thread Martin Jansa
On Thu, Dec 29, 2022 at 5:28 PM Trevor Woerner  wrote:

> On Thu 2022-12-29 @ 03:51:08 PM, Martin Jansa wrote:
> > On Thu, Dec 29, 2022 at 3:38 PM Trevor Woerner 
> wrote:
> >
> > > On Thu 2022-12-29 @ 01:56:51 PM, Richard Purdie wrote:
> > > > There are variations on this, such as a conditional append on some
> > > > override to SRC_URI but the fundamental problem is one of cleanup
> when
> > > > unpack is to rerun.
> > >
> > > ...just to elaborate a bit more on this variation for everyone's
> benefit
> > > (Richard already understands the details of my scenario):
> > >
> > > Some recipes require us to generate config files by hand in order to
> get a
> > > piece of software/service to work a correctly in our environment. A
> > > concrete
> > > example could be specifying the IP address of a time server to use for
> > > clock
> > > synchronization in chrony's /etc/chrony.conf file. Another example
> could
> > > be to
> > > provide a /etc/network/interfaces file so networking works on a given
> > > device
> > > in our specific network.
> > >
> > > In my case I might want to build the same image, for the same device,
> but
> > > use
> > > two different sets of config files. If the device is going to run on my
> > > non-routable network then it will use CONDITION1 config files. If I
> want to
> > > build a set of images for devices running on my routable network then
> I'll
> > > need to use the CONDITION2 set of config files:
> > >
> > > meta-project
> > > ├── README
> > > ├── conf
> > > │   └── layer.conf
> > > └── recipes-configfiles
> > > ├── chrony
> > > │   ├── chrony_%.bbappend
> > > │   └── files
> > > │   ├── condition1
> > > │   │   └── chrony.conf
> > > │   └── condition2
> > > │   └── chrony.conf
> > > └── init-ifupdown
> > > ├── files
> > > │   ├── condition1
> > > │   │   └── interfaces
> > > │   └── condition2
> > > │   └── interfaces
> > > └── init-ifupdown_%.bbappend
> > >
> > > Then, somewhere, I either specify:
> > >
> > > MACHINEOVERRIDES .= ":condition1"
> > >
> > > or:
> > >
> > > MACHINEOVERRIDES .= ":condition2"
> > >
> > > NOTE: using "OVERRIDES .= ":conditionX" doesn't work, it has to be a
> > >   MACHINEOVERRIDES since not all overrides are evaluated for the
> > > fetcher
> > >   in order to save parsing time (is that correct?)
> > >
> > > If I do a:
> > >
> > > $ bitbake  -c cleansstate
> > >
> > > (perhaps "-c clean" would be enough?) then perform a build, I always
> get
> > > the
> > > correct set of config files in my image. If I don't do a clenastate
> between
> > > builds in which I change the override, then I simply get the last
> config
> > > file
> > > that's in the WORKDIR.
> >
> >
> > This example is a bit surprising to me.
> >
> > I understand the case mentioned by Richard that files aren't removed from
> > WORKDIR when they are no longer in SRC_URI (happens to me all the time
> when
> > e.g. renaming a .patch file and then seeing both old and new .patch file
> in
> > WORKDIR).
> >
> > But why doesn't fetcher overwrite your chrony.conf and interfaces file
> > after MACHINEOVERRIDES is changed?
>
> I spent a fair amount of time yesterday proving to myself that it wasn't
> changing the config file by simply changing the MACHINEOVERRIDES. But it
> wouldn't be the first time I was certain something was working a certain
> way,
> then later couldn't reproduce it.
>
> > And are you really changing MACHINEOVERRIDES while MACHINE stays the
> same?
> > I would expect 2 MACHINEs each with own set of MACHINEOVERRIDES and
> recipes
> > like this being MACHINE_ARCH not TUNE_PKGARCH and then each will have own
> > WORKDIR with own set of files.
>
> If there's a better way, I'd be quite interested in learning it. I'm pretty
> sure MACHINEOVERRIDES wasn't designed for this, and probably isn't the
> right
> way to go about it, but it's a tool that I have and a tool that, in theory,
> should do what I want (?)
>

I was always using MACHINEOVERRIDES to have common override for multiple
different MACHINEs (like SOC_FAMILY, MACHINE_CLASS, MACHINE_VARIANT
variables various projects use).

I guess changing MACHINE_FEATURES or something like that would be slightly
better fit for the usecase, but I understand that you wanted to take
advantage of MACHINEOVERRIDES being included in:
meta/conf/bitbake.conf:FILESOVERRIDES =
"${TRANSLATED_TARGET_ARCH}:${MACHINEOVERRIDES}:${DISTROOVERRIDES}"

> it has to be a MACHINEOVERRIDES since not all overrides are evaluated for
the fetcher
> in order to save parsing time (is that correct?)

It's partially correct, it's the FILESOVERRIDES which is used to define
FILESPATH which is already quite long and all possible locations are tried
until the first existing file is found 

Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2022-12-29 Thread Trevor Woerner
On Thu 2022-12-29 @ 03:51:08 PM, Martin Jansa wrote:
> On Thu, Dec 29, 2022 at 3:38 PM Trevor Woerner  wrote:
> 
> > On Thu 2022-12-29 @ 01:56:51 PM, Richard Purdie wrote:
> > > There are variations on this, such as a conditional append on some
> > > override to SRC_URI but the fundamental problem is one of cleanup when
> > > unpack is to rerun.
> >
> > ...just to elaborate a bit more on this variation for everyone's benefit
> > (Richard already understands the details of my scenario):
> >
> > Some recipes require us to generate config files by hand in order to get a
> > piece of software/service to work a correctly in our environment. A
> > concrete
> > example could be specifying the IP address of a time server to use for
> > clock
> > synchronization in chrony's /etc/chrony.conf file. Another example could
> > be to
> > provide a /etc/network/interfaces file so networking works on a given
> > device
> > in our specific network.
> >
> > In my case I might want to build the same image, for the same device, but
> > use
> > two different sets of config files. If the device is going to run on my
> > non-routable network then it will use CONDITION1 config files. If I want to
> > build a set of images for devices running on my routable network then I'll
> > need to use the CONDITION2 set of config files:
> >
> > meta-project
> > ├── README
> > ├── conf
> > │   └── layer.conf
> > └── recipes-configfiles
> > ├── chrony
> > │   ├── chrony_%.bbappend
> > │   └── files
> > │   ├── condition1
> > │   │   └── chrony.conf
> > │   └── condition2
> > │   └── chrony.conf
> > └── init-ifupdown
> > ├── files
> > │   ├── condition1
> > │   │   └── interfaces
> > │   └── condition2
> > │   └── interfaces
> > └── init-ifupdown_%.bbappend
> >
> > Then, somewhere, I either specify:
> >
> > MACHINEOVERRIDES .= ":condition1"
> >
> > or:
> >
> > MACHINEOVERRIDES .= ":condition2"
> >
> > NOTE: using "OVERRIDES .= ":conditionX" doesn't work, it has to be a
> >   MACHINEOVERRIDES since not all overrides are evaluated for the
> > fetcher
> >   in order to save parsing time (is that correct?)
> >
> > If I do a:
> >
> > $ bitbake  -c cleansstate
> >
> > (perhaps "-c clean" would be enough?) then perform a build, I always get
> > the
> > correct set of config files in my image. If I don't do a clenastate between
> > builds in which I change the override, then I simply get the last config
> > file
> > that's in the WORKDIR.
> 
> 
> This example is a bit surprising to me.
> 
> I understand the case mentioned by Richard that files aren't removed from
> WORKDIR when they are no longer in SRC_URI (happens to me all the time when
> e.g. renaming a .patch file and then seeing both old and new .patch file in
> WORKDIR).
> 
> But why doesn't fetcher overwrite your chrony.conf and interfaces file
> after MACHINEOVERRIDES is changed?

I spent a fair amount of time yesterday proving to myself that it wasn't
changing the config file by simply changing the MACHINEOVERRIDES. But it
wouldn't be the first time I was certain something was working a certain way,
then later couldn't reproduce it.

> And are you really changing MACHINEOVERRIDES while MACHINE stays the same?
> I would expect 2 MACHINEs each with own set of MACHINEOVERRIDES and recipes
> like this being MACHINE_ARCH not TUNE_PKGARCH and then each will have own
> WORKDIR with own set of files.

If there's a better way, I'd be quite interested in learning it. I'm pretty
sure MACHINEOVERRIDES wasn't designed for this, and probably isn't the right
way to go about it, but it's a tool that I have and a tool that, in theory,
should do what I want (?)

There are a couple things I'm doing, and maybe I'm not doing them the right
way. First off, the decision as to which set of config files to use should be
done by the user at build time. As such I'm tweaking MACHINEOVERRIDES in
conf/local.conf. Maybe that's too late in the parsing process?

Second, I'm checking the contents of the config file by looking in chrony's
packages-split area. Maybe that's the wrong place?

Would I have to create multiple machine.conf files? If so, that's not really
the correct semantics for this use-case either. Creating multiple binary
packages that are just dropped in at the end could work too, but would also be
cumbersome (assuming the set of config files would have to be tarballed up).

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1695): 
https://lists.openembedded.org/g/openembedded-architecture/message/1695
Mute This Topic: https://lists.openembedded.org/mt/95936561/21656
Group Owner: openembedded-architecture+ow...@lists.openembedded.org
Unsubscribe: 

Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2022-12-29 Thread Richard Purdie
On Thu, 2022-12-29 at 08:50 -0600, Joshua Watt wrote:
> On Thu, Dec 29, 2022 at 7:56 AM Richard Purdie
>  wrote:
> > 
> > I was asked about a WORKDIR/fetcher interaction problem and the bugs it
> > results in. I've tried to write down my thoughts.
> > 
> > The unpack task writes it's output to WORKDIR as base.bbclass says:
> > 
> > fetcher = bb.fetch2.Fetch(src_uri, d)
> > fetcher.unpack(d.getVar('WORKDIR')
> > 
> > We historically dealt with tarballs which usually have a NAME-VERSION
> > directory within them, so when you extract them, they go into a sub
> > directory which tar creates. We usually call that subdirectory "S".
> > 
> > When we wrote the git fetcher, we emulated this by using a "git"
> > directory to extract into rather than WORKDIR.
> > 
> > For local files, there is no sub directory so they go into WORKDIR.
> > This includes patches, which do_patch looks for in WORKDIR and applies
> > them from there.
> > 
> > What issues does this cause? If you have an existing WORKDIR and run a
> > build with:
> > 
> > SRC_URI = "file://a file://b"
> > 
> > then change it to:
> > 
> > SRC_URI = "file://a"
> > 
> > and rebuild the recipe, the fetch and unpack tasks will rerun and their
> > hashes will change but the file "b" is still in WORKDIR. Nothing in the
> > codebase knows that it should delete "b" from there. If you have code
> > which does "if exists(b)", which is common, it will break.
> > 
> > There are variations on this, such as a conditional append on some
> > override to SRC_URI but the fundamental problem is one of cleanup when
> > unpack is to rerun.
> > 
> > The naive approach is then to think "lets just delete WORKDIR" when
> > running do_unpack. There is the small problem of WORKDIR/temp with logs
> > in. There is also the pseudo database and other things tasks could have
> > done. Basically, whilst tempting, it doesn't work out well in practise
> > particularly as that whilst unpack might rerun, not all other tasks
> > might.
> > 
> > I did also try a couple of other ideas. We could fetch into a
> > subdirectory, then either copy or symlink into place depending on which
> > set of performance/usabiity challenges you want to deal with. You could
> > involve a manifest of the files and then move into position so later
> > you'd know which ones to delete.
> > 
> > Part of the problem is that in some cases recipes do:
> > 
> > S = "${WORKDIR}"
> > 
> > for simplicity. This means that you also can't wipe out S as it might
> > point at WORKDIR.
> > 
> > SPDX users have requested a json file of file and checksums after the
> > unpack and before do_patch. Such a manifest could also be useful for
> > attempting cleanup of an existing WORKDIR so I suspect the solution
> > probably lies in that direction, probably unpacking into a subdir,
> > indexing it, then moving into position.
> 
> By "moving it into position" do you mean moving the files from the
> clean subdirectory to the locations they would occupy today?
> 
> If so I don't understand why that's strictly necessary. It seems
> like almost all of the complexity of this will be to support a
> use-case we don't really like anyway (S = "${WORKDIR}"). Manifests are
> great and all, but it causes a lot of problems if they get out of sync
> and I suspect that would happen more often than we would like, e.g.
> with devtool, make config, manual editing, etc. If we can keep it
> simple and not rely on external state (e.g. a manifest) I think it
> will be a lot easier to maintain in the long run.

Dropping S = "${WORKDIR}" doesn't solve the problem being described
here, it just removes something which complicates current code and
makes that problem harder to solve. Even not supporting S =
"${WORKDIR}", do_unpack still unpacks to WORKDIR with the S directory
created by the tarball.

Cheers,

Richard


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1694): 
https://lists.openembedded.org/g/openembedded-architecture/message/1694
Mute This Topic: https://lists.openembedded.org/mt/95936561/21656
Group Owner: openembedded-architecture+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2022-12-29 Thread Martin Jansa
On Thu, Dec 29, 2022 at 3:38 PM Trevor Woerner  wrote:

> On Thu 2022-12-29 @ 01:56:51 PM, Richard Purdie wrote:
> > There are variations on this, such as a conditional append on some
> > override to SRC_URI but the fundamental problem is one of cleanup when
> > unpack is to rerun.
>
> ...just to elaborate a bit more on this variation for everyone's benefit
> (Richard already understands the details of my scenario):
>
> Some recipes require us to generate config files by hand in order to get a
> piece of software/service to work a correctly in our environment. A
> concrete
> example could be specifying the IP address of a time server to use for
> clock
> synchronization in chrony's /etc/chrony.conf file. Another example could
> be to
> provide a /etc/network/interfaces file so networking works on a given
> device
> in our specific network.
>
> In my case I might want to build the same image, for the same device, but
> use
> two different sets of config files. If the device is going to run on my
> non-routable network then it will use CONDITION1 config files. If I want to
> build a set of images for devices running on my routable network then I'll
> need to use the CONDITION2 set of config files:
>
> meta-project
> ├── README
> ├── conf
> │   └── layer.conf
> └── recipes-configfiles
> ├── chrony
> │   ├── chrony_%.bbappend
> │   └── files
> │   ├── condition1
> │   │   └── chrony.conf
> │   └── condition2
> │   └── chrony.conf
> └── init-ifupdown
> ├── files
> │   ├── condition1
> │   │   └── interfaces
> │   └── condition2
> │   └── interfaces
> └── init-ifupdown_%.bbappend
>
> Then, somewhere, I either specify:
>
> MACHINEOVERRIDES .= ":condition1"
>
> or:
>
> MACHINEOVERRIDES .= ":condition2"
>
> NOTE: using "OVERRIDES .= ":conditionX" doesn't work, it has to be a
>   MACHINEOVERRIDES since not all overrides are evaluated for the
> fetcher
>   in order to save parsing time (is that correct?)
>
> If I do a:
>
> $ bitbake  -c cleansstate
>
> (perhaps "-c clean" would be enough?) then perform a build, I always get
> the
> correct set of config files in my image. If I don't do a clenastate between
> builds in which I change the override, then I simply get the last config
> file
> that's in the WORKDIR.


This example is a bit surprising to me.

I understand the case mentioned by Richard that files aren't removed from
WORKDIR when they are no longer in SRC_URI (happens to me all the time when
e.g. renaming a .patch file and then seeing both old and new .patch file in
WORKDIR).

But why doesn't fetcher overwrite your chrony.conf and interfaces file
after MACHINEOVERRIDES is changed?

And are you really changing MACHINEOVERRIDES while MACHINE stays the same?
I would expect 2 MACHINEs each with own set of MACHINEOVERRIDES and recipes
like this being MACHINE_ARCH not TUNE_PKGARCH and then each will have own
WORKDIR with own set of files.

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1693): 
https://lists.openembedded.org/g/openembedded-architecture/message/1693
Mute This Topic: https://lists.openembedded.org/mt/95936561/21656
Group Owner: openembedded-architecture+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2022-12-29 Thread Joshua Watt
On Thu, Dec 29, 2022 at 7:56 AM Richard Purdie
 wrote:
>
> I was asked about a WORKDIR/fetcher interaction problem and the bugs it
> results in. I've tried to write down my thoughts.
>
> The unpack task writes it's output to WORKDIR as base.bbclass says:
>
> fetcher = bb.fetch2.Fetch(src_uri, d)
> fetcher.unpack(d.getVar('WORKDIR')
>
> We historically dealt with tarballs which usually have a NAME-VERSION
> directory within them, so when you extract them, they go into a sub
> directory which tar creates. We usually call that subdirectory "S".
>
> When we wrote the git fetcher, we emulated this by using a "git"
> directory to extract into rather than WORKDIR.
>
> For local files, there is no sub directory so they go into WORKDIR.
> This includes patches, which do_patch looks for in WORKDIR and applies
> them from there.
>
> What issues does this cause? If you have an existing WORKDIR and run a
> build with:
>
> SRC_URI = "file://a file://b"
>
> then change it to:
>
> SRC_URI = "file://a"
>
> and rebuild the recipe, the fetch and unpack tasks will rerun and their
> hashes will change but the file "b" is still in WORKDIR. Nothing in the
> codebase knows that it should delete "b" from there. If you have code
> which does "if exists(b)", which is common, it will break.
>
> There are variations on this, such as a conditional append on some
> override to SRC_URI but the fundamental problem is one of cleanup when
> unpack is to rerun.
>
> The naive approach is then to think "lets just delete WORKDIR" when
> running do_unpack. There is the small problem of WORKDIR/temp with logs
> in. There is also the pseudo database and other things tasks could have
> done. Basically, whilst tempting, it doesn't work out well in practise
> particularly as that whilst unpack might rerun, not all other tasks
> might.
>
> I did also try a couple of other ideas. We could fetch into a
> subdirectory, then either copy or symlink into place depending on which
> set of performance/usabiity challenges you want to deal with. You could
> involve a manifest of the files and then move into position so later
> you'd know which ones to delete.
>
> Part of the problem is that in some cases recipes do:
>
> S = "${WORKDIR}"
>
> for simplicity. This means that you also can't wipe out S as it might
> point at WORKDIR.
>
> SPDX users have requested a json file of file and checksums after the
> unpack and before do_patch. Such a manifest could also be useful for
> attempting cleanup of an existing WORKDIR so I suspect the solution
> probably lies in that direction, probably unpacking into a subdir,
> indexing it, then moving into position.

By "moving it into position" do you mean moving the files from the
clean subdirectory to the locations they would occupy today?

If so I don't understand why that's strictly necessary. It seems
like almost all of the complexity of this will be to support a
use-case we don't really like anyway (S = "${WORKDIR}"). Manifests are
great and all, but it causes a lot of problems if they get out of sync
and I suspect that would happen more often than we would like, e.g.
with devtool, make config, manual editing, etc. If we can keep it
simple and not rely on external state (e.g. a manifest) I think it
will be a lot easier to maintain in the long run.

>
> Personally, I'd also like to see S = "${WORKDIR}" deprecated and
> dropped so that a subdir is always used, just to stop our code getting
> too full of corner cases which are hard to maintain.
>
> I've had a few experiments with variations on both issues on various
> branches at different times, I just haven't had enough time to
> socialise the changes, migrate code and handle the inevitable fallout.
>
> Cheers,
>
> Richard
>
>
> 
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1692): 
https://lists.openembedded.org/g/openembedded-architecture/message/1692
Mute This Topic: https://lists.openembedded.org/mt/95936561/21656
Group Owner: openembedded-architecture+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [Openembedded-architecture] WORKDIR fetcher interaction issue

2022-12-29 Thread Trevor Woerner
On Thu 2022-12-29 @ 01:56:51 PM, Richard Purdie wrote:
> There are variations on this, such as a conditional append on some
> override to SRC_URI but the fundamental problem is one of cleanup when
> unpack is to rerun.

...just to elaborate a bit more on this variation for everyone's benefit
(Richard already understands the details of my scenario):

Some recipes require us to generate config files by hand in order to get a
piece of software/service to work a correctly in our environment. A concrete
example could be specifying the IP address of a time server to use for clock
synchronization in chrony's /etc/chrony.conf file. Another example could be to
provide a /etc/network/interfaces file so networking works on a given device
in our specific network.

In my case I might want to build the same image, for the same device, but use
two different sets of config files. If the device is going to run on my
non-routable network then it will use CONDITION1 config files. If I want to
build a set of images for devices running on my routable network then I'll
need to use the CONDITION2 set of config files:

meta-project
├── README
├── conf
│   └── layer.conf
└── recipes-configfiles
    ├── chrony
    │   ├── chrony_%.bbappend
    │   └── files
    │   ├── condition1
    │   │   └── chrony.conf
    │   └── condition2
    │   └── chrony.conf
    └── init-ifupdown
    ├── files
    │   ├── condition1
    │   │   └── interfaces
    │   └── condition2
    │   └── interfaces
    └── init-ifupdown_%.bbappend

Then, somewhere, I either specify:

MACHINEOVERRIDES .= ":condition1"

or:

MACHINEOVERRIDES .= ":condition2"

NOTE: using "OVERRIDES .= ":conditionX" doesn't work, it has to be a
  MACHINEOVERRIDES since not all overrides are evaluated for the fetcher
  in order to save parsing time (is that correct?)

If I do a:

$ bitbake  -c cleansstate

(perhaps "-c clean" would be enough?) then perform a build, I always get the
correct set of config files in my image. If I don't do a clenastate between
builds in which I change the override, then I simply get the last config file
that's in the WORKDIR.

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1691): 
https://lists.openembedded.org/g/openembedded-architecture/message/1691
Mute This Topic: https://lists.openembedded.org/mt/95936561/21656
Group Owner: openembedded-architecture+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[Openembedded-architecture] WORKDIR fetcher interaction issue

2022-12-29 Thread Richard Purdie
I was asked about a WORKDIR/fetcher interaction problem and the bugs it
results in. I've tried to write down my thoughts.

The unpack task writes it's output to WORKDIR as base.bbclass says:

fetcher = bb.fetch2.Fetch(src_uri, d)
fetcher.unpack(d.getVar('WORKDIR')

We historically dealt with tarballs which usually have a NAME-VERSION
directory within them, so when you extract them, they go into a sub
directory which tar creates. We usually call that subdirectory "S".

When we wrote the git fetcher, we emulated this by using a "git"
directory to extract into rather than WORKDIR.

For local files, there is no sub directory so they go into WORKDIR.
This includes patches, which do_patch looks for in WORKDIR and applies
them from there.

What issues does this cause? If you have an existing WORKDIR and run a
build with:

SRC_URI = "file://a file://b"

then change it to:

SRC_URI = "file://a"

and rebuild the recipe, the fetch and unpack tasks will rerun and their
hashes will change but the file "b" is still in WORKDIR. Nothing in the
codebase knows that it should delete "b" from there. If you have code
which does "if exists(b)", which is common, it will break.

There are variations on this, such as a conditional append on some
override to SRC_URI but the fundamental problem is one of cleanup when
unpack is to rerun.

The naive approach is then to think "lets just delete WORKDIR" when
running do_unpack. There is the small problem of WORKDIR/temp with logs
in. There is also the pseudo database and other things tasks could have
done. Basically, whilst tempting, it doesn't work out well in practise
particularly as that whilst unpack might rerun, not all other tasks
might.

I did also try a couple of other ideas. We could fetch into a
subdirectory, then either copy or symlink into place depending on which
set of performance/usabiity challenges you want to deal with. You could
involve a manifest of the files and then move into position so later
you'd know which ones to delete.

Part of the problem is that in some cases recipes do:

S = "${WORKDIR}"

for simplicity. This means that you also can't wipe out S as it might
point at WORKDIR.

SPDX users have requested a json file of file and checksums after the
unpack and before do_patch. Such a manifest could also be useful for
attempting cleanup of an existing WORKDIR so I suspect the solution
probably lies in that direction, probably unpacking into a subdir,
indexing it, then moving into position.

Personally, I'd also like to see S = "${WORKDIR}" deprecated and
dropped so that a subdir is always used, just to stop our code getting
too full of corner cases which are hard to maintain.

I've had a few experiments with variations on both issues on various
branches at different times, I just haven't had enough time to
socialise the changes, migrate code and handle the inevitable fallout.

Cheers,

Richard


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1690): 
https://lists.openembedded.org/g/openembedded-architecture/message/1690
Mute This Topic: https://lists.openembedded.org/mt/95936561/21656
Group Owner: openembedded-architecture+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-