On 2022-01-14 14:38, Stefan Herbrechtsmeier wrote:
Hi Mark,

Am 14.01.2022 um 17:58 schrieb Mark Asselstine:
On 2022-01-14 11:35, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 16:22 schrieb Mark Asselstine:
On 2022-01-14 10:05, Stefan Herbrechtsmeier wrote:
Am 14.01.2022 um 15:15 schrieb Mark Asselstine via lists.openembedded.org:


On 2022-01-14 07:18, Alexander Kanavin wrote:
If we do seriously embark on making npm/go better, the first step could be to make npm/go write out a reproducible manifest for the licenses and sub-packages that can be verified against two recipe checksums during fetch, and ensures no further network access is necessary. That alone would make it a viable fetcher. Those manifests could contain all needed information for further processing (e.g. versions, and what depends on what etc.) And yes, it's a bundled self-contained approach, but that matches how the rest of the world is using npm.


I can't speak to npm but for go this was where I wanted to see things go. Just as work was done to avoid unexpected downloads of Python eggs I always felt the key to improving go integration was some for of automated SRC_URI generation. Once this would be available it could be leveraged for licensing and such.

Stefan, by the way the reason (a) is not possible is that multiple go applications can use a shared 'library' but different versions (or even different git commit ids).

Why is this simpler? The recipes need to list every information about its dependencies. That means you repeat a lot of code and need to change a lot of files if you update a single dependency.

We went through this with go recipes in meta-virt. It didn't work. You end up producing a lot of Yocto Project specific files containing information which is already available in other forms. Throw in the multiple versions issue I described before and you get a mess.

I assume you want to use the version the project recommend and not a single major version. What makes Go so special that the reasons for a single major version are irrelevant? Why don't we use multiple version for C/C++ projects?

Sure, go projects can opt to only use released versions of dependencies, but this doesn't always happen. For now, and possibly into the future, we have to accept this is not the case.

Not using the versions recommended in a go project will result in a level in invalidation of their testing and validation as well as CVE tracking.

Who does this work and how long he do it?

This work is done at the project level when we use the upstream configuration verbatim. Only if we decide to deviate in the YP would there be additional work. Which is why I propose we stick to what we know the project is using, testing with and shipping in their releases.



This actually maps to what happens with C/C++ and results in issues like "works for me" when one user happens to build against a slightly different library version than others use and hits an issue.

Because we define the version this doesn't happen for our user.

The Yocto Project testing coverage is not as extensive as what is done in upstream projects. There are certainly cases where problems exist in application not due to application code but due to library versions in use. Even outside of issues this requires us to duplicate testing that the golang approach avoids (if we replicate the project configuration).


What's more important, having consistently working software for the end user, or enforcing alternative dependencies to be used to fit a model needed to complete a build?

What do you mean by this?

What is your motivation for dropping the dependency versions outlined in a go projects go.mod and instead enforcing it build against a version of your choosing? My point is you are attempting to enforce an arbitrary decision that may affect how the software runs, as opposed to sticking with what the upstream project has tested and validated and is known to work. You are worried about the build and not the end user of the application.



The problem with the focus on working software is that the security suffers. You rely on others and get into trouble if a vulnerability happens or somebody corrupts its own project.

By not build the upstream project as they publish you have made the YP an exception. This does not improve tracking or fixing vulnerabilities.


Large reviews of content that maintainers will have to waste time to determine what is needed to review and what can be ignored as it is just transposed information... Again, the key is automation, that is what makes things simpler.

Without structured information any automation is impossible. Does the Go manifest contains all the information a recipe needs (license, CVE product name)?

It contains much of it and things that are missing would be valid suggestions to bring up with the golang community.

Do you have an example what is missing and how golang should provide it?

What happens if we detect a CVE in a dependency? How can we fix it?

The CVE would be applicable to the upstream project and so with this approach we are in a position to work with the go project to resolve the CVE.

This means you have to work with every project in the dependency chain until the change reach your root project.


You are phrasing this in an unfair way. Typically dependencies are not linear, nor can you assume it will always be the furthest on a chain, so rarely would a CVE mean you are working with every project in the dependency chain. Beyond your wording, yes you will need to work CVE fixes through a set of dependencies, in the same way you would dealing with recipes.

The number of recipes and versions of recipes required to support go applications quickly becomes difficult to manage.

How one big recipe instead of multi small recipes can solve this problem?

I am not pushing a big recipe. Keep the go recipes much as they are now, but leverage the go tools to generate support artifacts.

What are the artifacts?

These would be YP specific artifacts required to perform things like SBOM generation, etc...

What is needed for the SBOM generation?

Right now in meta-virt we only build up SRC_URI and provide some hints on where to put dependencies such that a build of the main application can be built. We would want to have some additional information such as licensing to complete an SBOM. But I will leave this to those currently working on SBOM to chime in.


Does the Go community need this artifacts too?

Not the artifacts, but the base information to generate the artifacts would be of interest for the go projects to have since the data could be reused by other, non-YP projects.

What information is missing beside the license? The problem with the license is that you have to trust the maintainer of the repository or you have to guess the license at every build.

This level of trust is universal. I suspect folks have transposed license details from Pypi without digging into source code to validate licensing.


This is similar to how the near ubiquitous use of Pypi had Python projects line up a complete set of data that was not only useful in Pypi but elsewhere. We have seen a steady improvement in these matters, from poor execution in Java (to support Maven), to Python, Ruby and now the new generation of this type of data in Golang and Rust. We can exploit this in YP to not have to be stuck in the box of writing recipes.

Do you really thing YP should switch to a distributed approach? Doesn't Log4Shell, 'colors' and 'faker' shows the disadvantages of this approach?

I think a project like YP can only exist at the scale it does or grow larger using a distributed approach. What I am pushing and have pushed in the past is that YP is best served by not repeating work that is already done, freeing up time to improve on the things that it is needed to perform.

What YP can do that projects can't is system level testing, where individual packages are typically tested in isolation, YP has the opportunity to test them in concert. But this is another discussion for another day.


The recipes give you the possibility to fine-tune and override settings and use a unify style for different languages.

And yet I would guess 95% of the python bitbake recipe files are the same set of 5 lines and 1/2 of those are includes. We aren't removing the possibility to make customization by why write recipes for everything when only a few need to be customized?

MarkA


Again, just my thoughts, I appreciate the back and forth and remain open to being convinced of alternative ideas in this matter.

Thanks for your thoughts.

Regards
   Stefan
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1423): 
https://lists.openembedded.org/g/openembedded-architecture/message/1423
Mute This Topic: https://lists.openembedded.org/mt/88417908/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to