Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-05-06 Thread Panu Matilainen
v6 will have signed payload size information : https://github.com/rpm-software-management/rpm/pull/3017/commits/784bb9076d614da33d29123f5ef6236a57d38463 -- Reply to this email directly or view it on GitHub:

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread David Disseldorp
> Thing is, since the payload is typically compressed, offsets are useless > because they'd just point to somewhere in the middle of a compression stream, > you can't jump to it without reading the whole thing anyhow. zstd frame headers can carry both compressed and uncompressed sizes, which

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread Demi Marie Obenour
@ddiss fsverity would also be suitable. If you go with this approach, I recommend also including the total length of the payload in the (signed) header, to avoid vulnerabilities where extra data somehow doesn’t get hashed. -- Reply to this email directly or view it on GitHub:

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread David Disseldorp
@DemiMarie I wonder whether packaging fsverity (Merkle tree) metadata into the rpm header would be an option for performant block-based hashing. It'd also bloat the rpm header, but may have the benefit of allowing the metadata to be reused for post-installation integrity and authenticity

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread David Disseldorp
I think as long as the zstd frame boundaries align with individual file data segments then it should still work fine with `BTRFS_IOC_ENCODED_WRITE` and reflink, although we'd need to write out individual files immediately rather than staging the complete cpio archive. -- Reply to this email

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread Panu Matilainen
Thing is, since the payload is typically compressed, offsets are useless because they'd just point to somewhere in the middle of a compression stream. If the files in the payload were individually compressed, it'd seem quite reasonable to have offsets stored. Of course that would likely loose

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread Panu Matilainen
There are no offsets stored, so I don't think there is. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9271000 You are receiving this because you are subscribed to this thread. Message ID:

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread David Disseldorp
Is there any way to pad (for alignment) between file data segments in the v6 payload? IIUC, the rpm header carries file data payload offset and length information, so perhaps sparse / zero ranges in between might be possible? -- Reply to this email directly or view it on GitHub:

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread Panu Matilainen
> The rpm payload format isn't modified, although there's a slight "bending" of > the cpio/newc spec to use the filename field for padding. zstandard frames > making up the compressed rpm payload explicitly carry both compressed and > uncompressed lengths, to allow detection of

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread David Disseldorp
Makes sense, thanks for the clarification. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9270632 You are receiving this because you are subscribed to this thread. Message ID:

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread Panu Matilainen
The compressed and uncompressed digests are *alternatives*, both cannot be valid at the same time. Rpm calculates both but uses the one that matches (if any). This is to allow freely changing between uncompressed and compressed format of the same content, which "obviously" must be a legit case.

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread David Disseldorp
One other thing I noticed is that the rpm header carries the digest of the compressed payload in addition to the uncompressed payload digest. Verification of the compressed payload alongside `BTRFS_IOC_ENCODED_WRITE` is relatively straightforward, but verifying the uncompressed payload would

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread David Disseldorp
If what you mean is that the payload shouldn't be used before it's been checked against a digest in the cryptographically verified header, then yes. The question is what quantifies as "used". The payload data needs to go somewhere while we're calculating the digest, in which case we could

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread Demi Marie Obenour
Do you plan on doing streaming cryptographic verification? See . -- Reply to this email directly or view it on GitHub:

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread David Disseldorp
I hacked together a proof of concept implementation which uses `BTRFS_IOC_ENCODED_WRITE` to write a zstd-compressed cpio payload directly to disk as-is, from a carefully aligned rpm. Compressed extents are then reflinked to the installation path. presentation:

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2023-02-21 Thread Panu Matilainen
> To add to this: Plugins should not get access to content that has not been > verified yet. Regardless of the technical requirements, this is a fine example of what a sane API looks like: it's not about just making X somehow possible, but building the right usage into the API. Often easier

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2023-02-20 Thread Richard Phibel
Thanks a lot for these comments. I think I have a better idea of what I need to do. I will work on a new implementation. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-5055827 You are receiving this because

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2023-02-20 Thread Demi Marie Obenour
To add to this: Plugins should not get access to content that has not been verified yet. That means creating a new method of cryptographic verification, one that allows streaming verification of the data. -- Reply to this email directly or view it on GitHub:

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2023-02-19 Thread Panu Matilainen
Okay so, trying to give a better idea of what I'm after: The fact that RPMRC_PLUGIN_CONTENTS is still needed is a strong signal that it's not quite right. The way I'm imagining this is that you have a probe function for the content handler, that gets called for each and every package, and for

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2023-01-23 Thread Richard Phibel
@pmatilai Hello, I was wondering if you had a chance to have a look at my proposal? Please let me know if this is unclear or if you need more details. -- Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-4762243

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2023-01-09 Thread Richard Phibel
Hello @pmatilai, I work at Meta with @chantra and @malmond77 . I am implementing the new API for the RPM CoW plugin based on your comment. This is what I have implemented so far: I defined 2 new fields of type `rpmPlugin` (with associated getters and setters) in `rpmte` structure: -

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2022-12-07 Thread David Disseldorp
> The idea of aligning cpio metadata is very interesting. I can see how it'd > help initramfs building speed tremendously. > > As I understand it, RPM is pretty different: the main difference is that > we're trying (fairly hard) not to change the normal format of rpm as found on > mirrors for

Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2022-12-06 Thread David Disseldorp
Thanks for working on this - the new CoW extent based approach looks exciting. I have a few comments / questions on the initial proposal... > Files are converted (“transcoded”) locally during download using > /usr/bin/rpm2extents (part of rpm codebase). The format is not intended to be >