Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread David Disseldorp
> Thing is, since the payload is typically compressed, offsets are useless 
> because they'd just point to somewhere in the middle of a compression stream, 
> you can't jump to it without reading the whole thing anyhow.

zstd frame headers can carry both compressed and uncompressed sizes, which 
makes it a little easier to seek around within the compressed payload.

> If the files in the payload were individually compressed, it'd seem quite 
> reasonable to have offsets stored. Of course that would likely loose 
> something in the compression ratio.

That's mostly what I did in my rpm-rs based poc - files aren't individually 
compressed, but the start and end of individual file data segments correspond 
to zstd frame boundaries, so individual files can be extracted or written via 
`BTRFS_IOC_ENCODED_WRITE`/reflinked without much effort.


> > The rpm payload format isn't modified, although there's a slight "bending" 
> > of the cpio/newc spec to use the filename field for padding. zstandard 
> > frames making up the compressed rpm payload explicitly carry both 
> > compressed and uncompressed lengths, to allow detection of 
> > filesystem-supported I/O sizes.
> 
> Note that the cpio/newc format is on its way out, you don't want to design 
> too much around that. In rpm v4 it's used for packages where all contained 
> files are below 4GB, but v6 will always use the newer rpm-specific format 
> which doesn't carry any file metadata in the payload: 
> https://rpm-software-management.github.io/rpm/manual/format_v6.html#payload

Returning to this, I suppose I could still pad `07070X` payloads relatively 
easily by injecting individual "padding" files into the archive. This would 
still be compatible with V4, although it might be nice to reserve a special 
*padding* path in the V6 spec which can be skipped over (where V4 would still 
extract / install it). Is the V6 draft still open for these kinds of proposals?
cc: @lnussel

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9281616
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread David Disseldorp
@DemiMarie I wonder whether packaging fsverity (Merkle tree) metadata into the 
rpm header would be an option for performant block-based hashing. It'd also 
bloat the rpm header, but may have the benefit of allowing the metadata to be 
reused for post-installation integrity and authenticity protection (assuming 
the files remain read-only and reside on ext4/btrfs).
For my specific `BTRFS_IOC_ENCODED_WRITE` use case, I didn't intend to 
implement it as a plug-in, but as part of core. The implementation doesn't 
require modifications to the existing on-disk format, and the only FS specific 
behaviour is the actual encoded write ioctl calls.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9280921
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread David Disseldorp
I think as long as the zstd frame boundaries align with individual file data 
segments then it should still work fine with `BTRFS_IOC_ENCODED_WRITE` and 
reflink, although we'd need to write out individual files immediately rather 
than staging the complete cpio archive.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9275970
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-30 Thread David Disseldorp
Is there any way to pad (for alignment) between file data segments in the v6 
payload? IIUC, the rpm header carries file data payload offset and length 
information, so perhaps sparse / zero ranges in between might be possible?

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9270904
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread David Disseldorp
Makes sense, thanks for the clarification.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9270632
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread David Disseldorp
One other thing I noticed is that the rpm header carries the digest of the 
compressed payload in addition to the uncompressed payload digest. Verification 
of the compressed payload alongside `BTRFS_IOC_ENCODED_WRITE` is relatively 
straightforward, but verifying the uncompressed payload would require 
decompression, which we're obviously trying to avoid with this optimization.
I think it's safe to only perform compressed payload verification - zstd frames 
can carry an optional xxHash64 hash, which should be suitable for correctness 
checking if desired.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9270483
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread David Disseldorp
If what you mean is that the payload shouldn't be used before it's been checked 
against a digest in the cryptographically verified header, then yes. The 
question is what quantifies as "used".
The payload data needs to go somewhere while we're calculating the digest, in 
which case we could "stream" it to Btrfs in it's compressed, yet-to-be-verified 
form via `BTRFS_IOC_ENCODED_WRITE`. However, doing so would allow any 
concurrent (pre-verification) reads to effectively fuzz the btrfs/zstd 
decompression code path.
One protection mechanism might be to write out the payload in its regular 
(compressed) form before attempting to *switch* those regular extents to 
compressed extents on btrfs via some new `copy_file_range()` flag. The 
regular->compressed extent switch would only be attempted following successful 
payload verification.
I'm very much open to any other suggestions :-)

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9270320
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2024-04-29 Thread David Disseldorp
I hacked together a proof of concept implementation which uses 
`BTRFS_IOC_ENCODED_WRITE` to write a zstd-compressed cpio payload directly to 
disk as-is, from a carefully aligned rpm. Compressed extents are then reflinked 
to the installation path.
presentation: 
https://www.youtube.com/live/qNGw8IKaqc0?si=vrLkk8Bi9Odfqm4Z=10325
rpm-rs based source: 
https://github.com/ddiss/rpm/tree/poc-btrfs-zstd-encoded-io-reflink-extract

The rpm payload format isn't modified, although there's a slight "bending" of 
the cpio/newc spec to use the filename field for padding. zstandard frames 
making up the compressed rpm payload explicitly carry both compressed and 
uncompressed lengths, to allow detection of filesystem-supported I/O sizes.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-9269013
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2022-12-07 Thread David Disseldorp
> The idea of aligning cpio metadata is very interesting. I can see how it'd 
> help initramfs building speed tremendously.
> 
> As I understand it, RPM is pretty different: the main difference is that 
> we're trying (fairly hard) not to change the normal format of rpm as found on 
> mirrors for now. There are some very interesting ideas on how to change the 
> upstream format, but in doing so, we'd render all existing servers unable to 
> read the format.

To clarify, aligning cpio data segments for *newly built* rpms shouldn't 
necessarily require any change in format. They'd continue to function the same 
as earlier cpio payload rpms, be it with some extra zero-padding.

> If we could tolerate the breakage: I'd love to experiment with 
> `BTRFS_IOC_ENCODED_WRITE` which would reduce writes down and eliminate 
> explicit decompression. For clients or filesystems without CoW support: RPM 
> could decompress and write the normal file. I was hoping encoded writes would 
> eliminate the complex path with curl -> librepo -> rpm2extents. I'm not sure 
> you could get data from the network and write encoded data to disk in one 
> pass like we're doing now. Do you have any ideas on how to resolve that 
> challenge?

I'm not too familiar with the rpm on-disk format, but I'd hoped that 
`BTRFS_IOC_ENCODED_WRITE` could be used without a change to the format, by 
having the rpm header parsed during download to determine whether the 
compressed payload could be written as-is. With a cpio payload it'd then be a 
matter of copy_file_range()ing the (optimally aligned) compressed file data 
segments into the destination during installation.

`BTRFS_IOC_ENCODED_WRITE` appears very restrictive at this stage though:
- it requires `CAP_SYS_ADMIN`, so probably isn't a viable option for 
containers, etc.
- ioctl calls need to specify both unencoded and encoded offset+length, meaning 
that we'd still need to parse rpm payload compression metadata
- the ioctl unencoded length can't exceed 128 KiB
- for zstd encoded I/Os, the ioctl data must represent "as a single zstd frame 
with the windowLog compression parameter set to no more than 17"
  - On openSUSE Tumbleweed I see some rpms currently using zstd compression 
level 19. IIUC, Fedora uses the same zstd level

> Adding cpio metadata, along with a "null" compression type could help 
> eliminate the change in `fsm.c` on how the payload is iterated. Note that 
> `rpm2extents` does not (and cannot) touch headers without invalidating 
> signatures, so the change in compression type is inferred and handled in the 
> plugin.
> 
> Lastly, there's another optimization that would be lost in adopting cpio 
> formatting: content de-duplication. I'm not sure how important this is tho in 
> the big picture, so it might be a worthwhile tradeoff.

Indeed. FWIW, I think your extent based approach offers a lot of worthwhile 
benefits, but just wanted to point out that something similarly CoW friendly 
(although less efficient) is possible without necessarily requiring invasive 
changes :-)

> Thanks for the feedback! Matthew.

Thanks for the response!


-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-4337009
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] API improvement to accommodate for RPM CoW (PR#1470) (Discussion #2057)

2022-12-06 Thread David Disseldorp
Thanks for working on this - the new CoW extent based approach looks exciting. 
I have a few comments / questions on the initial proposal...

> Files are converted (“transcoded”) locally during download using 
> /usr/bin/rpm2extents (part of rpm codebase). The format is not intended to be 
> “portable” - i.e. copying the files from the cache is not supported.
>
> Regular RPMs use a compressed .cpio based payload. In contrast, extent based 
> RPMs contain uncompressed data aligned to the fundamental page size of the 
> architecture, e.g. 4KiB on x86_64. This alignment is required for 
> FICLONERANGE to work. Only files are represented in the payload, other 
> directory entries like symlinks, device nodes etc are constructed entirely 
> from rpm header information. Files are referenced by their digest, so 
> identical files are de-duplicated.

I just wanted to highlight that, although not optimal, cpio archives can still 
be used alongside reflinks if the `newc` spec is *bent* a little to provide 
file data-segment alignment. The kernel initramfs archive format is quite 
static, so for [dracut-cpio](https://github.com/dracutdevs/dracut/pull/1531) 
(`dracut --enhanced-cpio`) reflinks I added functionality to zero-pad file 
names so that subsequent file data segments are block aligned. kernel and GNU 
cpio extractors handle this fine, with the caveat that padding can't exceed 
`PATH_MAX`.

On a separate note, Btrfs recently gained support for writing compressed 
extents directly to disk via the new `BTRFS_IOC_ENCODED_WRITE` ioctl. IIUC, 
Fedora also recently changed to using zstd by default for Btrfs root and rpms. 
Have you considered using `BTRFS_IOC_ENCODED_WRITE` functionality instead of 
performing decompression during download?

-- 
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/discussions/2057#discussioncomment-4327609
You are receiving this because you are subscribed to this thread.

Message ID: 
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint