Daniel Kahn Gillmor wrote:
> Would it possible to generate the GPT GUID based on a digest of the
> contents of the ISO itself?
> That would give you identical GUIDs for identical ISOs, and distinct
> GUIDs for ISOs that vary in any way
The general use case of mkisofs is to produce the ISO as sequential stream
with no need to revisit it.
So we could only use the first 528 bytes of the ISO before we have to
determine the GUID. At 528 begins the CRC32 of the GPT header block which
already depends on the disk GUID.
In worst case there is only the entropy of the ISO image size counted
in blocks of 2048 (although published as 512-block count in the partition
table of the MBR). Usually in the range of a few hundred thousand.
Often aligned to full megabytes.
So this entropy is too few to be used for GUIDs by default.
We'd need an extra option for xorriso anyways, by which the user
can tell that reproducibility matters more than GUID quality.
The new option --gpt_disk_guid offers the opportunity to automatically
use the modification time which GRUB deems sufficient to identify the
device where it was booted from.
If the poor modification-time GUIDs are considered insufficient, then i
advise to generate an own GUID for the reproducible ISO by e.g.
and to submit it to the xorriso runs by option --gpt_disk_guid.
> I don't understand well enough how GPT
> interacts with ISOs to be able to sketch out the details, but if there
> is a way to look at
Debian's ISOs actually hardly need their GPT because it is not announced
properly by a Protective MBR. (A feature of mjg's EFI isohybrid layout.)
grub-mkrescue ISOs really need their GPT for booting via EFI from USB stick.
This layout was prescribed by Vladimir Serbinenko.
The main risk i can imagine is that at boot or mount time another device
with the same GUIDs is present. Whether this confuses boot firmware, boot
loaders, or mount drivers is not known to me.
The specifications of GUID, in UEFI or RFC 4122, clearly state that they
are meant to be globally unique. Question is how small the world may be
where this uniqueness has to be uphold.
The poor GUIDs derived from modification-time look like
A random one produced by xorriso from /dev/urandom looks like
> (what hash function to use? it probably doesn't even need
> cryptographically secure
Well, it just should not waste the few entropy we have.
We can choose from UEFI specs which prescribe a deterministic GUID
from MAC address and finely granulated clock, or RFC 4122 which
offers to choose between the UEFI one, pseudo-random, or crypto-grade
hashing of user input.
I understand the cryptographic demand is only to obscure the user
input, in case it tells too much about the local machine.
Have a nice day :)
Reproducible-builds mailing list