Hello,
For those not at Guix Days:
We have split into groups discussing various topics. Each group is
collecting notes on its discussion. I am starting this thread as a
place for these notes, to be distributed as necessary.
To kick things off, I've attached my notes on the discussion of
"distibuted substitutes" which we clarified referred to
participatory/peer-to-peer substitutes. I tried to group things
conceptually based on where conversation ended up, but "conclusions"
per se are all under "Next Steps" and "Open Questions".
Thanks,
Juli
#+title: Participatory (p2p) Substitutes
* Angles
** Building
** Delivering
* Why
** substitute servers are slow
** resources
*** compute
**** speed
**** cost
*** storage
** resilience
These problems increase exponentially with users or packages or both.
* Problems
Source code is easier because we can have absolute knowledge of hash of source
-- can cryptographically verify source. By contrast, crypto verification of
binary requires compilation. Need to trust source of binary substitutes.
** Trust
Someone needs to supply the hash. Currently, this is the central Guix build
farm.
** Content-addressed downloads
Need architecture for distributed (network topography) delivery. Can already
content-address sources and binaries; just need trusted hash. That is, same
problem for source and substitutes.
** Nar files
Potentially inefficient?
** Obligations on users
Users may be expected to contribute back bandwidth, potentially build time to
the network.
** Privacy
What if we have private info in ~/gnu/store~ eg because of Guix home managing
dotfiles?
** Granularity
1. Different people have different security/privacy models.
2. People may want to use different transport mechanisms
* Solutions
We seemed to quickly shift to envisioning an opt-in network of distributors, eg
with Guix system service. Above problems addressed below:
** Trust
1. a server/user you choose to trust gives you a hash; you can get this
substitute from any server and hash it yourself.
- need to trust central server
+ can talk to operator
2. apply ~guix challenge~ somehow
3. distribute trust over multiple nodes, eg strongly trust a few nodes, weakly
trust more, test hashes against each other
- could incorporate this into existing substitute certification infra
- existing research in eg Tor exit node trust
4. zero-knowledge proof
- expensive
- more variables = more expensive
- thus, likely not feasible
Conversation is tending towards consensus-based trust (trusting hash if
plurality of trusted nodes agree on hash) combined with "watchdog" application
of ~guix challenge~.
** Content-addressed downloads
1. bittorrent
- definitely tackles bandwidth usage
- tends towards "supernodes" which advertise lots of smaller nodes
+ could run this on Guix infra
2. IPFS
3. (bespoke) OCapN/Spritely
- could facilitate granular control of access
- Spritely envisions distributed storage over ERIS, which is encrypted and
complicates this space
4. ~guix publish~
** Nar files
** Obligations on users
1. have ~guix publish~ already
** Privacy
1. do not advertise hashes, only respond to requests for specific hashes
- there is an attack on this (TahoeLFS encountered this?)
2. only advertise specific substitutes eg what's in the core Guix channel
- could be used to triangulate what software someone uses by watching what
they request
+ already the case if monitoring requests to central substitute server
+ could download and distribute software you don't use
3. may not solve all privacy issues, but must communicate privacy concerns to
users (ie informed consent)
** Granularity
*** Privacy
1. opt-in to share specific nars or equiv (see above)
*** Delivery
1. provide abstract interface to a network
* Next steps
We already have content-addressed distribution.
1. more central substitute servers and mirrors around the world
2. abstract API for decentralized substitute delivery
* Open questions
1. trust mechanism
2. exact delivery mechanism
3. who does the work