August 24, 2023 3:57 PM, "Martin Baulig" <mar...@baulig.is> wrote:

> Hello,
> 
> About 2–3 months ago, I got an initial prototype of Bacula working on GNU 
> Guix. I had the Bacula
> Director, two separate Storage Daemons and the Baculum web interface running 
> in a GNU Guix VM on my
> Synology NAS.

I had to look it up...Apparently Bacula is a way to back up computers on a 
network.  Sounds cool!
https://en.wikipedia.org/wiki/Bacula

> At some point, I would really love to upstream these changes, but it's quite 
> a complex
> configuration - and I also had to do quite a few refactorings and clean-ups 
> for this to pass my
> personal quality standards.
> 
> One issue I had to deal with is that Bacula heavily relies upon clear-text 
> passwords in its various
> configuration files. To communicate between its different components, it uses 
> TLS with Client
> Certificates in addition to passwords. So in addition to writing clear-text 
> passwords into various
> configuration files, the X509 private keys, DH parameters, etc. also need to 
> be installed into
> appropriate directories.
> 
> I came up with quite an elegant solution for this problem - and introduced 
> three new services and
> an extension.
> 
> * My "guix secrets" tool provides a command-line interface to maintain a 
> "secrets database"
> (/etc/guix/secrets.db) that's only accessible to root. It can contain simple 
> passwords, arbitrary
> text (like for instance X509 certificates in PEM format) and binary data.

I know guix has been wanting to figure out how to have services that need 
passwords in the configuration
file.  This sounds like it could work!  

> * The problem with the standard activation service is that it runs early in 
> the boot process and
> all activation actions are run in a seemingly random way, there isn't a way 
> to provide any real
> dependencies. Any failures could possibly prevent the system from fully 
> booting up.
> 
> I created a new "activation-tree-service-type" - currently experimental and a 
> bit in a refactoring
> stage. It creates a separate one-shot Shepherd service for each activation 
> action, and you can
> declare dependencies between them.
> 
> Since it's using normal Shepherd services underneath the hood, you could for 
> instance depend on
> user-homes and the network being up, so you could SSH in and use GNU Emacs to 
> fix any issues.
> 
> And any arbitrary Shepherd service could also depend on some of these actions 
> - such as for
> instance the various Bacula services.
> 
> * Then I created "service-accounts-service-type" that extends the standard 
> account creation with
> the ability to also create home directories, run and PID directories and the 
> log-file. It's mostly
> used under the hood.
> 
> * Finally, "secrets-service-type" depends on all of the above to do its work.
> 
> It takes a template file - which is typically interned in the store - 
> containing special "tokens"
> that tell it which keys to look up from the secrets database.
> 
> It uses the above mentioned service-accounts-service-type to specify where 
> the substituted
> configuration file should be installed, insuring that the directory has been 
> set up with
> appropriate permissions.
> 
> And then it substitutes the special tokens from the template file with the 
> actual secrets. For
> instance "@password:foo@" would be substituted with a password entry called 
> "foo". For arbitrary
> text or binary data, the template would contain something like "@blob:data@" 
> - this will be
> substituted with the full path name of a file where the actual data will be 
> written to.
> 
> * * * *
> 
> All of the above has been mostly working in early August, just one problem 
> remained:
> 
> I do not want to store any of the actual data inside the VM, but rather use a 
> folder on the NAS
> itself. Even the PostgreSQL database lives on a NFS-mounted volume. The 
> problem is quite simply
> that Synology's Virtual Machine Manager software does not provide any way of 
> exporting or importing
> volumes. You cannot even move them between VMs. And I really don't want to 
> tie my data to the
> lifecycle of the VM.
> 
> Using traditional NFS (either version 2 or 3) worked perfectly fine and since 
> this is a very
> locked-down environment, encrypting the NFS traffic really isn't needed. 
> Like, and attacker that
> got access to either the NAS or the VM running inside it would already have 
> all the data anyway.
> 
> However, I wanted to give it a try regardless and see whether I could get 
> SSSD working with GNU
> Guix.
> 
> And this is where the nightmares began!
> 
> Firstly, I had to make a few changes to GNU Guix itself, most of which I'd 
> like to upstream. The
> code is in my public GitLab repo, but it's a bit of a mess right now, and 
> I'll need at least a day
> or two to clean it up. But I also ran across a couple of questions and issues.
> 
> * GNU Guix is currently using nfs-utils 2.4.3, whereas 2.6.3 is currently the 
> latest version. We
> don't need to upgrade, but I would like to backport one change, affecting a 
> single function. This
> is needed for idmap-daemon to work with arbitrary plugins.
> 
> Back in nfs-utils 2.4.3, the plugin search path was hard-coded - and since 
> that hard-coded path
> will be inside the store, other packages can't add anything to it.
> 
> In later versions, this was changed to attempt to load the plugin from the 
> library search path
> first, prior to falling back to the hard-coded default.
> 
> * Once nfs-utils is patched, rpc.idmapd then needs to be started with 
> LD_LIBRARY_PATH set to the
> plugin directories - similar to how it's done with nscd.
> 
> I added a few new fields to idmap-service-type and nfs-service-type for this.
> 
> It also looks like you can't instantiate idmap-service-type without 
> nfs-service-type due to what
> seems to be a bug.
> 
> It's currently using
>> (extend (lambda (config values) (first values)))
> which fails if there isn't any previous value. Replacing that with 
> last-extension-or-cfg (from
> "(gnu home services xdg)") fixes that issue.
> 
> * For the sssd package, this is currently built without nfsidmap support and 
> has it's sysconfdir
> set to /etc.
> 
> Was there a particular reason for this? I suppose nfsidmap support was 
> disabled because it
> previously did not work?
> 
> As for its sysconfdir - there isn't really anything confidential in the 
> sssd.conf file, so I would
> rather have that interned in the store if possible. This requires a little 
> patch to sssd, though,
> to disable its permission checks on the config file.
> 
> * For the realmd package - it currently does not compile on GNU/Guix master. 
> All that's needed is a
> small fix to the configure script. GNU/Guix master uses a newer version of 
> GNU Glibc - there is no
> "__res_querydomain" in -lresolv anymore, that's now called "res_querydomain" 
> and is in glibc.
> 
> * To make realmd actually work, it needs a configuration file.
> 
> Could we possibly either move it from (gnu packages admin) into (gnu packages 
> sssd), or add a
> "realmd-sssd" package with a standard configuration file? A very simple 
> config file will work fine,
> but it needs to contain the store paths of adcli. sssd and sss_cache.
> 
> These are the parts that I got working so far. You can join the domain, 
> acquire Kerberos tickets,
> mount the network share - and access is handled by the server according to 
> the current user's
> Kerberos credentials. You also don't need to copy around any keytabs or 
> anything for that, as would
> be required with Samba. This is just really cool.
> 
> However, here's where the problems start:
> 
> * I couldn't figure out how to use gssproxy - setting that environment 
> variable doesn't seem to be
> doing anything, I ran the various daemons with strace and nothing was ever 
> attempting to use the
> proxy. Then, I looked at the mit-krb5 source code as well as the nfs-utils 
> and gss-daemon source
> code and couldn't find any reference to that environment variable either.
> 
> Is it possible that Fedora / Red Hat is using some custom patches in their 
> distribution.
> 
> * I finally worked around that by installing client keytabs for my service 
> principals, using my
> secrets service.
> 
> Works great for local accounts, but using domain accounts gave me quite a bit 
> of a headache!
> 
> Let's say "storage" in a domain account. I can do "getent passwd storage" and 
> it works. I can do
> "chown storage foo" on a local file system as root and then "ls -l storage" 
> shows me the correct
> owner.
> 
> On the mounted network share, root is mapped to the machine credential, so I 
> have to create and
> chown things on the server. After a bit of starting / restarting nscd, sssd 
> and gss-daemon, file
> permissions will also show up correctly in "ls -l".
> 
> I can also do "su storage" as root and that works (after I create the home 
> directory); "su -s
> /bin/sh storage -c id" works fine.
> 
> * In guile, I can also do (getent "storage") and that works.
> 
> However, it fails when I put that inside a G-Exp - to run it as part of a 
> one-shot Shepherd
> service. I can open a pipe to "su -s /bin/sh storage -c 
> /gnu/store/...-coreutils-../bin/id" and
> that works.
> 
> One would assume that (getent) won't work inside a G-Exp because it doesn't 
> have access to NSCD /
> SSSD.
> 
> But why can I (invoke) "su" inside that same G-Exp and it works fine?
> 
> My gut feeling tells me that this "su pipe" thing might not be the most 
> reliable thing to depend
> on.
> 
> The reason I need the domain account's UID is to put the Kerberos client 
> keytab into
> "/var/krb5/user/<UID>/client.keytab". Maybe there's a way to use the username 
> instead? I ran an
> "strace" on the gss-daemon and it currently only looks in that <UID> 
> directory.
> 
> * PostgreSQL - ... yeah, here it is getting interesting!
> 
> The first question here is which user account to use - and whether to create 
> a local or domain
> account.
> 
> It seems like using a local "postgres" account might be the most robust thing 
> to do. Any access to
> the mounted network share will be mapped to whichever Kerberos principal I 
> place in the
> "client.keytab".
> 
> Either way, the local "root" user will not have any access to the data 
> directory - and the local
> "postgres" user will only have access to it once SSSD is up and running and 
> it's mounted.
> 
> I have an "activation-tree-service-type" action to mount the share once SSSD 
> is ready and that
> seems to be working fine on system boot.
> 
> However, for PostgreSQL, I'd probably have to provide my own service that 
> uses the same activation
> logic - not create the data directory at all, create the local state and pid 
> directory and log-file
> once we have the user's UID (which is trivial for a local "postgres" account, 
> but more complicated
> for domain accounts).
> 
> * Finally, each of Bacula's service accounts then also needs client keytabs 
> installed and started
> in the correct order.
> 
> * * * *
> 
> Here, I start to wonder whether it's even worth the hassle. To summarize, to 
> use Kerberized NFSv4,
> all of the following is needed:
> 
> * Some patches to GNU Guix (most of which can probably be upstreamed 
> regardless).
> * Complicated activation actions, to put client keytabs in the correct 
> places, with the correct
> permissions.
> * Strict, particular order in which services need to be started up on system 
> boot.
> * Manually creating directories on the server with the right owner and 
> permissions.
> * Manually running "samba-tool domain exportkeytab 
> --principal=<service-user>" for each service
> user, coping them over and adding to "guix secrets".
> * There will be quite a few as I have set up Bacula with strict privilege 
> separation, even using
> different Storage Daemons for different backups, each running as a distinct 
> user account.
> * Custom PostgreSQL service.
> 
> Whereas with just using unencrypted NFSv3, I can:
> 
> * Use GNU Guix master as-is.
> * Have my activation-tree-service-type create all the service accouts, their 
> directories and
> everything with appropriate permissions.
> * Only run "guix secrets" locally, without the need to SSH into the server 
> and run stuff as root
> there.
> * Have a much more simple activation logic.
> 
> Bacula is something that I would really like to get running and most of my 
> work so far has been to
> make that happen in a clean and stable manner.
> 
> However, I am strongly leading towards declaring the entire SSSD endeavor a 
> failed experiment and
> not pursue it any further.
> 
> In case there is any interest from your part, then I'd gladly polish up my 
> Guix changes and submit
> them as a series of patches. I was actually planning to have that done by the 
> end of this week, but
> then SSSD took far more time than I had anticipated.
> 
> Has anybody else ever made similar experiences or what are your 
> recommendations?
> 
> I'm about to head out for a longer weekend, going on a bit of a road trip to 
> visit some friends, so
> this is a great point for me to take a break and then come fresh next week.
> 
> Looking forward to hearing back from you and have a wonderful weekend,
> 
> Martin Baulig

Congrats Martin!  This whole email looks awesome!

Reply via email to