Re: [qubes-devel] GSoC Student Introduction 2017

Jean-Philippe Ouellet Sun, 26 Mar 2017 20:08:48 -0700

I didn't have time to write a short email... so I wrote a long one ;)

I think it is important to keep in mind the value we are adding to
Qubes by implementing a log subsystem:
1. integrity guarantee by immediate sending to and storage in separate VM
  - inability to modify previous logs
  - inability to fake timestamps
2. persistence of logs from ephemeral VMs (such as DispVMs)
3. allowing retention policies on per-origin-vm basis
4. safe viewing of logs with arbitrarily-complex log-viewing/analysis tools

#3 is important to ensure one VM does not DoS available log storage to
hide another VM's compromise. We need per-origin quotas, and user
alerting when they are reached.

#4 can be implemented by only ever handling the log contents in
DispVMs. I imagine some GUI in the LogVM which allows one to view a
list of all logs collected (time, origin, size, etc.) and easily open
them with a suitable program in a DispVM, or manage them (sort,
delete, copy to a different VM to share or archive it, etc.).

Note that these features are inherently log-format-agnostic. They are
most definitely still useful for unstructured text logs. It is also
not difficult to imagine someone wanting the same features for other
non-journald audit formats, such as Windows' EVTX format, or FreeBSD's
/ Mac OS X's OpenBSM format. With a well-designed format-agnostic
log-collection subsystem, supporting these would likely be as simple
as adding a mapping for "open this log type with this log-viewing
application" and perhaps some trivial format-specific timestamp
injection / substitution happening in the LogVM as logs are received.

Yes, adding timestamps does mean some *simple* log parsing /
processing happening in the LogVM. It it can not be done simply in an
obviously-correct manner (such as prepending a timestamp to the
beginning of each line), then IMO it should either not be done at all,
or should be done in a per-origin DispVM.

On Sun, Mar 26, 2017 at 5:06 PM, Alisa Matsak <[email protected]> wrote:
> You're absolutely right, currently all the Linux VMs do have journald (as a
> part of systemd).

Journald may serve as a convenient building block in that there may be
a substantial ecosystem which facilitates easy integration, and it
makes sense to want to take advantage of that, but not at the cost of
excluding systems without it from also being able to take advantage of
the Qubes-encouraged logging facility.

I believe it is important to also support accepting plain unstructured
text logs as well, and I strongly encourage you to do so. Perhaps via
separate qubes.logs.JournaldExport and qubes.logs.PlainTextLines (or
similar) services.

Qubes is not really a linux distribution, and systemd (while perhaps a
good initial target) is not ubiquitous among all guest systems we are
interested in, so IMO we should avoid designing systems that
inherently rely on functionality exclusive to it where possible.

>> If using this format, I'd use some simplified version - filter out
>> unneeded fields (most of them?) when sending, and synthesize those
>> required after receiving entry in LogVM. And of course reject entries
>> not conforming to this simplified specification.
>
> If we don't use the full version of journald export format, I can't see the
> point of using it at all. Don't get me wrong, I only suggested to use this
> format for full compatibility with journald (which we would receive in
> LogVM). I thought it would be very useful because of the already implemented
> log rotation, very handy search in journal entries and the security feature
> of journald (which lets us know when a journal entry has been tampered
> with).

I do not think the hash chain is a meaningful security feature in this
context. If a VM is compromised, the adversary who wishes to modify a
log could still do so and recalculate all hashes afterwards (and
accordingly modify state of the process creating more logs to make
future logs appear fine too). The security comes from having the logs
stored in a separate VM, to which the only interface is sending logs,
and of which we should avoid any complex parsing or analysis in order
to avoid the log-collection VM getting compromised.

In this manner, we can rely on the isolation between Qubes domains to
guarantee authenticity and integrity without increasing the complexity
of the log format or requiring special tools to view them (which was
one of the major complaints against systemd-journald in its early
days).

If you want log hash chains to be secure, you need the hashes to be
regularly incorporated into the hash chains of other machines (see
sections 4.4 & 4.5 of Schneier & Kelsey's 1999 paper [1] on this),
which... if we need cross-domain communication to guarantee resistance
to undetected retroactive modification in the first place, we may as
well just send the logs themselves over an existing secure channel
(vchan/qrexec) to a secure destination (the LogVM) immediately as
they're created and store them there.

[1]: https://www.schneier.com/academic/paperfiles/paper-auditlogs.pdf

> But if we want to change structure of journald entries and leave only
> the "necessary" fields, we should be ready to lose all the advantages of
> using journald for logs storage mentioned above.

> I also don't think that synthesised entries can make friend with journald,

I don't understand what you mean by this. Care to elaborate?

> this approach prevents us
> from using all the cool features of journald, and at the same time makes us
> process the transmitted data twice.
>
>> To be honest, I think the "short" format (`journalctl --output short`),
>> with timestamp and hostname stripped off is enough. So, basically just
>> MESSAGE field from "export" format.
>
> Okay, we can use the "short" format in its pure form, if I understood you
> correctly.

Only MESSAGE is definitely too minimal IMO. It is extremely useful to
filter by what the message came from, for which _PID, _UID, _COMM,
_CMDLINE, etc. fields are commonly used (even if just by simple grep
of standard text format).

The fact that these can be provided in a somewhat-harder-to-spoof
manner than traditional syslog is nice, but does not somehow make them
trustworthy. However, just because they are not trustworthy does not
mean they are not still useful.

> Let's take the output of this command, put it in some file and send it to
> LogVM via qrexec. In this case we don't even need a proxy-server on the
> ordinary VM's side. Of course, we would track which entries have alredy been
> sent, and the next time we would send only the newly generated ones (there
> should also be some sort of timer for establishing connections). On the side
> of LogVW we would use a simple proxy-server, which would be responsible for
> listening for connections from different VMs and receiving information from
> them. We can keep received logs from different VMs in different directories
> and rotate them independently (like delete older files, archive not very
> recent ones and don't touch very recent logs). In this case, we can only
> search logs as text files (grep), and we don't get the security features of
> journald (I'm talking about sequential hashing)

If by "proxy server" you mean anything involving a network stack... I
really doubt you need that. There shouldn't even need to be any
sockets at all involved on the receiving-side code. The simplest case
(plain text format) would be a qrexec service consisting of something
like:
    #!/bin/sh
    set -e
    d=$HOME/QubesLogs/$QREXEC_REMOTE_DOMAIN
    mkdir -p "$d"
    while read line; do
        printf "%s\t%s\n" "$(date)" "$line"
    done >> "$d/$(date +%s).log"

You can almost kind of think of qrexec services as CGI scripts. The
listening and multiplexing and such normally handled by the webserver
is handled by qrexec-agent.

Regards,
Jean-Philippe

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-devel/CABQWM_BknHAXae8Zm6kkCdmEo%3Dd2T8aFNVUSrjFN69B66OPdzA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [qubes-devel] GSoC Student Introduction 2017

Reply via email to