Re: [qubes-devel] GSoC Student Introduction 2017

Marek Marczykowski-Górecki Thu, 23 Mar 2017 16:31:29 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Thu, Mar 23, 2017 at 02:28:59PM -0700, Alisa Matsak wrote:
> On Sunday, 12 March 2017 22:20:33 UTC+3, Jean-Philippe Ouellet wrote:
> 
> > > Please feel free to ask any questions you may have! 
> >
> 
> Hi again!
> 
> Since last time I've learned some materials related to LogVM task. Now I 
> want to use your offer about asking questions that I have. :)
> 
> I discovered that the majority of required functionality had already been 
> implemented as a part of journald (which is a part of systemd project). 
> Journald saves all the logs it knows about (such as kernel messages 
> generated with printk(), userspace messages generated with syslog(3), 
> userspace entries using the native API, coredumps via 
> /proc/proc/sys/kernel/core_pattern and more; I took it from here: [1]). It 
> also takes care of security (undetected manipulation is impossible because 
> of because of each entry cryptographically verifying all previous ones) and 
> journal files rotation for more efficient disk usage. At the same time it 
> provides tools for comfortably viewing logs and even searching them. 
> Because of the fact that all our VMs are working on Fedora, we can use all 
> this features for our profit. (All of this sounds like journald 
> advertisement. xD)


Not all the VMs are running Fedora, there are also Debian-based and some
people use also Arch. Not to mention also Windows, but I think we can
not care about it for now ;)
But currently all the Linux VMs do have journald (hmm, not sure about
Arch?). Using journald as a log collecting tool looks like a good idea.
But mostly because it is already there, not because we depend on some
specific feature of it. The question is how exactly (including data
format) transfer entries from one journald instance to another.

Also, if using journald for managing log storage, we need to make sure
it's reasonably configured. For example to prevent a VM to produce a lot
of log entries causing log rotation and removing very recent logs
(possibly some evidence of compromise). 

Even worse - removing not-so-old entries related to other VMs. Example
attack scenario:
1. 'work' VM got successfully attacked, but Log VM got evidences of the
attack
2. Compromised VM start a new DispVM
3. That DispVM produce a lot of rubbish log entries, causing all the
recent logs to be rotated and removed
4. Compromised 'work' VM also clean local logs (if any)

Now, the only thing you have in LogVM is some garbage sent by a random
DispVM and you don't even know which VM started that DispVM (because
that log entries were rotated too). While you may suspect that it isn't
only DispVM that got compromised, you have no idea which VM it is and
what exactly have happened.

Probably some rate-limiting (maybe connected with alerting) should solve
this problem, but we need to think about such scenarios.

> Journald developers advise not to change journal files because of basic 
> principles of journald implementation. They describe its on-disk format and 
> note that it is "not what you want to use as base of your project". I think 
> that we can parse journal export format (reasoning for why this is 
> necessary below) to delete meta-information, but I'm afraid journald won't 
> work with our modified file later. So this way an attempt to write some 
> tool for processing such files is similar to reinventing the wheel (or 
> reimplementing journald).

Using full journald export format isn't a good idea, at least for those
reasons:
 - many fields should be out of control for sending VMs - for example
   hostname, timestamp, but probably more
 - many fields are unnecessary (for example all __*), so lets keep the
   attack surface as small as possible; even Lennart Poettering can't
   write bug-free code ;)
 - for the same reason, I'd filter out binary entries (replace
   non-printable characters with dot, underscore or sth like this) -
   even if journald itself handle them well, some log-viewing software
   may not; even simple 'less' command throw a bunch of parsers on its
   input...

If using this format, I'd use some simplified version - filter out
unneeded fields (most of them?) when sending, and synthesize those
required after receiving entry in LogVM. And of course reject entries
not conforming to this simplified specification.

To be honest, I think the "short" format (`journalctl --output short`),
with timestamp and hostname stripped off is enough. So, basically just
MESSAGE field from "export" format.
If that means the need to synthesise all the other fields (which I
doubt), lets be it.

> My idea for the project is the following. Among other functionality, 
> journald contains functions for sending and receiving journal messages over 
> the network. For our goals we need its systemd-journal-remote [2]

Looks like this tool can accept input not only from the network, but
also from a local socket :)

>  and 
> systemd-journal-upload [3]. For transmiting entries journald uses the 
> special format [4]. The problem is those tools only support transmitting 
> logs in HTTP/HTTPS over TCP/IP

Receiving part support local socket, without HTTP(S) wrapping - see
- --listen-raw option. But the sending part indeed looks like supporting
only HTTP(S).

> , while we only support VMs communicating via 
> qrexec. I think a simple proxy-server (maybe, even a self-written one) 
> would solve the problem. Journald on VMs would send its logs to the proxy 
> (that works on the same VM in the background) and it, in its turn, would 
> open qrexec connection to pass them to LogVM. LogVM here would be a usual 
> VM working on Fedora. There would also be a proxy-server working on LogVM 
> in the background. It would receive data via qrexec and simulates for 
> journald on LogVM the situation like it was received through TCP/IP. So 
> this way can be suitable for collecting logs from other VMs.

Such proxies on both sides seems like a reasonable solution. Keep in
mind that the proxy on receiving side has a very important job: make
sure the entries conform to required format, whatever the format will
be. The simpler the format will be, the simpler the tool will be.

> For better understanding the process I attach the scheme of the described 
> process [5]. Hope it will be useful.
> 
> Please, let me know what do you think about this idea. Is it suitable for 
> this project? Can I write а proposal based on it?
> 
> Best wishes, Alisa.
> 
> [1] https://goo.gl/BaCCko
> [2] 
> https://www.freedesktop.org/software/systemd/man/systemd-journal-remote.html
> [3] 
> https://www.freedesktop.org/software/systemd/man/systemd-journal-upload.html
> [4] https://www.freedesktop.org/wiki/Software/systemd/export/
> [5] https://goo.gl/8euAAM
> 


- -- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJY1FqrAAoJENuP0xzK19csuccH/1UkNtYO340BtK0anqIDxWpu
Dpz5BrvEVdGvHNa6WPJWRo3nz3yLsWYXNZ40O/J/dyFEQBSQQA2UGfcd9IAF2VYZ
OGQmyd0rGlzzI/DVh//yxxtDKXU2MZphPusHBD+pK/b2PVi4vrCH+oe5gKBQgpN1
Lqo0K7WR8VCEdll1N53NNvmiejNgYONA+p3ZbYUsUIcc+s9DELP75MC73TtVM/IB
c8UQO7bhgieVzZeAa6sFoFqj/qGf1BMUpfAwmZI9DwLEasstrOaMsdC99wOD1lc/
eYGxNfRRg5gL4VbLq/JrQvYwlh09fj5D1FKYOfw5dHZcHNaIocOsD1TVjGKHQyA=
=uE+K
-----END PGP SIGNATURE-----

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-devel/20170323233051.GP1208%40mail-itl.
For more options, visit https://groups.google.com/d/optout.

Re: [qubes-devel] GSoC Student Introduction 2017

Reply via email to