On Thu, Apr 27, 2023 at 11:20 PM Alexander Gordeev
<alexander.gord...@opensynergy.com> wrote:
>
> On 26.04.23 07:52, Alexandre Courbot wrote:
> > On Mon, Apr 24, 2023 at 4:52 PM Alexander Gordeev
> > <alexander.gord...@opensynergy.com> wrote:
> >>
> >> On 21.04.23 18:01, Alexander Gordeev wrote:
> >>
> >>> On 21.04.23 06:02, Alexandre Courbot wrote:
> >>>
> >>>> * I am still not convinced that V4L2 is lacking from a security
> >>>> perspective. It would take just one valid example to change my mind
> >>>> (and no, the way the queues are named is not valid). And btw, if it
> >>>> really introduces security issues, then this makes it invalid for
> >>>> inclusion in virtio entirely, just not OpSy's hypervisor.
> >>>
> >>>
> >>> I'd like to start with this and then answer everything else later.
> >>>
> >>> Let's compare VIRTIO_VIDEO_CMD_RESOURCE_QUEUE with
> >>> VIDIOC_QBUF+VIDIOC_DQBUF. Including the parameters, of course. First,
> >>> let's compare the word count to get a very rough estimate of complexity.
> >>> I counted 585 words for VIRTIO_VIDEO_CMD_RESOURCE_QUEUE, including the
> >>> parameters. VIDIOC_QBUF+VIDIOC_DQBUF are defined together and take 1206
> >>> words, they both use struct v4l2_buffer as a parameter. The struct takes
> >>> 2716 words to be described. So the whole thing takes 3922 words. This is
> >>> 6.7 times more, than VIRTIO_VIDEO_CMD_RESOURCE_QUEUE. If we check the
> >>> definitions of the structs, it is also very obvious, that V4L2 UAPI is
> >>> almost like an order of magnitude more complex.
> >>
> >>
> >> I think, it is best to add all the steps necessary to reproduce my 
> >> calculations just in case.
> >>
> >> VIRTIO_VIDEO_CMD_RESOURCE_QUEUE is doing essentially the same thing as 
> >> VIDIOC_QBUF+VIDIOC_DQBUF, so we're comparing apples to apples (if we don't 
> >> forget to compare their parameters too).
> >>
> >> To get the word count for the VIRTIO_VIDEO_CMD_RESOURCE_QUEUE I opened the 
> >> rendered PDF of video section only from the first email in this thread. 
> >> Here is the link: 
> >> https://drive.google.com/file/d/1Sm6LSqvKqQiwYmDE9BXZ0po3XTKnKYlD/view?usp=sharing
> >>  . Then I scrolled to page 11 and copied everything related a text file. 
> >> This is around two pages in the PDF. Then I removed page numbers from the 
> >> copied text and used 'wc -w' to count words.
> >>
> >> To get the word count for VIDIOC_QBUF+VIDIOC_DQBUF I opened this link: 
> >> https://docs.kernel.org/userspace-api/media/v4l/vidioc-qbuf.html . Then I 
> >> selected all the text except table of contents and did followed the same 
> >> procedure.
> >>
> >> To get the word count for struct v4l2_buffer and other types, that are 
> >> referenced from it, I opened this link: 
> >> https://docs.kernel.org/userspace-api/media/v4l/buffer.html#struct-v4l2-buffer
> >>  . Then I selected all the text except the table of contents and the text 
> >> above struct v4l2_buffer definition. The rest is the same.
> >>
> >> Also it's quite obvious if you look at them how much bigger struct 
> >> v4l2_buffer (including the referenced types) is compared to struct 
> >> virtio_video_resource_queue.
> >
> > You are comparing not the complexity of the structures but the
> > verbosity of their documentation, which are written in a different
> > style, format, and by different people.
>
> I agree to some extent. At least this benchmark is simple and it
> provokes to actually go and look at the definitions, which IMO should be
> enough to already see the difference. What could be a better benchmark?
> Maybe counting the number of various fields and flags and enum cases,
> that one has to read through?

I give you another point of comparison literally 5 lines down my email.

>
> > And the V4L2 page also
> > contains the description of memory types, which is part of another
> > section in the virtio-video spec.
>
> You mean only enum v4l2_memory? Or anything else too?
>
> > There is no way to draw a meaningful
> > conclusion from this.
> >
> > If you want to compare, do it with how the structures are actually
> > used. Here is how you would queue an input buffer with virtio-video:
> >
> >    struct virtio_video_resource_queue queue_buf = {
> >        .cmd_type = VIRTIO_VIDEO_CMD_RESOURCE_QUEUE,
> >        .stream_id = 42,
> >        .queue_type = VIRTIO_VIDEO_QUEUE_TYPE_INPUT,
> >        .resource_id = 1,
> >        .timestamp = 0x10,
> >        .data_sizes = {
> >          [0] = 0x1000,
> >        },
> >    };
> >
> > Now the same with virtio-v4l2:
> >
> >    struct virtio_v4l2_queue_buf queue_buf = {
> >        .cmd = VIRTIO_V4L2_CMD_IOCTL,
> >        .code = VIDIOC_QBUF,
> >        .session_id = 42,
> >        .buffer.type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE,
> >        .buffer.index = 1,
> >        .buffer.timestamp.tv_usec = 0x10,
> >        .buffer.memory = V4L2_MEMORY_MMAP,
> >        .planes = {
> >          [0] = { .bytesused = 0x1000 },
> >        }
> >    };
> >
> > In both cases, you pass a structure with some members set, and the
> > rest to 0. The host receives basically the same thing - it's the same
> > data! The only difference is how it is laid out.
>
> How do I know this from the text? How do I verify, that this is correct?
> Can I be sure this code is going to work or not between any device and
> any driver?
>
> With virtio-video you can just go to the spec/header file, take the
> struct definition and simply set all the fields.
>
> With V4L2 UAPI I don't see any other way except going through the whole
> buffer.html file, filtering potentially irrelevant stuff, then doing
> some trials, and maybe looking into the device or driver code. Maybe
> also asking you for an advice?

The relevant fields for each decoding operation are clearly detailed
in 
https://www.kernel.org/doc/html/v5.18/userspace-api/media/v4l/dev-decoder.html,
which I already linked to multiple times.

> I think it is worth trying to imagine you're a newcomer in the project.
> Or a security auditor.
>
> So I think we can't draw conclusions about the spec quality from these
> code samples.
>
> > Also as mentioned by Bart, the apparent simplicity of
> > VIRTIO_VIDEO_CMD_RESOURCE_QUEUE, which does not require a dequeue
> > operation as the dequeue is sent with its response, is actually a
> > fallacy: that design choice makes the specification simpler, but at
> > the cost of more complexity on the device side and the potential to
> > starve on command descriptors. By contrast, adopting the V4L2 model
> > resulted in simpler code on both sides and no possibility to deadlock.
> > That point could be addressed by revising the virtio-video spec, but
> > then you get even closer to V4L2.
>
> Hmm, I understand, thanks. Well, I can fix this in the spec. I have some
> ideas. I think I'd better describe them in an answer to Bart's email.
>
> >> Do we agree now, that V4L2 UAPI is not only marginally more complex?
> >
> > No, and I rest my case on the code samples above.
> >
> >>>
> >>>
> >>> Also please read:
> >>>
> >>> https://medium.com/starting-up-security/evidence-of-absence-8148958da092
> >>>
> >>
> >> This reference probably needs a clarification. You argued, that V4L2 has a 
> >> good track record so far. Here is the quote:
> >>
> >>> FWIW V4L2 has been in use for a looong time (including in Chromebooks
> >>> that do secure playback) and I am not aware of fundamental security
> >>> issues with it.
> >>
> >> But absence of found major security issues doesn't tell us much about the 
> >> number of not found ones. Absence of evidence is not an evidence of 
> >> absence. At the link above a former Director of Security at Facebook 
> >> shares his thoughts about what could be a good evidence of absence of 
> >> major security problems.
> >
> > You are just FUDing now.
>
> These are well established security concepts AFAIK. Are you gaslighting?
>
> > The exact same argument could be made about
> > virtio-video. If you want to borrow from the security world, I can ask
> > you how going with virtio-video when V4L2 exists is different from
> > rolling your own crypto.
>
> It is very different because obviously neither of them is a crypto.
>
> >> So a bug bounty program with high premiums that covers V4L2 would be a 
> >> better argument in favor of *already written code* in my opinion. Not for 
> >> new code. Also probably it is also an argument in favor of the spec, that 
> >> is the V4L2 UAPI. Like that it is polished enough. Not so sure about that 
> >> though.
> >>
> >> There actually are several bug bounty programs, that cover the kernel. 
> >> These are Google's kctf, ZDI's pwn2own, and zerodium AFAIK. However the 
> >> premiums are not even close to the ones mentioned in my reference. Anyway 
> >> this means, that using *the existing V4L2 code in the kernel* is probably 
> >> OK. But this creates some limitations if we want the actual code to still 
> >> be covered with these bug bounties, right? This means, that the host OS 
> >> has to be Linux and the actual hardware has to be exposed through a stable 
> >> V4L2 driver, that is in mainline for some time, and there has to be no or 
> >> little processing on top. For us this is not possible unfortunately. In 
> >> the end both things could be secure:
> >>
> >> 1. V4L2 pass through can be secure because of the bug bounty programs and 
> >> a lot of attention to the kernel in general.
> >> 2. For the new code this doesn't work, so the spec should be as simple and 
> >> device-centric as possible. Because, all other things being equal, there 
> >> are fewer errors in simpler programs. So defining a subset of V4L2 UAPI 
> >> including the data types looks like a good idea to me. The stateful 
> >> decoder interface, that you point to, does not define a subset in the data 
> >> types.
> >
> > Wouldn't you have exactly the same problem by using a new guest-host
> > protocol?
>
> I think this is far-fetched. Especially if our new protocol is
> intentionally kept close to the reference protocol with a few things
> made better and simpler. That would be my preferred route.
>
> > Are you going to start a bounty program for virtio-video to
> > get an assurance that it is secure?
>
> I'm not sure it is possible because the device is proprietary. At least
> the driver is going to be covered hopefully. So for the device we're
> going to rely on simplicity, tests and audits. But who knows. I'm not
> the one to make this decision.
>
> >> This is basically my reasoning.
> >>
> >> Also these two specs don't need to compete with each other. They have 
> >> different limitations and they are for different audiences. If you check 
> >> the XKCD's comic, it is about competing standards.
> >
> > They allow exactly the same thing (virtualization of video
> > decoding/encoding) and there isn't any use-case of virtio-video that
> > could not be covered equally well by V4L2. Reinventing a new video
> > specification is pointless and will lead to unneeded fragmentation.
>
> I don't agree with these statements.
>
> > We should also expect reticence from the Linux community to upstream
> > two virtual video drivers that basically do the same thing.
>
> This is hypothetical.
>
> --
> Alexander Gordeev
> Senior Software Engineer
>
> OpenSynergy GmbH
> Rotherstr. 20, 10245 Berlin
>
> Phone: +49 30 60 98 54 0 - 88
> Fax: +49 (30) 60 98 54 0 - 99
> EMail: alexander.gord...@opensynergy.com
>
> www.opensynergy.com
>
> Handelsregister/Commercial Registry: Amtsgericht Charlottenburg, HRB 108616B
> Geschäftsführer/Managing Director: Régis Adjamah
>
> Please mind our privacy 
> notice<https://www.opensynergy.com/datenschutzerklaerung/privacy-notice-for-business-partners-pursuant-to-article-13-of-the-general-data-protection-regulation-gdpr/>
>  pursuant to Art. 13 GDPR. // Unsere Hinweise zum Datenschutz gem. Art. 13 
> DSGVO finden Sie 
> hier.<https://www.opensynergy.com/de/datenschutzerklaerung/datenschutzhinweise-fuer-geschaeftspartner-gem-art-13-dsgvo/>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Reply via email to