On 09/20/2016 04:03 AM, Alex Williamson wrote:
> On Tue, 20 Sep 2016 00:43:15 +0530
> Kirti Wankhede <kwankh...@nvidia.com> wrote:
>> On 9/20/2016 12:06 AM, Alex Williamson wrote:
>>> On Mon, 19 Sep 2016 23:52:36 +0530
>>> Kirti Wankhede <kwankh...@nvidia.com> wrote:
>>>> On 8/26/2016 7:43 PM, Kirti Wankhede wrote:  
>>>>> * PGP Signed: 08/26/2016 at 07:15:44 AM, Decrypted
>>>>> On 8/25/2016 2:52 PM, Dong Jia wrote:    
>>>>>> On Thu, 25 Aug 2016 09:23:53 +0530    
>>>>>>> +
>>>>>>> +static ssize_t vfio_mdev_read(void *device_data, char __user *buf,
>>>>>>> +                             size_t count, loff_t *ppos)
>>>>>>> +{
>>>>>>> +       struct vfio_mdev *vmdev = device_data;
>>>>>>> +       struct mdev_device *mdev = vmdev->mdev;
>>>>>>> +       struct parent_device *parent = mdev->parent;
>>>>>>> +       unsigned int done = 0;
>>>>>>> +       int ret;
>>>>>>> +
>>>>>>> +       if (!parent->ops->read)
>>>>>>> +               return -EINVAL;
>>>>>>> +
>>>>>>> +       while (count) {    
>>>>>> Here, I have to say sorry to you guys for that I didn't notice the
>>>>>> bad impact of this change to my patches during the v6 discussion.
>>>>>> For vfio-ccw, I introduced an I/O region to input/output I/O
>>>>>> instruction parameters and results for Qemu. The @count of these data
>>>>>> currently is 140. So supporting arbitrary lengths in one shot here, and
>>>>>> also in vfio_mdev_write, seems the better option for this case.
>>>>>> I believe that if the pci drivers want to iterate in a 4 bytes step, you
>>>>>> can do that in the parent read/write callbacks instead.
>>>>>> What do you think?
>>>>> I would like to know Alex's thought on this. He raised concern with this
>>>>> approach in v6 reviews:
>>>>> "But I think this is exploitable, it lets the user make the kernel
>>>>> allocate an arbitrarily sized buffer."
>>>> Read/write callbacks are for slow path, emulation of mmio region which
>>>> are mainly device registers. I do feel it shouldn't support arbitrary
>>>> lengths.
>>>> Alex, I would like to know your thoughts.  
>>> The exploit was that the mdev layer allocated a buffer and copied the
>>> entire user buffer into kernel space before passing it to the vendor
>>> driver.  The solution is to pass the __user *buf to the vendor driver
>>> and let them sanitize and split it however makes sense for their
>>> device.  We shouldn't be assuming naturally aligned, up to dword
>>> accesses in the generic mdev layers.  Those sorts of semantics are
>>> defined by the device type.  This is another case where making
>>> the mdev layer as thin as possible is actually the best thing to
>>> do to make it really device type agnostic.  Thanks,
>> Alex,
>> These were the comments on v6 patch:
>>>>> Do we really need to support arbitrary lengths in one shot?  Seems
>>>>> like
>>>>> we could just use a 4 or 8 byte variable on the stack and iterate
>>>>> until
>>>>> done.
>>>> We just want to pass the arguments to vendor driver as is here.Vendor
>>>> driver could take care of that.  
>>> But I think this is exploitable, it lets the user make the kernel
>>> allocate an arbitrarily sized buffer.  
>> As per above discussion in v7 version, this module don't allocated
>> memory from heap.
>> If vendor driver allocates arbitrary memory in kernel space through mdev
>> module interface, isn't that would be exploit?
> Yep, my 4-8/byte chunking idea was too PCI specific.  If a vendor
> driver introduces an exploit, that's a bug in the vendor driver.  I'm
> not sure if we can or should attempt to guard against that.  Ultimately
> the vendor driver is either open source and we can inspect it for such
> exploits or it's closed source, taints the kernel, and we hope for the
> best.  It might make a good unit test to perform substantially sized
> reads/writes to the mdev device.

Can't agree more! :-)

> Perhaps the only sanity test we can
> make in the core is to verify the access doesn't exceed the size of
> the region as advertised by the vendor driver.  Thanks,

Even performing a lightweight sanity check, would require vfio-mdev
to be able to decode the ppos into a particular region, that means
information of all regions should be stored in the framework. I guess
it is not your preferred way :)


Reply via email to