Hi

On Fri, Jun 20, 2014 at 4:49 PM, Benjamin Tissoires
<benjamin.tissoi...@gmail.com> wrote:
> Hi David,
>
> On Fri, Jun 20, 2014 at 10:05 AM, David Herrmann <dh.herrm...@gmail.com> 
> wrote:
>> When we introduced the slotted MT ABS extensions, we didn't take care to
>> make SYN_DROPPED recoverable. Imagine a client recevies a SYN_DROPPED and
>> syncs its current state via EVIOCGABS. It has to call this ioctl for each
>> and every ABS code separately. Besides being horribly slow, this series
>> of ioctl-calls is not atomic. The kernel might queue new ABS events while
>> the client fetches the data.
>>
>> Now for normal ABS codes this is negligible as ABS values provide absolute
>> data. That is, there is usually no need to resync ABS codes as we don't
>> need previous values to interpret the next ABS code. Furthermore, most ABS
>> codes are also sent pretty frequently so a refresh is usually useless.
>>
>> However, with the introduction of slotted ABS axes we added a relative
>> component: ABS_MT_SLOT. If a client syncs all ABS codes via EVIOCGABS
>> while the kernel has ABS-events and an ABS_MT_SLOT event queued, the
>> client will think any read ABS-event is for the retrieved SLOT, however,
>> this is not true as all events until the next ABS_MT_SLOT event are for
>> the previously active slot:
>>
>>     Kernel queue is: { ABS_DROPPED,
>>                        ABS_MT_POSITION_X(slot: 0),
>>                        ABS_MT_SLOT(slot: 1),
>>                        ABS_MT_POSITION_X(slot: 1) }
>>     Client reads ABS_DROPPED from queue.
>>     Client syncs all ABS values:
>>       As part of that, client syncs ABS_MT_SLOT, which is 1 in the current
>>       view of the kernel.
>>     Client reads ABS_MT_POSITION_X and attributes it to slot 1 instead of
>>     slot 0, as the slot-value is not explicit.
>>
>> This is just a simple example how the relative information provided by the
>> ABS_MT_SLOT axis can be problematic to clients.
>>
>> Now there are many ways to fix this:
>>  * Make ABS_MT_SLOT a per-evdev-client attribute. On each
>>    EVIOCGABS(ABS_MT_SLOT) we can add fake ABS_MT_SLOT events to the queue.
>>    => Ugly and overkill
>>  * Flush all ABS events when clients read ABS_MT_SLOT.
>>    => Ugly hack and client might loose important ABS_MT_* events
>>  * Provide atomic EVIOCGABS API.
>>    => Sounds good!
>>
>> This patch introduces EVIOCGABSRANGE. Unlike EVIOCGABS, this ioctl only
>> fetches ABS values, rather than the whole "struct input_absinfo" set.
>> However, the new ioctl can fetch a range of ABS axes atomically and will
>> flush matching events from the client's receive queue. Moreover, you can
>> fetch all axes for *all* slots with a single call.
>>
>> This way, a client can simply run EVIOCGABSRANGE(0, ABS_CNT) and it will
>> receive a consistent view of the whole ABS state, while the kernel flushes
>> the receive-buffer for a consistent view.
>> While most clients probably only need
>> EVIOCGABSRANGE(ABS_MT_SLOT, ABS_MT_TOOL_y - ABS_MT_SLOT + 1), the ioctl
>> allows to receive an arbitrary range of axes.
>>
>> Signed-off-by: David Herrmann <dh.herrm...@gmail.com>
>> ---
>
> Thanks for this. This sounds very promising. I did not went into the
> tiny details of the code, but I already have some comments. See below.

Thanks for the comments!

>>  drivers/input/evdev.c      | 180 
>> ++++++++++++++++++++++++++++++++++++++++++++-
>>  include/uapi/linux/input.h |  42 +++++++++++
>>  2 files changed, 218 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
>> index 6386882..7a25a7a 100644
>> --- a/drivers/input/evdev.c
>> +++ b/drivers/input/evdev.c
>> @@ -175,8 +175,9 @@ static bool __evdev_is_filtered(struct evdev_client 
>> *client,
>>         return mask && !test_bit(code, mask);
>>  }
>>
>> -/* flush queued events of type @type, caller must hold client->buffer_lock 
>> */
>> -static void __evdev_flush_queue(struct evdev_client *client, unsigned int 
>> type)
>> +/* flush queued, matching events, caller must hold client->buffer_lock */
>> +static void __evdev_flush_queue(struct evdev_client *client, unsigned int 
>> type,
>> +                               unsigned int code_first, unsigned int 
>> code_last)
>>  {
>>         unsigned int i, head, num;
>>         unsigned int mask = client->bufsize - 1;
>> @@ -195,7 +196,9 @@ static void __evdev_flush_queue(struct evdev_client 
>> *client, unsigned int type)
>>                 ev = &client->buffer[i];
>>                 is_report = ev->type == EV_SYN && ev->code == SYN_REPORT;
>>
>> -               if (ev->type == type) {
>> +               if (ev->type == type &&
>> +                   ev->code >= code_first &&
>> +                   ev->code <= code_last) {
>>                         /* drop matched entry */
>>                         continue;
>>                 } else if (is_report && !num) {
>> @@ -786,6 +789,172 @@ static int handle_eviocgbit(struct input_dev *dev,
>>         return bits_to_user(bits, len, size, p, compat_mode);
>>  }
>>
>> +static inline void free_absrange(s32 **pages, size_t page_cnt)
>> +{
>> +       if (page_cnt > 1) {
>> +               while (page_cnt > 0) {
>> +                       if (!pages[--page_cnt])
>> +                               break;
>> +                       __free_page(virt_to_page(pages[page_cnt]));
>> +               }
>> +               kfree(pages);
>> +       } else if (page_cnt == 1) {
>> +               kfree(pages);
>> +       }
>> +}
>> +
>> +static inline s32 *absrange_ptr(s32 **pages, size_t page_cnt, size_t slots,
>> +                               size_t i_code, size_t j_slot)
>> +{
>> +       size_t idx, off;
>> +
>> +       idx = (i_code * slots + j_slot) / (PAGE_SIZE / sizeof(s32));
>> +       off = (i_code * slots + j_slot) % (PAGE_SIZE / sizeof(s32));
>> +
>> +       if (page_cnt == 1)
>> +               return &((s32*)pages)[off];
>> +       else
>> +               return &pages[idx][off];
>> +}
>> +
>> +static inline ssize_t fetch_absrange(struct evdev_client *client,
>> +                                    struct input_dev *dev, size_t start,
>> +                                    size_t count, size_t slots, s32 ***out)
>> +{
>> +       size_t size, page_cnt, i, j;
>> +       unsigned long flags;
>> +       s32 **pages;
>> +
>> +       /*
>> +        * Fetch data atomically from the device and flush buffers. We need 
>> to
>> +        * allocate a temporary buffer as copy_to_user() is not allowed while
>> +        * holding spinlocks. However, to-be-copied data might be huge and
>> +        * high-order allocations should be avoided. Therefore, do the
>> +        * page-allocation manually.
>> +        */
>> +
>> +       BUILD_BUG_ON(PAGE_SIZE % sizeof(s32) != 0);
>> +
>> +       size = sizeof(s32) * count * slots;
>> +       page_cnt = DIV_ROUND_UP(size, PAGE_SIZE);
>> +       if (page_cnt < 1) {
>> +               return 0;
>> +       } else if (page_cnt == 1) {
>> +               pages = kzalloc(size, GFP_TEMPORARY);
>> +               if (!pages)
>> +                       return -ENOMEM;
>> +       } else {
>> +               pages = kzalloc(sizeof(*pages) * page_cnt, GFP_TEMPORARY);
>> +               if (!pages)
>> +                       return -ENOMEM;
>> +
>> +               for (i = 0; i < page_cnt; ++i) {
>> +                       pages[i] = (void*)get_zeroed_page(GFP_TEMPORARY);
>> +                       if (!pages[i]) {
>> +                               free_absrange(pages, page_cnt);
>> +                               return -ENOMEM;
>> +                       }
>> +               }
>> +       }
>> +
>> +       spin_lock_irqsave(&dev->event_lock, flags);
>> +       spin_lock(&client->buffer_lock);
>> +
>> +       for (i = 0; i < count; ++i) {
>> +               __u16 code;
>> +               bool is_mt;
>> +
>> +               code = start + i;
>> +               is_mt = input_is_mt_value(code);
>> +               if (is_mt && !dev->mt)
>> +                       continue;
>> +
>> +               for (j = 0; j < slots; ++j) {
>> +                       __s32 v;
>> +
>> +                       if (is_mt)
>> +                               v = input_mt_get_value(&dev->mt->slots[j],
>> +                                                      code);
>> +                       else
>> +                               v = dev->absinfo[code].value;
>> +
>> +                       *absrange_ptr(pages, page_cnt, slots, i, j) = v;
>> +
>> +                       if (!is_mt)
>> +                               break;
>> +               }
>> +       }
>> +
>> +       spin_unlock(&client->buffer_lock);
>> +       __evdev_flush_queue(client, EV_ABS, start, start + count - 1);
>> +       spin_unlock_irqrestore(&dev->event_lock, flags);
>> +
>> +       *out = pages;
>> +       return page_cnt;
>> +}
>> +
>> +static int evdev_handle_get_absrange(struct evdev_client *client,
>> +                                    struct input_dev *dev,
>> +                                    struct input_absrange __user *p)
>> +{
>> +       size_t slots, code, count, i, j;
>> +       struct input_absrange absbuf;
>> +       s32 **vals = NULL;
>> +       ssize_t val_cnt;
>> +       s32 __user *b;
>> +       int retval;
>> +
>> +       if (!dev->absinfo)
>> +               return -EINVAL;
>> +       if (copy_from_user(&absbuf, p, sizeof(absbuf)))
>> +               return -EFAULT;
>> +
>> +       slots = min_t(size_t, dev->mt ? dev->mt->num_slots : 1, 
>> absbuf.slots);
>> +       code = min_t(size_t, absbuf.code, ABS_CNT);
>> +       count = min_t(size_t, absbuf.count, ABS_CNT);
>> +
>> +       /* first fetch data atomically from device */
>> +
>> +       if (code + count > ABS_CNT)
>> +               count = ABS_CNT - code;
>> +
>> +       if (!slots || !count) {
>> +               val_cnt = 0;
>> +       } else {
>> +               val_cnt = fetch_absrange(client, dev, code, count,
>> +                                        slots, &vals);
>> +               if (val_cnt < 0)
>> +                       return val_cnt;
>> +       }
>> +
>> +       /* now copy data to user-space */
>> +
>> +       b = (void __user*)(unsigned long)absbuf.buffer;
>
> What if the user space buffer is not long enough, or if the pointer is
> invalid? Is there any means from the kernel to guarantee that we are
> not writing in a restricted memory area or in a buffer not owned by
> the process?

The buffer is given as pointer in absbuf.buffer, the size of the
buffer is given as absbuf.slots * absbuf.count. This is the same as
for the read() syscall, where we get a pointer plus the size.
I use put_user() to write data into that buffer, so in case it's not
valid user-memory, this will fail with -EFAULT.

This is no different from other calls that return data, apart from
splitting the size into two ints "slots" and "count". So I cannot
follow what you mean? Note that "put_user()" is equivalent to
"copy_to_user()", maybe you missed that part?

>> +       for (i = 0; i < absbuf.count; ++i) {
>> +               for (j = 0; j < absbuf.slots; ++j, ++b) {
>> +                       s32 v;
>> +
>> +                       if (i >= count || j >= slots)
>> +                               v = 0;
>> +                       else
>> +                               v = *absrange_ptr(vals, val_cnt, slots, i, 
>> j);
>> +
>> +                       if (put_user(v, b)) {
>> +                               retval = -EFAULT;
>> +                               goto out;
>> +                       }
>> +               }
>> +       }
>> +
>
> Shouldn't we also call free_absrange(vals, val_cnt); before returning?

I do. There's a fall-through to "out:", so we always free the buffer.
Otherwise, I would have called it "error:" :)

>> +       retval = 0;
>
> Not sure it matters a lot, but returning the size of what has been
> written is more common. This would make sense if the buffer is not
> long enough or if it is too big.

This would always be "absbuf.slots * absbuf.count". I don't think
there's much gain in returning a constant size, is there?

Thanks!
David

>
>> +
>> +out:
>> +       free_absrange(vals, val_cnt);
>> +       if (retval < 0)
>> +               evdev_queue_syn_dropped(client);
>> +       return retval;
>> +}
>> +
>>  static int evdev_handle_get_keycode(struct input_dev *dev, void __user *p)
>>  {
>>         struct input_keymap_entry ke = {
>> @@ -889,7 +1058,7 @@ static int evdev_handle_get_val(struct evdev_client 
>> *client,
>>
>>         spin_unlock(&dev->event_lock);
>>
>> -       __evdev_flush_queue(client, type);
>> +       __evdev_flush_queue(client, type, 0, UINT_MAX);
>>
>>         spin_unlock_irq(&client->buffer_lock);
>>
>> @@ -1006,6 +1175,9 @@ static long evdev_do_ioctl(struct file *file, unsigned 
>> int cmd,
>>                 else
>>                         return evdev_revoke(evdev, client, file);
>>
>> +       case EVIOCGABSRANGE:
>> +               return evdev_handle_get_absrange(client, dev, p);
>> +
>>         case EVIOCGMASK:
>>                 if (copy_from_user(&mask, p, sizeof(mask)))
>>                         return -EFAULT;
>> diff --git a/include/uapi/linux/input.h b/include/uapi/linux/input.h
>> index f6ace0e..32a6443 100644
>> --- a/include/uapi/linux/input.h
>> +++ b/include/uapi/linux/input.h
>> @@ -210,6 +210,48 @@ struct input_mask {
>>   */
>>  #define EVIOCSMASK             _IOW('E', 0x93, struct input_mask)      /* 
>> Set event-masks */
>>
>> +struct input_absrange {
>> +       __u16 slots;
>> +       __u16 code;
>> +       __u32 count;
>> +       __u64 buffer;
>> +};
>> +
>> +/**
>> + * EVIOCGABSRANGE - Fetch range of ABS values
>> + *
>> + * This fetches the current values of a range of ABS codes atomically. The 
>> range
>> + * of codes to fetch and the buffer-types are passed as "struct 
>> input_absrange",
>> + * which has the following fields:
>> + *      slots: Number of MT slots to fetch data for.
>> + *       code: First ABS axis to query.
>> + *      count: Number of ABS axes to query starting at @code.
>> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
>> + *             values. This buffer must be an array of __s32 with at least
>> + *             (@slots * @code) elements. The buffer is interpreted as two
>> + *             dimensional __s32 array, declared as: __s32[slots][codes]
>> + *
>> + * Compared to EVIOCGABS this ioctl allows to retrieve a range of ABS codes
>> + * atomically regarding any concurrent buffer modifications. Furthermore, 
>> any
>> + * pending events for codes that were retrived via this call are flushed 
>> from
>> + * the client's receive buffer. But unlike EVIOCGABS, this ioctl only 
>> returns
>> + * the current value of an axis, rather than the whole "struct 
>> input_absinfo"
>> + * set. All fields of "struct input_absinfo" except for the value are 
>> constant,
>> + * though.
>> + *
>> + * The kernel's current view of the ABS axes is copied into the provided 
>> buffer.
>> + * If an ABS axis is not enabled on the device, its value will be zero. 
>> Also, if
>> + * an axis is not a slotted MT-axis, values for all but the first slot will 
>> be
>> + * 0. If @slots is greater than the actual number of slots provided by the
>> + * device, values for all slots higher than that will be 0.
>> + *
>> + * This call may fail with -EINVAL if the kernel doesn't support this call 
>> or
>> + * the arguments are invalid, with -ENODEV if access was revoked, -ENOMEM 
>> if the
>> + * kernel couldn't allocate temporary buffers for data-copy or -EFAULT if 
>> the
>> + * passed pointer was invalid.
>> + */
>> +#define EVIOCGABSRANGE         _IOR('E', 0x94, struct input_absrange)
>> +
>>  #define EVIOCSCLOCKID          _IOW('E', 0xa0, int)                    /* 
>> Set clockid to be used for timestamps */
>>
>>  /*
>> --
>> 2.0.0
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to