Am 05.03.2018 um 23:33 schrieb Felix Kuehling:
On 2018-03-05 12:50 PM, Christian König wrote:
The easiest way to work around this to to add a separate IOCTL which
waits for VM updates to complete. Should be trivial to use
vm->last_update for this.
For the cost of an extra system call. That means every GPU mapping
operation now requires two system calls. With Spectre workarounds, that
can be a lot more expensive.
There is still very little overhead associated with that.

And additional to this as long you actually need to wait it doesn't
matter if you wait directly in one system call or if you split it into
two. The time from the initial call to the end of the operation stays
the same.
That's if you have waiting. For CPU page table updates, there is no
waiting, so the extra system call is wasted.

Another alternative would be to make the ioctl fail with -EINTR instead.
That way the Thunk would be in charge of handling the restart, and we
wouldn't have to pretend that no state was changed by the first call.

You can then also pipeline multiple mappings and wait for all of them
to finish in the end.
Our Thunk mapping function already bundles all the mappings it can do at
once into a single ioctl.

Please don't tell me you make multiple mappings in a single IOCTL, cause that would most likely be a show stopper as well.

See operations like this need to be transactional, e.g. either they complete or they return an error and don't do anything.

When you make multiple changes at the same time you either need to make sure that you fulfill all prerequisites before making the first change or you need to be able to rollback changes when a problem happens.

For an example how to do this see amdgpu_vm_bo_clear_mappings():
        /* Allocate all the needed memory */
        before = kzalloc(sizeof(*before), GFP_KERNEL);
        if (!before)
                return -ENOMEM;
        INIT_LIST_HEAD(&before->list);

        after = kzalloc(sizeof(*after), GFP_KERNEL);
        if (!after) {
                kfree(before);
                return -ENOMEM;
        }
        INIT_LIST_HEAD(&after->list);

Here we allocate memory for two nodes before actually making any changes, just to free it up at the end of the function when it isn't used:
        /* Insert partial mapping before the range */
        if (!list_empty(&before->list)) {
                amdgpu_vm_it_insert(before, &vm->va);
                if (before->flags & AMDGPU_PTE_PRT)
                        amdgpu_vm_prt_get(adev);
        } else {
                kfree(before);
        }

        /* Insert partial mapping after the range */
        if (!list_empty(&after->list)) {
                amdgpu_vm_it_insert(after, &vm->va);
                if (after->flags & AMDGPU_PTE_PRT)
                        amdgpu_vm_prt_get(adev);
        } else {
                kfree(after);
        }

Same is true for locking up handles etc..

The Thunk API must guarantee that memory can
be accessed when it returns. So we have no further room for improving
the pipelining without changing the Thunk API semantics.

Well then change the Thunk API.

Regards,
Christian.


Regards,
   Felix

Regards,
Christian.

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to