[PULL] drm-xe-next

2024-04-23 Thread Thomas Hellstrom
Hi, Dave, Sima

The main -next 6.10 pull request for the xe driver. I scanned through the 
patches and
tried to provide a somewhat condensed log below.

Nothing spectacular in the uAPI changes. Among other things there are some flags
that are reinstated now that we have published UMD for them. Unfortunately some
of the underlying implementation got somehow lost in a backmerge but there is a
patch pending to reinstate that. Will send another pull-request this week, or
if you want I can resend this one when the patch passes review and CI with the
patch included.

Some hickups unfortunately in that we carry a couple of i915 patches.
One that got mistakenly commited to drm-xe-next, but was later acked-by
Rodrigo for carrying in drm-xe-next to simplify handling. There is also one
that was part of the PM rework, and a fix for that patch.

Thanks,
Thomas

drm-xe-next-2024-04-23:
UAPI Changes:
- Remove unused flags (Francois Dugast)
- Extend uAPI to query HuC micro-controler firmware version (Francois Dugast)
- drm/xe/uapi: Define topology types as indexes rather than masks
  (Francois Dugast)
- drm/xe/uapi: Restore flags VM_BIND_FLAG_READONLY and VM_BIND_FLAG_IMMEDIATE
  (Francois Dugast)
- devcoredump updates. Some touching the output format.
  (José Roberto de Souza, Matthew Brost)
- drm/xe/hwmon: Add infra to support card power and energy attributes
- Improve LRC, HWSP and HWCTX error capture. (Maarten Lankhorst)
- drm/xe/uapi: Add IP version and stepping to GT list query (Matt roper)
- Invalidate userptr VMA on page pin fault (Matthew Brost)
- Improve xe_bo_move tracepoint (Priyanka Danamudi)
- Align fence output format in ftrace log

Cross-driver Changes:
- drm/i915/hwmon: Get rid of devm (Ashutosh Dixit)
  (Acked-by: Rodrigo Vivi )
- drm/i915/display: convert inner wakeref get towards get_if_in_use
  (SOB Rodrigo Vivi)
- drm/i915: Convert intel_runtime_pm_get_noresume towards raw wakeref
  (Committer, SOB Jani Nikula)

Driver Changes:
- Fix for unneeded CCS metadata allocation (Akshata Jahagirdar)
- Fix for fix multicast support for Xe_LP platforms (Andrzej Hajda)
- A couple of build fixes (Arnd Bergmann)
- Fix register definition (Ashutosh Dixit)
- Add BMG mocs table (Balasubramani Vivekanandan)
- Replace sprintf() across driver (Bommu Krishnaiah)
- Add an xe2 workaround (Bommu Krishnaiah)
- Makefile fix (Dafna Hirschfeld)
- force_wake_get error value check (Daniele Ceraolo Spurio)
- Handle GSCCS ER interrupt (Daniele Ceraolo Spurio)
- GSC Workaround (Daniele Ceraolo Spurio)
- Build error fix (Dawei Li)
- drm/xe/gt: Add L3 bank mask to GT topology (Francois Dugast)
- Implement xe2- and GuC workarounds (Gustavo Sousa, Haridhar Kalvala,
  Himal rasad Ghimiray, John Harrison, Matt Roper, Radhakrishna Sripada,
  Vinay Belgaumkar, Badal Nilawar)
- xe2hpg compression (Himal Ghimiray Prasad)
- Error code cleanups and fixes (Himal Prasad Ghimiray)
- struct xe_device cleanup (Jani Nikula)
- Avoid validating bos when only requesting an exec dma-fence
  (José Roberto de Souza)
- Remove debug message from migrate_clear (José Roberto de Souza)
- Nuke EXEC_QUEUE_FLAG_PERSISTENT leftover internal flag (José Roberto de Souza)
- Mark dpt and related vma as uncached (Juha-Pekka Heikkila)
- Hwmon updates (Karthik Poosa)
- KConfig fix when ACPI_WMI selcted (Lu Yao)
- Update intel_uncore_read*() return types (Luca Coelho)
- Mocs updates (Lucas De Marchi, Matt Roper)
- Drop dynamic load-balancing workaround (Lucas De Marchi)
- Fix a PVC workaround (Lucas De Marchi)
- Group live kunit tests into a single module (Lucas De Marchi)
- Various code cleanups (Lucas De Marchi)
- Fix a ggtt init error patch and move ggtt invalidate out of ggtt lock
  (Maarten Lankhorst)
- Fix a bo leak (Marten Lankhorst)
- Add LRC parsing for more GPU instructions (Matt Roper)
- Add various definitions for hardware and IP (Matt Roper)
- Define all possible engines in media IP descriptors (Matt Roper)
- Various cleanups, asserts and code fixes (Matthew Auld)
- Various cleanups and code fixes (Matthew Brost)
- Increase VM_BIND number of per-ioctl Ops (Matthew Brost, Paulo Zanoni)
- Don't support execlists in xe_gt_tlb_invalidation layer (Matthew Brost)
- Handle timing out of already signaled jobs gracefully (Matthew Brost)
- Pipeline evict / restore of pinned BOs during suspend / resume (Matthew Brost)
- Do not grab forcewakes when issuing GGTT TLB invalidation via GuC
  (Matthew Brost)
- Drop ggtt invalidate from display code (Matthew Brost)
- drm/xe: Add XE_BO_GGTT_INVALIDATE flag (Matthew Brost)
- Add debug messages for MMU notifier and VMA invalidate (Matthew Brost)
- Use ordered wq for preempt fence waiting (Matthew Brost)
- Initial development for SR-IOV support including some refactoring
  (Michal Wajdeczko)
- Various GuC- and GT- related cleanups and fixes (Michal Wajdeczko)
- Move userptr over to start using hmm_range_fault (Oak Zeng)
- Add new PCI IDs to DG2 platform (Ravi Kumar Vodapalli)
- Pcode - and VRAM initialization check update (Riana 

[PULL] drm-xe-fixes

2024-03-07 Thread Thomas Hellstrom
Hi Dave, Sima

A single error path fix for 6.8 final (-rc8).

Thanks,
Thomas

drm-xe-fixes-2024-03-07:
Driver Changes:
- An error path fix.

The following changes since commit 90d35da658da8cff0d4ecbb5113f5fac9d00eb72:

  Linux 6.8-rc7 (2024-03-03 13:02:52 -0800)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-03-07

for you to fetch changes up to a4e7596e209783a7be2727d6b947cbd863c2bbcb:

  drm/xe: Return immediately on tile_init failure (2024-03-07 09:13:38 +0100)


Driver Changes:
- An error path fix.


Rodrigo Vivi (1):
  drm/xe: Return immediately on tile_init failure

 drivers/gpu/drm/xe/xe_tile.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)


[PULL] drm-xe-fixes

2024-02-29 Thread Thomas Hellstrom
Dave, Sima

The xe fixes for -rc7. It's mostly uapi sanitizing and future-proofing,
and a couple of driver fixes.

drm-xe-fixes-2024-02-29:
UAPI Changes:
- A couple of tracepoint updates from Priyanka and Lucas.
- Make sure BINDs are completed before accepting UNBINDs on LR vms.
- Don't arbitrarily restrict max number of batched binds.
- Add uapi for dumpable bos (agreed on IRC).
- Remove unused uapi flags and a leftover comment.

Driver Changes:
- A couple of fixes related to the execlist backend.
- A 32-bit fix.

/Thomas


The following changes since commit 6650d23f3e20ca00482a71a4ef900f0ea776fb15:

  drm/xe: Fix modpost warning on xe_mocs kunit module (2024-02-21 11:06:52 
+0100)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-02-29

for you to fetch changes up to 8188cae3cc3d8018ec97ca9ab8caa3acc69a056d:

  drm/xe/xe_trace: Add move_lacks_source detail to xe_bo_move trace (2024-02-29 
12:32:15 +0100)


UAPI Changes:
- A couple of tracepoint updates from Priyanka and Lucas.
- Make sure BINDs are completed before accepting UNBINDs on LR vms.
- Don't arbitrarily restrict max number of batched binds.
- Add uapi for dumpable bos (agreed on IRC).
- Remove unused uapi flags and a leftover comment.

Driver Changes:
- A couple of fixes related to the execlist backend.
- A 32-bit fix.

Arnd Bergmann (1):
  drm/xe/mmio: fix build warning for BAR resize on 32-bit

Francois Dugast (1):
  drm/xe/uapi: Remove unused flags

José Roberto de Souza (1):
  drm/xe/uapi: Remove DRM_XE_VM_BIND_FLAG_ASYNC comment left over

Lucas De Marchi (1):
  drm/xe: Use pointers in trace events

Maarten Lankhorst (1):
  drm/xe: Add uapi for dumpable bos

Matthew Brost (3):
  drm/xe: Fix execlist splat
  drm/xe: Don't support execlists in xe_gt_tlb_invalidation layer
  drm/xe: Use vmalloc for array of bind allocation in bind IOCTL

Mika Kuoppala (2):
  drm/xe: Expose user fence from xe_sync_entry
  drm/xe: Deny unbinds if uapi ufence pending

Paulo Zanoni (1):
  drm/xe: get rid of MAX_BINDS

Priyanka Dandamudi (2):
  drm/xe/xe_bo_move: Enhance xe_bo_move trace
  drm/xe/xe_trace: Add move_lacks_source detail to xe_bo_move trace

 drivers/gpu/drm/xe/xe_bo.c  | 11 +++-
 drivers/gpu/drm/xe/xe_bo.h  |  1 +
 drivers/gpu/drm/xe/xe_drm_client.c  | 12 +---
 drivers/gpu/drm/xe/xe_exec_queue.c  | 88 +
 drivers/gpu/drm/xe/xe_exec_queue_types.h| 10 
 drivers/gpu/drm/xe/xe_execlist.c|  2 +-
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 12 
 drivers/gpu/drm/xe/xe_lrc.c | 10 +---
 drivers/gpu/drm/xe/xe_mmio.c|  2 +-
 drivers/gpu/drm/xe/xe_sync.c| 58 +++
 drivers/gpu/drm/xe/xe_sync.h|  4 ++
 drivers/gpu/drm/xe/xe_sync_types.h  |  2 +-
 drivers/gpu/drm/xe/xe_trace.h   | 59 +--
 drivers/gpu/drm/xe/xe_vm.c  | 80 ++
 drivers/gpu/drm/xe/xe_vm_types.h| 11 ++--
 include/uapi/drm/xe_drm.h   | 21 +--
 16 files changed, 187 insertions(+), 196 deletions(-)


[PULL] drm-xe-fixes

2024-02-22 Thread Thomas Hellstrom
Hi, Dave Sima

The Xe pull request for 6.8-rc6
The uAPI fixes / adjustments we've been discussing
are starting to appear, and I will hopefully have the rest
for next week's PR. In addition two driver fixes.

drm-xe-fixes-2024-02-22:
UAPI Changes:
- Remove support for persistent exec_queues
- Drop a reduntant sysfs newline printout

Cross-subsystem Changes:

Core Changes:

Driver Changes:
- A three-patch fix for a VM_BIND rebind optimization path
- Fix a modpost warning on an xe KUNIT module

/Thomas

The following changes since commit b401b621758e46812da61fa58a67c3fd8d91de0d:

  Linux 6.8-rc5 (2024-02-18 12:56:25 -0800)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-02-22

for you to fetch changes up to 6650d23f3e20ca00482a71a4ef900f0ea776fb15:

  drm/xe: Fix modpost warning on xe_mocs kunit module (2024-02-21 11:06:52 
+0100)


UAPI Changes:
- Remove support for persistent exec_queues
- Drop a reduntant sysfs newline printout

Cross-subsystem Changes:

Core Changes:

Driver Changes:
- A three-patch fix for a VM_BIND rebind optimization path
- Fix a modpost warning on an xe KUNIT module


Ashutosh Dixit (2):
  drm/xe/xe_gt_idle: Drop redundant newline in name
  drm/xe: Fix modpost warning on xe_mocs kunit module

Matthew Brost (3):
  drm/xe: Fix xe_vma_set_pte_size
  drm/xe: Add XE_VMA_PTE_64K VMA flag
  drm/xe: Return 2MB page size for compact 64k PTEs

Thomas Hellström (1):
  drm/xe/uapi: Remove support for persistent exec_queues

 drivers/gpu/drm/xe/tests/xe_mocs_test.c  |  1 +
 drivers/gpu/drm/xe/xe_device.c   | 39 
 drivers/gpu/drm/xe/xe_device.h   |  4 
 drivers/gpu/drm/xe/xe_device_types.h |  8 ---
 drivers/gpu/drm/xe/xe_exec_queue.c   | 33 ---
 drivers/gpu/drm/xe/xe_exec_queue_types.h | 10 
 drivers/gpu/drm/xe/xe_execlist.c |  2 --
 drivers/gpu/drm/xe/xe_gt_idle.c  |  4 ++--
 drivers/gpu/drm/xe/xe_guc_submit.c   |  2 --
 drivers/gpu/drm/xe/xe_pt.c   | 11 ++---
 drivers/gpu/drm/xe/xe_vm.c   | 14 
 drivers/gpu/drm/xe/xe_vm_types.h |  2 ++
 include/uapi/drm/xe_drm.h|  1 -
 13 files changed, 28 insertions(+), 103 deletions(-)


[PULL] drm-xe-fixes

2024-02-15 Thread Thomas Hellstrom
Hi Dave, Sima!

The xe fixes pull request for -rc5.
drm-xe-fixes-2024-02-15:
Driver Changes:
- Fix an out-of-bounds shift.
- Fix the display code thinking xe uses shmem
- Fix a warning about index out-of-bound
- Fix a clang-16 compilation warning

Thanks,
Thomas

The following changes since commit bf4c27b8267d7848bb81fd41e6aa07aa662f07fb:

  drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR (2024-02-08 09:51:19 +0100)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-02-15

for you to fetch changes up to 455dae7549aed709707feda5d6b3e085b37d33f7:

  drm/xe: avoid function cast warnings (2024-02-15 09:53:38 +0100)


Driver Changes:
- Fix an out-of-bounds shift.
- Fix the display code thinking xe uses shmem
- Fix a warning about index out-of-bound
- Fix a clang-16 compilation warning


Arnd Bergmann (1):
  drm/xe: avoid function cast warnings

Matthew Auld (1):
  drm/xe/display: fix i915_gem_object_is_shmem() wrapper

Thomas Hellström (2):
  drm/xe/vm: Avoid reserving zero fences
  drm/xe/pt: Allow for stricter type- and range checking

 .../xe/compat-i915-headers/gem/i915_gem_object.h   |  2 +-
 drivers/gpu/drm/xe/xe_pt.c | 39 ++
 drivers/gpu/drm/xe/xe_pt_walk.c|  2 +-
 drivers/gpu/drm/xe/xe_pt_walk.h| 19 ++-
 drivers/gpu/drm/xe/xe_range_fence.c|  7 +++-
 drivers/gpu/drm/xe/xe_vm.c | 13 ++--
 6 files changed, 46 insertions(+), 36 deletions(-)


[PULL] drm-xe-fixes

2024-02-08 Thread Thomas Hellstrom
Dave, Sima

The drm-xe-fixes pull for -rc4.

Thanks,
Thomas

drm-xe-fixes-2024-02-08:
Driver Changes:
- Fix a loop in an error path
- Fix a missing dma-fence reference
- Fix a retry path on userptr REMAP
- Workaround for a false gcc warning
- Fix missing map of the usm batch buffer
  in the migrate vm.
- Fix a memory leak.
- Fix a bad assumption of used page size
- Fix hitting a BUG() due to zero pages to map.
- Remove some leftover async bind queue relics
The following changes since commit 54be6c6c5ae8e0d93a6c4641cb7528eb0b6ba478:

  Linux 6.8-rc3 (2024-02-04 12:20:36 +)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-02-08

for you to fetch changes up to bf4c27b8267d7848bb81fd41e6aa07aa662f07fb:

  drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR (2024-02-08 09:51:19 +0100)


Driver Changes:
- Fix a loop in an error path
- Fix a missing dma-fence reference
- Fix a retry path on userptr REMAP
- Workaround for a false gcc warning
- Fix missing map of the usm batch buffer
  in the migrate vm.
- Fix a memory leak.
- Fix a bad assumption of used page size
- Fix hitting a BUG() due to zero pages to map.
- Remove some leftover async bind queue relics


Arnd Bergmann (1):
  drm/xe: circumvent bogus stringop-overflow warning

Matthew Auld (1):
  drm/xe/vm: don't ignore error when in_kthread

Matthew Brost (6):
  drm/xe: Fix loop in vm_bind_ioctl_ops_unwind
  drm/xe: Take a reference in xe_exec_queue_last_fence_get()
  drm/xe: Pick correct userptr VMA to repin on REMAP op failure
  drm/xe: Map both mem.kernel_bb_pool and usm.bb_pool
  drm/xe: Assume large page size if VMA not yet bound
  drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR

Xiaoming Wang (1):
  drm/xe/display: Fix memleak in display initialization

 drivers/gpu/drm/xe/xe_display.c  |  6 
 drivers/gpu/drm/xe/xe_exec_queue.c   |  8 +++--
 drivers/gpu/drm/xe/xe_gt.c   |  5 ++-
 drivers/gpu/drm/xe/xe_gt_pagefault.c |  2 +-
 drivers/gpu/drm/xe/xe_migrate.c  | 28 
 drivers/gpu/drm/xe/xe_sched_job.c|  1 -
 drivers/gpu/drm/xe/xe_sync.c |  2 --
 drivers/gpu/drm/xe/xe_vm.c   | 62 ++--
 drivers/gpu/drm/xe/xe_vm_types.h |  8 -
 9 files changed, 57 insertions(+), 65 deletions(-)


[PULL] drm-xe-fixes

2024-02-01 Thread Thomas Hellstrom
Hi Dave and Sima,

The xe fixes for 6.8-rc2.

drm-xe-fixes-2024-02-01:
UAPI Changes:
- Only allow a single user-fence per exec / bind.
  The reason for this clarification fix is a limitation in the implementation
  which can be lifted moving forward, if needed.

Driver Changes:
- A crash fix
- A fix for an assert due to missing mem_acces ref
- Some sparse warning fixes
- Two fixes for compilation failures on various odd
  combinations of gcc / arch pointed out on LKML.
- Fix a fragile partial allocation pointed out on LKML.

Cross-driver Change:
- A sysfs ABI documentation warning fix
  This also touches i915 and is acked by i915 maintainers.

Thanks,
Thomas

The following changes since commit 9e3a13f3eef6b14a26cc2660ca2f43f0e46b4318:

  drm/xe: Remove PVC from xe_wa kunit tests (2024-01-24 11:13:55 +0100)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-02-01

for you to fetch changes up to 5f16ee27cd5abd5166e28b2311ac693c204063ff:

  drm/hwmon: Fix abi doc warnings (2024-02-01 12:04:52 +0100)


UAPI Changes:
- Only allow a single user-fence per exec / bind.
  The reason for this clarification fix is a limitation in the implementation
  which can be lifted moving forward, if needed.

Driver Changes:
- A crash fix
- A fix for an assert due to missing mem_acces ref
- Only allow a single user-fence per exec / bind.
- Some sparse warning fixes
- Two fixes for compilation failures on various odd
  combinations of gcc / arch pointed out on LKML.
- Fix a fragile partial allocation pointed out on LKML.

Cross-driver Change:
- A sysfs ABI documentation warning fix
  This also touches i915 and is acked by i915 maintainers.


Badal Nilawar (1):
  drm/hwmon: Fix abi doc warnings

José Roberto de Souza (1):
  drm/xe: Fix crash in trace_dma_fence_init()

Matt Roper (1):
  drm/xe: Grab mem_access when disabling C6 on skip_guc_pc platforms

Matthew Brost (3):
  drm/xe: Only allow 1 ufence per exec / bind IOCTL
  drm/xe: Use LRC prefix rather than CTX prefix in lrc desc defines
  drm/xe: Make all GuC ABI shift values unsigned

Thomas Hellström (3):
  drm/xe: Annotate mcr_[un]lock()
  drm/xe: Don't use __user error pointers
  drm/xe/vm: Subclass userptr vmas

 .../ABI/testing/sysfs-driver-intel-i915-hwmon  |  14 +-
 .../ABI/testing/sysfs-driver-intel-xe-hwmon|  14 +-
 drivers/gpu/drm/xe/abi/guc_actions_abi.h   |   4 +-
 drivers/gpu/drm/xe/abi/guc_actions_slpc_abi.h  |   4 +-
 drivers/gpu/drm/xe/abi/guc_communication_ctb_abi.h |   8 +-
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h  |   6 +-
 drivers/gpu/drm/xe/abi/guc_messages_abi.h  |  20 +--
 drivers/gpu/drm/xe/xe_exec.c   |  10 +-
 drivers/gpu/drm/xe/xe_gt_mcr.c |   4 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c   |  11 +-
 drivers/gpu/drm/xe/xe_guc_pc.c |   2 +
 drivers/gpu/drm/xe/xe_hw_fence.c   |   6 +-
 drivers/gpu/drm/xe/xe_lrc.c|  14 +-
 drivers/gpu/drm/xe/xe_pt.c |  32 ++--
 drivers/gpu/drm/xe/xe_query.c  |  50 +++
 drivers/gpu/drm/xe/xe_sync.h   |   5 +
 drivers/gpu/drm/xe/xe_vm.c | 165 -
 drivers/gpu/drm/xe/xe_vm.h |  16 +-
 drivers/gpu/drm/xe/xe_vm_types.h   |  16 +-
 19 files changed, 234 insertions(+), 167 deletions(-)


[PULL] drm-xe-fixes

2024-01-25 Thread Thomas Hellstrom
Hi, Dave, Sima

The Xe fixes PR for 6.8-rc2.

Thanks, Thomas.

The following changes since commit 6613476e225e090cc9aad49be7fa504e290dd33d:

  Linux 6.8-rc1 (2024-01-21 14:11:32 -0800)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-01-25

for you to fetch changes up to 9e3a13f3eef6b14a26cc2660ca2f43f0e46b4318:

  drm/xe: Remove PVC from xe_wa kunit tests (2024-01-24 11:13:55 +0100)


Driver Changes:
- Make an ops struct static
- Fix an implicit 0 to NULL conversion
- A couple of 32-bit fixes
- A migration coherency fix for Lunar Lake.
- An error path vm id leak fix
- Remove PVC references in kunit tests


Himal Prasad Ghimiray (1):
  drm/xe/xe2: Use XE_CACHE_WB pat index

Lucas De Marchi (4):
  drm/xe: Use _ULL for u64 division
  drm/xe/mmio: Cast to u64 when printing
  drm/xe/display: Avoid calling readq()
  drm/xe: Remove PVC from xe_wa kunit tests

Moti Haimovski (1):
  drm/xe/vm: bugfix in xe_vm_create_ioctl

Thomas Hellström (2):
  drm/xe/dmabuf: Make xe_dmabuf_ops static
  drm/xe: Use a NULL pointer instead of 0.

 .../xe/compat-i915-headers/gem/i915_gem_object.h   | 11 +--
 drivers/gpu/drm/xe/tests/xe_wa_test.c  |  3 ---
 drivers/gpu/drm/xe/xe_device.c |  2 +-
 drivers/gpu/drm/xe/xe_dma_buf.c|  2 +-
 drivers/gpu/drm/xe/xe_hwmon.c  |  2 +-
 drivers/gpu/drm/xe/xe_migrate.c| 14 ++---
 drivers/gpu/drm/xe/xe_mmio.c   |  4 ++--
 drivers/gpu/drm/xe/xe_vm.c | 23 +-
 8 files changed, 31 insertions(+), 30 deletions(-)


[git-pull] vmwgfx-next

2020-03-16 Thread Thomas Hellstrom (VMware)
Dave, Daniel,

The first vmwgfx-next pull for 5.7. Roland Scheidegger will follow up with
a larger pull request for functionality needed for GL4 support.

- Disable DMA when using SEV encryption
- An -RT fix
- Code cleanups

Thanks,
Thomas

The following changes since commit d3bd37f587b4438d47751d0f1d5aaae3d39bd416:

  Merge v5.6-rc5 into drm-next (2020-03-11 07:27:21 +1000)

are available in the Git repository at:

  git://people.freedesktop.org/~thomash/linux vmwgfx-next

for you to fetch changes up to 6b656755428dc0c96d21d7af697dc2a10c7ff175:

  drm/vmwgfx: Replace zero-length array with flexible-array member (2020-03-16 
10:42:01 +0100)


Gustavo A. R. Silva (1):
  drm/vmwgfx: Replace zero-length array with flexible-array member

Sebastian Andrzej Siewior (2):
  drm/vmwgfx: Drop preempt_disable() in vmw_fifo_ping_host()
  drm/vmwgfx: Remove a few unused functions

Thomas Hellstrom (2):
  drm/vmwgfx: Fix the refuse_dma mode when using guest-backed objects
  drm/vmwgfx: Refuse DMA operation when SEV encryption is active

 drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c |  3 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c| 11 +++-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h| 28 ---
 drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c   |  2 -
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c| 81 --
 drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c| 31 
 drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c |  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c   |  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_surface.c|  2 +-
 9 files changed, 14 insertions(+), 148 deletions(-)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 5/9] drm/ttm, drm/vmwgfx: Support huge TTM pagefaults

2020-03-01 Thread Thomas Hellstrom
On Sun, 2020-03-01 at 21:49 +0800, Hillf Danton wrote:
> On Thu, 20 Feb 2020 13:27:15 +0100 Thomas Hellstrom wrote:
> > +
> > +static vm_fault_t ttm_bo_vm_huge_fault(struct vm_fault *vmf,
> > +  enum page_entry_size pe_size)
> > +{
> > +   struct vm_area_struct *vma = vmf->vma;
> > +   pgprot_t prot;
> > +   struct ttm_buffer_object *bo = vma->vm_private_data;
> > +   vm_fault_t ret;
> > +   pgoff_t fault_page_size = 0;
> > +   bool write = vmf->flags & FAULT_FLAG_WRITE;
> > +
> > +   switch (pe_size) {
> > +   case PE_SIZE_PMD:
> > +   fault_page_size = HPAGE_PMD_SIZE >> PAGE_SHIFT;
> > +   break;
> > +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> > +   case PE_SIZE_PUD:
> > +   fault_page_size = HPAGE_PUD_SIZE >> PAGE_SHIFT;
> > +   break;
> > +#endif
> > +   default:
> > +   WARN_ON_ONCE(1);
> > +   return VM_FAULT_FALLBACK;
> > +   }
> > +
> > +   /* Fallback on write dirty-tracking or COW */
> > +   if (write && ttm_pgprot_is_wrprotecting(vma->vm_page_prot))
> > +   return VM_FAULT_FALLBACK;
> > +
> > +   ret = ttm_bo_vm_reserve(bo, vmf);
> > +   if (ret)
> > +   return ret;
> > +
> > +   prot = vm_get_page_prot(vma->vm_flags);
> > +   ret = ttm_bo_vm_fault_reserved(vmf, prot, 1, fault_page_size);
> > +   if (ret == VM_FAULT_RETRY && !(vmf->flags &
> > FAULT_FLAG_RETRY_NOWAIT))
> > +   return ret;
> 
> Seems it does not make much sense to check VM_FAULT_RETRY and return
> as
> at least resv lock is left behind without care.

With this particular flag combination, both the mm_sem and the dma_resv
lock have already been released by TTM.

It's a special case allowing for drivers to release the mmap_sem when
waiting for IO.

That should probably be documented better in TTM.

/Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] drm/vmwgfx: Drop preempt_disable() in vmw_fifo_ping_host()

2020-02-26 Thread Thomas Hellstrom
On Mon, 2020-02-24 at 15:07 +0100, Sebastian Andrzej Siewior wrote:
> vmw_fifo_ping_host() disables preemption around a test and a register
> write via vmw_write(). The write function acquires a spinlock_t typed
> lock which is not allowed in a preempt_disable()ed section on
> PREEMPT_RT. This has been reported in the bugzilla.
> 
> It has been explained by Thomas Hellstrom that this
> preempt_disable()ed
> section is not required for correctness.
> 
> Remove the preempt_disable() section.
> 

Hi, Sebastian,

I suppose there isn't something like a preempt_disable_unless_RT()
macro?

If not,
Reviewed-by: Thomas Hellstrom 

I'll include in the next vmwgfx-next pull request

Thanks,
Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 20/22] drm/vmwgfx: Convert to CRTC VBLANK callbacks

2020-01-20 Thread Thomas Hellstrom
On 1/20/20 9:23 AM, Thomas Zimmermann wrote:
> VBLANK callbacks in struct drm_driver are deprecated in favor of
> their equivalents in struct drm_crtc_funcs. Convert vmwgfx over.
>
> v2:
>   * remove accidental whitespace fixes
>
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  | 3 ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h  | 6 +++---
>  drivers/gpu/drm/vmwgfx/vmwgfx_kms.c  | 6 +++---
>  drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c  | 3 +++
>  drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c | 3 +++
>  drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 3 +++
>  6 files changed, 15 insertions(+), 9 deletions(-)
>
Acked-by: Thomas Hellstrom 


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 1/2] mm: Add a vmf_insert_mixed_prot() function

2019-12-12 Thread Thomas Hellstrom
On 12/12/19 9:48 AM, Thomas Hellström (VMware) wrote:
> From: Thomas Hellstrom 
>
> The TTM module today uses a hack to be able to set a different page
> protection than struct vm_area_struct::vm_page_prot. To be able to do
> this properly, add the needed vm functionality as vmf_insert_mixed_prot().
>
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: "Matthew Wilcox (Oracle)" 
> Cc: "Kirill A. Shutemov" 
> Cc: Ralph Campbell 
> Cc: "Jérôme Glisse" 
> Cc: "Christian König" 
> Signed-off-by: Thomas Hellstrom 
> Acked-by: Christian König 
> ---
>  include/linux/mm.h   |  2 ++
>  include/linux/mm_types.h |  7 ++-
>  mm/memory.c  | 43 
>  3 files changed, 47 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index cc292273e6ba..29575d3c1e47 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2548,6 +2548,8 @@ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct 
> *vma, unsigned long addr,
>   unsigned long pfn, pgprot_t pgprot);
>  vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
>   pfn_t pfn);
> +vm_fault_t vmf_insert_mixed_prot(struct vm_area_struct *vma, unsigned long 
> addr,
> + pfn_t pfn, pgprot_t pgprot);
>  vm_fault_t vmf_insert_mixed_mkwrite(struct vm_area_struct *vma,
>   unsigned long addr, pfn_t pfn);
>  int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned 
> long len);
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index fa795284..ac96afdbb4bc 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -307,7 +307,12 @@ struct vm_area_struct {
>   /* Second cache line starts here. */
>  
>   struct mm_struct *vm_mm;/* The address space we belong to. */
> - pgprot_t vm_page_prot;  /* Access permissions of this VMA. */
> +
> + /*
> +  * Access permissions of this VMA.
> +  * See vmf_insert_mixed() for discussion.

Typo. will of course be vmf_insert_mixed_prot() in the final version.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vmwgfx: prevent memory leak in vmw_context_define

2019-12-11 Thread Thomas Hellstrom
On 9/25/19 6:46 AM, Navid Emamdoost wrote:
> In vmw_context_define if vmw_context_init fails the allocated resource
> should be unreferenced. The goto label was fixed.
>
> Signed-off-by: Navid Emamdoost 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_context.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_context.c 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_context.c
> index a56c9d802382..ac42f8a6acf0 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_context.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_context.c
> @@ -773,7 +773,7 @@ static int vmw_context_define(struct drm_device *dev, 
> void *data,
>  
>   ret = vmw_context_init(dev_priv, res, vmw_user_context_free, dx);
>   if (unlikely(ret != 0))
> - goto out_unlock;
> + goto out_err;
>  
>   tmp = vmw_resource_reference(>res);
>   ret = ttm_base_object_init(tfile, >base, false, VMW_RES_CONTEXT,

This patch doesn't look correct. vmw_context_init should free up all
resources if failing.

Thanks,

Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vmwgfx: prevent memory leak in vmw_cmdbuf_res_add

2019-12-11 Thread Thomas Hellstrom
On 9/25/19 6:38 AM, Navid Emamdoost wrote:
> In vmw_cmdbuf_res_add if drm_ht_insert_item fails the allocated memory
> for cres should be released.
>
> Signed-off-by: Navid Emamdoost 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c
> index 4ac55fc2bf97..44d858ce4ce7 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c
> @@ -209,8 +209,10 @@ int vmw_cmdbuf_res_add(struct vmw_cmdbuf_res_manager 
> *man,
>  
>   cres->hash.key = user_key | (res_type << 24);
>   ret = drm_ht_insert_item(>resources, >hash);
> - if (unlikely(ret != 0))
> + if (unlikely(ret != 0)) {
> + kfree(cres);
>   goto out_invalid_key;
> + }
>  
>   cres->state = VMW_CMDBUF_RES_ADD;
>   cres->res = vmw_resource_reference(res);

Reviewed-by: Thomas Hellstrom 

Will be part of next vmwgfx-next pull.

Thanks,

Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2] drm/vmwgfx: Call vmw_driver_{load,unload}() before registering device

2019-12-10 Thread Thomas Hellstrom
On 12/10/19 1:43 PM, Thomas Zimmermann wrote:
> The load/unload callbacks in struct drm_driver are deprecated. Remove
> them and call functions explicitly.
>
> v2:
>   * remove accidental whitespace fix
>
> Signed-off-by: Thomas Zimmermann 
> ---

Reviewed-by: Thomas Hellstrom 

I'll take this through vmwgfx-next unless told otherwise.

Thanks,

Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vmwgfx: Call vmw_driver_{load,unload}() before registering device

2019-12-10 Thread Thomas Hellstrom
On 12/9/19 12:06 PM, Thomas Zimmermann wrote:
> The load/unload callbacks in struct drm_driver are deprecated. Remove
> them and call functions explicitly.
>
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 44 +
>  1 file changed, 38 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index e962048f65d2..f34f1eb57cfa 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -28,10 +28,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -1211,8 +1211,10 @@ static void vmw_remove(struct pci_dev *pdev)
>  {
>   struct drm_device *dev = pci_get_drvdata(pdev);
>
> + drm_dev_unregister(dev);
> + vmw_driver_unload(dev);
> + drm_dev_put(dev);
>   pci_disable_device(pdev);
> - drm_put_dev(dev);
>  }
>
>  static int vmwgfx_pm_notifier(struct notifier_block *nb, unsigned long val,
> @@ -1329,7 +1331,7 @@ static int vmw_pm_freeze(struct device *kdev)
>
>   vmw_fence_fifo_down(dev_priv->fman);
>   __vmw_svga_disable(dev_priv);
> -
> +

Unrelated whitespace-fixup.

Otherwise looks good, but still conflicts in the above hunk when I try
to apply it. Could be some TAB-mangling on the way perhaps.

Could you remove that hunk and resend?

Thanks,

Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vmwgfx: Call vmw_driver_{load,unload}() before registering device

2019-12-10 Thread Thomas Hellstrom
Hi,

On 12/9/19 12:06 PM, Thomas Zimmermann wrote:
> The load/unload callbacks in struct drm_driver are deprecated. Remove
> them and call functions explicitly.
>
> Signed-off-by: Thomas Zimmermann 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 44 +++

Hmm, which tree is this diff against? I get

Applying: drm/vmwgfx: Call vmw_driver_{load, unload}() before
registering device
Using index info to reconstruct a base tree...
Mdrivers/gpu/drm/vmwgfx/vmwgfx_drv.c
error: patch failed: drivers/gpu/drm/vmwgfx/vmwgfx_drv.c:1329
error: drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: patch does not apply
error: Did you hand edit your patch?
It does not apply to blobs recorded in its index.
Patch failed at 0001 drm/vmwgfx: Call vmw_driver_{load, unload}() before
registering device

On both drm-misc-next and linus' master?


Thanks,

Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/vmwgfx: Replace deprecated PTR_RET

2019-12-09 Thread Thomas Hellstrom
On Sun, 2019-12-08 at 11:53 +0100, Lukas Bulwahn wrote:
> Commit 508108ea2747 ("drm/vmwgfx: Don't refcount command-buffer
> managed
> resource lookups during command buffer validation") slips in use of
> deprecated PTR_RET. Use PTR_ERR_OR_ZERO instead.
> 
> As the PTR_ERR_OR_ZERO is a bit longer than PTR_RET, we introduce
> local variable ret for proper indentation and line-length limits.
> 
> Signed-off-by: Lukas Bulwahn 
> ---
> applies cleanly on current master (9455d25f4e3b) and next-20191207
> compile-tested on x86_64_defconfig + DRM_VMWGFX=y
> 
>  drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 21 +++--
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> index 934ad7c0c342..73489a45decb 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> @@ -2377,9 +2377,12 @@ static int
> vmw_cmd_dx_clear_rendertarget_view(struct vmw_private *dev_priv,
>  {
>   VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdDXClearRenderTargetView) =
>   container_of(header, typeof(*cmd), header);
> + struct vmw_resource *ret;
>  
> - return PTR_RET(vmw_view_id_val_add(sw_context, vmw_view_rt,
> -cmd-
> >body.renderTargetViewId));
> + ret = vmw_view_id_val_add(sw_context, vmw_view_rt,
> +   cmd->body.renderTargetViewId);
> +
> + return PTR_ERR_OR_ZERO(ret);
>  }
>  
>  /**
> @@ -2396,9 +2399,12 @@ static int
> vmw_cmd_dx_clear_depthstencil_view(struct vmw_private *dev_priv,
>  {
>   VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdDXClearDepthStencilView) =
>   container_of(header, typeof(*cmd), header);
> + struct vmw_resource *ret;
> +
> + ret = vmw_view_id_val_add(sw_context, vmw_view_ds,
> +   cmd->body.depthStencilViewId);
>  
> - return PTR_RET(vmw_view_id_val_add(sw_context, vmw_view_ds,
> -cmd-
> >body.depthStencilViewId));
> + return PTR_ERR_OR_ZERO(ret);
>  }
>  
>  static int vmw_cmd_dx_view_define(struct vmw_private *dev_priv,
> @@ -2741,9 +2747,12 @@ static int vmw_cmd_dx_genmips(struct
> vmw_private *dev_priv,
>  {
>   VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdDXGenMips) =
>   container_of(header, typeof(*cmd), header);
> + struct vmw_resource *ret;
> +
> + ret = vmw_view_id_val_add(sw_context, vmw_view_sr,
> +   cmd->body.shaderResourceViewId);
>  
> - return PTR_RET(vmw_view_id_val_add(sw_context, vmw_view_sr,
> -cmd-
> >body.shaderResourceViewId));
> + return PTR_ERR_OR_ZERO(ret);
>  }
>  
>  /**

Reviewed-by: Thomas Hellstrom 

I will include this in vmwgfx-next.
Thanks,
Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v3 2/2] mm, drm/ttm: Fix vm page protection handling

2019-12-06 Thread Thomas Hellstrom
Hi Michal,

On Fri, 2019-12-06 at 11:30 +0100, Michal Hocko wrote:
> On Fri 06-12-19 09:24:26, Thomas Hellström (VMware) wrote:
> [...]
> > @@ -283,11 +282,26 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct
> > vm_fault *vmf,
> > pfn = page_to_pfn(page);
> > }
> >  
> > +   /*
> > +* Note that the value of @prot at this point may
> > differ from
> > +* the value of @vma->vm_page_prot in the caching- and
> > +* encryption bits. This is because the exact location
> > of the
> > +* data may not be known at mmap() time and may also
> > change
> > +* at arbitrary times while the data is mmap'ed.
> > +* This is ok as long as @vma->vm_page_prot is not used
> > by
> > +* the core vm to set caching- and encryption bits.
> > +* This is ensured by core vm using pte_modify() to
> > modify
> > +* page table entry protection bits (that function
> > preserves
> > +* old caching- and encryption bits), and the @fault
> > +* callback being the only function that creates new
> > +* page table entries.
> > +*/
> 
> While this is a very valuable piece of information I believe we need
> to
> document this in the generic code where everybody will find it.
> vmf_insert_mixed_prot sounds like a good place to me. So being
> explicit
> about VM_MIXEDMAP. Also a reference from vm_page_prot to this
> function
> would be really helpeful.
> 
> Thanks!
> 

Just to make sure I understand correctly. You'd prefer this (or
similar) text to be present at the vmf_insert_mixed_prot() and
vmf_insert_pfn_prot() definitions for MIXEDMAP and PFNMAP respectively,
and a pointer from vm_page_prot to that text. Is that correct?

Thanks,
Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: locking refcounting for ttm_bo_kmap/dma_buf_vmap

2019-11-20 Thread Thomas Hellstrom
On 11/20/19 1:24 PM, Christian König wrote:
> Am 20.11.19 um 13:19 schrieb Daniel Vetter:
>> On Wed, Nov 20, 2019 at 1:09 PM Daniel Vetter  wrote:
>>> On Wed, Nov 20, 2019 at 1:02 PM Christian König
>>>  wrote:
> What am I missing?
 The assumption is that when you want to create a vmap of a DMA-buf
 buffer the buffer needs to be pinned somehow.

 E.g. without dynamic dma-buf handling you would need to have an active
 attachment. With dynamic handling the requirements could be lowered to
 using the pin()/unpin() callbacks.
>>> Yeah right now everyone seems to have an attachment, and that's how we
>>> get away with all this. But the interface isn't supposed to work like
>>> that, dma_buf_vmap/unmap is supposed to be a stand-alone thing (you
>>> can call it directly on the struct dma_buf, no need for an
>>> attachment). Also I don't think non-dynamic drivers should ever call
>>> pin/unpin, not their job, holding onto a mapping should be enough to
>>> get things pinned.
>>>
 You also can't lock/unlock inside your vmap callback because you don't
 have any guarantee that the pointer stays valid as soon as your drop
 your lock.
>>> Well that's why I asked where the pin/unpin pair is. If you lock,
>>> then you do know that the pointer will stay around. But looks like the
>>> original patch from Dave for ttm based drivers played fast here
>>> with what should be done.
>>>
 BTW: What is vmap() still used for?
>>> udl, bunch of other things (e.g. bunch of tiny drivers). Not much, but
>>> not stuff we can drop.
>> If we're unlucky we'll actually have a problem with these now. For
>> some of these the attached device is not dma-capable, so dma_map_sg
>> will go boom. With the cached mapping logic we now have this might go
>> sideways for dynamic exporters. Did you test your dynamic dma-buf
>> support for amdgpu with udl?
> Short answer no, not at all. Long one: What the heck is udl? And how is 
> it not dma-capable?
>
>> Worst case we need to get rid of the fake
>> attachment, fix the vmap locking/pinning, and maybe some more
>> headaches to sort this out.
> Well of hand we could require that vmap will also pin a DMA-buf and 
> start fixing amgpu/nouveau/radeon/qxl.

Perhaps with dynamic dma-bufs it might be possible to do something
similar to vmwgfx (and recently other?) fbdev:

The map is cached, but may be invalidated as soon as we release dma_resv
/ unpin. (move_notify() unmaps if needed).

So each time it's needed we make sure we're locked / pinned and then
call a map_validate() function. Typically the map is still around. If it
isn't, the map_validate() function sets it up again.

Saves a bunch of vmap() calls or the need for persistent pinning for
performance reasons.

/Thomas




>
> Christian.
>

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 10/15] drm/vmwgfx: Delete mmaping functions

2019-11-18 Thread Thomas Hellstrom
On Mon, 2019-11-18 at 11:35 +0100, Daniel Vetter wrote:
> No need for stubs, dma-buf.c takes care of that.
> 
> Aside, not having a ->release callback smelled like refcounting leak
> somewhere. It will also score you a WARN_ON backtrace in dma-buf.c on
> every export. But then I found that ttm_device_object_init overwrites
> it. Overwriting const memory is not going to go down well in recent
> kernels.

It's actually taking a non-const copy and updating it. Not that that's
much better, but at least it won't crash due to writing to wp memory.
I'll add a backlog item to revisit this.

> 
> One more aside: The (un)map_dma_buf can't ever be called because
> ->attach rejects everything. Might want to drop a BUG_ON(1) in there.
> Same for ->detach.

And this.

> 
> Signed-off-by: Daniel Vetter 
> Cc: VMware Graphics 
> Cc: Thomas Hellstrom 
> ---
> 


Reviewed-by: Thomas Hellstrom 

Will you be taking this through drm-misc?

Thanks,
Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 3/5] drm/vmwgfx: drop DRM_AUTH for render ioctls

2019-11-12 Thread Thomas Hellstrom
On 11/1/19 2:05 PM, Emil Velikov wrote:
> From: Emil Velikov 
>
> With earlier commit 9c84aeba67cc ("drm/vmwgfx: Kill unneeded legacy
> security features") we removed the no longer applicable validation, as
> we now have isolation of primary clients from different master realms.
>
> As of last commit, we're explicitly checking for authentication in the
> only render ioctls which care about one.
>
> With those in place, the DRM_AUTH token serves no real purpose. Let's
> drop it.
>
> Cc: VMware Graphics 
> Cc: Thomas Hellstrom 
> Signed-off-by: Emil Velikov 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 28 ++--
>  1 file changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index 81a95651643f..253fae160175 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -165,9 +165,9 @@
>  
>  static const struct drm_ioctl_desc vmw_ioctls[] = {
>   VMW_IOCTL_DEF(VMW_GET_PARAM, vmw_getparam_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_ALLOC_DMABUF, vmw_bo_alloc_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_UNREF_DMABUF, vmw_bo_unref_ioctl,
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_CURSOR_BYPASS,
> @@ -182,16 +182,16 @@ static const struct drm_ioctl_desc vmw_ioctls[] = {
> DRM_MASTER),
>  
>   VMW_IOCTL_DEF(VMW_CREATE_CONTEXT, vmw_context_define_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_UNREF_CONTEXT, vmw_context_destroy_ioctl,
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_CREATE_SURFACE, vmw_surface_define_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_UNREF_SURFACE, vmw_surface_destroy_ioctl,
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_REF_SURFACE, vmw_surface_reference_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> - VMW_IOCTL_DEF(VMW_EXECBUF, vmw_execbuf_ioctl, DRM_AUTH |
> +   DRM_RENDER_ALLOW),
> + VMW_IOCTL_DEF(VMW_EXECBUF, vmw_execbuf_ioctl,
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_FENCE_WAIT, vmw_fence_obj_wait_ioctl,
> DRM_RENDER_ALLOW),
> @@ -201,9 +201,9 @@ static const struct drm_ioctl_desc vmw_ioctls[] = {
>   VMW_IOCTL_DEF(VMW_FENCE_UNREF, vmw_fence_obj_unref_ioctl,
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_FENCE_EVENT, vmw_fence_event_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_GET_3D_CAP, vmw_get_cap_3d_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>  
>   /* these allow direct access to the framebuffers mark as master only */
>   VMW_IOCTL_DEF(VMW_PRESENT, vmw_present_ioctl,
> @@ -221,28 +221,28 @@ static const struct drm_ioctl_desc vmw_ioctls[] = {
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_CREATE_SHADER,
> vmw_shader_define_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_UNREF_SHADER,
> vmw_shader_destroy_ioctl,
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_GB_SURFACE_CREATE,
> vmw_gb_surface_define_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_GB_SURFACE_REF,
> vmw_gb_surface_reference_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_SYNCCPU,
> vmw_user_bo_synccpu_ioctl,
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_CREATE_EXTENDED_CONTEXT,
> vmw_extended_context_define_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_GB_SURFACE_CREATE_EXT,
> vmw_gb_surface_define_ext_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_GB_SURFACE_REF_EXT,
> vmw_gb_surface_reference_ext_ioctl,
> -   DRM_AUTH | DRM_RENDER_ALLOW),
> +   DRM_RENDER_ALLOW),
&g

Re: [PATCH 2/5] drm/vmwgfx: check master authentication in surface_ref ioctls

2019-11-12 Thread Thomas Hellstrom
On 11/1/19 2:05 PM, Emil Velikov wrote:
> From: Emil Velikov 
>
> With later commit we'll rework DRM authentication handling. Namely
> DRM_AUTH will not be a requirement for DRM_RENDER_ALLOW ioctls.
>
> Since vmwgfx does isolation for primary clients in different master
> realms, the DRM_AUTH can be dropped.
>
> The only place where authentication matters, is surface_reference ioctls
> whenever a legacy (non-prime) handle is used. For those ioctls we call
> vmw_surface_handle_reference(), where we explicitly check if the client
> is both a) master and b) unauthenticated - bailing out as result.
>
> Otherwise the usual isolation path kicks in and we're all good.
>
> v2: Reword commit message, since the isolation work has landed.
>
> Cc: VMware Graphics 
> Cc: Thomas Hellstrom 
> Signed-off-by: Emil Velikov 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
> index 1f989f3605c8..596e5c1bc2c1 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
> @@ -936,6 +936,13 @@ vmw_surface_handle_reference(struct vmw_private 
> *dev_priv,
>   user_srf = container_of(base, struct vmw_user_surface,
>   prime.base);
>  
> + /* Error out if we are unauthenticated master */

Shouldn't this be "Error out if we are unauthenticated primary" ?

Otherwise

Reviewed-by: Thomas Hellstrom 


> + if (drm_is_primary_client(file_priv) &&
> + !file_priv->authenticated) {
> + ret = -EACCES;
> + goto out_bad_resource;
> + }
> +
>   /*
>* Make sure the surface creator has the same
>* authenticating master, or is already registered with us.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/5] drm/vmwgfx: move the require_exist handling together

2019-11-12 Thread Thomas Hellstrom
On 11/1/19 2:05 PM, Emil Velikov wrote:
> From: Emil Velikov 
>
> Move the render_client hunk for require_exist alongside the rest.
> Keeping all the reasons why an existing object is needed, in a single
> place makes it easier to follow.
>
> Cc: VMware Graphics 
> Cc: Thomas Hellstrom 
> Signed-off-by: Emil Velikov 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
> index 29d8794f0421..1f989f3605c8 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
> @@ -909,16 +909,12 @@ vmw_surface_handle_reference(struct vmw_private 
> *dev_priv,
>   uint32_t handle;
>   struct ttm_base_object *base;
>   int ret;
> - bool require_exist = false;
>  
>   if (handle_type == DRM_VMW_HANDLE_PRIME) {
>   ret = ttm_prime_fd_to_handle(tfile, u_handle, );
>   if (unlikely(ret != 0))
>   return ret;
>   } else {
> - if (unlikely(drm_is_render_client(file_priv)))
> - require_exist = true;
> -
>   handle = u_handle;
>   }
>  
> @@ -935,6 +931,8 @@ vmw_surface_handle_reference(struct vmw_private *dev_priv,
>   }
>  
>   if (handle_type != DRM_VMW_HANDLE_PRIME) {
> + bool require_exist = false;
> +
>   user_srf = container_of(base, struct vmw_user_surface,
>   prime.base);
>  
> @@ -946,6 +944,9 @@ vmw_surface_handle_reference(struct vmw_private *dev_priv,
>   user_srf->master != file_priv->master)
>   require_exist = true;
>  
> + if (unlikely(drm_is_render_client(file_priv)))
> + require_exist = true;
> +
>   ret = ttm_ref_object_add(tfile, base, TTM_REF_USAGE, NULL,
>require_exist);
>   if (unlikely(ret != 0)) {

Reviewed-by: Thomas Hellstrom 


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/5] drm/vmwgfx: move the require_exist handling together

2019-11-08 Thread Thomas Hellstrom
Hi, Emil!

On 11/8/19 2:14 PM, Emil Velikov wrote:
> On Fri, 1 Nov 2019 at 13:05, Emil Velikov  wrote:
>> From: Emil Velikov 
>>
>> Move the render_client hunk for require_exist alongside the rest.
>> Keeping all the reasons why an existing object is needed, in a single
>> place makes it easier to follow.
>>
>> Cc: VMware Graphics 
>> Cc: Thomas Hellstrom 
>> Signed-off-by: Emil Velikov 
>> ---
>>  drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 9 +
>>  1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c 
>> b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
>> index 29d8794f0421..1f989f3605c8 100644
>> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
>> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
>> @@ -909,16 +909,12 @@ vmw_surface_handle_reference(struct vmw_private 
>> *dev_priv,
>> uint32_t handle;
>> struct ttm_base_object *base;
>> int ret;
>> -   bool require_exist = false;
>>
>> if (handle_type == DRM_VMW_HANDLE_PRIME) {
>> ret = ttm_prime_fd_to_handle(tfile, u_handle, );
>> if (unlikely(ret != 0))
>> return ret;
>> } else {
>> -   if (unlikely(drm_is_render_client(file_priv)))
>> -   require_exist = true;
>> -
>> handle = u_handle;
>> }
>>
>> @@ -935,6 +931,8 @@ vmw_surface_handle_reference(struct vmw_private 
>> *dev_priv,
>> }
>>
>> if (handle_type != DRM_VMW_HANDLE_PRIME) {
>> +   bool require_exist = false;
>> +
>> user_srf = container_of(base, struct vmw_user_surface,
>> prime.base);
>>
>> @@ -946,6 +944,9 @@ vmw_surface_handle_reference(struct vmw_private 
>> *dev_priv,
>> user_srf->master != file_priv->master)
>> require_exist = true;
>>
>> +   if (unlikely(drm_is_render_client(file_priv)))
>> +   require_exist = true;
>> +
>> ret = ttm_ref_object_add(tfile, base, TTM_REF_USAGE, NULL,
>>  require_exist);
>> if (unlikely(ret != 0)) {
>> --
> Thomas, VMware devs, humble poke?
> Any comments and review would be appreciated.
>
> Thanks
> Emil
>
Sorry, I'll look at this early monday.

Thanks,

Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/3] dma_resv: prime lockdep annotations

2019-10-21 Thread Thomas Hellstrom
On 10/21/19 4:50 PM, Daniel Vetter wrote:
> Full audit of everyone:
>
> - i915, radeon, amdgpu should be clean per their maintainers.
>
> - vram helpers should be fine, they don't do command submission, so
>   really no business holding struct_mutex while doing copy_*_user. But
>   I haven't checked them all.
>
> - panfrost seems to dma_resv_lock only in panfrost_job_push, which
>   looks clean.
>
> - v3d holds dma_resv locks in the tail of its v3d_submit_cl_ioctl(),
>   copying from/to userspace happens all in v3d_lookup_bos which is
>   outside of the critical section.
>
> - vmwgfx has a bunch of ioctls that do their own copy_*_user:
>   - vmw_execbuf_process: First this does some copies in
> vmw_execbuf_cmdbuf() and also in the vmw_execbuf_process() itself.
> Then comes the usual ttm reserve/validate sequence, then actual
> submission/fencing, then unreserving, and finally some more
> copy_to_user in vmw_execbuf_copy_fence_user. Glossing over tons of
> details, but looks all safe.
>   - vmw_fence_event_ioctl: No ttm_reserve/dma_resv_lock anywhere to be
> seen, seems to only create a fence and copy it out.
>   - a pile of smaller ioctl in vmwgfx_ioctl.c, no reservations to be
> found there.
>   Summary: vmwgfx seems to be fine too.
>
> - virtio: There's virtio_gpu_execbuffer_ioctl, which does all the
>   copying from userspace before even looking up objects through their
>   handles, so safe. Plus the getparam/getcaps ioctl, also both safe.
>
> - qxl only has qxl_execbuffer_ioctl, which calls into
>   qxl_process_single_command. There's a lovely comment before the
>   __copy_from_user_inatomic that the slowpath should be copied from
>   i915, but I guess that never happened. Try not to be unlucky and get
>   your CS data evicted between when it's written and the kernel tries
>   to read it. The only other copy_from_user is for relocs, but those
>   are done before qxl_release_reserve_list(), which seems to be the
>   only thing reserving buffers (in the ttm/dma_resv sense) in that
>   code. So looks safe.
>
> - A debugfs file in nouveau_debugfs_pstate_set() and the usif ioctl in
>   usif_ioctl() look safe. nouveau_gem_ioctl_pushbuf() otoh breaks this
>   everywhere and needs to be fixed up.
>
> v2: Thomas pointed at that vmwgfx calls dma_resv_init while it holds a
> dma_resv lock of a different object already. Christian mentioned that
> ttm core does this too for ghost objects. intel-gfx-ci highlighted
> that i915 has similar issues.
>
> Unfortunately we can't do this in the usual module init functions,
> because kernel threads don't have an ->mm - we have to wait around for
> some user thread to do this.
>
> Solution is to spawn a worker (but only once). It's horrible, but it
> works.
>
> v3: We can allocate mm! (Chris). Horrible worker hack out, clean
> initcall solution in.
>
> v4: Annotate with __init (Rob Herring)
>
> Cc: Rob Herring 
> Cc: Alex Deucher 
> Cc: Christian König 
> Cc: Chris Wilson 
> Cc: Thomas Zimmermann 
> Cc: Rob Herring 
> Cc: Tomeu Vizoso 
> Cc: Eric Anholt 
> Cc: Dave Airlie 
> Cc: Gerd Hoffmann 
> Cc: Ben Skeggs 
> Cc: "VMware Graphics" 
> Cc: Thomas Hellstrom 
> Reviewed-by: Christian König 
> Reviewed-by: Chris Wilson 
> Tested-by: Chris Wilson 
> Signed-off-by: Daniel Vetter 

Including the vmwgfx audit,

Reviewed-by: Thomas Hellstrom 

Thanks,

Thomas



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/ttm: move cpu_writers handling into vmwgfx v2

2019-10-01 Thread Thomas Hellstrom
Hi, Christian,

On 9/30/19 6:34 PM, Christian König wrote:
> From: Christian König 
>
> This feature is only used by vmwgfx and superflous for everybody else.
>
> v2: use vmw_buffer_object instead of vmw_user_bo.
>
> Signed-off-by: Christian König 
> ---

I just sent out a patch based on this that is slightly reworked on the
vmwgfx side and that fixes a couple of checkpatch warnings.  TTM changes
should be the same. Added myself as Co-developed-by:

Hope this is ok. If you want to merge it through your tree I'm fine with
that.

/Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Re-review? WAS [PATCH 2/7] drm/ttm: Remove explicit typecasts of vm_private_data

2019-09-25 Thread Thomas Hellstrom
On 9/25/19 2:30 PM, Christian König wrote:
> Hi Thomas,
>
> this one and patch #3 are still Reviewed-by: Christian König 
> 
>
> Any objections that I cherry pick them over into our branch? Upstreaming 
> that stuff got delayed quite a bit and I want to base a cleanup on this.
>
> Thanks,
> Christian.

Sure, please do. We can sort out any merge problems later. Let me
quickly fix the commit message of patch #3 since it doesn't reflect that
we export more helpers, and I'll send out these two as a separate patchset.

Thanks,
Thomas




>
> Am 18.09.19 um 15:20 schrieb Thomas Hellstrom:
>> Hi, Christian!
>>
>> Since I introduced this patch and changed the TTM VM helper patch
>> enough to motivate removing your R-B, I wonder whether you could do a
>> quick review on these two and if OK also ack merging through vmwgfx?
>>
>> Thanks,
>> Thomas
>>
>>
>> On Wed, 2019-09-18 at 14:59 +0200, Thomas Hellström (VMware) wrote:
>>> From: Thomas Hellstrom 
>>>
>>> The explicit typcasts are meaningless, so remove them.
>>>
>>> Suggested-by: Matthew Wilcox 
>>> Signed-off-by: Thomas Hellstrom 
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 8 +++-
>>>   1 file changed, 3 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> index 76eedb963693..8963546bf245 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>> @@ -109,8 +109,7 @@ static unsigned long ttm_bo_io_mem_pfn(struct
>>> ttm_buffer_object *bo,
>>>   static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>>>   {
>>> struct vm_area_struct *vma = vmf->vma;
>>> -   struct ttm_buffer_object *bo = (struct ttm_buffer_object *)
>>> -   vma->vm_private_data;
>>> +   struct ttm_buffer_object *bo = vma->vm_private_data;
>>> struct ttm_bo_device *bdev = bo->bdev;
>>> unsigned long page_offset;
>>> unsigned long page_last;
>>> @@ -302,8 +301,7 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault
>>> *vmf)
>>>   
>>>   static void ttm_bo_vm_open(struct vm_area_struct *vma)
>>>   {
>>> -   struct ttm_buffer_object *bo =
>>> -   (struct ttm_buffer_object *)vma->vm_private_data;
>>> +   struct ttm_buffer_object *bo = vma->vm_private_data;
>>>   
>>> WARN_ON(bo->bdev->dev_mapping != vma->vm_file->f_mapping);
>>>   
>>> @@ -312,7 +310,7 @@ static void ttm_bo_vm_open(struct vm_area_struct
>>> *vma)
>>>   
>>>   static void ttm_bo_vm_close(struct vm_area_struct *vma)
>>>   {
>>> -   struct ttm_buffer_object *bo = (struct ttm_buffer_object *)vma-
>>>> vm_private_data;
>>> +   struct ttm_buffer_object *bo = vma->vm_private_data;
>>>   
>>> ttm_bo_put(bo);
>>> vma->vm_private_data = NULL;
>

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re-review? WAS [PATCH 2/7] drm/ttm: Remove explicit typecasts of vm_private_data

2019-09-18 Thread Thomas Hellstrom
Hi, Christian!

Since I introduced this patch and changed the TTM VM helper patch
enough to motivate removing your R-B, I wonder whether you could do a
quick review on these two and if OK also ack merging through vmwgfx?

Thanks,
Thomas


On Wed, 2019-09-18 at 14:59 +0200, Thomas Hellström (VMware) wrote:
> From: Thomas Hellstrom 
> 
> The explicit typcasts are meaningless, so remove them.
> 
> Suggested-by: Matthew Wilcox 
> Signed-off-by: Thomas Hellstrom 
> ---
>  drivers/gpu/drm/ttm/ttm_bo_vm.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 76eedb963693..8963546bf245 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -109,8 +109,7 @@ static unsigned long ttm_bo_io_mem_pfn(struct
> ttm_buffer_object *bo,
>  static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>  {
>   struct vm_area_struct *vma = vmf->vma;
> - struct ttm_buffer_object *bo = (struct ttm_buffer_object *)
> - vma->vm_private_data;
> + struct ttm_buffer_object *bo = vma->vm_private_data;
>   struct ttm_bo_device *bdev = bo->bdev;
>   unsigned long page_offset;
>   unsigned long page_last;
> @@ -302,8 +301,7 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault
> *vmf)
>  
>  static void ttm_bo_vm_open(struct vm_area_struct *vma)
>  {
> - struct ttm_buffer_object *bo =
> - (struct ttm_buffer_object *)vma->vm_private_data;
> + struct ttm_buffer_object *bo = vma->vm_private_data;
>  
>   WARN_ON(bo->bdev->dev_mapping != vma->vm_file->f_mapping);
>  
> @@ -312,7 +310,7 @@ static void ttm_bo_vm_open(struct vm_area_struct
> *vma)
>  
>  static void ttm_bo_vm_close(struct vm_area_struct *vma)
>  {
> - struct ttm_buffer_object *bo = (struct ttm_buffer_object *)vma-
> >vm_private_data;
> + struct ttm_buffer_object *bo = vma->vm_private_data;
>  
>   ttm_bo_put(bo);
>   vma->vm_private_data = NULL;
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/vmwgfx: Fix double free in vmw_recv_msg()

2019-09-05 Thread Thomas Hellstrom
On Thu, 2019-08-15 at 09:38 +0100, Colin Ian King wrote:
> On 15/08/2019 09:30, Dan Carpenter wrote:
> > We recently added a kfree() after the end of the loop:
> > 
> > if (retries == RETRIES) {
> > kfree(reply);
> > return -EINVAL;
> > }
> > 
> > There are two problems.  First the test is wrong and because
> > retries
> > equals RETRIES if we succeed on the last iteration through the
> > loop.
> > Second if we fail on the last iteration through the loop then the
> > kfree
> > is a double free.
> > 
> > When you're reading this code, please note the break statement at
> > the
> > end of the while loop.  This patch changes the loop so that if it's
> > not
> > successful then "reply" is NULL and we can test for that afterward.
> > 
> > Fixes: 6b7c3b86f0b6 ("drm/vmwgfx: fix memory leak when too many
> > retries have occurred")
> > Signed-off-by: Dan Carpenter 
> > ---
> >  drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 8 +++-
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
> > b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
> > index 59e9d05ab928..0af048d1a815 100644
> > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
> > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
> > @@ -353,7 +353,7 @@ static int vmw_recv_msg(struct rpc_channel
> > *channel, void **msg,
> >  !!(HIGH_WORD(ecx) &
> > MESSAGE_STATUS_HB));
> > if ((HIGH_WORD(ebx) & MESSAGE_STATUS_SUCCESS) == 0) {
> > kfree(reply);
> > -
> > +   reply = NULL;
> > if ((HIGH_WORD(ebx) & MESSAGE_STATUS_CPT) != 0)
> > {
> > /* A checkpoint occurred. Retry. */
> > continue;
> > @@ -377,7 +377,7 @@ static int vmw_recv_msg(struct rpc_channel
> > *channel, void **msg,
> >  
> > if ((HIGH_WORD(ecx) & MESSAGE_STATUS_SUCCESS) == 0) {
> > kfree(reply);
> > -
> > +   reply = NULL;
> > if ((HIGH_WORD(ecx) & MESSAGE_STATUS_CPT) != 0)
> > {
> > /* A checkpoint occurred. Retry. */
> > continue;
> > @@ -389,10 +389,8 @@ static int vmw_recv_msg(struct rpc_channel
> > *channel, void **msg,
> > break;
> > }
> >  
> > -   if (retries == RETRIES) {
> > -   kfree(reply);
> > +   if (!reply)
> > return -EINVAL;
> > -   }
> >  
> > *msg_len = reply_len;
> > *msg = reply;
> > 
> 
> Dan, Thanks for fixing up my mistake.

Thanks, Dan. Sorry for the late reply. 
Reviewed-by: Thomas Hellström 
Will push this to fixes.

/Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 7/8] drm/vmwgfx: switch to own vma manager

2019-09-05 Thread Thomas Hellstrom
On Thu, 2019-09-05 at 09:05 +0200, Gerd Hoffmann wrote:
> Add struct drm_vma_offset_manager to vma_private, initialize it and
> pass it to ttm_bo_device_init().
> 
> With this in place the last user of ttm's embedded vma offset manager
> is gone and we can remove it (in a separate patch).
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 1 +
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 6 +-
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> 

Reviewed-by: Thomas Hellström 

I assume this will be merged through drm-misc?

/Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[tip: x86/vmware] drm/vmwgfx: Update the backdoor call with support for new instructions

2019-08-29 Thread tip-bot2 for Thomas Hellstrom
The following commit has been merged into the x86/vmware branch of tip:

Commit-ID: 6abe3778cf5abd59b23b9037796f3eab8b7f1d98
Gitweb:
https://git.kernel.org/tip/6abe3778cf5abd59b23b9037796f3eab8b7f1d98
Author:Thomas Hellstrom 
AuthorDate:Wed, 28 Aug 2019 10:03:52 +02:00
Committer: Borislav Petkov 
CommitterDate: Wed, 28 Aug 2019 13:36:46 +02:00

drm/vmwgfx: Update the backdoor call with support for new instructions

Use the definition provided by include/asm/vmware.h

Signed-off-by: Thomas Hellstrom 
Signed-off-by: Borislav Petkov 
Reviewed-by: Doug Covelli 
Acked-by: Dave Airlie 
Cc: Daniel Vetter 
Cc: dri-devel@lists.freedesktop.org
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: pv-driv...@vmware.com
Cc: Thomas Gleixner 
Cc: VMware Graphics 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190828080353.12658-4-thomas...@shipmail.org
---
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 21 -
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.h | 35 ++--
 2 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
index 59e9d05..b1df3e3 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
@@ -46,8 +46,6 @@
 #define RETRIES 3
 
 #define VMW_HYPERVISOR_MAGIC0x564D5868
-#define VMW_HYPERVISOR_PORT 0x5658
-#define VMW_HYPERVISOR_HB_PORT  0x5659
 
 #define VMW_PORT_CMD_MSG30
 #define VMW_PORT_CMD_HB_MSG 0
@@ -93,7 +91,7 @@ static int vmw_open_channel(struct rpc_channel *channel, 
unsigned int protocol)
 
VMW_PORT(VMW_PORT_CMD_OPEN_CHANNEL,
(protocol | GUESTMSG_FLAG_COOKIE), si, di,
-   VMW_HYPERVISOR_PORT,
+   0,
VMW_HYPERVISOR_MAGIC,
eax, ebx, ecx, edx, si, di);
 
@@ -126,7 +124,7 @@ static int vmw_close_channel(struct rpc_channel *channel)
 
VMW_PORT(VMW_PORT_CMD_CLOSE_CHANNEL,
0, si, di,
-   (VMW_HYPERVISOR_PORT | (channel->channel_id << 16)),
+   channel->channel_id << 16,
VMW_HYPERVISOR_MAGIC,
eax, ebx, ecx, edx, si, di);
 
@@ -160,7 +158,8 @@ static unsigned long vmw_port_hb_out(struct rpc_channel 
*channel,
VMW_PORT_HB_OUT(
(MESSAGE_STATUS_SUCCESS << 16) | VMW_PORT_CMD_HB_MSG,
msg_len, si, di,
-   VMW_HYPERVISOR_HB_PORT | (channel->channel_id << 16),
+   VMWARE_HYPERVISOR_HB | (channel->channel_id << 16) |
+   VMWARE_HYPERVISOR_OUT,
VMW_HYPERVISOR_MAGIC, bp,
eax, ebx, ecx, edx, si, di);
 
@@ -181,7 +180,7 @@ static unsigned long vmw_port_hb_out(struct rpc_channel 
*channel,
 
VMW_PORT(VMW_PORT_CMD_MSG | (MSG_TYPE_SENDPAYLOAD << 16),
 word, si, di,
-VMW_HYPERVISOR_PORT | (channel->channel_id << 16),
+channel->channel_id << 16,
 VMW_HYPERVISOR_MAGIC,
 eax, ebx, ecx, edx, si, di);
}
@@ -213,7 +212,7 @@ static unsigned long vmw_port_hb_in(struct rpc_channel 
*channel, char *reply,
VMW_PORT_HB_IN(
(MESSAGE_STATUS_SUCCESS << 16) | VMW_PORT_CMD_HB_MSG,
reply_len, si, di,
-   VMW_HYPERVISOR_HB_PORT | (channel->channel_id << 16),
+   VMWARE_HYPERVISOR_HB | (channel->channel_id << 16),
VMW_HYPERVISOR_MAGIC, bp,
eax, ebx, ecx, edx, si, di);
 
@@ -230,7 +229,7 @@ static unsigned long vmw_port_hb_in(struct rpc_channel 
*channel, char *reply,
 
VMW_PORT(VMW_PORT_CMD_MSG | (MSG_TYPE_RECVPAYLOAD << 16),
 MESSAGE_STATUS_SUCCESS, si, di,
-VMW_HYPERVISOR_PORT | (channel->channel_id << 16),
+channel->channel_id << 16,
 VMW_HYPERVISOR_MAGIC,
 eax, ebx, ecx, edx, si, di);
 
@@ -269,7 +268,7 @@ static int vmw_send_msg(struct rpc_channel *channel, const 
char *msg)
 
VMW_PORT(VMW_PORT_CMD_SENDSIZE,
msg_len, si, di,
-   VMW_HYPERVISOR_PORT | (channel->channel_id << 16),
+   channel->channel_id << 16,
VMW_HYPERVISOR_MAGIC,
eax, ebx, ecx, edx, si, di);
 
@@ -327,7 +326,7 @@ static int vmw_recv_msg(struct rpc_channel *channel, void 
**msg,
 
VMW_PORT(VMW_PORT_CMD_RECVSIZE,
0, si, di,
-   (VMW_HYPERVISOR_PORT | (channel->channel_id << 16)),
+

Re: [PATCH v4 12/17] drm/vmwgfx: switch driver from bo->resv to bo->base.resv

2019-08-03 Thread Thomas Hellstrom

Still doesn't like this very much, but anyway

Acked-by: Thomas Hellstrom 

On 8/2/19 7:22 AM, Gerd Hoffmann wrote:

Signed-off-by: Gerd Hoffmann 
---
  drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 4 ++--
  drivers/gpu/drm/vmwgfx/vmwgfx_bo.c   | 8 
  drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c  | 4 ++--
  drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 6 +++---
  4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
index fc6673cde289..917eeb793585 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
@@ -459,9 +459,9 @@ int vmw_bo_cpu_blit(struct ttm_buffer_object *dst,
  
  	/* Buffer objects need to be either pinned or reserved: */

if (!(dst->mem.placement & TTM_PL_FLAG_NO_EVICT))
-   lockdep_assert_held(>resv->lock.base);
+   lockdep_assert_held(>base.resv->lock.base);
if (!(src->mem.placement & TTM_PL_FLAG_NO_EVICT))
-   lockdep_assert_held(>resv->lock.base);
+   lockdep_assert_held(>base.resv->lock.base);
  
  	if (dst->ttm->state == tt_unpopulated) {

ret = dst->ttm->bdev->driver->ttm_tt_populate(dst->ttm, );
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index 0d9478d2e700..4a38ab0733c4 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -342,7 +342,7 @@ void vmw_bo_pin_reserved(struct vmw_buffer_object *vbo, 
bool pin)
uint32_t old_mem_type = bo->mem.mem_type;
int ret;
  
-	lockdep_assert_held(>resv->lock.base);

+   lockdep_assert_held(>base.resv->lock.base);
  
  	if (pin) {

if (vbo->pin_count++ > 0)
@@ -690,7 +690,7 @@ static int vmw_user_bo_synccpu_grab(struct 
vmw_user_buffer_object *user_bo,
long lret;
  
  		lret = reservation_object_wait_timeout_rcu

-   (bo->resv, true, true,
+   (bo->base.resv, true, true,
 nonblock ? 0 : MAX_SCHEDULE_TIMEOUT);
if (!lret)
return -EBUSY;
@@ -1007,10 +1007,10 @@ void vmw_bo_fence_single(struct ttm_buffer_object *bo,
  
  	if (fence == NULL) {

vmw_execbuf_fence_commands(NULL, dev_priv, , NULL);
-   reservation_object_add_excl_fence(bo->resv, >base);
+   reservation_object_add_excl_fence(bo->base.resv, >base);
dma_fence_put(>base);
} else
-   reservation_object_add_excl_fence(bo->resv, >base);
+   reservation_object_add_excl_fence(bo->base.resv, >base);
  }
  
  
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c

index b4f6e1217c9d..e142714f132c 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cotable.c
@@ -169,7 +169,7 @@ static int vmw_cotable_unscrub(struct vmw_resource *res)
} *cmd;
  
  	WARN_ON_ONCE(bo->mem.mem_type != VMW_PL_MOB);

-   lockdep_assert_held(>resv->lock.base);
+   lockdep_assert_held(>base.resv->lock.base);
  
  	cmd = VMW_FIFO_RESERVE(dev_priv, sizeof(*cmd));

if (!cmd)
@@ -311,7 +311,7 @@ static int vmw_cotable_unbind(struct vmw_resource *res,
return 0;
  
  	WARN_ON_ONCE(bo->mem.mem_type != VMW_PL_MOB);

-   lockdep_assert_held(>resv->lock.base);
+   lockdep_assert_held(>base.resv->lock.base);
  
  	mutex_lock(_priv->binding_mutex);

if (!vcotbl->scrubbed)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index 1d38a8b2f2ec..ccd7f758bf8c 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -402,14 +402,14 @@ void vmw_resource_unreserve(struct vmw_resource *res,
  
  	if (switch_backup && new_backup != res->backup) {

if (res->backup) {
-   lockdep_assert_held(>backup->base.resv->lock.base);
+   
lockdep_assert_held(>backup->base.base.resv->lock.base);
list_del_init(>mob_head);
vmw_bo_unreference(>backup);
}
  
  		if (new_backup) {

res->backup = vmw_bo_reference(new_backup);
-   lockdep_assert_held(_backup->base.resv->lock.base);
+   
lockdep_assert_held(_backup->base.base.resv->lock.base);
list_add_tail(>mob_head, _backup->res_list);
} else {
res->backup = NULL;
@@ -691,7 +691,7 @@ void vmw_resource_unbind_list(struct vmw_buffer_object *vbo)
.num_shared = 0
};
  
-	lockdep_assert_held(>base.resv->lock.base);

+   lockdep_assert_held(>base.base.resv->lock.base);
list_for_each_entry_safe(res, next, >res_list, mob_head) {
if (!res->func->unbind)
continue;





Re: [PATCH v2 00/18] drm/ttm: make ttm bo a gem bo subclass

2019-06-22 Thread Thomas Hellstrom

Hi, Daniel,

On 6/22/19 11:18 AM, Daniel Vetter wrote:

Hi Thomas,

On Sat, Jun 22, 2019 at 12:52 AM Thomas Hellstrom  wrote:

On 6/21/19 5:57 PM, Daniel Vetter wrote:

On Fri, Jun 21, 2019 at 05:12:19PM +0200, Thomas Hellström (VMware) wrote:

On 6/21/19 1:57 PM, Gerd Hoffmann wrote:

Aargh. Please don't do this. Multiple reasons:

1) I think It's bad to dump all buffer object functionality we can possibly
think of in a single struct and force that on all (well at least most)
users. It's better to isolate functionality in structs, have utility
functions for those and let the drivers derive their buffer objects from
whatever functionality they actually need.
2) vmwgfx is not using gem and we don't want to carry that extra payload in
the buffer object.
3) TTM historically hasn't been using the various drm layers except for
later when common helpers have been used, (like the vma manager and the
cache utilities). It's desirable to keep that layer distinction. (which is
really what I'm saying in 1.)

Now if more and more functionality that originated in TTM is moving into GEM
we need to find a better way to do that without duplicating functionality. I
suggest adding pointers in the TTM structs and defaulting those pointers to
the member in the TTM struct. Optionally to to the member in the GEM struct.
If we need to migrate those members out of the TTM struct, vmwgfx would have
to provide them in its own buffer class.

NAK from the vmwgfx side.

It's 59 DRIVER_GEM vs 1 which is not. I think the verdict is clear what
the reasonable thing to do is here, and this will allow us to
substantially improve code and concept sharing across drm drivers.

10 years ago it was indeed not clear whether everyone doing the same is a
bright idea, but that's no more. If you want I guess you can keep a
private copy of ttm in vmwgfx, but not sure that's really worth it
long-term.
-Daniel

It's not a question about whether GEM or TTM, or even the number of
drivers using one or the other. (GEM would actually be a good choice for
the latter vmwgfx device versions). But this is going against all recent
effort to make different parts of drm functionality recently self-contained.

Just stop and think for a while what would happen if someone would
suggest doing the opposite: making a gem object a derived class of a TTM
object, arguing that a lot of GEM drivers are using TTM as a backend.
There would probably be a lot of people claiming "we don't want to
unnecesarily carry that stuff". That's because that would also be a poor
design.

That case is a bit a different case. We have
- 5 ttm+gem drivers, recently refactored into vram helpers (but still
ttm underneath)
- 5 ttm+gem drivers, using ttm directly
- 1 ttm driver, no gem
- 48 other gem drivers with no vram support
- 1 gem driver which will gain vram support shortly, with or without
ttm still not clear

11:48 is not even close to 59:1 imo. And I think even if Thomas
Zimmermann and others get really busy porting old discrete fbdev
drivers to kms, that ratio won't change much since we're also gaining
new soc drm drivers at a steady rate.


Yeah, my point was not really suggesting that we do this, but rather 
that people would rightfully get upset because the struct contains 
unused stuff.


Also a trap we might end up with in the future is that with the design 
suggested in this patch series is that people start assuming that the 
embedded gem object is actually initialized and working, which could 
lead to pretty severe problems for vmwgfx...




Also I wouldn't mind if we e.g. stuff a struct list_head lru; into
drm_gem_buffer_object, that's probably useful for many cases (not the
pure display drivers, but they tend to have so few bo it really wont
matter even if we add a few kb of cruft).


What I'm suggesting is, build that improved code and concept sharing around

struct gem_ttm_object {
 struct gem_object;
 struct ttm_object;
};

I guess technically this would work too. Bit more churn (maybe
substantially more, I haven't looked tbh) to roll this out for all the
ttm drivers using gem.


And lets work toghether to eliminate what's duplicated.

How would you share the bo.resv pointer with the above design? With
embedding ttm can use the gem one, and we drop a bunch of code (and
for all the ttm+gem drivers, one pointer they don't need twice). With
the side-by-side, which is the design all gem+ttm drivers used the
past few years, we still need that duplication. Same for the vma node
thing, which is also duplicated.


To bemore precise I'd probably define a

struct drm_bo_common {
    struct reservation_object r;
    struct drm_vma_node v;
};

Embed it in a struct drm_gem_object (and in a struct 
vmwgfx_buffer_object) and then have a pointer to a struct drm_bo_common 
in the struct ttm_buffer_object. That's a single pointer overhead for 
everything we want to move.


As TTM-specific code disappears, so will the number of members in struct 
drm_bo_common. Meanwhile, w

Re: [PATCH v2 00/18] drm/ttm: make ttm bo a gem bo subclass

2019-06-21 Thread Thomas Hellstrom

Hi, Daniel,

On 6/21/19 5:57 PM, Daniel Vetter wrote:

On Fri, Jun 21, 2019 at 05:12:19PM +0200, Thomas Hellström (VMware) wrote:


On 6/21/19 1:57 PM, Gerd Hoffmann wrote:

Aargh. Please don't do this. Multiple reasons:

1) I think It's bad to dump all buffer object functionality we can possibly
think of in a single struct and force that on all (well at least most)
users. It's better to isolate functionality in structs, have utility
functions for those and let the drivers derive their buffer objects from
whatever functionality they actually need.
2) vmwgfx is not using gem and we don't want to carry that extra payload in
the buffer object.
3) TTM historically hasn't been using the various drm layers except for
later when common helpers have been used, (like the vma manager and the
cache utilities). It's desirable to keep that layer distinction. (which is
really what I'm saying in 1.)

Now if more and more functionality that originated in TTM is moving into GEM
we need to find a better way to do that without duplicating functionality. I
suggest adding pointers in the TTM structs and defaulting those pointers to
the member in the TTM struct. Optionally to to the member in the GEM struct.
If we need to migrate those members out of the TTM struct, vmwgfx would have
to provide them in its own buffer class.

NAK from the vmwgfx side.

It's 59 DRIVER_GEM vs 1 which is not. I think the verdict is clear what
the reasonable thing to do is here, and this will allow us to
substantially improve code and concept sharing across drm drivers.

10 years ago it was indeed not clear whether everyone doing the same is a
bright idea, but that's no more. If you want I guess you can keep a
private copy of ttm in vmwgfx, but not sure that's really worth it
long-term.
-Daniel


It's not a question about whether GEM or TTM, or even the number of 
drivers using one or the other. (GEM would actually be a good choice for 
the latter vmwgfx device versions). But this is going against all recent 
effort to make different parts of drm functionality recently self-contained.


Just stop and think for a while what would happen if someone would 
suggest doing the opposite: making a gem object a derived class of a TTM 
object, arguing that a lot of GEM drivers are using TTM as a backend. 
There would probably be a lot of people claiming "we don't want to 
unnecesarily carry that stuff". That's because that would also be a poor 
design.


What I'm suggesting is, build that improved code and concept sharing around

struct gem_ttm_object {
   struct gem_object;
   struct ttm_object;
};

And lets work toghether to eliminate what's duplicated.

The vmwgfx driver is doing what it does mostly because all buffer 
objects do not need to be user-space visible, and do not need to be 
mapped by user-space. And there are other types of objects that DO need 
to be user-space visible, and that do need to be shared by processes. 
Hence user-space visibility is something that should be abstracted and 
made available to those objects. Not lumped together with all other 
potential buffer object functionality.


/Thomas




___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/4] drm/vmwgfx: Assign eviction priorities to resources

2019-06-18 Thread Thomas Hellstrom

On 6/18/19 3:27 PM, Daniel Vetter wrote:

On Tue, Jun 18, 2019 at 03:08:01PM +0200, Thomas Hellstrom wrote:

On 6/18/19 2:19 PM, Daniel Vetter wrote:

On Tue, Jun 18, 2019 at 11:54:08AM +0100, Emil Velikov wrote:

Hi Thomas,

On 2019/06/18, Thomas Hellström (VMware) wrote:

From: Thomas Hellstrom 

TTM provides a means to assign eviction priorities to buffer object. This
means that all buffer objects with a lower priority will be evicted first
on memory pressure.
Use this to make sure surfaces and in particular non-dirty surfaces are
evicted first. Evicting in particular shaders, cotables and contexts imply
a significant performance hit on vmwgfx, so make sure these resources are
evicted last.
Some buffer objects are sub-allocated in user-space which means we can have
many resources attached to a single buffer object or resource. In that case
the buffer object is given the highest priority of the attached resources.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Deepak Rawat 

Fwiw patches 1-3 are:
Reviewed-by: Emil Velikov 

Patch 4 is:
Acked-by: Emil Velikov 

Huge thanks for sorting this out.

Oh, does this mean we can remove the varios master* callbacks from
drm_driver now? Iirc vmwgfx was the only user, and those callbacks seem
very tempting to various folks for implementing questionable driver hacks
... Happy to type the patches, but maybe simpler if you do that since all
this gets merged through the vmwgfx tree.

Cheers, Daniel

In case someone follow this, I'll paste in the commit message of 4/4 which
is the relevant one here..

8<

At one point, the GPU command verifier and user-space handle manager
couldn't properly protect GPU clients from accessing each other's data.
Instead there was an elaborate mechanism to make sure only the active
master's primary clients could render. The other clients were either
put to sleep or even killed (if the master had exited). VRAM was
evicted on master switch. With the advent of render-node functionality,
we relaxed the VRAM eviction, but the other mechanisms stayed in place.

Now that the GPU  command verifier and ttm object manager properly
isolates primary clients from different master realms we can remove the
master switch related code and drop those legacy features.

8<---

I think we can at least take a look. I'm out on a fairly long vacation soon
so in any case it won't be before August or so.

Ah don't worry, if this all lands in the 5.3 merge window I can take a
look in a few weeks.


One use we still have for master_set() is that if a master is switched away,
and then the mode list changes, and then the master is switched back, it
will typically not remember to act on the sysfs event received while
switched out, and come back in an incorrect mode. Since mode-list changes
happen quite frequently with virtual display adapters that's bad.

But perhaps we can consider moving that to core, if that's what needed to
get rid of the master switch callbacks.

Hm, this sounds a bit like papering over userspace bugs, at least if
you're referring to drm_sysfs_hotplug_event(). Userspace is supposed to
either keep listening or to re-acquire all the kms output state and do the
hotplugg processing in one go when becoming active again.

Ofc it exists, so we can't just remove it. I wouldn't want to make this
part of the uapi though, feels like duct-taping around sloppy userspace.
Maybe we could work on a gradual plan to deprecate this, with limiting it
only to older vmwgfx versions as a start?


Sounds ok with me. First I guess I need to figure out what compositors / 
user-space drivers actually suffer from this. If there are many, it 
might be a pain trying to fix them all.


Thanks,

Thomas




These kind of tiny but important differences in how drivers implement kms
is why I'd much, much prefer it's not even possible to do stuff like this.

Thanks, Daniel



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/4] drm/vmwgfx: Assign eviction priorities to resources

2019-06-18 Thread Thomas Hellstrom

On 6/18/19 12:54 PM, Emil Velikov wrote:

Hi Thomas,

On 2019/06/18, Thomas Hellström (VMware) wrote:

From: Thomas Hellstrom 

TTM provides a means to assign eviction priorities to buffer object. This
means that all buffer objects with a lower priority will be evicted first
on memory pressure.
Use this to make sure surfaces and in particular non-dirty surfaces are
evicted first. Evicting in particular shaders, cotables and contexts imply
a significant performance hit on vmwgfx, so make sure these resources are
evicted last.
Some buffer objects are sub-allocated in user-space which means we can have
many resources attached to a single buffer object or resource. In that case
the buffer object is given the highest priority of the attached resources.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Deepak Rawat 

Fwiw patches 1-3 are:
Reviewed-by: Emil Velikov 

Patch 4 is:
Acked-by: Emil Velikov 

Huge thanks for sorting this out.
Emil


Thanks for reviewing, Emil.

/Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/4] drm/vmwgfx: Assign eviction priorities to resources

2019-06-18 Thread Thomas Hellstrom

On 6/18/19 2:19 PM, Daniel Vetter wrote:

On Tue, Jun 18, 2019 at 11:54:08AM +0100, Emil Velikov wrote:

Hi Thomas,

On 2019/06/18, Thomas Hellström (VMware) wrote:

From: Thomas Hellstrom 

TTM provides a means to assign eviction priorities to buffer object. This
means that all buffer objects with a lower priority will be evicted first
on memory pressure.
Use this to make sure surfaces and in particular non-dirty surfaces are
evicted first. Evicting in particular shaders, cotables and contexts imply
a significant performance hit on vmwgfx, so make sure these resources are
evicted last.
Some buffer objects are sub-allocated in user-space which means we can have
many resources attached to a single buffer object or resource. In that case
the buffer object is given the highest priority of the attached resources.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Deepak Rawat 

Fwiw patches 1-3 are:
Reviewed-by: Emil Velikov 

Patch 4 is:
Acked-by: Emil Velikov 

Huge thanks for sorting this out.

Oh, does this mean we can remove the varios master* callbacks from
drm_driver now? Iirc vmwgfx was the only user, and those callbacks seem
very tempting to various folks for implementing questionable driver hacks
... Happy to type the patches, but maybe simpler if you do that since all
this gets merged through the vmwgfx tree.

Cheers, Daniel


In case someone follow this, I'll paste in the commit message of 4/4 
which is the relevant one here..


8<

At one point, the GPU command verifier and user-space handle manager
couldn't properly protect GPU clients from accessing each other's data.
Instead there was an elaborate mechanism to make sure only the active
master's primary clients could render. The other clients were either
put to sleep or even killed (if the master had exited). VRAM was
evicted on master switch. With the advent of render-node functionality,
we relaxed the VRAM eviction, but the other mechanisms stayed in place.

Now that the GPU  command verifier and ttm object manager properly
isolates primary clients from different master realms we can remove the
master switch related code and drop those legacy features.

8<---

I think we can at least take a look. I'm out on a fairly long vacation 
soon so in any case it won't be before August or so.


One use we still have for master_set() is that if a master is switched 
away, and then the mode list changes, and then the master is switched 
back, it will typically not remember to act on the sysfs event received 
while switched out, and come back in an incorrect mode. Since mode-list 
changes happen quite frequently with virtual display adapters that's bad.


But perhaps we can consider moving that to core, if that's what needed 
to get rid of the master switch callbacks.


/Thomas



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/ttm: move cpu_writers handling into vmwgfx

2019-06-14 Thread Thomas Hellstrom
Hi, Christian,

On Fri, 2019-06-14 at 14:58 +0200, Christian König wrote:
> Thomas just a gentle ping on this.
> 
> It's not that my live depends on this, but it would still be a nice
> to 
> have cleanup.
> 
> Thanks,
> Christian.
> 

I thought I had answered this, but I can't find it in my outgoing
folder. Sorry about that.

In principle I'm fine with it, but the vmwgfx part needs some changes:
1) We need to operate on struct vmwgfx_buffer_object rather than struct
vmwgfx_user_buffer_object. Not all buffer objects are user buffer
objects...

2) Need to look at the moving the list verifying or at least its calls
into the vmwgfx_validate.c code.

I hopefully can have a quick look at this next week.

/Thomas




> Am 07.06.19 um 16:47 schrieb Christian König:
> > This feature is only used by vmwgfx and superflous for everybody
> > else.
> > 
> > Signed-off-by: Christian König 
> > ---
> >   drivers/gpu/drm/ttm/ttm_bo.c | 27 --
> >   drivers/gpu/drm/ttm/ttm_bo_util.c|  1 -
> >   drivers/gpu/drm/ttm/ttm_execbuf_util.c   |  7 +
> >   drivers/gpu/drm/vmwgfx/vmwgfx_bo.c   | 35
> > 
> >   drivers/gpu/drm/vmwgfx/vmwgfx_drv.h  |  2 ++
> >   drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c  |  8 ++
> >   drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |  4 +++
> >   include/drm/ttm/ttm_bo_api.h | 31 -
> > 
> >   8 files changed, 45 insertions(+), 70 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > b/drivers/gpu/drm/ttm/ttm_bo.c
> > index c7de667d482a..4ec055ffd6a7 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > @@ -153,7 +153,6 @@ static void ttm_bo_release_list(struct kref
> > *list_kref)
> >   
> > BUG_ON(kref_read(>list_kref));
> > BUG_ON(kref_read(>kref));
> > -   BUG_ON(atomic_read(>cpu_writers));
> > BUG_ON(bo->mem.mm_node != NULL);
> > BUG_ON(!list_empty(>lru));
> > BUG_ON(!list_empty(>ddestroy));
> > @@ -1308,7 +1307,6 @@ int ttm_bo_init_reserved(struct ttm_bo_device
> > *bdev,
> >   
> > kref_init(>kref);
> > kref_init(>list_kref);
> > -   atomic_set(>cpu_writers, 0);
> > INIT_LIST_HEAD(>lru);
> > INIT_LIST_HEAD(>ddestroy);
> > INIT_LIST_HEAD(>swap);
> > @@ -1814,31 +1812,6 @@ int ttm_bo_wait(struct ttm_buffer_object
> > *bo,
> >   }
> >   EXPORT_SYMBOL(ttm_bo_wait);
> >   
> > -int ttm_bo_synccpu_write_grab(struct ttm_buffer_object *bo, bool
> > no_wait)
> > -{
> > -   int ret = 0;
> > -
> > -   /*
> > -* Using ttm_bo_reserve makes sure the lru lists are updated.
> > -*/
> > -
> > -   ret = ttm_bo_reserve(bo, true, no_wait, NULL);
> > -   if (unlikely(ret != 0))
> > -   return ret;
> > -   ret = ttm_bo_wait(bo, true, no_wait);
> > -   if (likely(ret == 0))
> > -   atomic_inc(>cpu_writers);
> > -   ttm_bo_unreserve(bo);
> > -   return ret;
> > -}
> > -EXPORT_SYMBOL(ttm_bo_synccpu_write_grab);
> > -
> > -void ttm_bo_synccpu_write_release(struct ttm_buffer_object *bo)
> > -{
> > -   atomic_dec(>cpu_writers);
> > -}
> > -EXPORT_SYMBOL(ttm_bo_synccpu_write_release);
> > -
> >   /**
> >* A buffer object shrink method that tries to swap out the first
> >* buffer object on the bo_global::swap_lru list.
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
> > b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > index 895d77d799e4..6f43f1f0de7c 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > @@ -511,7 +511,6 @@ static int ttm_buffer_object_transfer(struct
> > ttm_buffer_object *bo,
> > mutex_init(>base.wu_mutex);
> > fbo->base.moving = NULL;
> > drm_vma_node_reset(>base.vma_node);
> > -   atomic_set(>base.cpu_writers, 0);
> >   
> > kref_init(>base.list_kref);
> > kref_init(>base.kref);
> > diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > index 957ec375a4ba..80fa52b36d5c 100644
> > --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > @@ -113,12 +113,7 @@ int ttm_eu_reserve_buffers(struct
> > ww_acquire_ctx *ticket,
> > struct ttm_buffer_object *bo = entry->bo;
> >   
> > ret = __ttm_bo_reserve(bo, intr, (ticket == NULL),
> > ticket);
> > -   if (!ret && unlikely(atomic_read(>cpu_writers) >
> > 0)) {
> > -   reservation_object_unlock(bo->resv);
> > -
> > -   ret = -EBUSY;
> > -
> > -   } else if (ret == -EALREADY && dups) {
> > +   if (ret == -EALREADY && dups) {
> > struct ttm_validate_buffer *safe = entry;
> > entry = list_prev_entry(entry, head);
> > list_del(>head);
> > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> > b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> > index 5d5c2bce01f3..457861c5047f 100644
> > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> > @@ -565,7 

Re: [PATCH v5 5/9] drm/ttm: TTM fault handler helpers

2019-06-13 Thread Thomas Hellstrom
Hi!

On Thu, 2019-06-13 at 12:25 +0800, Hillf Danton wrote:
> Hello Thomas
> 
> On Wed, 12 Jun 2019 08:42:39 +0200 Thomas Hellstrom wrote:
> > From: Thomas Hellstrom 
> > 
> > With the vmwgfx dirty tracking, the default TTM fault handler is
> > not
> > completely sufficient (vmwgfx need to modify the vma->vm_flags
> > member,
> > and also needs to restrict the number of prefaults).
> > 
> > We also want to replicate the new ttm_bo_vm_reserve() functionality
> > 
> > So start turning the TTM vm code into helpers:
> > ttm_bo_vm_fault_reserved()
> > and ttm_bo_vm_reserve(), and provide a default TTM fault handler
> > for other
> > drivers to use.
> > 
> > Cc: "Christian König" 
> > 
> > Signed-off-by: Thomas Hellstrom 
> > Reviewed-by: "Christian König"  #v1
> > ---
> >  drivers/gpu/drm/ttm/ttm_bo_vm.c | 175 +++-
> > 
> >  include/drm/ttm/ttm_bo_api.h|  10 ++
> >  2 files changed, 113 insertions(+), 72 deletions(-)
> > 
> > 

...


> > -   /*
> > -* Work around locking order reversal in fault / nopfn
> > -* between mmap_sem and bo_reserve: Perform a trylock operation
> > -* for reserve, and if it fails, retry the fault after waiting
> > -* for the buffer to become unreserved.
> > -*/
> Is it likely to not cut the comment as the trylock is still there?

Yes, I'll re-add that. It was removed in an early version of the patch
when I actually removed the trylock as well, but I changed my mind on
that.

> 
> > if (unlikely(!reservation_object_trylock(bo->resv))) {
> > if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
> > if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) {
> > @@ -151,14 +148,55 @@ static vm_fault_t ttm_bo_vm_fault(struct
> > vm_fault *vmf)
> > return VM_FAULT_NOPAGE;
> > }
> >  
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL(ttm_bo_vm_reserve);

...


> > 
> > -   if (unlikely(err != 0)) {
> > -   ret = VM_FAULT_SIGBUS;
> > -   goto out_io_unlock;
> > -   }
> > +   if (unlikely(err != 0))
> > +   return VM_FAULT_SIGBUS;
> >  
> Is it likely a typo to skip the io_unlock?
> 
> --
> Hillf


Yes. Good catch. That io_unlock should definitely remain.

I'll respin and resend to dri-devel and lkml only.

Thanks,
Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/ttm: move cpu_writers handling into vmwgfx

2019-06-12 Thread Thomas Hellstrom
Hi, Christian,

This looks OK, although there are a couple of minor alterations needed
in the vmwgfx driver:

- We should operate on vmw_buffer_objects rather than on
user_buffer_objects.
- vmw_user_bo_verify_synccpu should move to the validate code.

I can take care of that if it's ok with you.

Thanks,
Thomas


On Fri, 2019-06-07 at 16:47 +0200, Christian König wrote:
> This feature is only used by vmwgfx and superflous for everybody
> else.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/ttm/ttm_bo.c | 27 --
>  drivers/gpu/drm/ttm/ttm_bo_util.c|  1 -
>  drivers/gpu/drm/ttm/ttm_execbuf_util.c   |  7 +
>  drivers/gpu/drm/vmwgfx/vmwgfx_bo.c   | 35 
> 
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h  |  2 ++
>  drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c  |  8 ++
>  drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |  4 +++
>  include/drm/ttm/ttm_bo_api.h | 31 -
>  8 files changed, 45 insertions(+), 70 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> b/drivers/gpu/drm/ttm/ttm_bo.c
> index c7de667d482a..4ec055ffd6a7 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -153,7 +153,6 @@ static void ttm_bo_release_list(struct kref
> *list_kref)
>  
>   BUG_ON(kref_read(>list_kref));
>   BUG_ON(kref_read(>kref));
> - BUG_ON(atomic_read(>cpu_writers));
>   BUG_ON(bo->mem.mm_node != NULL);
>   BUG_ON(!list_empty(>lru));
>   BUG_ON(!list_empty(>ddestroy));
> @@ -1308,7 +1307,6 @@ int ttm_bo_init_reserved(struct ttm_bo_device
> *bdev,
>  
>   kref_init(>kref);
>   kref_init(>list_kref);
> - atomic_set(>cpu_writers, 0);
>   INIT_LIST_HEAD(>lru);
>   INIT_LIST_HEAD(>ddestroy);
>   INIT_LIST_HEAD(>swap);
> @@ -1814,31 +1812,6 @@ int ttm_bo_wait(struct ttm_buffer_object *bo,
>  }
>  EXPORT_SYMBOL(ttm_bo_wait);
>  
> -int ttm_bo_synccpu_write_grab(struct ttm_buffer_object *bo, bool
> no_wait)
> -{
> - int ret = 0;
> -
> - /*
> -  * Using ttm_bo_reserve makes sure the lru lists are updated.
> -  */
> -
> - ret = ttm_bo_reserve(bo, true, no_wait, NULL);
> - if (unlikely(ret != 0))
> - return ret;
> - ret = ttm_bo_wait(bo, true, no_wait);
> - if (likely(ret == 0))
> - atomic_inc(>cpu_writers);
> - ttm_bo_unreserve(bo);
> - return ret;
> -}
> -EXPORT_SYMBOL(ttm_bo_synccpu_write_grab);
> -
> -void ttm_bo_synccpu_write_release(struct ttm_buffer_object *bo)
> -{
> - atomic_dec(>cpu_writers);
> -}
> -EXPORT_SYMBOL(ttm_bo_synccpu_write_release);
> -
>  /**
>   * A buffer object shrink method that tries to swap out the first
>   * buffer object on the bo_global::swap_lru list.
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
> b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 895d77d799e4..6f43f1f0de7c 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -511,7 +511,6 @@ static int ttm_buffer_object_transfer(struct
> ttm_buffer_object *bo,
>   mutex_init(>base.wu_mutex);
>   fbo->base.moving = NULL;
>   drm_vma_node_reset(>base.vma_node);
> - atomic_set(>base.cpu_writers, 0);
>  
>   kref_init(>base.list_kref);
>   kref_init(>base.kref);
> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> index 957ec375a4ba..80fa52b36d5c 100644
> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> @@ -113,12 +113,7 @@ int ttm_eu_reserve_buffers(struct ww_acquire_ctx
> *ticket,
>   struct ttm_buffer_object *bo = entry->bo;
>  
>   ret = __ttm_bo_reserve(bo, intr, (ticket == NULL),
> ticket);
> - if (!ret && unlikely(atomic_read(>cpu_writers) >
> 0)) {
> - reservation_object_unlock(bo->resv);
> -
> - ret = -EBUSY;
> -
> - } else if (ret == -EALREADY && dups) {
> + if (ret == -EALREADY && dups) {
>   struct ttm_validate_buffer *safe = entry;
>   entry = list_prev_entry(entry, head);
>   list_del(>head);
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> index 5d5c2bce01f3..457861c5047f 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> @@ -565,7 +565,7 @@ static void vmw_user_bo_ref_obj_release(struct
> ttm_base_object *base,
>  
>   switch (ref_type) {
>   case TTM_REF_SYNCCPU_WRITE:
> - ttm_bo_synccpu_write_release(_bo->vbo.base);
> + atomic_dec(_bo->vbo.cpu_writers);
>   break;
>   default:
>   WARN_ONCE(true, "Undefined buffer object reference
> release.\n");
> @@ -681,12 +681,12 @@ static int vmw_user_bo_synccpu_grab(struct
> vmw_user_buffer_object *user_bo,
>   struct ttm_object_file *tfile,
>  

[PATCH v3] drm/vmwgfx: fix a warning due to missing dma_parms

2019-06-10 Thread Thomas Hellstrom
From: Qian Cai 

Booting up with DMA_API_DEBUG_SG=y generates a warning due to the driver
forgot to set dma_parms appropriately. Set it just after vmw_dma_masks()
in vmw_driver_load().

DMA-API: vmwgfx :00:0f.0: mapping sg segment longer than device
claims to support [len=2097152] [max=65536]
WARNING: CPU: 2 PID: 261 at kernel/dma/debug.c:1232
debug_dma_map_sg+0x360/0x480
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform, BIOS 6.00 04/13/2018
RIP: 0010:debug_dma_map_sg+0x360/0x480
Call Trace:
 vmw_ttm_map_dma+0x3b1/0x5b0 [vmwgfx]
 vmw_bo_map_dma+0x25/0x30 [vmwgfx]
 vmw_otables_setup+0x2a8/0x750 [vmwgfx]
 vmw_request_device_late+0x78/0xc0 [vmwgfx]
 vmw_request_device+0xee/0x4e0 [vmwgfx]
 vmw_driver_load.cold+0x757/0xd84 [vmwgfx]
 drm_dev_register+0x1ff/0x340 [drm]
 drm_get_pci_dev+0x110/0x290 [drm]
 vmw_probe+0x15/0x20 [vmwgfx]
 local_pci_probe+0x7a/0xc0
 pci_device_probe+0x1b9/0x290
 really_probe+0x1b5/0x630
 driver_probe_device+0xa3/0x1a0
 device_driver_attach+0x94/0xa0
 __driver_attach+0xdd/0x1c0
 bus_for_each_dev+0xfe/0x150
 driver_attach+0x2d/0x40
 bus_add_driver+0x290/0x350
 driver_register+0xdc/0x1d0
 __pci_register_driver+0xda/0xf0
 vmwgfx_init+0x34/0x1000 [vmwgfx]
 do_one_initcall+0xe5/0x40a
 do_init_module+0x10f/0x3a0
 load_module+0x16a5/0x1a40
 __se_sys_finit_module+0x183/0x1c0
 __x64_sys_finit_module+0x43/0x50
 do_syscall_64+0xc8/0x606
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: fb1d9738ca05 ("drm/vmwgfx: Add DRM driver for VMware Virtual GPU")
Co-developed-by: Thomas Hellstrom 
Signed-off-by: Qian Cai 
Signed-off-by: Thomas Hellstrom 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 6d417e29bcec..447b49d6ade1 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -745,6 +745,9 @@ static int vmw_driver_load(struct drm_device *dev, unsigned 
long chipset)
if (unlikely(ret != 0))
goto out_err0;
 
+   dma_set_max_seg_size(dev->dev, min_t(unsigned int, U32_MAX & PAGE_MASK,
+SCATTERLIST_MAX_SEGMENT));
+
if (dev_priv->capabilities & SVGA_CAP_GMR2) {
DRM_INFO("Max GMR ids is %u\n",
 (unsigned)dev_priv->max_gmr_ids);
-- 
2.21.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RESEND PATCH v2] drm/vmwgfx: fix a warning due to missing dma_parms

2019-06-04 Thread Thomas Hellstrom
Reviewed-by: Thomas Hellstrom 

I'll just need to give this some more testing before queueing it on
vmwgfx-fixes.

Thanks,
Thomas


On Mon, 2019-06-03 at 16:44 -0400, Qian Cai wrote:
> Booting up with DMA_API_DEBUG_SG=y generates a warning due to the
> driver
> forgot to set dma_parms appropriately. Set it just after
> vmw_dma_masks()
> in vmw_driver_load().
> 
> DMA-API: vmwgfx :00:0f.0: mapping sg segment longer than device
> claims to support [len=2097152] [max=65536]
> WARNING: CPU: 2 PID: 261 at kernel/dma/debug.c:1232
> debug_dma_map_sg+0x360/0x480
> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop
> Reference Platform, BIOS 6.00 04/13/2018
> RIP: 0010:debug_dma_map_sg+0x360/0x480
> Call Trace:
>  vmw_ttm_map_dma+0x3b1/0x5b0 [vmwgfx]
>  vmw_bo_map_dma+0x25/0x30 [vmwgfx]
>  vmw_otables_setup+0x2a8/0x750 [vmwgfx]
>  vmw_request_device_late+0x78/0xc0 [vmwgfx]
>  vmw_request_device+0xee/0x4e0 [vmwgfx]
>  vmw_driver_load.cold+0x757/0xd84 [vmwgfx]
>  drm_dev_register+0x1ff/0x340 [drm]
>  drm_get_pci_dev+0x110/0x290 [drm]
>  vmw_probe+0x15/0x20 [vmwgfx]
>  local_pci_probe+0x7a/0xc0
>  pci_device_probe+0x1b9/0x290
>  really_probe+0x1b5/0x630
>  driver_probe_device+0xa3/0x1a0
>  device_driver_attach+0x94/0xa0
>  __driver_attach+0xdd/0x1c0
>  bus_for_each_dev+0xfe/0x150
>  driver_attach+0x2d/0x40
>  bus_add_driver+0x290/0x350
>  driver_register+0xdc/0x1d0
>  __pci_register_driver+0xda/0xf0
>  vmwgfx_init+0x34/0x1000 [vmwgfx]
>  do_one_initcall+0xe5/0x40a
>  do_init_module+0x10f/0x3a0
>  load_module+0x16a5/0x1a40
>  __se_sys_finit_module+0x183/0x1c0
>  __x64_sys_finit_module+0x43/0x50
>  do_syscall_64+0xc8/0x606
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Fixes: fb1d9738ca05 ("drm/vmwgfx: Add DRM driver for VMware Virtual
> GPU")
> Suggested-by: Thomas Hellstrom 
> Signed-off-by: Qian Cai 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index 4ff11a0077e1..5f690429eb89 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -747,6 +747,8 @@ static int vmw_driver_load(struct drm_device
> *dev, unsigned long chipset)
>   if (unlikely(ret != 0))
>   goto out_err0;
>  
> + dma_set_max_seg_size(dev->dev, U32_MAX);
> +
>   if (dev_priv->capabilities & SVGA_CAP_GMR2) {
>   DRM_INFO("Max GMR ids is %u\n",
>(unsigned)dev_priv->max_gmr_ids);
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH] drm/ttm, drm/vmwgfx: Have TTM support AMD SEV encryption

2019-05-29 Thread Thomas Hellstrom
On Wed, 2019-05-29 at 09:50 +0200, Christian König wrote:
> Am 28.05.19 um 19:23 schrieb Lendacky, Thomas:
> > On 5/28/19 12:05 PM, Thomas Hellstrom wrote:
> > > On 5/28/19 7:00 PM, Lendacky, Thomas wrote:
> > > > On 5/28/19 11:32 AM, Koenig, Christian wrote:
> > > > > Am 28.05.19 um 18:27 schrieb Thomas Hellstrom:
> > > > > > On Tue, 2019-05-28 at 15:50 +, Lendacky, Thomas wrote:
> > > > > > > On 5/28/19 10:17 AM, Koenig, Christian wrote:
> > > > > > > > Hi Thomas,
> > > > > > > > 
> > > > > > > > Am 28.05.19 um 17:11 schrieb Thomas Hellstrom:
> > > > > > > > > Hi, Tom,
> > > > > > > > > 
> > > > > > > > > Thanks for the reply. The question is not graphics
> > > > > > > > > specific, but
> > > > > > > > > lies
> > > > > > > > > in your answer further below:
> > > > > > > > > 
> > > > > > > > > On 5/28/19 4:48 PM, Lendacky, Thomas wrote:
> > > > > > > > > > On 5/28/19 2:31 AM, Thomas Hellstrom wrote:
> > > > > > > > > > [SNIP]
> > > > > > > > > > As for kernel vmaps and user-maps, those pages will
> > > > > > > > > > be marked
> > > > > > > > > > encrypted
> > > > > > > > > > (unless explicitly made un-encrypted by calling
> > > > > > > > > > set_memory_decrypted()).
> > > > > > > > > > But, if you are copying to/from those areas into
> > > > > > > > > > the un-
> > > > > > > > > > encrypted DMA
> > > > > > > > > > area then everything will be ok.
> > > > > > > > > The question is regarding the above paragraph.
> > > > > > > > > 
> > > > > > > > > AFAICT,  set_memory_decrypted() only changes the
> > > > > > > > > fixed kernel map
> > > > > > > > > PTEs.
> > > > > > > > > But when setting up other aliased PTEs to the exact
> > > > > > > > > same
> > > > > > > > > decrypted
> > > > > > > > > pages, for example using dma_mmap_coherent(),
> > > > > > > > > kmap_atomic_prot(),
> > > > > > > > > vmap() etc. What code is responsible for clearing the
> > > > > > > > > encrypted
> > > > > > > > > flag
> > > > > > > > > on those PTEs? Is there something in the x86 platform
> > > > > > > > > code doing
> > > > > > > > > that?
> > > > > > > > Tom actually explained this:
> > > > > > > > > The encryption bit is bit-47 of a physical address.
> > > > > > > > In other words set_memory_decrypted() changes the
> > > > > > > > physical address
> > > > > > > > in
> > > > > > > > the PTEs of the kernel mapping and all other use cases
> > > > > > > > just copy
> > > > > > > > that
> > > > > > > > from there.
> > > > > > > Except I don't think the PTE attributes are copied from
> > > > > > > the kernel
> > > > > > > mapping
> > > > > > +1!
> > > > > > 
> > > > > > > in some cases. For example, dma_mmap_coherent() will
> > > > > > > create the same
> > > > > > > vm_page_prot value regardless of whether or not the
> > > > > > > underlying memory
> > > > > > > is
> > > > > > > encrypted or not. But kmap_atomic_prot() will return the
> > > > > > > kernel
> > > > > > > virtual
> > > > > > > address of the page, so that would be fine.
> > > > > > Yes, on 64-bit systems. On 32-bit systems (do they exist
> > > > > > with SEV?),
> > > > > > they don't.
> > > > > I don't think so, but feel free to prove me wrong Tom.
> > > > SEV is 64-bit only.
> > > And I just noticed that kmap_atomic_prot() indeed returns the
> > > kernel

Re: [RFC PATCH] drm/ttm, drm/vmwgfx: Have TTM support AMD SEV encryption

2019-05-28 Thread Thomas Hellstrom

On 5/28/19 7:00 PM, Lendacky, Thomas wrote:

On 5/28/19 11:32 AM, Koenig, Christian wrote:

Am 28.05.19 um 18:27 schrieb Thomas Hellstrom:

On Tue, 2019-05-28 at 15:50 +, Lendacky, Thomas wrote:

On 5/28/19 10:17 AM, Koenig, Christian wrote:

Hi Thomas,

Am 28.05.19 um 17:11 schrieb Thomas Hellstrom:

Hi, Tom,

Thanks for the reply. The question is not graphics specific, but
lies
in your answer further below:

On 5/28/19 4:48 PM, Lendacky, Thomas wrote:

On 5/28/19 2:31 AM, Thomas Hellstrom wrote:
[SNIP]
As for kernel vmaps and user-maps, those pages will be marked
encrypted
(unless explicitly made un-encrypted by calling
set_memory_decrypted()).
But, if you are copying to/from those areas into the un-
encrypted DMA
area then everything will be ok.

The question is regarding the above paragraph.

AFAICT,  set_memory_decrypted() only changes the fixed kernel map
PTEs.
But when setting up other aliased PTEs to the exact same
decrypted
pages, for example using dma_mmap_coherent(),
kmap_atomic_prot(),
vmap() etc. What code is responsible for clearing the encrypted
flag
on those PTEs? Is there something in the x86 platform code doing
that?

Tom actually explained this:

The encryption bit is bit-47 of a physical address.

In other words set_memory_decrypted() changes the physical address
in
the PTEs of the kernel mapping and all other use cases just copy
that
from there.

Except I don't think the PTE attributes are copied from the kernel
mapping

+1!


in some cases. For example, dma_mmap_coherent() will create the same
vm_page_prot value regardless of whether or not the underlying memory
is
encrypted or not. But kmap_atomic_prot() will return the kernel
virtual
address of the page, so that would be fine.

Yes, on 64-bit systems. On 32-bit systems (do they exist with SEV?),
they don't.

I don't think so, but feel free to prove me wrong Tom.

SEV is 64-bit only.


And I just noticed that kmap_atomic_prot() indeed returns the kernel map 
also for 32-bit lowmem.





And similarly TTM user-space mappings and vmap() doesn't copy from the
kernel map either,  so I think we actually do need to modify the page-
prot like done in the patch.

Well the problem is that this won't have any effect.

As Tom explained encryption is not implemented as a page protection bit,
but rather as part of the physical address of the part.

This is where things get interesting.  Even though the encryption bit is
part of the physical address (e.g. under SME the device could/would use an
address with the encryption bit set), it is implemented as part of the PTE
attributes. So, for example, using _PAGE_ENC when building a pgprot value
would produce an entry with the encryption bit set.

And the thing to watch out for is using two virtual addresses that point
to the same physical address (page) in DRAM but one has the encryption bit
set and one doesn't. The hardware does not enforce coherency between an
encrypted and un-encrypted mapping of the same physical address (page).
See section 7.10.6 of the AMD64 Architecture Programmer's Manual Volume 2.


Indeed. And I'm pretty sure the kernel map PTE and a TTM / vmap PTE 
pointing to the same decrypted page differ in the encryption bit (47) 
setting.


But on the hypervisor that would sort of work, because from what I 
understand with SEV we select between the guest key and the hypervisor 
key with that bit. On the hypervisor both keys are the same? On a guest 
it would probably break.


/Thomas



Thanks,
Tom


I have no idea how that is actually handled thought,
Christian.


/Thomas


This is an area that needs looking into to be sure it is working
properly
with SME and SEV.

Thanks,
Tom


That's rather nifty, because this way everybody will either use or
not
use encryption on the page.

Christian.


Thanks,
Thomas



Things get fuzzy for me when it comes to the GPU access of the
memory
and what and how it is accessed.

Thanks,
Tom



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH] drm/ttm, drm/vmwgfx: Have TTM support AMD SEV encryption

2019-05-28 Thread Thomas Hellstrom
On Tue, 2019-05-28 at 15:50 +, Lendacky, Thomas wrote:
> On 5/28/19 10:17 AM, Koenig, Christian wrote:
> > Hi Thomas,
> > 
> > Am 28.05.19 um 17:11 schrieb Thomas Hellstrom:
> > > Hi, Tom,
> > > 
> > > Thanks for the reply. The question is not graphics specific, but
> > > lies 
> > > in your answer further below:
> > > 
> > > On 5/28/19 4:48 PM, Lendacky, Thomas wrote:
> > > > On 5/28/19 2:31 AM, Thomas Hellstrom wrote:
> > > > [SNIP]
> > > > As for kernel vmaps and user-maps, those pages will be marked
> > > > encrypted
> > > > (unless explicitly made un-encrypted by calling
> > > > set_memory_decrypted()).
> > > > But, if you are copying to/from those areas into the un-
> > > > encrypted DMA
> > > > area then everything will be ok.
> > > 
> > > The question is regarding the above paragraph.
> > > 
> > > AFAICT,  set_memory_decrypted() only changes the fixed kernel map
> > > PTEs.
> > > But when setting up other aliased PTEs to the exact same
> > > decrypted 
> > > pages, for example using dma_mmap_coherent(),
> > > kmap_atomic_prot(), 
> > > vmap() etc. What code is responsible for clearing the encrypted
> > > flag 
> > > on those PTEs? Is there something in the x86 platform code doing
> > > that?
> > 
> > Tom actually explained this:
> > > The encryption bit is bit-47 of a physical address.
> > 
> > In other words set_memory_decrypted() changes the physical address
> > in 
> > the PTEs of the kernel mapping and all other use cases just copy
> > that 
> > from there.
> 
> Except I don't think the PTE attributes are copied from the kernel
> mapping

+1!

> in some cases. For example, dma_mmap_coherent() will create the same
> vm_page_prot value regardless of whether or not the underlying memory
> is
> encrypted or not. But kmap_atomic_prot() will return the kernel
> virtual
> address of the page, so that would be fine.

Yes, on 64-bit systems. On 32-bit systems (do they exist with SEV?),
they don't. 

And similarly TTM user-space mappings and vmap() doesn't copy from the
kernel map either,  so I think we actually do need to modify the page-
prot like done in the patch.

/Thomas

> 
> This is an area that needs looking into to be sure it is working
> properly
> with SME and SEV.
> 
> Thanks,
> Tom
> 
> > That's rather nifty, because this way everybody will either use or
> > not 
> > use encryption on the page.
> > 
> > Christian.
> > 
> > > Thanks,
> > > Thomas
> > > 
> > > 
> > > > Things get fuzzy for me when it comes to the GPU access of the
> > > > memory
> > > > and what and how it is accessed.
> > > > 
> > > > Thanks,
> > > > Tom
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH] drm/ttm, drm/vmwgfx: Have TTM support AMD SEV encryption

2019-05-28 Thread Thomas Hellstrom

Hi, Tom,

Thanks for the reply. The question is not graphics specific, but lies in 
your answer further below:


On 5/28/19 4:48 PM, Lendacky, Thomas wrote:

On 5/28/19 2:31 AM, Thomas Hellstrom wrote:

Hi, Tom,

Could you shed some light on this?

I don't have a lot of GPU knowledge, so let me start with an overview of
how everything should work and see if that answers the questions being
asked.

First, SME:
The encryption bit is bit-47 of a physical address. So, if a device does
not support at least 48-bit DMA, it will have to use the SWIOTLB and
bounce buffer the data. This is handled automatically if the driver is
using the Linux DMA-api as all of SWIOTLB has been marked un-encrypted.
Data is bounced between the un-encrypted SWIOTLB and the (presumably)
encrypted area of the driver.

For SEV:
The encryption bit position is the same as SME. However, with SEV all
DMA must use an un-encrypted area so all DMA goes through SWIOTLB. Just
like SME, this is handled automatically if the driver is using the Linux
DMA-api as all of SWIOTLB has been marked un-encrypted. And just like SME,
data is bounced between the un-encrypted SWIOTLB and the (presumably)
encrypted area of the driver.

There is an optimization for dma_alloc_coherent() where the pages are
allocated and marked un-encrypted, thus avoiding the bouncing (see file
kernel/dma/direct.c, dma_direct_alloc_pages()).

As for kernel vmaps and user-maps, those pages will be marked encrypted
(unless explicitly made un-encrypted by calling set_memory_decrypted()).
But, if you are copying to/from those areas into the un-encrypted DMA
area then everything will be ok.


The question is regarding the above paragraph.

AFAICT,  set_memory_decrypted() only changes the fixed kernel map PTEs.
But when setting up other aliased PTEs to the exact same decrypted 
pages, for example using dma_mmap_coherent(), kmap_atomic_prot(), vmap() 
etc. What code is responsible for clearing the encrypted flag on those 
PTEs? Is there something in the x86 platform code doing that?


Thanks,
Thomas




Things get fuzzy for me when it comes to the GPU access of the memory
and what and how it is accessed.

Thanks,
Tom


Thanks,
Thomas


On 5/24/19 5:08 PM, Alex Deucher wrote:

+ Tom

He's been looking into SEV as well.

On Fri, May 24, 2019 at 8:30 AM Thomas Hellstrom 
wrote:

On 5/24/19 2:03 PM, Koenig, Christian wrote:

Am 24.05.19 um 12:37 schrieb Thomas Hellstrom:

[CAUTION: External Email]

On 5/24/19 12:18 PM, Koenig, Christian wrote:

Am 24.05.19 um 11:55 schrieb Thomas Hellstrom:

[CAUTION: External Email]

On 5/24/19 11:11 AM, Thomas Hellstrom wrote:

Hi, Christian,

On 5/24/19 10:37 AM, Koenig, Christian wrote:

Am 24.05.19 um 10:11 schrieb Thomas Hellström (VMware):

[CAUTION: External Email]

From: Thomas Hellstrom 

With SEV encryption, all DMA memory must be marked decrypted
(AKA "shared") for devices to be able to read it. In the future we
might
want to be able to switch normal (encrypted) memory to decrypted in
exactly
the same way as we handle caching states, and that would require
additional
memory pools. But for now, rely on memory allocated with
dma_alloc_coherent() which is already decrypted with SEV enabled.
Set up
the page protection accordingly. Drivers must detect SEV enabled
and
switch
to the dma page pool.

This patch has not yet been tested. As a follow-up, we might
want to
cache decrypted pages in the dma page pool regardless of their
caching
state.

This patch is unnecessary, SEV support already works fine with at
least
amdgpu and I would expect that it also works with other drivers as
well.

Also see this patch:

commit 64e1f830ea5b3516a4256ed1c504a265d7f2a65c
Author: Christian König 
Date:   Wed Mar 13 10:11:19 2019 +0100

  drm: fallback to dma_alloc_coherent when memory
encryption is
active

  We can't just map any randome page we get when memory
encryption is
  active.

  Signed-off-by: Christian König 
  Acked-by: Alex Deucher 
  Link: https://patchwork.kernel.org/patch/10850833/

Regards,
Christian.

Yes, I noticed that. Although I fail to see where we automagically
clear the PTE encrypted bit when mapping coherent memory? For the
linear kernel map, that's done within dma_alloc_coherent() but for
kernel vmaps and and user-space maps? Is that done automatically by
the x86 platform layer?

Yes, I think so. Haven't looked to closely at this either.

This sounds a bit odd. If that were the case, the natural place would be
the PAT tracking code, but it only handles caching flags AFAICT. Not
encryption flags.

But when you tested AMD with SEV, was that running as hypervisor rather
than a guest, or did you run an SEV guest with PCI passthrough to the
AMD device?

Yeah, well the problem is we never tested this ourself :)


/Thomas


And, as a follow up question, why do we need dma_alloc_coherent() when
using SME? I thought the hardware performs the decryption when DMA-ing
to / from an encrypted pag

Re: [RFC PATCH] drm/ttm, drm/vmwgfx: Have TTM support AMD SEV encryption

2019-05-28 Thread Thomas Hellstrom

Hi, Tom,

Could you shed some light on this?

Thanks,
Thomas


On 5/24/19 5:08 PM, Alex Deucher wrote:

+ Tom

He's been looking into SEV as well.

On Fri, May 24, 2019 at 8:30 AM Thomas Hellstrom  wrote:

On 5/24/19 2:03 PM, Koenig, Christian wrote:

Am 24.05.19 um 12:37 schrieb Thomas Hellstrom:

[CAUTION: External Email]

On 5/24/19 12:18 PM, Koenig, Christian wrote:

Am 24.05.19 um 11:55 schrieb Thomas Hellstrom:

[CAUTION: External Email]

On 5/24/19 11:11 AM, Thomas Hellstrom wrote:

Hi, Christian,

On 5/24/19 10:37 AM, Koenig, Christian wrote:

Am 24.05.19 um 10:11 schrieb Thomas Hellström (VMware):

[CAUTION: External Email]

From: Thomas Hellstrom 

With SEV encryption, all DMA memory must be marked decrypted
(AKA "shared") for devices to be able to read it. In the future we
might
want to be able to switch normal (encrypted) memory to decrypted in
exactly
the same way as we handle caching states, and that would require
additional
memory pools. But for now, rely on memory allocated with
dma_alloc_coherent() which is already decrypted with SEV enabled.
Set up
the page protection accordingly. Drivers must detect SEV enabled and
switch
to the dma page pool.

This patch has not yet been tested. As a follow-up, we might want to
cache decrypted pages in the dma page pool regardless of their
caching
state.

This patch is unnecessary, SEV support already works fine with at
least
amdgpu and I would expect that it also works with other drivers as
well.

Also see this patch:

commit 64e1f830ea5b3516a4256ed1c504a265d7f2a65c
Author: Christian König 
Date:   Wed Mar 13 10:11:19 2019 +0100

 drm: fallback to dma_alloc_coherent when memory encryption is
active

 We can't just map any randome page we get when memory
encryption is
 active.

 Signed-off-by: Christian König 
 Acked-by: Alex Deucher 
 Link: https://patchwork.kernel.org/patch/10850833/

Regards,
Christian.

Yes, I noticed that. Although I fail to see where we automagically
clear the PTE encrypted bit when mapping coherent memory? For the
linear kernel map, that's done within dma_alloc_coherent() but for
kernel vmaps and and user-space maps? Is that done automatically by
the x86 platform layer?

Yes, I think so. Haven't looked to closely at this either.

This sounds a bit odd. If that were the case, the natural place would be
the PAT tracking code, but it only handles caching flags AFAICT. Not
encryption flags.

But when you tested AMD with SEV, was that running as hypervisor rather
than a guest, or did you run an SEV guest with PCI passthrough to the
AMD device?

Yeah, well the problem is we never tested this ourself :)


/Thomas


And, as a follow up question, why do we need dma_alloc_coherent() when
using SME? I thought the hardware performs the decryption when DMA-ing
to / from an encrypted page with SME, but not with SEV?

I think the issue was that the DMA API would try to use a bounce buffer
in this case.

SEV forces SWIOTLB bouncing on, but not SME. So it should probably be
possible to avoid dma_alloc_coherent() in the SME case.

In this case I don't have an explanation for this.

For the background what happened is that we got reports that SVE/SME
doesn't work with amdgpu. So we told the people to try using the
dma_alloc_coherent() path and that worked fine. Because of this we came
up with the patch I noted earlier.

I can confirm that it indeed works now for a couple of users, but we
still don't have a test system for this in our team.

Christian.

OK, undestood,

But unless there is some strange magic going on, (which there might be
of course),I do think the patch I sent is correct, and the reason that
SEV works is that the AMD card is used by the hypervisor and not the
guest, and TTM is actually incorrectly creating conflicting maps and
treating the coherent memory as encrypted. But since the memory is only
accessed through encrypted PTEs, the hardware does the right thing,
using the hypervisor key for decryption

But that's only a guess, and this is not super-urgent. I will be able to
follow up if / when we bring vmwgfx up for SEV.

/Thomas


/Thomas



Christian.


Thanks, Thomas




___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 4/5] drm/vmwgfx: remove custom ioctl io encoding check

2019-05-27 Thread Thomas Hellstrom

On 5/27/19 5:27 PM, Emil Velikov wrote:

On 2019/05/27, Thomas Hellstrom wrote:

On 5/27/19 2:35 PM, Emil Velikov wrote:

Hi Thomas,

On 2019/05/27, Thomas Hellstrom wrote:


I think we might be talking past each other, let's take a step back:

   - as of previous patch, all of vmwgfx ioctls size is consistently
handled by the core

I don't think I follow you here, AFAICT patch 3/5 only affects and
relaxes the execbuf checking (and in fact a little more than I would
like)?


Precisely, it makes execbuf ioctl behave like all other ioctls - both
vmwgfx and rest of drm.

But we're still enforcing a non-relaxed size check for the other vmwgfx
private ioctls, right? Which is relaxed, together with the directions, in
this commit?


Regardless of the patch, all !execbuf vmwgfx ioctls use the related size
checking from core drm.


Well it does, but since we (before this patch) enforce ioctl->cmd == 
cmd, we also enforce
_IOC_SIZE(ioctl->cmd) == _IOC_SIZE(cmd), which makes the core check 
pointless, or am I missing something?





(Not that it matters much to the discussion, though).


Agreed.




...

Can you provide a concrete example, please?

OK, let's say you were developing fence wait functionality. Like
vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the wait
never timed out as it should. The reason turn out to be that signals were
restarting the waits with the original timeout. So you change the ioctl from
W to RW and add a kernel-computed time to the argument. Everything is fine,
except that you forget to change this in a user-space application somewhere.

So now what happens, is that that user-space bug can live on undetected as
in 1), and that means you can never go back and implement a stricter check
because that would completely break old user-space.


If I understand you correctly, the W -> RW change in unnecessary. Yet
the only negative effect that I can see is the copy_to_user() overhead.

The copy should be negligible, yet it "feels" silly.

Is there anything more serious that I've missed?


Well the point is in this case, that the write was necessary, but the 
code would work sort of OK anyway. It updated a kernel "cookie" to make 
sure the timeout would be correct even with the next call repetition. 
Now if an old header was floating around, there might be clients using 
it. And with the current core checks that typically wouldn't get 
noticed. With the check we'd immediately notice and abort. It feels a 
little like moving from ANSI C to K :-)





Having a closer look - vmwgfx (et al) seems to stand out, such that it
does not provide a UABI define including the encoding. Hence it sort of
duplicates that in userspace, by using the explicit drmCommand*

Guess I could follow the suggestion in vmwgfx_drv.c move the defines to
UABI, sync header and update mesa/xf86-video-vmwgfx.

What do you think - yes, or please don't?


Please hold on for a while, and I'll discuss it internally.




The current code will trap (and has historically trapped) code like this.
That's mainly why I'm reluctant to give it up, but I guess it can be
conditionally compiled in for debug purposes.


This piece here, is the holly grail. I'll go further and suggest:

  - add a strict encoding and size check, behind a config toggle
  - make it a core drm thing and drop the custom vmwgfx one

Will keep it disabled by default - but will clearly document Kconfig and
docs that devs should toggle it to catch bugs.


Sounds good, but IIRC the reason why I kept it only to driver-private 
ioctls, is that there were errors with the drm ioctls. But that was a 
long time ago so I might remember incorrectly, or user-space has been fixed.





2) Catch a lot of fuzzer combinations and error out early instead of
forwarding them to the ioctl function where they may cause harm.


Struggling to see why this is a problem? At some point the fuzzer will
get past this first line of defence, so we want to make the rest of the
ioctl is robust.



I think the new user-space vs old kernel can be handled nicely in user-
space with feature flags or API versions. That's the way we've handled
them up to now?


How is a feature flag doing to help if the encoding changes from _IOW
to _IORW?

Ah, you're referring to old user-space new kernel? Yes, I was probably
reading a bit too fast. Sorry about that.

So we're basically landing in a tradeoff between trapping problems like
above, and hazzle-free ioctl argument definition change.

OK, so I'm ok with that as long as there is a way we can compile in strict
checking, which will likely has to be as a vmwgfx-specific wrapper.


Ack, I'll proceed with the debug toggle suggestion.


Great.




Thank you for the insightful input.
Emil


Thanks,

Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 13/13] drm: allow render capable master with DRM_AUTH ioctls

2019-05-27 Thread Thomas Hellstrom

On 5/27/19 3:16 PM, Daniel Vetter wrote:

On Mon, May 27, 2019 at 02:39:18PM +0200, Thomas Hellstrom wrote:

On 5/27/19 10:17 AM, Emil Velikov wrote:

From: Emil Velikov 

There are cases (in mesa and applications) where one would open the
primary node without properly authenticating the client.

Sometimes we don't check if the authentication succeeds, but there's
also cases we simply forget to do it.

The former was a case for Mesa where it did not not check the return
value of drmGetMagic() [1]. That was fixed recently although, there's
the question of older drivers or other apps that exbibit this behaviour.

While omitting the call results in issues as seen in [2] and [3].

In the libva case, libva itself doesn't authenticate the DRM client and
the vaGetDisplayDRM documentation doesn't mention if the app should
either.

As of today, the official vainfo utility doesn't authenticate.

To workaround issues like these, some users resort to running their apps
under sudo. Which admittedly isn't always a good idea.

Since any DRIVER_RENDER driver has sufficient isolation between clients,
we can use that, for unauthenticated [primary node] ioctls that require
DRM_AUTH. But only if the respective ioctl is tagged as DRM_RENDER_ALLOW.

v2:
- Rework/simplify if check (Daniel V)
- Add examples to commit messages, elaborate. (Daniel V)

v3:
- Use single unlikely (Daniel V)

v4:
- Patch was reverted because it broke AMDGPU, apply again. The AMDGPU
issue is fixed with earlier patch.

[1] 
https://gitlab.freedesktop.org/mesa/mesa/blob/2bc1f5c2e70fe3b4d41f060af9859bc2a94c5b62/src/egl/drivers/dri2/platform_wayland.c#L1136
[2] https://lists.freedesktop.org/archives/libva/2016-July/004185.html
[3] https://gitlab.freedesktop.org/mesa/kmscube/issues/1
Testcase: igt/core_unauth_vs_render
Cc: intel-...@lists.freedesktop.org
Signed-off-by: Emil Velikov 
Reviewed-by: Daniel Vetter 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20190114085408.15933-2-emil.l.veli...@gmail.com
---
   drivers/gpu/drm/drm_ioctl.c | 20 
   1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 9841c0076f02..b64b022a2b29 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -511,6 +511,13 @@ int drm_version(struct drm_device *dev, void *data,
return err;
   }
+static inline bool
+drm_render_driver_and_ioctl(const struct drm_device *dev, u32 flags)
+{
+   return drm_core_check_feature(dev, DRIVER_RENDER) &&
+   (flags & DRM_RENDER_ALLOW);
+}
+
   /**
* drm_ioctl_permit - Check ioctl permissions against caller
*
@@ -525,14 +532,19 @@ int drm_version(struct drm_device *dev, void *data,
*/
   int drm_ioctl_permit(u32 flags, struct drm_file *file_priv)
   {
+   const struct drm_device *dev = file_priv->minor->dev;
+
/* ROOT_ONLY is only for CAP_SYS_ADMIN */
if (unlikely((flags & DRM_ROOT_ONLY) && !capable(CAP_SYS_ADMIN)))
return -EACCES;
-   /* AUTH is only for authenticated or render client */
-   if (unlikely((flags & DRM_AUTH) && !drm_is_render_client(file_priv) &&
-!file_priv->authenticated))
-   return -EACCES;
+   /* AUTH is only for master ... */
+   if (unlikely((flags & DRM_AUTH) && drm_is_primary_client(file_priv))) {
+   /* authenticated ones, or render capable on DRM_RENDER_ALLOW. */
+   if (!file_priv->authenticated &&
+   !drm_render_driver_and_ioctl(dev, flags))
+   return -EACCES;
+   }

This breaks vmwgfx primary client authentication in the surface_reference
ioctl, which takes different paths in case of render clients and primary
clients, but adding an auth check in the primary path in the vmwgfx code
should fix this.

Hm yeah we need to adjust that ... otoh kinda not sure why this is gated
on authentication status, and not on "am I master or not" status. At least
from a very cursory read ...
-Daniel


The code snippet in question is:


        if (drm_is_primary_client(file_priv) &&
            user_srf->master != file_priv->master) {
            DRM_ERROR("Trying to reference surface outside of"
                  " master domain.\n");
            ret = -EACCES;
            goto out_bad_resource;
        }


In gem term's this means a client can't open a surface that hasn't been 
flinked by a client in the same master realm: You can't read from 
resources belonging to another X server's clients


/Thomas






/Thomas



/* MASTER is only for master or control clients */
if (unlikely((flags & DRM_MASTER) &&


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel



_

Re: [PATCH 4/5] drm/vmwgfx: remove custom ioctl io encoding check

2019-05-27 Thread Thomas Hellstrom

On 5/27/19 2:35 PM, Emil Velikov wrote:

Hi Thomas,

On 2019/05/27, Thomas Hellstrom wrote:


I think we might be talking past each other, let's take a step back:

  - as of previous patch, all of vmwgfx ioctls size is consistently
handled by the core

I don't think I follow you here, AFAICT patch 3/5 only affects and
relaxes the execbuf checking (and in fact a little more than I would
like)?


Precisely, it makes execbuf ioctl behave like all other ioctls - both
vmwgfx and rest of drm.


But we're still enforcing a non-relaxed size check for the other vmwgfx 
private ioctls, right? Which is relaxed, together with the directions, 
in this commit?


(Not that it matters much to the discussion, though).




  - handling of featue flags, as always, is responsibility of the
driver
ifself
  - with this patch, ioctl direction is also handled by core.

Here core ensures we only copy in/out as much data as the kernel
implementation can handle.


Let's consider the following real world example - msm and virtio_gpu.

An in field of an _IOW ioctl becomes in/out aka _IORW ioctl.
  - we add a flag to annotate/request the out, as always invalid flags
are return -EINVAL
  - we change the ioctl encoding

As currently handled by core DRM, old kernel/new userspace and
vice-versa works just fine. Sadly, vmwgfx will error out, while it
could
be avoided.

IMO basically we have a tradeoff between strict checking in this case,
and new user-space vs old kernel "hazzle-free" transition in the
relaxed case.


Precisely. If I read the code correctly, ATM new userspace will fail
against old kernels. Unless userspace writes two versions of the ioctl -
with with each encoding.


As said above, I'll gladly adjust core and/or others, if this relaxed
approach causes an issue somewhere. A specific use-case, real or
hypothetical will be appreciated.

To me there are two important reasons to keep the strict approach.

1) Avoid user-space mistakes early in the development cycle. We can't
distinguish between buggy user-space and "new" user-space. This is
important because of [a]) below.


Can you provide a concrete example, please?


OK, let's say you were developing fence wait functionality. Like 
vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the 
wait never timed out as it should. The reason turn out to be that 
signals were restarting the waits with the original timeout. So you 
change the ioctl from W to RW and add a kernel-computed time to the 
argument. Everything is fine, except that you forget to change this in a 
user-space application somewhere.


So now what happens, is that that user-space bug can live on undetected 
as in 1), and that means you can never go back and implement a stricter 
check because that would completely break old user-space.


The current code will trap (and has historically trapped) code like 
this. That's mainly why I'm reluctant to give it up, but I guess it can 
be conditionally compiled in for debug purposes.





2) Catch a lot of fuzzer combinations and error out early instead of
forwarding them to the ioctl function where they may cause harm.


Struggling to see why this is a problem? At some point the fuzzer will
get past this first line of defence, so we want to make the rest of the
ioctl is robust.



I think the new user-space vs old kernel can be handled nicely in user-
space with feature flags or API versions. That's the way we've handled
them up to now?


How is a feature flag doing to help if the encoding changes from _IOW
to _IORW?


Ah, you're referring to old user-space new kernel? Yes, I was probably 
reading a bit too fast. Sorry about that.


So we're basically landing in a tradeoff between trapping problems like 
above, and hazzle-free ioctl argument definition change.


OK, so I'm ok with that as long as there is a way we can compile in 
strict checking, which will likely has to be as a vmwgfx-specific wrapper.


/Thomas





Thanks
Emil
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 13/13] drm: allow render capable master with DRM_AUTH ioctls

2019-05-27 Thread Thomas Hellstrom

On 5/27/19 10:17 AM, Emil Velikov wrote:

From: Emil Velikov 

There are cases (in mesa and applications) where one would open the
primary node without properly authenticating the client.

Sometimes we don't check if the authentication succeeds, but there's
also cases we simply forget to do it.

The former was a case for Mesa where it did not not check the return
value of drmGetMagic() [1]. That was fixed recently although, there's
the question of older drivers or other apps that exbibit this behaviour.

While omitting the call results in issues as seen in [2] and [3].

In the libva case, libva itself doesn't authenticate the DRM client and
the vaGetDisplayDRM documentation doesn't mention if the app should
either.

As of today, the official vainfo utility doesn't authenticate.

To workaround issues like these, some users resort to running their apps
under sudo. Which admittedly isn't always a good idea.

Since any DRIVER_RENDER driver has sufficient isolation between clients,
we can use that, for unauthenticated [primary node] ioctls that require
DRM_AUTH. But only if the respective ioctl is tagged as DRM_RENDER_ALLOW.

v2:
- Rework/simplify if check (Daniel V)
- Add examples to commit messages, elaborate. (Daniel V)

v3:
- Use single unlikely (Daniel V)

v4:
- Patch was reverted because it broke AMDGPU, apply again. The AMDGPU
issue is fixed with earlier patch.

[1] 
https://gitlab.freedesktop.org/mesa/mesa/blob/2bc1f5c2e70fe3b4d41f060af9859bc2a94c5b62/src/egl/drivers/dri2/platform_wayland.c#L1136
[2] https://lists.freedesktop.org/archives/libva/2016-July/004185.html
[3] https://gitlab.freedesktop.org/mesa/kmscube/issues/1
Testcase: igt/core_unauth_vs_render
Cc: intel-...@lists.freedesktop.org
Signed-off-by: Emil Velikov 
Reviewed-by: Daniel Vetter 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20190114085408.15933-2-emil.l.veli...@gmail.com
---
  drivers/gpu/drm/drm_ioctl.c | 20 
  1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 9841c0076f02..b64b022a2b29 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -511,6 +511,13 @@ int drm_version(struct drm_device *dev, void *data,
return err;
  }
  
+static inline bool

+drm_render_driver_and_ioctl(const struct drm_device *dev, u32 flags)
+{
+   return drm_core_check_feature(dev, DRIVER_RENDER) &&
+   (flags & DRM_RENDER_ALLOW);
+}
+
  /**
   * drm_ioctl_permit - Check ioctl permissions against caller
   *
@@ -525,14 +532,19 @@ int drm_version(struct drm_device *dev, void *data,
   */
  int drm_ioctl_permit(u32 flags, struct drm_file *file_priv)
  {
+   const struct drm_device *dev = file_priv->minor->dev;
+
/* ROOT_ONLY is only for CAP_SYS_ADMIN */
if (unlikely((flags & DRM_ROOT_ONLY) && !capable(CAP_SYS_ADMIN)))
return -EACCES;
  
-	/* AUTH is only for authenticated or render client */

-   if (unlikely((flags & DRM_AUTH) && !drm_is_render_client(file_priv) &&
-!file_priv->authenticated))
-   return -EACCES;
+   /* AUTH is only for master ... */
+   if (unlikely((flags & DRM_AUTH) && drm_is_primary_client(file_priv))) {
+   /* authenticated ones, or render capable on DRM_RENDER_ALLOW. */
+   if (!file_priv->authenticated &&
+   !drm_render_driver_and_ioctl(dev, flags))
+   return -EACCES;
+   }


This breaks vmwgfx primary client authentication in the 
surface_reference ioctl, which takes different paths in case of render 
clients and primary clients, but adding an auth check in the primary 
path in the vmwgfx code should fix this.


/Thomas


  
  	/* MASTER is only for master or control clients */

if (unlikely((flags & DRM_MASTER) &&



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 4/5] drm/vmwgfx: remove custom ioctl io encoding check

2019-05-27 Thread Thomas Hellstrom
Hi, Emil,

On Mon, 2019-05-27 at 10:08 +0100, Emil Velikov wrote:
> On 2019/05/25, Thomas Hellstrom wrote:
> > On Sat, 2019-05-25 at 00:39 +0200, Thomas Hellström wrote:
> > > Hi, Emil
> > > 
> > > On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote:
> > > > On 2019/05/24, Thomas Hellstrom wrote:
> > > > > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:
> > > > > > On 2019/05/23, Thomas Hellstrom wrote:
> > > > > > > Hi, Emil,
> > > > > > > 
> > > > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> > > > > > > > From: Emil Velikov 
> > > > > > > > 
> > > > > > > > Drop the custom ioctl io encoding check - core drm does
> > > > > > > > it
> > > > > > > > for
> > > > > > > > us.
> > > > > > > 
> > > > > > > I fail to see where the core does this, or do I miss
> > > > > > > something?
> > > > > > 
> > > > > > drm_ioctl() allows for the encoding to be changed and
> > > > > > attributes
> > > > > > that
> > > > > > only the
> > > > > > appropriate size is copied in/out of the kernel.
> > > > > > 
> > > > > > Technically the function is more relaxed relative to the
> > > > > > vmwgfx
> > > > > > check, yet
> > > > > > seems perfectly reasonable.
> > > > > > 
> > > > > > Is there any corner-case that isn't but should be handled
> > > > > > in
> > > > > > drm_ioctl()?
> > > > > 
> > > > > I'd like to turn the question around and ask whether there's
> > > > > a
> > > > > reason
> > > > > we should relax the vmwgfx test? In the past it has trapped
> > > > > quite
> > > > > a
> > > > > few
> > > > > user-space errors.
> > > > > 
> > > > The way I see it either:
> > > >  - the check, as-is, is unnessesary, or
> > > >  - it is needed, and we should do something equivalent for all
> > > > of
> > > > DRM
> > > > 
> > > > We had a very long brainstorming session with a colleague and
> > > > we
> > > > could not see
> > > > any cases where this would cause a problem. If you recall
> > > > anything
> > > > concrete
> > > > please let me know - I would be more than happy to take a
> > > > closer
> > > > look.
> > > 
> > > If you have a good reason to drop an ioctl sanity check, I'd be
> > > perfectly happy to do it. To me, a good reason even includes "I
> > > have
> > > a
> > > non-open-source customer having problems with this check" because
> > > of
> > > reason etc. etc. as long as I have a way to evaluate those
> > > reasons
> > > and
> > > determine if they're valid or not. "No other drm driver nor the
> > > core
> > > is
> > > doing this" is NOT a valid reason to me. In particular if the
> > > check
> > > is
> > > not affecting performance. So unless you provide additional
> > > reasons
> > > to
> > > drop this check, it's a solid NAK from my side.
> > 
> > To clarify my point of view a bit, this check is useful to early
> > catch
> > userspace using incorrect flags and sizes, which otherwise might
> > make
> > it out to distros and after that, introducing a check like this
> > would
> > be impossible, since it might break old user-space. For the same
> > reason
> > it would probably be very difficult to introduce it in core drm. 
> > 
> I think we might be talking past each other, let's take a step back:
> 
>  - as of previous patch, all of vmwgfx ioctls size is consistently
> handled by the core

I don't think I follow you here, AFAICT patch 3/5 only affects and
relaxes the execbuf checking (and in fact a little more than I would
like)?

>  - handling of featue flags, as always, is responsibility of the
> driver
> ifself
>  - with this patch, ioctl direction is also handled by core.
> 
> Here core ensures we only copy in/out as much data as the kernel
> implementation can handle.
> 
> 
> Let's consider the following real world example - msm and virtio_gpu.
>

Re: [PATCH 4/5] drm/vmwgfx: remove custom ioctl io encoding check

2019-05-25 Thread Thomas Hellstrom
On Sat, 2019-05-25 at 00:39 +0200, Thomas Hellström wrote:
> Hi, Emil
> 
> On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote:
> > On 2019/05/24, Thomas Hellstrom wrote:
> > > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:
> > > > On 2019/05/23, Thomas Hellstrom wrote:
> > > > > Hi, Emil,
> > > > > 
> > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> > > > > > From: Emil Velikov 
> > > > > > 
> > > > > > Drop the custom ioctl io encoding check - core drm does it
> > > > > > for
> > > > > > us.
> > > > > 
> > > > > I fail to see where the core does this, or do I miss
> > > > > something?
> > > > 
> > > > drm_ioctl() allows for the encoding to be changed and
> > > > attributes
> > > > that
> > > > only the
> > > > appropriate size is copied in/out of the kernel.
> > > > 
> > > > Technically the function is more relaxed relative to the vmwgfx
> > > > check, yet
> > > > seems perfectly reasonable.
> > > > 
> > > > Is there any corner-case that isn't but should be handled in
> > > > drm_ioctl()?
> > > 
> > > I'd like to turn the question around and ask whether there's a
> > > reason
> > > we should relax the vmwgfx test? In the past it has trapped quite
> > > a
> > > few
> > > user-space errors.
> > > 
> > The way I see it either:
> >  - the check, as-is, is unnessesary, or
> >  - it is needed, and we should do something equivalent for all of
> > DRM
> > 
> > We had a very long brainstorming session with a colleague and we
> > could not see
> > any cases where this would cause a problem. If you recall anything
> > concrete
> > please let me know - I would be more than happy to take a closer
> > look.
> 
> If you have a good reason to drop an ioctl sanity check, I'd be
> perfectly happy to do it. To me, a good reason even includes "I have
> a
> non-open-source customer having problems with this check" because of
> reason etc. etc. as long as I have a way to evaluate those reasons
> and
> determine if they're valid or not. "No other drm driver nor the core
> is
> doing this" is NOT a valid reason to me. In particular if the check
> is
> not affecting performance. So unless you provide additional reasons
> to
> drop this check, it's a solid NAK from my side.

To clarify my point of view a bit, this check is useful to early catch
userspace using incorrect flags and sizes, which otherwise might make
it out to distros and after that, introducing a check like this would
be impossible, since it might break old user-space. For the same reason
it would probably be very difficult to introduce it in core drm. 

Thanks,
Thomas



> 
> Thanks,
> Thomas
> 
> 
> > Thanks
> > Emil
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 4/5] drm/vmwgfx: remove custom ioctl io encoding check

2019-05-24 Thread Thomas Hellstrom
Hi, Emil

On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote:
> On 2019/05/24, Thomas Hellstrom wrote:
> > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:
> > > On 2019/05/23, Thomas Hellstrom wrote:
> > > > Hi, Emil,
> > > > 
> > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> > > > > From: Emil Velikov 
> > > > > 
> > > > > Drop the custom ioctl io encoding check - core drm does it
> > > > > for
> > > > > us.
> > > > 
> > > > I fail to see where the core does this, or do I miss something?
> > > 
> > > drm_ioctl() allows for the encoding to be changed and attributes
> > > that
> > > only the
> > > appropriate size is copied in/out of the kernel.
> > > 
> > > Technically the function is more relaxed relative to the vmwgfx
> > > check, yet
> > > seems perfectly reasonable.
> > > 
> > > Is there any corner-case that isn't but should be handled in
> > > drm_ioctl()?
> > 
> > I'd like to turn the question around and ask whether there's a
> > reason
> > we should relax the vmwgfx test? In the past it has trapped quite a
> > few
> > user-space errors.
> > 
> The way I see it either:
>  - the check, as-is, is unnessesary, or
>  - it is needed, and we should do something equivalent for all of DRM
> 
> We had a very long brainstorming session with a colleague and we
> could not see
> any cases where this would cause a problem. If you recall anything
> concrete
> please let me know - I would be more than happy to take a closer
> look.

If you have a good reason to drop an ioctl sanity check, I'd be
perfectly happy to do it. To me, a good reason even includes "I have a
non-open-source customer having problems with this check" because of
reason etc. etc. as long as I have a way to evaluate those reasons and
determine if they're valid or not. "No other drm driver nor the core is
doing this" is NOT a valid reason to me. In particular if the check is
not affecting performance. So unless you provide additional reasons to
drop this check, it's a solid NAK from my side.

Thanks,
Thomas


> 
> Thanks
> Emil
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 4/5] drm/vmwgfx: remove custom ioctl io encoding check

2019-05-24 Thread Thomas Hellstrom
On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:
> On 2019/05/23, Thomas Hellstrom wrote:
> > Hi, Emil,
> > 
> > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> > > From: Emil Velikov 
> > > 
> > > Drop the custom ioctl io encoding check - core drm does it for
> > > us.
> > 
> > I fail to see where the core does this, or do I miss something?
> 
> drm_ioctl() allows for the encoding to be changed and attributes that
> only the
> appropriate size is copied in/out of the kernel.
> 
> Technically the function is more relaxed relative to the vmwgfx
> check, yet
> seems perfectly reasonable.
> 
> Is there any corner-case that isn't but should be handled in
> drm_ioctl()?

I'd like to turn the question around and ask whether there's a reason
we should relax the vmwgfx test? In the past it has trapped quite a few
user-space errors.

Thanks,
Thomas



> 
> -Emil
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH] drm/ttm, drm/vmwgfx: Have TTM support AMD SEV encryption

2019-05-24 Thread Thomas Hellstrom

On 5/24/19 2:03 PM, Koenig, Christian wrote:

Am 24.05.19 um 12:37 schrieb Thomas Hellstrom:

[CAUTION: External Email]

On 5/24/19 12:18 PM, Koenig, Christian wrote:

Am 24.05.19 um 11:55 schrieb Thomas Hellstrom:

[CAUTION: External Email]

On 5/24/19 11:11 AM, Thomas Hellstrom wrote:

Hi, Christian,

On 5/24/19 10:37 AM, Koenig, Christian wrote:

Am 24.05.19 um 10:11 schrieb Thomas Hellström (VMware):

[CAUTION: External Email]

From: Thomas Hellstrom 

With SEV encryption, all DMA memory must be marked decrypted
(AKA "shared") for devices to be able to read it. In the future we
might
want to be able to switch normal (encrypted) memory to decrypted in
exactly
the same way as we handle caching states, and that would require
additional
memory pools. But for now, rely on memory allocated with
dma_alloc_coherent() which is already decrypted with SEV enabled.
Set up
the page protection accordingly. Drivers must detect SEV enabled and
switch
to the dma page pool.

This patch has not yet been tested. As a follow-up, we might want to
cache decrypted pages in the dma page pool regardless of their
caching
state.

This patch is unnecessary, SEV support already works fine with at
least
amdgpu and I would expect that it also works with other drivers as
well.

Also see this patch:

commit 64e1f830ea5b3516a4256ed1c504a265d7f2a65c
Author: Christian König 
Date:   Wed Mar 13 10:11:19 2019 +0100

    drm: fallback to dma_alloc_coherent when memory encryption is
active

    We can't just map any randome page we get when memory
encryption is
    active.

    Signed-off-by: Christian König 
    Acked-by: Alex Deucher 
    Link: https://patchwork.kernel.org/patch/10850833/

Regards,
Christian.

Yes, I noticed that. Although I fail to see where we automagically
clear the PTE encrypted bit when mapping coherent memory? For the
linear kernel map, that's done within dma_alloc_coherent() but for
kernel vmaps and and user-space maps? Is that done automatically by
the x86 platform layer?

Yes, I think so. Haven't looked to closely at this either.

This sounds a bit odd. If that were the case, the natural place would be
the PAT tracking code, but it only handles caching flags AFAICT. Not
encryption flags.

But when you tested AMD with SEV, was that running as hypervisor rather
than a guest, or did you run an SEV guest with PCI passthrough to the
AMD device?

Yeah, well the problem is we never tested this ourself :)


/Thomas


And, as a follow up question, why do we need dma_alloc_coherent() when
using SME? I thought the hardware performs the decryption when DMA-ing
to / from an encrypted page with SME, but not with SEV?

I think the issue was that the DMA API would try to use a bounce buffer
in this case.

SEV forces SWIOTLB bouncing on, but not SME. So it should probably be
possible to avoid dma_alloc_coherent() in the SME case.

In this case I don't have an explanation for this.

For the background what happened is that we got reports that SVE/SME
doesn't work with amdgpu. So we told the people to try using the
dma_alloc_coherent() path and that worked fine. Because of this we came
up with the patch I noted earlier.

I can confirm that it indeed works now for a couple of users, but we
still don't have a test system for this in our team.

Christian.


OK, undestood,

But unless there is some strange magic going on, (which there might be 
of course),I do think the patch I sent is correct, and the reason that 
SEV works is that the AMD card is used by the hypervisor and not the 
guest, and TTM is actually incorrectly creating conflicting maps and 
treating the coherent memory as encrypted. But since the memory is only 
accessed through encrypted PTEs, the hardware does the right thing, 
using the hypervisor key for decryption


But that's only a guess, and this is not super-urgent. I will be able to 
follow up if / when we bring vmwgfx up for SEV.


/Thomas


/Thomas



Christian.


Thanks, Thomas





___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/vmwgfx: fix a warning due to missing dma_parms

2019-05-24 Thread Thomas Hellstrom
On Fri, 2019-05-24 at 08:19 +0200, Christoph Hellwig wrote:
> On Thu, May 23, 2019 at 10:37:19PM -0400, Qian Cai wrote:
> > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > index bf6c3500d363..5c567b81174f 100644
> > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > @@ -747,6 +747,13 @@ static int vmw_driver_load(struct drm_device
> > *dev, unsigned long chipset)
> > if (unlikely(ret != 0))
> > goto out_err0;
> >  
> > +   dev->dev->dma_parms =  kzalloc(sizeof(*dev->dev->dma_parms),
> > +  GFP_KERNEL);
> > +   if (!dev->dev->dma_parms)
> > +   goto out_err0;
> 
> What bus does this device come from?  I though vmgfx was a
> (virtualized)
> PCI device, in which case this should be provided by the PCI core.
> Or are we calling DMA mapping routines on arbitrary other struct
> device,
> in which case that is the real bug and we should switch the PCI
> device
> instead.

It's a PCI device. The struct device * used in dma_map_sg() is the same
as the _dev->dev handed to the probe() callback. But at probe time,
the struct device::dma_parms is non-NULL, at least on my system so
there shouldn't really be a need to kzalloc() it.

> 
> > +   dma_set_max_seg_size(dev->dev, *dev->dev->dma_mask);

The max is U32_MAX.

/Thomas


> 
> That looks odd.  If you want to support an unlimited segment size
> just pass UINT_MAX here.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 3/5] drm/vmwgfx: use core drm to extend/check vmw_execbuf_ioctl

2019-05-24 Thread Thomas Hellstrom

On 5/24/19 12:53 PM, Emil Velikov wrote:

On 2019/05/24, Daniel Vetter wrote:

On Fri, May 24, 2019 at 8:05 AM Thomas Hellstrom  wrote:

On Wed, 2019-05-22 at 21:09 +0200, Daniel Vetter wrote:

On Wed, May 22, 2019 at 9:01 PM Thomas Hellstrom <
thellst...@vmware.com> wrote:

Hi, Emil,

On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:

From: Emil Velikov 

Currently vmw_execbuf_ioctl() open-codes the permission checking,
size
extending and copying that is already done in core drm.

Kill all the duplication, adding a few comments for clarity.

Ah, there is core functionality for this now.

What worries me though with the core approach is that the sizes are
not
capped by the size of the kernel argument definition, which makes
mailicious user-space being able to force kmallocs() the size of
the
maximum ioctl size. Should probably be fixed before pushing this.

Hm I always worked under the assumption that kmalloc and friends
should be userspace hardened. Otherwise stuff like kmalloc_array
doesn't make any sense, everyone just feeds it unchecked input and
expects that helper to handle overflows.

If we assume kmalloc isn't hardened against that, then we have a much
bigger problem than just vmwgfx ioctls ...

After checking the drm_ioctl code I realize that what I thought was new
behaviour actually has been around for a couple of years, so
fixing isn't really tied to this patch series...

What caused me to react was that previously we used to have this

e4fda9f264e1 ("drm: Perform ioctl command validation on the stored
kernel values")

and we seem to have lost that now, if not for the io flags then at
least for the size part. For the size of the ioctl arguments, I think
in general if the kernel only touches a subset of the user-space
specified size I see no reason why we should malloc / copy more than
that?

I guess we could optimize that, but we'd probably still need to zero
clear the added size for forward compat with newer userspace. Iirc
we've had some issues in this area.


Now, given the fact that the maximum ioctl argument size is quite
limited, that might not be a big problem or a problem at all. Otherwise
it would be pretty easy for a malicious process to allocate most or all
of a system's resident memory?

The biggest you can allocate from kmalloc is limited by the largest
contiguous chunk alloc_pages gives you, which is limited by MAX_ORDER
from the page buddy allocator. You need lots of process to be able to
exhaust memory like that (and like I said, the entire kernel would be
broken if we'd consider this a security issue). If you want to make
sure that a process group can't exhaust memory this way then you need
to set appropriate cgroups limits.

I do agree with all the sentiments that drm_ioctl() could use some extra
optimisation and hardening. At the same time I would remind that the
code has been used as-is by vmwgfx and other drivers for years.

In other words: let's keep that work as orthogonal series.

What do you guys think?


I agree. Then I only had a concern with one of the patches.

/Thomas



Emil
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH] drm/ttm, drm/vmwgfx: Have TTM support AMD SEV encryption

2019-05-24 Thread Thomas Hellstrom

On 5/24/19 12:18 PM, Koenig, Christian wrote:

Am 24.05.19 um 11:55 schrieb Thomas Hellstrom:

[CAUTION: External Email]

On 5/24/19 11:11 AM, Thomas Hellstrom wrote:

Hi, Christian,

On 5/24/19 10:37 AM, Koenig, Christian wrote:

Am 24.05.19 um 10:11 schrieb Thomas Hellström (VMware):

[CAUTION: External Email]

From: Thomas Hellstrom 

With SEV encryption, all DMA memory must be marked decrypted
(AKA "shared") for devices to be able to read it. In the future we
might
want to be able to switch normal (encrypted) memory to decrypted in
exactly
the same way as we handle caching states, and that would require
additional
memory pools. But for now, rely on memory allocated with
dma_alloc_coherent() which is already decrypted with SEV enabled.
Set up
the page protection accordingly. Drivers must detect SEV enabled and
switch
to the dma page pool.

This patch has not yet been tested. As a follow-up, we might want to
cache decrypted pages in the dma page pool regardless of their caching
state.

This patch is unnecessary, SEV support already works fine with at least
amdgpu and I would expect that it also works with other drivers as
well.

Also see this patch:

commit 64e1f830ea5b3516a4256ed1c504a265d7f2a65c
Author: Christian König 
Date:   Wed Mar 13 10:11:19 2019 +0100

   drm: fallback to dma_alloc_coherent when memory encryption is
active

   We can't just map any randome page we get when memory
encryption is
   active.

   Signed-off-by: Christian König 
   Acked-by: Alex Deucher 
   Link: https://patchwork.kernel.org/patch/10850833/

Regards,
Christian.

Yes, I noticed that. Although I fail to see where we automagically
clear the PTE encrypted bit when mapping coherent memory? For the
linear kernel map, that's done within dma_alloc_coherent() but for
kernel vmaps and and user-space maps? Is that done automatically by
the x86 platform layer?

Yes, I think so. Haven't looked to closely at this either.


This sounds a bit odd. If that were the case, the natural place would be 
the PAT tracking code, but it only handles caching flags AFAICT. Not 
encryption flags.


But when you tested AMD with SEV, was that running as hypervisor rather 
than a guest, or did you run an SEV guest with PCI passthrough to the 
AMD device?





/Thomas


And, as a follow up question, why do we need dma_alloc_coherent() when
using SME? I thought the hardware performs the decryption when DMA-ing
to / from an encrypted page with SME, but not with SEV?

I think the issue was that the DMA API would try to use a bounce buffer
in this case.


SEV forces SWIOTLB bouncing on, but not SME. So it should probably be 
possible to avoid dma_alloc_coherent() in the SME case.


/Thomas




Christian.


Thanks, Thomas





___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH] drm/ttm, drm/vmwgfx: Have TTM support AMD SEV encryption

2019-05-24 Thread Thomas Hellstrom

On 5/24/19 11:11 AM, Thomas Hellstrom wrote:

Hi, Christian,

On 5/24/19 10:37 AM, Koenig, Christian wrote:

Am 24.05.19 um 10:11 schrieb Thomas Hellström (VMware):

[CAUTION: External Email]

From: Thomas Hellstrom 

With SEV encryption, all DMA memory must be marked decrypted
(AKA "shared") for devices to be able to read it. In the future we 
might
want to be able to switch normal (encrypted) memory to decrypted in 
exactly
the same way as we handle caching states, and that would require 
additional

memory pools. But for now, rely on memory allocated with
dma_alloc_coherent() which is already decrypted with SEV enabled. 
Set up
the page protection accordingly. Drivers must detect SEV enabled and 
switch

to the dma page pool.

This patch has not yet been tested. As a follow-up, we might want to
cache decrypted pages in the dma page pool regardless of their caching
state.

This patch is unnecessary, SEV support already works fine with at least
amdgpu and I would expect that it also works with other drivers as well.

Also see this patch:

commit 64e1f830ea5b3516a4256ed1c504a265d7f2a65c
Author: Christian König 
Date:   Wed Mar 13 10:11:19 2019 +0100

      drm: fallback to dma_alloc_coherent when memory encryption is 
active


      We can't just map any randome page we get when memory 
encryption is

      active.

      Signed-off-by: Christian König 
      Acked-by: Alex Deucher 
      Link: https://patchwork.kernel.org/patch/10850833/

Regards,
Christian.


Yes, I noticed that. Although I fail to see where we automagically 
clear the PTE encrypted bit when mapping coherent memory? For the 
linear kernel map, that's done within dma_alloc_coherent() but for 
kernel vmaps and and user-space maps? Is that done automatically by 
the x86 platform layer?


/Thomas

And, as a follow up question, why do we need dma_alloc_coherent() when 
using SME? I thought the hardware performs the decryption when DMA-ing 
to / from an encrypted page with SME, but not with SEV?


Thanks, Thomas



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH] drm/ttm, drm/vmwgfx: Have TTM support AMD SEV encryption

2019-05-24 Thread Thomas Hellstrom

Hi, Christian,

On 5/24/19 10:37 AM, Koenig, Christian wrote:

Am 24.05.19 um 10:11 schrieb Thomas Hellström (VMware):

[CAUTION: External Email]

From: Thomas Hellstrom 

With SEV encryption, all DMA memory must be marked decrypted
(AKA "shared") for devices to be able to read it. In the future we might
want to be able to switch normal (encrypted) memory to decrypted in exactly
the same way as we handle caching states, and that would require additional
memory pools. But for now, rely on memory allocated with
dma_alloc_coherent() which is already decrypted with SEV enabled. Set up
the page protection accordingly. Drivers must detect SEV enabled and switch
to the dma page pool.

This patch has not yet been tested. As a follow-up, we might want to
cache decrypted pages in the dma page pool regardless of their caching
state.

This patch is unnecessary, SEV support already works fine with at least
amdgpu and I would expect that it also works with other drivers as well.

Also see this patch:

commit 64e1f830ea5b3516a4256ed1c504a265d7f2a65c
Author: Christian König 
Date:   Wed Mar 13 10:11:19 2019 +0100

      drm: fallback to dma_alloc_coherent when memory encryption is active

      We can't just map any randome page we get when memory encryption is
      active.

      Signed-off-by: Christian König 
      Acked-by: Alex Deucher 
      Link: https://patchwork.kernel.org/patch/10850833/

Regards,
Christian.


Yes, I noticed that. Although I fail to see where we automagically clear 
the PTE encrypted bit when mapping coherent memory? For the linear 
kernel map, that's done within dma_alloc_coherent() but for kernel vmaps 
and and user-space maps? Is that done automatically by the x86 platform 
layer?


/Thomas





Cc: Christian König 
Signed-off-by: Thomas Hellstrom 
---
   drivers/gpu/drm/ttm/ttm_bo_util.c| 17 +
   drivers/gpu/drm/ttm/ttm_bo_vm.c  |  6 --
   drivers/gpu/drm/ttm/ttm_page_alloc_dma.c |  3 +++
   drivers/gpu/drm/vmwgfx/vmwgfx_blit.c |  6 --
   include/drm/ttm/ttm_bo_driver.h  |  8 +---
   include/drm/ttm/ttm_tt.h |  1 +
   6 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 895d77d799e4..1d6643bd0b01 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -419,11 +419,13 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
  page = i * dir + add;
  if (old_iomap == NULL) {
  pgprot_t prot = ttm_io_prot(old_mem->placement,
+   ttm->page_flags,
  PAGE_KERNEL);
  ret = ttm_copy_ttm_io_page(ttm, new_iomap, page,
 prot);
  } else if (new_iomap == NULL) {
  pgprot_t prot = ttm_io_prot(new_mem->placement,
+   ttm->page_flags,
  PAGE_KERNEL);
  ret = ttm_copy_io_ttm_page(ttm, old_iomap, page,
 prot);
@@ -526,11 +528,11 @@ static int ttm_buffer_object_transfer(struct 
ttm_buffer_object *bo,
  return 0;
   }

-pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp)
+pgprot_t ttm_io_prot(u32 caching_flags, u32 tt_page_flags, pgprot_t tmp)
   {
  /* Cached mappings need no adjustment */
  if (caching_flags & TTM_PL_FLAG_CACHED)
-   return tmp;
+   goto check_encryption;

   #if defined(__i386__) || defined(__x86_64__)
  if (caching_flags & TTM_PL_FLAG_WC)
@@ -548,6 +550,11 @@ pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp)
   #if defined(__sparc__) || defined(__mips__)
  tmp = pgprot_noncached(tmp);
   #endif
+
+check_encryption:
+   if (tt_page_flags & TTM_PAGE_FLAG_DECRYPTED)
+   tmp = pgprot_decrypted(tmp);
+
  return tmp;
   }
   EXPORT_SYMBOL(ttm_io_prot);
@@ -594,7 +601,8 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
  if (ret)
  return ret;

-   if (num_pages == 1 && (mem->placement & TTM_PL_FLAG_CACHED)) {
+   if (num_pages == 1 && (mem->placement & TTM_PL_FLAG_CACHED) &&
+   !(ttm->page_flags & TTM_PAGE_FLAG_DECRYPTED)) {
  /*
   * We're mapping a single page, and the desired
   * page protection is consistent with the bo.
@@ -608,7 +616,8 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
   * We need to use vmap to get the desired page protection
   * or to make the buffer object look contiguous.
   */
-   prot = t

Re: [PATCH 3/5] drm/vmwgfx: use core drm to extend/check vmw_execbuf_ioctl

2019-05-24 Thread Thomas Hellstrom
On Wed, 2019-05-22 at 21:09 +0200, Daniel Vetter wrote:
> On Wed, May 22, 2019 at 9:01 PM Thomas Hellstrom <
> thellst...@vmware.com> wrote:
> > Hi, Emil,
> > 
> > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> > > From: Emil Velikov 
> > > 
> > > Currently vmw_execbuf_ioctl() open-codes the permission checking,
> > > size
> > > extending and copying that is already done in core drm.
> > > 
> > > Kill all the duplication, adding a few comments for clarity.
> > 
> > Ah, there is core functionality for this now.
> > 
> > What worries me though with the core approach is that the sizes are
> > not
> > capped by the size of the kernel argument definition, which makes
> > mailicious user-space being able to force kmallocs() the size of
> > the
> > maximum ioctl size. Should probably be fixed before pushing this.
> 
> Hm I always worked under the assumption that kmalloc and friends
> should be userspace hardened. Otherwise stuff like kmalloc_array
> doesn't make any sense, everyone just feeds it unchecked input and
> expects that helper to handle overflows.
> 
> If we assume kmalloc isn't hardened against that, then we have a much
> bigger problem than just vmwgfx ioctls ...

After checking the drm_ioctl code I realize that what I thought was new
behaviour actually has been around for a couple of years, so
fixing isn't really tied to this patch series...

What caused me to react was that previously we used to have this

e4fda9f264e1 ("drm: Perform ioctl command validation on the stored
kernel values")

and we seem to have lost that now, if not for the io flags then at
least for the size part. For the size of the ioctl arguments, I think
in general if the kernel only touches a subset of the user-space
specified size I see no reason why we should malloc / copy more than
that?

Now, given the fact that the maximum ioctl argument size is quite
limited, that might not be a big problem or a problem at all. Otherwise
it would be pretty easy for a malicious process to allocate most or all
of a system's resident memory?

/Thomas







> -Daniel
> 
> > 
> > > Cc: "VMware Graphics" 
> > > Cc: Thomas Hellstrom 
> > > Cc: Daniel Vetter 
> > > Signed-off-by: Emil Velikov 
> > > ---
> > > Thomas, VMware team,
> > > 
> > > Please give this some testing on your end. I've only tested it
> > > against
> > > mesa-master.
> > 
> > I'll review tomorrow and do some testing. Need to see if I can dig
> > up
> > user-space apps with version 0...
> > 
> > Thanks,
> > 
> > Thomas
> > 
> > > Thanks
> > > Emil
> > > ---
> > >  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 12 +-
> > >  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h |  4 +-
> > >  drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 52 +
> > > 
> > > 
> > >  3 files changed, 23 insertions(+), 45 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > > b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > > index d3f108f7e52d..2cb6ae219e43 100644
> > > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > > @@ -186,7 +186,7 @@ static const struct drm_ioctl_desc
> > > vmw_ioctls[] =
> > > {
> > > DRM_RENDER_ALLOW),
> > >   VMW_IOCTL_DEF(VMW_REF_SURFACE, vmw_surface_reference_ioctl,
> > > DRM_AUTH | DRM_RENDER_ALLOW),
> > > - VMW_IOCTL_DEF(VMW_EXECBUF, NULL, DRM_AUTH |
> > > + VMW_IOCTL_DEF(VMW_EXECBUF, vmw_execbuf_ioctl, DRM_AUTH |
> > > DRM_RENDER_ALLOW),
> > >   VMW_IOCTL_DEF(VMW_FENCE_WAIT, vmw_fence_obj_wait_ioctl,
> > > DRM_RENDER_ALLOW),
> > > @@ -1140,15 +1140,7 @@ static long vmw_generic_ioctl(struct file
> > > *filp, unsigned int cmd,
> > >   _ioctls[nr - DRM_COMMAND_BASE];
> > > 
> > >   if (nr == DRM_COMMAND_BASE + DRM_VMW_EXECBUF) {
> > > - ret = (long) drm_ioctl_permit(ioctl->flags,
> > > file_priv);
> > > - if (unlikely(ret != 0))
> > > - return ret;
> > > -
> > > - if (unlikely((cmd & (IOC_IN | IOC_OUT)) !=
> > > IOC_IN))
> > > - goto out_io_encoding;
> > > -
> > > - return (long) vmw_execbuf_ioctl(dev, arg,
> 

Re: [PATCH 3/5] drm/vmwgfx: use core drm to extend/check vmw_execbuf_ioctl

2019-05-23 Thread Thomas Hellstrom
On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Currently vmw_execbuf_ioctl() open-codes the permission checking,
> size
> extending and copying that is already done in core drm.
> 
> Kill all the duplication, adding a few comments for clarity.
> 
> Cc: "VMware Graphics" 
> Cc: Thomas Hellstrom 
> Cc: Daniel Vetter 
> Signed-off-by: Emil Velikov 

Tested using piglit quick using execbuf versions 1 and 2.

Tested-by: Thomas Hellstrom 
Reviewed-by: Thomas Hellstrom 


> ---
> Thomas, VMware team,
> 
> Please give this some testing on your end. I've only tested it
> against
> mesa-master.
> 
> Thanks
> Emil
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 12 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h |  4 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 52 +
> 
>  3 files changed, 23 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index d3f108f7e52d..2cb6ae219e43 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -186,7 +186,7 @@ static const struct drm_ioctl_desc vmw_ioctls[] =
> {
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_REF_SURFACE, vmw_surface_reference_ioctl,
> DRM_AUTH | DRM_RENDER_ALLOW),
> - VMW_IOCTL_DEF(VMW_EXECBUF, NULL, DRM_AUTH |
> + VMW_IOCTL_DEF(VMW_EXECBUF, vmw_execbuf_ioctl, DRM_AUTH |
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_FENCE_WAIT, vmw_fence_obj_wait_ioctl,
> DRM_RENDER_ALLOW),
> @@ -1140,15 +1140,7 @@ static long vmw_generic_ioctl(struct file
> *filp, unsigned int cmd,
>   _ioctls[nr - DRM_COMMAND_BASE];
>  
>   if (nr == DRM_COMMAND_BASE + DRM_VMW_EXECBUF) {
> - ret = (long) drm_ioctl_permit(ioctl->flags,
> file_priv);
> - if (unlikely(ret != 0))
> - return ret;
> -
> - if (unlikely((cmd & (IOC_IN | IOC_OUT)) !=
> IOC_IN))
> - goto out_io_encoding;
> -
> - return (long) vmw_execbuf_ioctl(dev, arg,
> file_priv,
> - _IOC_SIZE(cmd))
> ;
> + return ioctl_func(filp, cmd, arg);
>   } else if (nr == DRM_COMMAND_BASE +
> DRM_VMW_UPDATE_LAYOUT) {
>   if (!drm_is_current_master(file_priv) &&
>   !capable(CAP_SYS_ADMIN))
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> index 9be2176cc260..f5bfac85f793 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> @@ -910,8 +910,8 @@ static inline struct page *vmw_piter_page(struct
> vmw_piter *viter)
>   * Command submission - vmwgfx_execbuf.c
>   */
>  
> -extern int vmw_execbuf_ioctl(struct drm_device *dev, unsigned long
> data,
> -  struct drm_file *file_priv, size_t size);
> +extern int vmw_execbuf_ioctl(struct drm_device *dev, void *data,
> +  struct drm_file *file_priv);
>  extern int vmw_execbuf_process(struct drm_file *file_priv,
>  struct vmw_private *dev_priv,
>  void __user *user_commands,
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> index 2ff7ba04d8c8..767e2b99618d 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> @@ -3977,54 +3977,40 @@ void vmw_execbuf_release_pinned_bo(struct
> vmw_private *dev_priv)
>   mutex_unlock(_priv->cmdbuf_mutex);
>  }
>  
> -int vmw_execbuf_ioctl(struct drm_device *dev, unsigned long data,
> -   struct drm_file *file_priv, size_t size)
> +int vmw_execbuf_ioctl(struct drm_device *dev, void *data,
> +   struct drm_file *file_priv)
>  {
>   struct vmw_private *dev_priv = vmw_priv(dev);
> - struct drm_vmw_execbuf_arg arg;
> + struct drm_vmw_execbuf_arg *arg = data;
>   int ret;
> - static const size_t copy_offset[] = {
> - offsetof(struct drm_vmw_execbuf_arg, context_handle),
> - sizeof(struct drm_vmw_execbuf_arg)};
>   struct dma_fence *in_fence = NULL;
>  
> - if (unlikely(size < copy_offset[0])) {
> - VMW_DEBUG_USER("Invalid command size, ioctl %d\n",
> -DRM_VMW_EXECBUF);
> - return -EINVAL;
> - }
> -
> - 

Re: [PATCH 2/5] drm/vmgfx: kill off unused init_mutex

2019-05-23 Thread Thomas Hellstrom
On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> According to the docs - prevents firstopen/lastclose races. Yet never
> used in practise.
> 
> Cc: "VMware Graphics" 
> Cc: Thomas Hellstrom 
> Cc: Daniel Vetter 
> Signed-off-by: Emil Velikov 

Reviewed-by: Thomas Hellstrom 

> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 1 -
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 5 -
>  2 files changed, 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index a38f06909fb6..d3f108f7e52d 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -664,7 +664,6 @@ static int vmw_driver_load(struct drm_device
> *dev, unsigned long chipset)
>   INIT_LIST_HEAD(_priv->res_lru[i]);
>   }
>  
> - mutex_init(_priv->init_mutex);
>   init_waitqueue_head(_priv->fence_queue);
>   init_waitqueue_head(_priv->fifo_queue);
>   dev_priv->fence_queue_waiters = 0;
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> index 96983c47fb40..9be2176cc260 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> @@ -484,11 +484,6 @@ struct vmw_private {
>  
>   spinlock_t resource_lock;
>   struct idr res_idr[vmw_res_max];
> - /*
> -  * Block lastclose from racing with firstopen.
> -  */
> -
> - struct mutex init_mutex;
>  
>   /*
>* A resource manager for kernel-only surfaces and
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/5] vmwgfx: drop empty lastclose stub

2019-05-23 Thread Thomas Hellstrom
On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Core DRM is safe when the callback is NULL.
> 
> Cc: "VMware Graphics" 
> Cc: Thomas Hellstrom 
> Cc: Daniel Vetter 
> Signed-off-by: Emil Velikov 

Reviewed-by: Thomas Hellstrom 


> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index be25ce9440ad..a38f06909fb6 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -1200,10 +1200,6 @@ static long vmw_compat_ioctl(struct file
> *filp, unsigned int cmd,
>  }
>  #endif
>  
> -static void vmw_lastclose(struct drm_device *dev)
> -{
> -}
> -
>  static void vmw_master_init(struct vmw_master *vmaster)
>  {
>   ttm_lock_init(>lock);
> @@ -1568,7 +1564,6 @@ static struct drm_driver driver = {
>   DRIVER_MODESET | DRIVER_PRIME | DRIVER_RENDER | DRIVER_ATOMIC,
>   .load = vmw_driver_load,
>   .unload = vmw_driver_unload,
> - .lastclose = vmw_lastclose,
>   .get_vblank_counter = vmw_get_vblank_counter,
>   .enable_vblank = vmw_enable_vblank,
>   .disable_vblank = vmw_disable_vblank,
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 4/5] drm/vmwgfx: remove custom ioctl io encoding check

2019-05-23 Thread Thomas Hellstrom
Hi, Emil,

On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Drop the custom ioctl io encoding check - core drm does it for us.

I fail to see where the core does this, or do I miss something?
Thanks,
Thomas


> 
> Cc: "VMware Graphics" 
> Cc: Thomas Hellstrom 
> Cc: Daniel Vetter 
> Signed-off-by: Emil Velikov 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 9 -
>  1 file changed, 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index 2cb6ae219e43..f65542639b55 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -1147,9 +1147,6 @@ static long vmw_generic_ioctl(struct file
> *filp, unsigned int cmd,
>   return -EACCES;
>   }
>  
> - if (unlikely(ioctl->cmd != cmd))
> - goto out_io_encoding;
> -
>   flags = ioctl->flags;
>   } else if (!drm_ioctl_flags(nr, ))
>   return -EINVAL;
> @@ -1169,12 +1166,6 @@ static long vmw_generic_ioctl(struct file
> *filp, unsigned int cmd,
>   ttm_read_unlock(>lock);
>  
>   return ret;
> -
> -out_io_encoding:
> - DRM_ERROR("Invalid command format, ioctl %d\n",
> -   nr - DRM_COMMAND_BASE);
> -
> - return -EINVAL;
>  }
>  
>  static long vmw_unlocked_ioctl(struct file *filp, unsigned int cmd,
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 3/5] drm/vmwgfx: use core drm to extend/check vmw_execbuf_ioctl

2019-05-22 Thread Thomas Hellstrom
Hi, Emil,

On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> Currently vmw_execbuf_ioctl() open-codes the permission checking,
> size
> extending and copying that is already done in core drm.
> 
> Kill all the duplication, adding a few comments for clarity.

Ah, there is core functionality for this now.

What worries me though with the core approach is that the sizes are not
capped by the size of the kernel argument definition, which makes
mailicious user-space being able to force kmallocs() the size of the
maximum ioctl size. Should probably be fixed before pushing this.


> 
> Cc: "VMware Graphics" 
> Cc: Thomas Hellstrom 
> Cc: Daniel Vetter 
> Signed-off-by: Emil Velikov 
> ---
> Thomas, VMware team,
> 
> Please give this some testing on your end. I've only tested it
> against
> mesa-master.

I'll review tomorrow and do some testing. Need to see if I can dig up
user-space apps with version 0...

Thanks,

Thomas

> 
> Thanks
> Emil
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 12 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h |  4 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 52 +
> 
>  3 files changed, 23 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> index d3f108f7e52d..2cb6ae219e43 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> @@ -186,7 +186,7 @@ static const struct drm_ioctl_desc vmw_ioctls[] =
> {
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_REF_SURFACE, vmw_surface_reference_ioctl,
> DRM_AUTH | DRM_RENDER_ALLOW),
> - VMW_IOCTL_DEF(VMW_EXECBUF, NULL, DRM_AUTH |
> + VMW_IOCTL_DEF(VMW_EXECBUF, vmw_execbuf_ioctl, DRM_AUTH |
> DRM_RENDER_ALLOW),
>   VMW_IOCTL_DEF(VMW_FENCE_WAIT, vmw_fence_obj_wait_ioctl,
> DRM_RENDER_ALLOW),
> @@ -1140,15 +1140,7 @@ static long vmw_generic_ioctl(struct file
> *filp, unsigned int cmd,
>   _ioctls[nr - DRM_COMMAND_BASE];
>  
>   if (nr == DRM_COMMAND_BASE + DRM_VMW_EXECBUF) {
> - ret = (long) drm_ioctl_permit(ioctl->flags,
> file_priv);
> - if (unlikely(ret != 0))
> - return ret;
> -
> - if (unlikely((cmd & (IOC_IN | IOC_OUT)) !=
> IOC_IN))
> - goto out_io_encoding;
> -
> - return (long) vmw_execbuf_ioctl(dev, arg,
> file_priv,
> - _IOC_SIZE(cmd))
> ;
> + return ioctl_func(filp, cmd, arg);
>   } else if (nr == DRM_COMMAND_BASE +
> DRM_VMW_UPDATE_LAYOUT) {
>   if (!drm_is_current_master(file_priv) &&
>   !capable(CAP_SYS_ADMIN))
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> index 9be2176cc260..f5bfac85f793 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
> @@ -910,8 +910,8 @@ static inline struct page *vmw_piter_page(struct
> vmw_piter *viter)
>   * Command submission - vmwgfx_execbuf.c
>   */
>  
> -extern int vmw_execbuf_ioctl(struct drm_device *dev, unsigned long
> data,
> -  struct drm_file *file_priv, size_t size);
> +extern int vmw_execbuf_ioctl(struct drm_device *dev, void *data,
> +  struct drm_file *file_priv);
>  extern int vmw_execbuf_process(struct drm_file *file_priv,
>  struct vmw_private *dev_priv,
>  void __user *user_commands,
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> index 2ff7ba04d8c8..767e2b99618d 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> @@ -3977,54 +3977,40 @@ void vmw_execbuf_release_pinned_bo(struct
> vmw_private *dev_priv)
>   mutex_unlock(_priv->cmdbuf_mutex);
>  }
>  
> -int vmw_execbuf_ioctl(struct drm_device *dev, unsigned long data,
> -   struct drm_file *file_priv, size_t size)
> +int vmw_execbuf_ioctl(struct drm_device *dev, void *data,
> +   struct drm_file *file_priv)
>  {
>   struct vmw_private *dev_priv = vmw_priv(dev);
> - struct drm_vmw_execbuf_arg arg;
> + struct drm_vmw_execbuf_arg *arg = data;
>   int ret;
> - static const size_t copy_offset[] = {
> - offsetof(struct drm_vmw_execbuf_arg, context_handle),
> - sizeof(stru

[git pull] vmwgfx-fixes-5.2

2019-05-22 Thread Thomas Hellstrom (VMware)
Dave, Daniel

A set of misc fixes for various issues that have surfaced recently.
All Cc'd stable except the dma iterator fix which shouldn't really cause
any real issues on older kernels.

The following changes since commit a188339ca5a396acc588e5851ed7e19f66b0ebd9:

  Linux 5.2-rc1 (2019-05-19 15:47:09 -0700)

are available in the Git repository at:

  git://people.freedesktop.org/~thomash/linux vmwgfx-fixes-5.2

for you to fetch changes up to 5ed7f4b5eca11c3c69e7c8b53e4321812bc1ee1e:

  drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an 
invalid read (2019-05-21 10:23:10 +0200)


Murray McAllister (2):
  drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define()
  drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an 
invalid read

Thomas Hellstrom (4):
  drm/vmwgfx: Don't send drm sysfs hotplug events on initial master set
  drm/vmwgfx: Fix user space handle equal to zero
  drm/vmwgfx: Fix compat mode shader operation
  drm/vmwgfx: Use the dma scatter-gather iterator to get dma addresses

 drivers/gpu/drm/vmwgfx/ttm_object.c|  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c|  8 +++-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h|  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c| 20 +++-
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 27 +++
 5 files changed, 35 insertions(+), 24 deletions(-)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 4/6] drm/vmwgfx: Use the dma scatter-gather iterator to get dma addresses

2019-05-21 Thread Thomas Hellstrom
Use struct sg_dma_page_iter in favour struct of sg_page_iter, which fairly
recently was declared useless for obtaining dma addresses.

With a struct sg_dma_page_iter we can't call sg_page_iter_page() so
when the page is needed, use the same page lookup mechanism as for the
non-sg dma modes instead of calling sg_dma_page_iter.

Note, the fixes tag doesn't really point to a commit introducing a
failure / regression, but rather to a commit that implemented a simple
workaround for this problem.

Cc: Jason Gunthorpe 
Fixes: d901b2760dc6 ("lib/scatterlist: Provide a DMA page iterator")
Signed-off-by: Thomas Hellstrom 
Reviewed-by: Jason Gunthorpe 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h|  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 27 ++
 2 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index 96983c47fb40..366dcfc1f9bb 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -296,7 +296,7 @@ struct vmw_sg_table {
 struct vmw_piter {
struct page **pages;
const dma_addr_t *addrs;
-   struct sg_page_iter iter;
+   struct sg_dma_page_iter iter;
unsigned long i;
unsigned long num_pages;
bool (*next)(struct vmw_piter *);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
index a3357ff7540d..a6ea75b58a83 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
@@ -266,7 +266,9 @@ static bool __vmw_piter_non_sg_next(struct vmw_piter *viter)
 
 static bool __vmw_piter_sg_next(struct vmw_piter *viter)
 {
-   return __sg_page_iter_next(>iter);
+   bool ret = __vmw_piter_non_sg_next(viter);
+
+   return __sg_page_iter_dma_next(>iter) && ret;
 }
 
 
@@ -284,12 +286,6 @@ static struct page *__vmw_piter_non_sg_page(struct 
vmw_piter *viter)
return viter->pages[viter->i];
 }
 
-static struct page *__vmw_piter_sg_page(struct vmw_piter *viter)
-{
-   return sg_page_iter_page(>iter);
-}
-
-
 /**
  * Helper functions to return the DMA address of the current page.
  *
@@ -311,13 +307,7 @@ static dma_addr_t __vmw_piter_dma_addr(struct vmw_piter 
*viter)
 
 static dma_addr_t __vmw_piter_sg_addr(struct vmw_piter *viter)
 {
-   /*
-* FIXME: This driver wrongly mixes DMA and CPU SG list iteration and
-* needs revision. See
-* https://lore.kernel.org/lkml/20190104223531.ga1...@ziepe.ca/
-*/
-   return sg_page_iter_dma_address(
-   container_of(>iter, struct sg_dma_page_iter, base));
+   return sg_page_iter_dma_address(>iter);
 }
 
 
@@ -336,26 +326,23 @@ void vmw_piter_start(struct vmw_piter *viter, const 
struct vmw_sg_table *vsgt,
 {
viter->i = p_offset - 1;
viter->num_pages = vsgt->num_pages;
+   viter->page = &__vmw_piter_non_sg_page;
+   viter->pages = vsgt->pages;
switch (vsgt->mode) {
case vmw_dma_phys:
viter->next = &__vmw_piter_non_sg_next;
viter->dma_address = &__vmw_piter_phys_addr;
-   viter->page = &__vmw_piter_non_sg_page;
-   viter->pages = vsgt->pages;
break;
case vmw_dma_alloc_coherent:
viter->next = &__vmw_piter_non_sg_next;
viter->dma_address = &__vmw_piter_dma_addr;
-   viter->page = &__vmw_piter_non_sg_page;
viter->addrs = vsgt->addrs;
-   viter->pages = vsgt->pages;
break;
case vmw_dma_map_populate:
case vmw_dma_map_bind:
viter->next = &__vmw_piter_sg_next;
viter->dma_address = &__vmw_piter_sg_addr;
-   viter->page = &__vmw_piter_sg_page;
-   __sg_page_iter_start(>iter, vsgt->sgt->sgl,
+   __sg_page_iter_start(>iter.base, vsgt->sgt->sgl,
 vsgt->sgt->orig_nents, p_offset);
break;
default:
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 5/6] drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define()

2019-05-21 Thread Thomas Hellstrom
From: Murray McAllister 

If SVGA_3D_CMD_DX_DEFINE_RENDERTARGET_VIEW is called with a surface
ID of SVGA3D_INVALID_ID, the srf struct will remain NULL after
vmw_cmd_res_check(), leading to a null pointer dereference in
vmw_view_add().

Cc: 
Fixes: d80efd5cb3de ("drm/vmwgfx: Initial DX support")
Signed-off-by: Murray McAllister 
Reviewed-by: Thomas Hellstrom 
Signed-off-by: Thomas Hellstrom 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
index 315f9efce765..b4c7553d2814 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
@@ -2427,6 +2427,10 @@ static int vmw_cmd_dx_view_define(struct vmw_private 
*dev_priv,
return -EINVAL;
 
cmd = container_of(header, typeof(*cmd), header);
+   if (unlikely(cmd->sid == SVGA3D_INVALID_ID)) {
+   VMW_DEBUG_USER("Invalid surface id.\n");
+   return -EINVAL;
+   }
ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface,
VMW_RES_DIRTY_NONE, user_surface_converter,
>sid, );
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 3/6] drm/vmwgfx: Fix compat mode shader operation

2019-05-21 Thread Thomas Hellstrom
In compat mode, we allowed host-backed user-space with guest-backed
kernel / device. In this mode, set shader commands was broken since
no relocations were emitted. Fix this.

Cc: 
Fixes: e8c66efbfe3a ("drm/vmwgfx: Make user resource lookups reference-free 
during validation")
Signed-off-by: Thomas Hellstrom 
Reviewed-by: Brian Paul 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
index 2ff7ba04d8c8..315f9efce765 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
@@ -2010,6 +2010,11 @@ static int vmw_cmd_set_shader(struct vmw_private 
*dev_priv,
return 0;
 
if (cmd->body.shid != SVGA3D_INVALID_ID) {
+   /*
+* This is the compat shader path - Per device guest-backed
+* shaders, but user-space thinks it's per context host-
+* backed shaders.
+*/
res = vmw_shader_lookup(vmw_context_res_man(ctx),
cmd->body.shid, cmd->body.type);
if (!IS_ERR(res)) {
@@ -2017,6 +2022,14 @@ static int vmw_cmd_set_shader(struct vmw_private 
*dev_priv,
VMW_RES_DIRTY_NONE);
if (unlikely(ret != 0))
return ret;
+
+   ret = vmw_resource_relocation_add
+   (sw_context, res,
+vmw_ptr_diff(sw_context->buf_start,
+ >body.shid),
+vmw_res_rel_normal);
+   if (unlikely(ret != 0))
+   return ret;
}
}
 
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 6/6] drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an invalid read

2019-05-21 Thread Thomas Hellstrom
From: Murray McAllister 

If SVGA_3D_CMD_DX_SET_SHADER is called with a shader ID
of SVGA3D_INVALID_ID, and a shader type of
SVGA3D_SHADERTYPE_INVALID, the calculated binding.shader_slot
will be 4294967295, leading to an out-of-bounds read in vmw_binding_loc()
when the offset is calculated.

Cc: 
Fixes: d80efd5cb3de ("drm/vmwgfx: Initial DX support")
Signed-off-by: Murray McAllister 
Reviewed-by: Thomas Hellstrom 
Signed-off-by: Thomas Hellstrom 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
index b4c7553d2814..33533d126277 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
@@ -2206,7 +2206,8 @@ static int vmw_cmd_dx_set_shader(struct vmw_private 
*dev_priv,
 
cmd = container_of(header, typeof(*cmd), header);
 
-   if (cmd->body.type >= SVGA3D_SHADERTYPE_DX10_MAX) {
+   if (cmd->body.type >= SVGA3D_SHADERTYPE_DX10_MAX ||
+   cmd->body.type < SVGA3D_SHADERTYPE_MIN) {
VMW_DEBUG_USER("Illegal shader type %u.\n",
   (unsigned int) cmd->body.type);
return -EINVAL;
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 2/6] drm/vmwgfx: Fix user space handle equal to zero

2019-05-21 Thread Thomas Hellstrom
User-space handles equal to zero are interpreted as uninitialized or
illegal by some drm systems (most notably kms). This means that a
dumb buffer or surface with a zero user-space handle can never be
used as a kms frame-buffer.

Cc: 
Fixes: c7eae62666ad ("drm/vmwgfx: Make the object handles idr-generated")
Signed-off-by: Thomas Hellstrom 
Reviewed-by: Deepak Rawat 
---
 drivers/gpu/drm/vmwgfx/ttm_object.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vmwgfx/ttm_object.c 
b/drivers/gpu/drm/vmwgfx/ttm_object.c
index 36990b80e790..16077785ad47 100644
--- a/drivers/gpu/drm/vmwgfx/ttm_object.c
+++ b/drivers/gpu/drm/vmwgfx/ttm_object.c
@@ -174,7 +174,7 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
kref_init(>refcount);
idr_preload(GFP_KERNEL);
spin_lock(>object_lock);
-   ret = idr_alloc(>idr, base, 0, 0, GFP_NOWAIT);
+   ret = idr_alloc(>idr, base, 1, 0, GFP_NOWAIT);
spin_unlock(>object_lock);
idr_preload_end();
if (ret < 0)
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 1/6] drm/vmwgfx: Don't send drm sysfs hotplug events on initial master set

2019-05-21 Thread Thomas Hellstrom
This may confuse user-space clients like plymouth that opens a drm
file descriptor as a result of a hotplug event and then generates a
new event...

Cc: 
Fixes: 5ea1734827bb ("drm/vmwgfx: Send a hotplug event at master_set")
Signed-off-by: Thomas Hellstrom 
Reviewed-by: Deepak Rawat 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index bf6c3500d363..4ff11a0077e1 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -1239,7 +1239,13 @@ static int vmw_master_set(struct drm_device *dev,
}
 
dev_priv->active_master = vmaster;
-   drm_sysfs_hotplug_event(dev);
+
+   /*
+* Inform a new master that the layout may have changed while
+* it was gone.
+*/
+   if (!from_open)
+   drm_sysfs_hotplug_event(dev);
 
return 0;
 }
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an invalid read

2019-05-20 Thread Thomas Hellstrom
Thanks, Murray,

I'll include in the next vmwgfx-fixes pull request.

On Mon, 2019-05-20 at 21:57 +1200, Murray McAllister wrote:
> If SVGA_3D_CMD_DX_SET_SHADER is called with a shader ID
> of SVGA3D_INVALID_ID, and a shader type of
> SVGA3D_SHADERTYPE_INVALID, the calculated binding.shader_slot
> will be 4294967295, leading to an out-of-bounds read in
> vmw_binding_loc()
> when the offset is calculated.
> 
> Signed-off-by: Murray McAllister 
> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> index 2ff7ba04d8c8..9aeb5448cfc1 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> @@ -2193,7 +2193,8 @@ static int vmw_cmd_dx_set_shader(struct
> vmw_private *dev_priv,
>  
>   cmd = container_of(header, typeof(*cmd), header);
>  
> - if (cmd->body.type >= SVGA3D_SHADERTYPE_DX10_MAX) {
> + if (cmd->body.type >= SVGA3D_SHADERTYPE_DX10_MAX ||
> + cmd->body.type < SVGA3D_SHADERTYPE_MIN) {
>   VMW_DEBUG_USER("Illegal shader type %u.\n",
>  (unsigned int) cmd->body.type);
>   return -EINVAL;
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v2] drm/ttm, drm/vmwgfx: Use a configuration option for the TTM dma page pool

2019-05-16 Thread Thomas Hellstrom
Drivers like vmwgfx may want to test whether the dma page pool is present
or not. Since it's activated by default by TTM if compiled-in, define a
hidden configuration option that the driver can test for.

Cc: Christian König 
Signed-off-by: Thomas Hellstrom 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/Kconfig  | 7 +++
 drivers/gpu/drm/ttm/Makefile | 4 ++--
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 3 ---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  | 3 +--
 include/drm/ttm/ttm_page_alloc.h | 2 +-
 5 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 2267e84d5cb4..be66027f7dbe 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -160,6 +160,13 @@ config DRM_TTM
  GPU memory types. Will be enabled automatically if a device driver
  uses it.
 
+config DRM_TTM_DMA_PAGE_POOL
+   bool
+   depends on DRM_TTM && (SWIOTLB || INTEL_IOMMU)
+   default y
+   help
+ Choose this if you need the TTM dma page pool
+
 config DRM_GEM_CMA_HELPER
bool
depends on DRM
diff --git a/drivers/gpu/drm/ttm/Makefile b/drivers/gpu/drm/ttm/Makefile
index 01fc670ce7a2..caea2a099496 100644
--- a/drivers/gpu/drm/ttm/Makefile
+++ b/drivers/gpu/drm/ttm/Makefile
@@ -4,8 +4,8 @@
 
 ttm-y := ttm_memory.o ttm_tt.o ttm_bo.o \
ttm_bo_util.o ttm_bo_vm.o ttm_module.o \
-   ttm_execbuf_util.o ttm_page_alloc.o ttm_bo_manager.o \
-   ttm_page_alloc_dma.o
+   ttm_execbuf_util.o ttm_page_alloc.o ttm_bo_manager.o
 ttm-$(CONFIG_AGP) += ttm_agp_backend.o
+ttm-$(CONFIG_DRM_TTM_DMA_PAGE_POOL) += ttm_page_alloc_dma.o
 
 obj-$(CONFIG_DRM_TTM) += ttm.o
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index d594f7520b7b..98d100fd1599 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
@@ -33,7 +33,6 @@
  *   when freed).
  */
 
-#if defined(CONFIG_SWIOTLB) || defined(CONFIG_INTEL_IOMMU)
 #define pr_fmt(fmt) "[TTM] " fmt
 
 #include 
@@ -1234,5 +1233,3 @@ int ttm_dma_page_alloc_debugfs(struct seq_file *m, void 
*data)
return 0;
 }
 EXPORT_SYMBOL_GPL(ttm_dma_page_alloc_debugfs);
-
-#endif
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index d59c474be38e..850efa196d72 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -572,8 +572,7 @@ static int vmw_dma_select_mode(struct vmw_private *dev_priv)
else
dev_priv->map_mode = vmw_dma_map_populate;
 
-   /* No TTM coherent page pool? FIXME: Ask TTM instead! */
-if (!(IS_ENABLED(CONFIG_SWIOTLB) || IS_ENABLED(CONFIG_INTEL_IOMMU)) &&
+   if (!IS_ENABLED(CONFIG_DRM_TTM_DMA_PAGE_POOL) &&
(dev_priv->map_mode == vmw_dma_alloc_coherent))
return -EINVAL;
 
diff --git a/include/drm/ttm/ttm_page_alloc.h b/include/drm/ttm/ttm_page_alloc.h
index 4d9b019d253c..a6b6ef5f9bf4 100644
--- a/include/drm/ttm/ttm_page_alloc.h
+++ b/include/drm/ttm/ttm_page_alloc.h
@@ -74,7 +74,7 @@ void ttm_unmap_and_unpopulate_pages(struct device *dev, 
struct ttm_dma_tt *tt);
  */
 int ttm_page_alloc_debugfs(struct seq_file *m, void *data);
 
-#if defined(CONFIG_SWIOTLB) || defined(CONFIG_INTEL_IOMMU)
+#if defined(CONFIG_DRM_TTM_DMA_PAGE_POOL)
 /**
  * Initialize pool allocator.
  */
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/ttm, drm/vmwgfx: Use a configuration option for the TTM dma page pool

2019-05-16 Thread Thomas Hellstrom
On Thu, 2019-05-16 at 12:05 +0200, Christian König wrote:
> Am 16.05.19 um 11:23 schrieb Thomas Hellstrom:
> > Drivers like vmwgfx may want to test whether the dma page pool is
> > present
> > or not. Since it's activated by default by TTM if compiled-in,
> > define a
> > hidden configuration option that the driver can test for.
> > 
> > Cc: Christian König 
> > Signed-off-by: Thomas Hellstrom 
> 
> There are at least also occasions of this in radeon and amdgpu, but 
> those can be cleaned up later on.
> 
> Reviewed-by: Christian König  for now.
> 
> Which tree should we use for merging?
> 
> Thanks,
> Christian.

We can take it through an AMD tree if it's OK with you. Then it would
be easier to add similar changes to the AMD drivers.

I'll send out v2 with some whitespace cleanup, a config help text and
R-b next.

Thanks,
Thomas





> 
> > ---
> >   drivers/gpu/drm/Kconfig  | 5 +
> >   drivers/gpu/drm/ttm/Makefile | 4 ++--
> >   drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 3 ---
> >   drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  | 3 +--
> >   include/drm/ttm/ttm_page_alloc.h | 2 +-
> >   5 files changed, 9 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > index 2267e84d5cb4..f733a9273b3f 100644
> > --- a/drivers/gpu/drm/Kconfig
> > +++ b/drivers/gpu/drm/Kconfig
> > @@ -160,6 +160,11 @@ config DRM_TTM
> >   GPU memory types. Will be enabled automatically if a device
> > driver
> >   uses it.
> >   
> > +config DRM_TTM_DMA_PAGE_POOL
> > +bool
> > +   depends on DRM_TTM && (SWIOTLB || INTEL_IOMMU)
> > +   default y
> > +
> >   config DRM_GEM_CMA_HELPER
> > bool
> > depends on DRM
> > diff --git a/drivers/gpu/drm/ttm/Makefile
> > b/drivers/gpu/drm/ttm/Makefile
> > index 01fc670ce7a2..caea2a099496 100644
> > --- a/drivers/gpu/drm/ttm/Makefile
> > +++ b/drivers/gpu/drm/ttm/Makefile
> > @@ -4,8 +4,8 @@
> >   
> >   ttm-y := ttm_memory.o ttm_tt.o ttm_bo.o \
> > ttm_bo_util.o ttm_bo_vm.o ttm_module.o \
> > -   ttm_execbuf_util.o ttm_page_alloc.o ttm_bo_manager.o \
> > -   ttm_page_alloc_dma.o
> > +   ttm_execbuf_util.o ttm_page_alloc.o ttm_bo_manager.o
> >   ttm-$(CONFIG_AGP) += ttm_agp_backend.o
> > +ttm-$(CONFIG_DRM_TTM_DMA_PAGE_POOL) += ttm_page_alloc_dma.o
> >   
> >   obj-$(CONFIG_DRM_TTM) += ttm.o
> > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > index d594f7520b7b..98d100fd1599 100644
> > --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> > @@ -33,7 +33,6 @@
> >*   when freed).
> >*/
> >   
> > -#if defined(CONFIG_SWIOTLB) || defined(CONFIG_INTEL_IOMMU)
> >   #define pr_fmt(fmt) "[TTM] " fmt
> >   
> >   #include 
> > @@ -1234,5 +1233,3 @@ int ttm_dma_page_alloc_debugfs(struct
> > seq_file *m, void *data)
> > return 0;
> >   }
> >   EXPORT_SYMBOL_GPL(ttm_dma_page_alloc_debugfs);
> > -
> > -#endif
> > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > index d59c474be38e..bc259d4df1cb 100644
> > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
> > @@ -572,8 +572,7 @@ static int vmw_dma_select_mode(struct
> > vmw_private *dev_priv)
> > else
> > dev_priv->map_mode = vmw_dma_map_populate;
> >   
> > -   /* No TTM coherent page pool? FIXME: Ask TTM instead! */
> > -if (!(IS_ENABLED(CONFIG_SWIOTLB) ||
> > IS_ENABLED(CONFIG_INTEL_IOMMU)) &&
> > +if (!IS_ENABLED(CONFIG_DRM_TTM_DMA_PAGE_POOL) &&
> > (dev_priv->map_mode == vmw_dma_alloc_coherent))
> > return -EINVAL;
> >   
> > diff --git a/include/drm/ttm/ttm_page_alloc.h
> > b/include/drm/ttm/ttm_page_alloc.h
> > index 4d9b019d253c..a6b6ef5f9bf4 100644
> > --- a/include/drm/ttm/ttm_page_alloc.h
> > +++ b/include/drm/ttm/ttm_page_alloc.h
> > @@ -74,7 +74,7 @@ void ttm_unmap_and_unpopulate_pages(struct device
> > *dev, struct ttm_dma_tt *tt);
> >*/
> >   int ttm_page_alloc_debugfs(struct seq_file *m, void *data);
> >   
> > -#if defined(CONFIG_SWIOTLB) || defined(CONFIG_INTEL_IOMMU)
> > +#if defined(CONFIG_DRM_TTM_DMA_PAGE_POOL)
> >   /**
> >* Initialize pool allocator.
> >*/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm/ttm, drm/vmwgfx: Use a configuration option for the TTM dma page pool

2019-05-16 Thread Thomas Hellstrom
Drivers like vmwgfx may want to test whether the dma page pool is present
or not. Since it's activated by default by TTM if compiled-in, define a
hidden configuration option that the driver can test for.

Cc: Christian König 
Signed-off-by: Thomas Hellstrom 
---
 drivers/gpu/drm/Kconfig  | 5 +
 drivers/gpu/drm/ttm/Makefile | 4 ++--
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 3 ---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  | 3 +--
 include/drm/ttm/ttm_page_alloc.h | 2 +-
 5 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 2267e84d5cb4..f733a9273b3f 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -160,6 +160,11 @@ config DRM_TTM
  GPU memory types. Will be enabled automatically if a device driver
  uses it.
 
+config DRM_TTM_DMA_PAGE_POOL
+bool
+   depends on DRM_TTM && (SWIOTLB || INTEL_IOMMU)
+   default y
+
 config DRM_GEM_CMA_HELPER
bool
depends on DRM
diff --git a/drivers/gpu/drm/ttm/Makefile b/drivers/gpu/drm/ttm/Makefile
index 01fc670ce7a2..caea2a099496 100644
--- a/drivers/gpu/drm/ttm/Makefile
+++ b/drivers/gpu/drm/ttm/Makefile
@@ -4,8 +4,8 @@
 
 ttm-y := ttm_memory.o ttm_tt.o ttm_bo.o \
ttm_bo_util.o ttm_bo_vm.o ttm_module.o \
-   ttm_execbuf_util.o ttm_page_alloc.o ttm_bo_manager.o \
-   ttm_page_alloc_dma.o
+   ttm_execbuf_util.o ttm_page_alloc.o ttm_bo_manager.o
 ttm-$(CONFIG_AGP) += ttm_agp_backend.o
+ttm-$(CONFIG_DRM_TTM_DMA_PAGE_POOL) += ttm_page_alloc_dma.o
 
 obj-$(CONFIG_DRM_TTM) += ttm.o
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index d594f7520b7b..98d100fd1599 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
@@ -33,7 +33,6 @@
  *   when freed).
  */
 
-#if defined(CONFIG_SWIOTLB) || defined(CONFIG_INTEL_IOMMU)
 #define pr_fmt(fmt) "[TTM] " fmt
 
 #include 
@@ -1234,5 +1233,3 @@ int ttm_dma_page_alloc_debugfs(struct seq_file *m, void 
*data)
return 0;
 }
 EXPORT_SYMBOL_GPL(ttm_dma_page_alloc_debugfs);
-
-#endif
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index d59c474be38e..bc259d4df1cb 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -572,8 +572,7 @@ static int vmw_dma_select_mode(struct vmw_private *dev_priv)
else
dev_priv->map_mode = vmw_dma_map_populate;
 
-   /* No TTM coherent page pool? FIXME: Ask TTM instead! */
-if (!(IS_ENABLED(CONFIG_SWIOTLB) || IS_ENABLED(CONFIG_INTEL_IOMMU)) &&
+if (!IS_ENABLED(CONFIG_DRM_TTM_DMA_PAGE_POOL) &&
(dev_priv->map_mode == vmw_dma_alloc_coherent))
return -EINVAL;
 
diff --git a/include/drm/ttm/ttm_page_alloc.h b/include/drm/ttm/ttm_page_alloc.h
index 4d9b019d253c..a6b6ef5f9bf4 100644
--- a/include/drm/ttm/ttm_page_alloc.h
+++ b/include/drm/ttm/ttm_page_alloc.h
@@ -74,7 +74,7 @@ void ttm_unmap_and_unpopulate_pages(struct device *dev, 
struct ttm_dma_tt *tt);
  */
 int ttm_page_alloc_debugfs(struct seq_file *m, void *data);
 
-#if defined(CONFIG_SWIOTLB) || defined(CONFIG_INTEL_IOMMU)
+#if defined(CONFIG_DRM_TTM_DMA_PAGE_POOL)
 /**
  * Initialize pool allocator.
  */
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define()

2019-05-13 Thread Thomas Hellstrom
On Sat, 2019-05-11 at 18:01 +1200, Murray McAllister wrote:
> If SVGA_3D_CMD_DX_DEFINE_RENDERTARGET_VIEW is called with a surface
> ID of SVGA3D_INVALID_ID, the srf struct will remain NULL after
> vmw_cmd_res_check(), leading to a null pointer dereference in
> vmw_view_add().
> 
> Signed-off-by: Murray McAllister 

Thanks, I'll add this to the next -fixes pull.
Thomas


> ---
>  drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> index 2ff7ba04d8c8..447afd086206 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
> @@ -2414,6 +2414,10 @@ static int vmw_cmd_dx_view_define(struct
> vmw_private *dev_priv,
>   return -EINVAL;
>  
>   cmd = container_of(header, typeof(*cmd), header);
> + if (unlikely(cmd->sid == SVGA3D_INVALID_ID)) {
> + DRM_ERROR("Invalid surface id.\n");
> + return -EINVAL;
> + }
>   ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface,
>   VMW_RES_DIRTY_NONE,
> user_surface_converter,
>   >sid, );
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6

2019-05-08 Thread Thomas Hellstrom

On 5/7/19 1:42 PM, Koenig, Christian wrote:

Am 07.05.19 um 13:37 schrieb Thomas Hellstrom:

[CAUTION: External Email]

On 5/7/19 1:24 PM, Christian König wrote:

Am 07.05.19 um 13:22 schrieb zhoucm1:


On 2019年05月07日 19:13, Koenig, Christian wrote:

Am 07.05.19 um 13:08 schrieb zhoucm1:

On 2019年05月07日 18:53, Koenig, Christian wrote:

Am 07.05.19 um 11:36 schrieb Chunming Zhou:

heavy gpu job could occupy memory long time, which lead other user
fail to get memory.

basically pick up Christian idea:

1. Reserve the BO in DC using a ww_mutex ticket (trivial).
2. If we then run into this EBUSY condition in TTM check if the BO
we need memory for (or rather the ww_mutex of its reservation
object) has a ticket assigned.
3. If we have a ticket we grab a reference to the first BO on the
LRU, drop the LRU lock and try to grab the reservation lock with
the
ticket.
4. If getting the reservation lock with the ticket succeeded we
check if the BO is still the first one on the LRU in question (the
BO could have moved).
5. If the BO is still the first one on the LRU in question we
try to
evict it as we would evict any other BO.
6. If any of the "If's" above fail we just back off and return
-EBUSY.

v2: fix some minor check
v3: address Christian v2 comments.
v4: fix some missing
v5: handle first_bo unlock and bo_get/put
v6: abstract unified iterate function, and handle all possible
usecase not only pinned bo.

Change-Id: I21423fb922f885465f13833c41df1e134364a8e7
Signed-off-by: Chunming Zhou 
---
     drivers/gpu/drm/ttm/ttm_bo.c | 113
++-
     1 file changed, 97 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
b/drivers/gpu/drm/ttm/ttm_bo.c
index 8502b3ed2d88..bbf1d14d00a7 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable);
  * b. Otherwise, trylock it.
  */
     static bool ttm_bo_evict_swapout_allowable(struct
ttm_buffer_object *bo,
-    struct ttm_operation_ctx *ctx, bool *locked)
+    struct ttm_operation_ctx *ctx, bool *locked, bool
*busy)
     {
     bool ret = false;
    *locked = false;
+    if (busy)
+    *busy = false;
     if (bo->resv == ctx->resv) {
reservation_object_assert_held(bo->resv);
     if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT
@@ -779,35 +781,45 @@ static bool
ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
     } else {
     *locked = reservation_object_trylock(bo->resv);
     ret = *locked;
+    if (!ret && busy)
+    *busy = true;
     }
    return ret;
     }
     -static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
-   uint32_t mem_type,
-   const struct ttm_place *place,
-   struct ttm_operation_ctx *ctx)
+static struct ttm_buffer_object*
+ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev,
+ struct ttm_mem_type_manager *man,
+ const struct ttm_place *place,
+ struct ttm_operation_ctx *ctx,
+ struct ttm_buffer_object **first_bo,
+ bool *locked)
     {
-    struct ttm_bo_global *glob = bdev->glob;
-    struct ttm_mem_type_manager *man = >man[mem_type];
     struct ttm_buffer_object *bo = NULL;
-    bool locked = false;
-    unsigned i;
-    int ret;
+    int i;
     -    spin_lock(>lru_lock);
+    if (first_bo)
+    *first_bo = NULL;
     for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
     list_for_each_entry(bo, >lru[i], lru) {
-    if (!ttm_bo_evict_swapout_allowable(bo, ctx, ))
+    bool busy = false;
+    if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked,
+    )) {

A newline between declaration and code please.


+    if (first_bo && !(*first_bo) && busy) {
+    ttm_bo_get(bo);
+    *first_bo = bo;
+    }
     continue;
+    }
    if (place &&
!bdev->driver->eviction_valuable(bo,
   place)) {
-    if (locked)
+    if (*locked)
reservation_object_unlock(bo->resv);
     continue;
     }
+
     break;
     }
     @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct
ttm_bo_device *bdev,
     bo = NULL;
     }
     +    return bo;
+}
+
+static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
+   uint32_t mem_type,
+   const struct ttm_place *place,
+   struct ttm_operation_ctx *ctx)
+{
+    struct ttm_bo_global *glob = bdev->glob;
+    struct ttm_mem_type_manager *man = >man[mem_type];
+    struct ttm_buffer_object *bo = NULL, *first_bo = NULL;
+    bool locked = false;
+    int ret;

Re: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v6

2019-05-07 Thread Thomas Hellstrom

On 5/7/19 1:24 PM, Christian König wrote:

Am 07.05.19 um 13:22 schrieb zhoucm1:



On 2019年05月07日 19:13, Koenig, Christian wrote:

Am 07.05.19 um 13:08 schrieb zhoucm1:


On 2019年05月07日 18:53, Koenig, Christian wrote:

Am 07.05.19 um 11:36 schrieb Chunming Zhou:

heavy gpu job could occupy memory long time, which lead other user
fail to get memory.

basically pick up Christian idea:

1. Reserve the BO in DC using a ww_mutex ticket (trivial).
2. If we then run into this EBUSY condition in TTM check if the BO
we need memory for (or rather the ww_mutex of its reservation
object) has a ticket assigned.
3. If we have a ticket we grab a reference to the first BO on the
LRU, drop the LRU lock and try to grab the reservation lock with the
ticket.
4. If getting the reservation lock with the ticket succeeded we
check if the BO is still the first one on the LRU in question (the
BO could have moved).
5. If the BO is still the first one on the LRU in question we try to
evict it as we would evict any other BO.
6. If any of the "If's" above fail we just back off and return 
-EBUSY.


v2: fix some minor check
v3: address Christian v2 comments.
v4: fix some missing
v5: handle first_bo unlock and bo_get/put
v6: abstract unified iterate function, and handle all possible
usecase not only pinned bo.

Change-Id: I21423fb922f885465f13833c41df1e134364a8e7
Signed-off-by: Chunming Zhou 
---
    drivers/gpu/drm/ttm/ttm_bo.c | 113
++-
    1 file changed, 97 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
b/drivers/gpu/drm/ttm/ttm_bo.c
index 8502b3ed2d88..bbf1d14d00a7 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable);
 * b. Otherwise, trylock it.
 */
    static bool ttm_bo_evict_swapout_allowable(struct
ttm_buffer_object *bo,
-    struct ttm_operation_ctx *ctx, bool *locked)
+    struct ttm_operation_ctx *ctx, bool *locked, bool 
*busy)

    {
    bool ret = false;
       *locked = false;
+    if (busy)
+    *busy = false;
    if (bo->resv == ctx->resv) {
    reservation_object_assert_held(bo->resv);
    if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT
@@ -779,35 +781,45 @@ static bool
ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
    } else {
    *locked = reservation_object_trylock(bo->resv);
    ret = *locked;
+    if (!ret && busy)
+    *busy = true;
    }
       return ret;
    }
    -static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
-   uint32_t mem_type,
-   const struct ttm_place *place,
-   struct ttm_operation_ctx *ctx)
+static struct ttm_buffer_object*
+ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev,
+ struct ttm_mem_type_manager *man,
+ const struct ttm_place *place,
+ struct ttm_operation_ctx *ctx,
+ struct ttm_buffer_object **first_bo,
+ bool *locked)
    {
-    struct ttm_bo_global *glob = bdev->glob;
-    struct ttm_mem_type_manager *man = >man[mem_type];
    struct ttm_buffer_object *bo = NULL;
-    bool locked = false;
-    unsigned i;
-    int ret;
+    int i;
    -    spin_lock(>lru_lock);
+    if (first_bo)
+    *first_bo = NULL;
    for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
    list_for_each_entry(bo, >lru[i], lru) {
-    if (!ttm_bo_evict_swapout_allowable(bo, ctx, ))
+    bool busy = false;
+    if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked,
+    )) {

A newline between declaration and code please.


+    if (first_bo && !(*first_bo) && busy) {
+    ttm_bo_get(bo);
+    *first_bo = bo;
+    }
    continue;
+    }
       if (place && !bdev->driver->eviction_valuable(bo,
  place)) {
-    if (locked)
+    if (*locked)
reservation_object_unlock(bo->resv);
    continue;
    }
+
    break;
    }
    @@ -818,9 +830,66 @@ static int ttm_mem_evict_first(struct
ttm_bo_device *bdev,
    bo = NULL;
    }
    +    return bo;
+}
+
+static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
+   uint32_t mem_type,
+   const struct ttm_place *place,
+   struct ttm_operation_ctx *ctx)
+{
+    struct ttm_bo_global *glob = bdev->glob;
+    struct ttm_mem_type_manager *man = >man[mem_type];
+    struct ttm_buffer_object *bo = NULL, *first_bo = NULL;
+    bool locked = false;
+    int ret;
+
+    spin_lock(>lru_lock);
+    bo = ttm_mem_find_evitable_bo(bdev, man, place, ctx, _bo,
+  );
    if (!bo) {
+    struct ttm_operation_ctx busy_ctx;
+
    spin_unlock(>lru_lock);
-    return 

[PATCH 3/9] mm: Add write-protect and clean utilities for address space ranges v3

2019-04-27 Thread Thomas Hellstrom
Add two utilities to a) write-protect and b) clean all ptes pointing into
a range of an address space.
The utilities are intended to aid in tracking dirty pages (either
driver-allocated system memory or pci device memory).
The write-protect utility should be used in conjunction with
page_mkwrite() and pfn_mkwrite() to trigger write page-faults on page
accesses. Typically one would want to use this on sparse accesses into
large memory regions. The clean utility should be used to utilize
hardware dirtying functionality and avoid the overhead of page-faults,
typically on large accesses into small memory regions.

The added file "as_dirty_helpers.c" is initially listed as maintained by
VMware under our DRM driver. If somebody would like it elsewhere,
that's of course no problem.

Notable changes since RFC:
- Added comments to help avoid the usage of these function for VMAs
  it's not intended for. We also do advisory checks on the vm_flags and
  warn on illegal usage.
- Perform the pte modifications the same way softdirty does.
- Add mmu_notifier range invalidation calls.
- Add a config option so that this code is not unconditionally included.
- Tell the mmu_gather code about pending tlb flushes.

Cc: Andrew Morton 
Cc: Matthew Wilcox 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Minchan Kim 
Cc: Michal Hocko 
Cc: Huang Ying 
Cc: Souptick Joarder 
Cc: "Jérôme Glisse" 
Cc: linux...@kvack.org
Cc: linux-ker...@vger.kernel.org

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Ralph Campbell  #v1
---
v2: Fix formatting and typos.
Change file-name of the added file, and don't compile it unless
configured to do so.
v3: Adapt to new arguments to ptep_modify_prot_[start|commit]
---
 MAINTAINERS   |   1 +
 include/linux/mm.h|   9 +-
 mm/Kconfig|   3 +
 mm/Makefile   |   1 +
 mm/as_dirty_helpers.c | 298 ++
 5 files changed, 311 insertions(+), 1 deletion(-)
 create mode 100644 mm/as_dirty_helpers.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e233b3c48546..dd647a68580f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5100,6 +5100,7 @@ T:git git://people.freedesktop.org/~thomash/linux
 S: Supported
 F: drivers/gpu/drm/vmwgfx/
 F: include/uapi/drm/vmwgfx_drm.h
+F: mm/as_dirty_helpers.c
 
 DRM DRIVERS
 M: David Airlie 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 34338ee70317..e446af9732f6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2610,7 +2610,14 @@ struct pfn_range_apply {
 };
 extern int apply_to_pfn_range(struct pfn_range_apply *closure,
  unsigned long address, unsigned long size);
-
+unsigned long apply_as_wrprotect(struct address_space *mapping,
+pgoff_t first_index, pgoff_t nr);
+unsigned long apply_as_clean(struct address_space *mapping,
+pgoff_t first_index, pgoff_t nr,
+pgoff_t bitmap_pgoff,
+unsigned long *bitmap,
+pgoff_t *start,
+pgoff_t *end);
 #ifdef CONFIG_PAGE_POISONING
 extern bool page_poisoning_enabled(void);
 extern void kernel_poison_pages(struct page *page, int numpages, int enable);
diff --git a/mm/Kconfig b/mm/Kconfig
index 25c71eb8a7db..80e41cdbb4ae 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -758,4 +758,7 @@ config GUP_BENCHMARK
 config ARCH_HAS_PTE_SPECIAL
bool
 
+config AS_DIRTY_HELPERS
+bool
+
 endmenu
diff --git a/mm/Makefile b/mm/Makefile
index d210cc9d6f80..4bf396ba3a00 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -99,3 +99,4 @@ obj-$(CONFIG_HARDENED_USERCOPY) += usercopy.o
 obj-$(CONFIG_PERCPU_STATS) += percpu-stats.o
 obj-$(CONFIG_HMM) += hmm.o
 obj-$(CONFIG_MEMFD_CREATE) += memfd.o
+obj-$(CONFIG_AS_DIRTY_HELPERS) += as_dirty_helpers.o
diff --git a/mm/as_dirty_helpers.c b/mm/as_dirty_helpers.c
new file mode 100644
index ..88a1ac0d5da9
--- /dev/null
+++ b/mm/as_dirty_helpers.c
@@ -0,0 +1,298 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/**
+ * struct apply_as - Closure structure for apply_as_range
+ * @base: struct pfn_range_apply we derive from
+ * @start: Address of first modified pte
+ * @end: Address of last modified pte + 1
+ * @total: Total number of modified ptes
+ * @vma: Pointer to the struct vm_area_struct we're currently operating on
+ */
+struct apply_as {
+   struct pfn_range_apply base;
+   unsigned long start;
+   unsigned long end;
+   unsigned long total;
+   struct vm_area_struct *vma;
+};
+
+/**
+ * apply_pt_wrprotect - Leaf pte callback to write-protect a pte
+ * @pte: Pointer to the pte
+ * @token: Page table token, see apply_to_pfn_range()
+ * @addr: The virtual page address
+ * @closure: Pointer to a struct pfn_range_apply embedded in a
+ * struct apply_as
+ *
+ * The function 

Re: [PATCH] Revert "drm/qxl: drop prime import/export callbacks"

2019-04-26 Thread Thomas Hellstrom

On 4/26/19 4:21 PM, Daniel Vetter wrote:

On Fri, Apr 26, 2019 at 7:33 AM Gerd Hoffmann  wrote:

This reverts commit f4c34b1e2a37d5676180901fa6ff188bcb6371f8.

Simliar to commit a0cecc23cfcb Revert "drm/virtio: drop prime
import/export callbacks".  We have to do the same with qxl,
for the same reasons (it breaks DRI3).

Drop the WARN_ON_ONCE().

Fixes: f4c34b1e2a37d5676180901fa6ff188bcb6371f8
Signed-off-by: Gerd Hoffmann 

Maybe we need some helpers for virtual drivers which only allow
self-reimport and nothing else at all? I think there's qxl, virgl,
vmwgfx and maybe also vbox one who could use this ... Just a quick
idea.
-Daniel


I think vmwgfx could, in theory, support the full range of operations,
at least for reasonably recent device versions. However, it wouldn't be 
terribly efficient since the exported dma-buf sglist would basically be 
a bounce-buffer.


/Thomas



---
  drivers/gpu/drm/qxl/qxl_drv.c   |  4 
  drivers/gpu/drm/qxl/qxl_prime.c | 12 
  2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
index 578d867a81d5..f33e349c4ec5 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.c
+++ b/drivers/gpu/drm/qxl/qxl_drv.c
@@ -255,10 +255,14 @@ static struct drm_driver qxl_driver = {
  #if defined(CONFIG_DEBUG_FS)
 .debugfs_init = qxl_debugfs_init,
  #endif
+   .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
+   .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 .gem_prime_export = drm_gem_prime_export,
 .gem_prime_import = drm_gem_prime_import,
 .gem_prime_pin = qxl_gem_prime_pin,
 .gem_prime_unpin = qxl_gem_prime_unpin,
+   .gem_prime_get_sg_table = qxl_gem_prime_get_sg_table,
+   .gem_prime_import_sg_table = qxl_gem_prime_import_sg_table,
 .gem_prime_vmap = qxl_gem_prime_vmap,
 .gem_prime_vunmap = qxl_gem_prime_vunmap,
 .gem_prime_mmap = qxl_gem_prime_mmap,
diff --git a/drivers/gpu/drm/qxl/qxl_prime.c b/drivers/gpu/drm/qxl/qxl_prime.c
index 8b448eca1cd9..114653b471c6 100644
--- a/drivers/gpu/drm/qxl/qxl_prime.c
+++ b/drivers/gpu/drm/qxl/qxl_prime.c
@@ -42,6 +42,18 @@ void qxl_gem_prime_unpin(struct drm_gem_object *obj)
 qxl_bo_unpin(bo);
  }

+struct sg_table *qxl_gem_prime_get_sg_table(struct drm_gem_object *obj)
+{
+   return ERR_PTR(-ENOSYS);
+}
+
+struct drm_gem_object *qxl_gem_prime_import_sg_table(
+   struct drm_device *dev, struct dma_buf_attachment *attach,
+   struct sg_table *table)
+{
+   return ERR_PTR(-ENOSYS);
+}
+
  void *qxl_gem_prime_vmap(struct drm_gem_object *obj)
  {
 struct qxl_bo *bo = gem_to_qxl_bo(obj);
--
2.18.1





___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[git pull] vmwgfx-fixes-5.1

2019-04-25 Thread Thomas Hellstrom
Dave, Daniel

A single fix for a layer violation requested by Cristoph.

The following changes since commit c2d311553855395764e2e5bf401d987ba65c2056:

  drm/vmwgfx: Don't double-free the mode stored in par->set_mode (2019-03-20 
07:57:01 +0100)

are available in the Git repository at:

  git://people.freedesktop.org/~thomash/linux vmwgfx-fixes-5.1

for you to fetch changes up to 81103355b1e23345dbcdeccad59962a424da4a34:

  drm/vmwgfx: Fix dma API layer violation (2019-04-25 09:05:03 +0200)

----
Thomas Hellstrom (1):
  drm/vmwgfx: Fix dma API layer violation

 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 33 +
 1 file changed, 5 insertions(+), 28 deletions(-)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 4/9] drm/ttm: Allow the driver to provide the ttm struct vm_operations_struct

2019-04-25 Thread Thomas Hellstrom
Hi, Christian,

On Wed, 2019-04-24 at 16:20 +0200, Thomas Hellström wrote:
> On Wed, 2019-04-24 at 14:10 +, Koenig, Christian wrote:
> > Am 24.04.19 um 14:00 schrieb Thomas Hellstrom:
> > > Add a pointer to the struct vm_operations_struct in the
> > > bo_device,
> > > and
> > > assign that pointer to the default value currently used.
> > > 
> > > The driver can then optionally modify that pointer and the new
> > > value
> > > can be used for each new vma created.
> > > 
> > > Cc: "Christian König" 
> > > 
> > > Signed-off-by: Thomas Hellstrom 
> > > Reviewed-by: Christian König 
> > 
> > Going to pick those two TTM patches up for amd-staging-drm-next.
> 
> Will you be relying on either patch for related work? Otherwise it
> would be simpler for us to use vmwgfx-next for the whole series,
> targeting 5.3.
> 
> Thomas

Is this OK with you?

Thanks,
Thomas



> 
> > Christian.
> > 
> > > ---
> > >   drivers/gpu/drm/ttm/ttm_bo.c| 1 +
> > >   drivers/gpu/drm/ttm/ttm_bo_vm.c | 6 +++---
> > >   include/drm/ttm/ttm_bo_driver.h | 6 ++
> > >   3 files changed, 10 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > > b/drivers/gpu/drm/ttm/ttm_bo.c
> > > index 3f56647cdb35..1c85bec00472 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > > @@ -1656,6 +1656,7 @@ int ttm_bo_device_init(struct ttm_bo_device
> > > *bdev,
> > >   mutex_lock(_global_mutex);
> > >   list_add_tail(>device_list, >device_list);
> > >   mutex_unlock(_global_mutex);
> > > + bdev->vm_ops = _bo_vm_ops;
> > >   
> > >   return 0;
> > >   out_no_sys:
> > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > index e86a29a1e51f..bfb25b81fed7 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > > @@ -395,7 +395,7 @@ static int ttm_bo_vm_access(struct
> > > vm_area_struct *vma, unsigned long addr,
> > >   return ret;
> > >   }
> > >   
> > > -static const struct vm_operations_struct ttm_bo_vm_ops = {
> > > +const struct vm_operations_struct ttm_bo_vm_ops = {
> > >   .fault = ttm_bo_vm_fault,
> > >   .open = ttm_bo_vm_open,
> > >   .close = ttm_bo_vm_close,
> > > @@ -445,7 +445,7 @@ int ttm_bo_mmap(struct file *filp, struct
> > > vm_area_struct *vma,
> > >   if (unlikely(ret != 0))
> > >   goto out_unref;
> > >   
> > > - vma->vm_ops = _bo_vm_ops;
> > > + vma->vm_ops = bdev->vm_ops;
> > >   
> > >   /*
> > >* Note: We're transferring the bo reference to
> > > @@ -477,7 +477,7 @@ int ttm_fbdev_mmap(struct vm_area_struct
> > > *vma,
> > > struct ttm_buffer_object *bo)
> > >   
> > >   ttm_bo_get(bo);
> > >   
> > > - vma->vm_ops = _bo_vm_ops;
> > > + vma->vm_ops = bo->bdev->vm_ops;
> > >   vma->vm_private_data = bo;
> > >   vma->vm_flags |= VM_MIXEDMAP;
> > >   vma->vm_flags |= VM_IO | VM_DONTEXPAND;
> > > diff --git a/include/drm/ttm/ttm_bo_driver.h
> > > b/include/drm/ttm/ttm_bo_driver.h
> > > index cbf3180cb612..cfeaff5d9706 100644
> > > --- a/include/drm/ttm/ttm_bo_driver.h
> > > +++ b/include/drm/ttm/ttm_bo_driver.h
> > > @@ -443,6 +443,9 @@ extern struct ttm_bo_global {
> > >* @driver: Pointer to a struct ttm_bo_driver struct setup by
> > > the
> > > driver.
> > >* @man: An array of mem_type_managers.
> > >* @vma_manager: Address space manager
> > > + * @vm_ops: Pointer to the struct vm_operations_struct used for
> > > this
> > > + * device's VM operations. The driver may override this before
> > > the
> > > first
> > > + * mmap() call.
> > >* lru_lock: Spinlock that protects the buffer+device lru lists
> > > and
> > >* ddestroy lists.
> > >* @dev_mapping: A pointer to the struct address_space
> > > representing the
> > > @@ -461,6 +464,7 @@ struct ttm_bo_device {
> > >   struct ttm_bo_global *glob;
> > >   struct ttm_bo_driver *driver;
> > >   struct ttm_mem_type_manager man[TTM_NUM_MEM_TYPES];
> > > + const struct vm_operations_struct *vm_ops;
> > >   
> > >   /*
> > >* Protected by internal locks.
> > > @@ -489,6 +493,8 @@ struct ttm_bo_device {
> > >   bool no_retry;
> > >   };
> > >   
> > > +extern const struct vm_operations_struct ttm_bo_vm_ops;
> > > +
> > >   /**
> > >* struct ttm_lru_bulk_move_pos
> > >*
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[git pull] vmwgfx-fixes-5.1

2019-04-25 Thread Thomas Hellstrom
Dave, Daniel

A single fix for a layer violation requested by Cristoph.

The following changes since commit c2d311553855395764e2e5bf401d987ba65c2056:

  drm/vmwgfx: Don't double-free the mode stored in par->set_mode (2019-03-20 
07:57:01 +0100)

are available in the Git repository at:

  git://people.freedesktop.org/~thomash/linux vmwgfx-fixes-5.1

for you to fetch changes up to 81103355b1e23345dbcdeccad59962a424da4a34:

  drm/vmwgfx: Fix dma API layer violation (2019-04-25 09:05:03 +0200)

----
Thomas Hellstrom (1):
  drm/vmwgfx: Fix dma API layer violation

 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 33 +
 1 file changed, 5 insertions(+), 28 deletions(-)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 4/9] drm/ttm: Allow the driver to provide the ttm struct vm_operations_struct

2019-04-24 Thread Thomas Hellstrom
On Wed, 2019-04-24 at 14:10 +, Koenig, Christian wrote:
> Am 24.04.19 um 14:00 schrieb Thomas Hellstrom:
> > Add a pointer to the struct vm_operations_struct in the bo_device,
> > and
> > assign that pointer to the default value currently used.
> > 
> > The driver can then optionally modify that pointer and the new
> > value
> > can be used for each new vma created.
> > 
> > Cc: "Christian König" 
> > 
> > Signed-off-by: Thomas Hellstrom 
> > Reviewed-by: Christian König 
> 
> Going to pick those two TTM patches up for amd-staging-drm-next.

Will you be relying on either patch for related work? Otherwise it
would be simpler for us to use vmwgfx-next for the whole series,
targeting 5.3.

Thomas

> 
> Christian.
> 
> > ---
> >   drivers/gpu/drm/ttm/ttm_bo.c| 1 +
> >   drivers/gpu/drm/ttm/ttm_bo_vm.c | 6 +++---
> >   include/drm/ttm/ttm_bo_driver.h | 6 ++
> >   3 files changed, 10 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > b/drivers/gpu/drm/ttm/ttm_bo.c
> > index 3f56647cdb35..1c85bec00472 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > @@ -1656,6 +1656,7 @@ int ttm_bo_device_init(struct ttm_bo_device
> > *bdev,
> > mutex_lock(_global_mutex);
> > list_add_tail(>device_list, >device_list);
> > mutex_unlock(_global_mutex);
> > +   bdev->vm_ops = _bo_vm_ops;
> >   
> > return 0;
> >   out_no_sys:
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > index e86a29a1e51f..bfb25b81fed7 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> > @@ -395,7 +395,7 @@ static int ttm_bo_vm_access(struct
> > vm_area_struct *vma, unsigned long addr,
> > return ret;
> >   }
> >   
> > -static const struct vm_operations_struct ttm_bo_vm_ops = {
> > +const struct vm_operations_struct ttm_bo_vm_ops = {
> > .fault = ttm_bo_vm_fault,
> > .open = ttm_bo_vm_open,
> > .close = ttm_bo_vm_close,
> > @@ -445,7 +445,7 @@ int ttm_bo_mmap(struct file *filp, struct
> > vm_area_struct *vma,
> > if (unlikely(ret != 0))
> > goto out_unref;
> >   
> > -   vma->vm_ops = _bo_vm_ops;
> > +   vma->vm_ops = bdev->vm_ops;
> >   
> > /*
> >  * Note: We're transferring the bo reference to
> > @@ -477,7 +477,7 @@ int ttm_fbdev_mmap(struct vm_area_struct *vma,
> > struct ttm_buffer_object *bo)
> >   
> > ttm_bo_get(bo);
> >   
> > -   vma->vm_ops = _bo_vm_ops;
> > +   vma->vm_ops = bo->bdev->vm_ops;
> > vma->vm_private_data = bo;
> > vma->vm_flags |= VM_MIXEDMAP;
> > vma->vm_flags |= VM_IO | VM_DONTEXPAND;
> > diff --git a/include/drm/ttm/ttm_bo_driver.h
> > b/include/drm/ttm/ttm_bo_driver.h
> > index cbf3180cb612..cfeaff5d9706 100644
> > --- a/include/drm/ttm/ttm_bo_driver.h
> > +++ b/include/drm/ttm/ttm_bo_driver.h
> > @@ -443,6 +443,9 @@ extern struct ttm_bo_global {
> >* @driver: Pointer to a struct ttm_bo_driver struct setup by the
> > driver.
> >* @man: An array of mem_type_managers.
> >* @vma_manager: Address space manager
> > + * @vm_ops: Pointer to the struct vm_operations_struct used for
> > this
> > + * device's VM operations. The driver may override this before the
> > first
> > + * mmap() call.
> >* lru_lock: Spinlock that protects the buffer+device lru lists
> > and
> >* ddestroy lists.
> >* @dev_mapping: A pointer to the struct address_space
> > representing the
> > @@ -461,6 +464,7 @@ struct ttm_bo_device {
> > struct ttm_bo_global *glob;
> > struct ttm_bo_driver *driver;
> > struct ttm_mem_type_manager man[TTM_NUM_MEM_TYPES];
> > +   const struct vm_operations_struct *vm_ops;
> >   
> > /*
> >  * Protected by internal locks.
> > @@ -489,6 +493,8 @@ struct ttm_bo_device {
> > bool no_retry;
> >   };
> >   
> > +extern const struct vm_operations_struct ttm_bo_vm_ops;
> > +
> >   /**
> >* struct ttm_lru_bulk_move_pos
> >*
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 9/9] drm/vmwgfx: Add surface dirty-tracking callbacks v2

2019-04-24 Thread Thomas Hellstrom
Add the callbacks necessary to implement emulated coherent memory for
surfaces. Add a flag to the gb_surface_create ioctl to indicate that
surface memory should be coherent.
Also bump the drm minor version to signal the availability of coherent
surfaces.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Deepak Rawat 
---
v2: Fix a couple of typos.
---
 .../device_include/svga3d_surfacedefs.h   | 209 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h   |   4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_surface.c   | 390 +-
 include/uapi/drm/vmwgfx_drm.h |   4 +-
 4 files changed, 600 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/device_include/svga3d_surfacedefs.h 
b/drivers/gpu/drm/vmwgfx/device_include/svga3d_surfacedefs.h
index f2bfd3d80598..c4243e76a249 100644
--- a/drivers/gpu/drm/vmwgfx/device_include/svga3d_surfacedefs.h
+++ b/drivers/gpu/drm/vmwgfx/device_include/svga3d_surfacedefs.h
@@ -1280,7 +1280,6 @@ svga3dsurface_get_pixel_offset(SVGA3dSurfaceFormat format,
return offset;
 }
 
-
 static inline u32
 svga3dsurface_get_image_offset(SVGA3dSurfaceFormat format,
   surf_size_struct baseLevelSize,
@@ -1375,4 +1374,212 @@ 
svga3dsurface_is_screen_target_format(SVGA3dSurfaceFormat format)
return svga3dsurface_is_dx_screen_target_format(format);
 }
 
+/**
+ * struct svga3dsurface_mip - Mimpmap level information
+ * @bytes: Bytes required in the backing store of this mipmap level.
+ * @img_stride: Byte stride per image.
+ * @row_stride: Byte stride per block row.
+ * @size: The size of the mipmap.
+ */
+struct svga3dsurface_mip {
+   size_t bytes;
+   size_t img_stride;
+   size_t row_stride;
+   struct drm_vmw_size size;
+
+};
+
+/**
+ * struct svga3dsurface_cache - Cached surface information
+ * @desc: Pointer to the surface descriptor
+ * @mip: Array of mipmap level information. Valid size is @num_mip_levels.
+ * @mip_chain_bytes: Bytes required in the backing store for the whole chain
+ * of mip levels.
+ * @num_mip_levels: Valid size of the @mip array. Number of mipmap levels in
+ * a chain.
+ * @num_layers: Number of slices in an array texture or number of faces in
+ * a cubemap texture.
+ */
+struct svga3dsurface_cache {
+   const struct svga3d_surface_desc *desc;
+   struct svga3dsurface_mip mip[DRM_VMW_MAX_MIP_LEVELS];
+   size_t mip_chain_bytes;
+   u32 num_mip_levels;
+   u32 num_layers;
+};
+
+/**
+ * struct svga3dsurface_loc - Surface location
+ * @sub_resource: Surface subresource. Defined as layer * num_mip_levels +
+ * mip_level.
+ * @x: X coordinate.
+ * @y: Y coordinate.
+ * @z: Z coordinate.
+ */
+struct svga3dsurface_loc {
+   u32 sub_resource;
+   u32 x, y, z;
+};
+
+/**
+ * svga3dsurface_subres - Compute the subresource from layer and mipmap.
+ * @cache: Surface layout data.
+ * @mip_level: The mipmap level.
+ * @layer: The surface layer (face or array slice).
+ *
+ * Return: The subresource.
+ */
+static inline u32 svga3dsurface_subres(const struct svga3dsurface_cache *cache,
+  u32 mip_level, u32 layer)
+{
+   return cache->num_mip_levels * layer + mip_level;
+}
+
+/**
+ * svga3dsurface_setup_cache - Build a surface cache entry
+ * @size: The surface base level dimensions.
+ * @format: The surface format.
+ * @num_mip_levels: Number of mipmap levels.
+ * @num_layers: Number of layers.
+ * @cache: Pointer to a struct svga3dsurface_cach object to be filled in.
+ */
+static inline void svga3dsurface_setup_cache(const struct drm_vmw_size *size,
+SVGA3dSurfaceFormat format,
+u32 num_mip_levels,
+u32 num_layers,
+u32 num_samples,
+struct svga3dsurface_cache *cache)
+{
+   const struct svga3d_surface_desc *desc;
+   u32 i;
+
+   memset(cache, 0, sizeof(*cache));
+   cache->desc = desc = svga3dsurface_get_desc(format);
+   cache->num_mip_levels = num_mip_levels;
+   cache->num_layers = num_layers;
+   for (i = 0; i < cache->num_mip_levels; i++) {
+   struct svga3dsurface_mip *mip = >mip[i];
+
+   mip->size = svga3dsurface_get_mip_size(*size, i);
+   mip->bytes = svga3dsurface_get_image_buffer_size
+   (desc, >size, 0) * num_samples;
+   mip->row_stride =
+   __KERNEL_DIV_ROUND_UP(mip->size.width,
+ desc->block_size.width) *
+   desc->bytes_per_block * num_samples;
+   mip->img_stride =
+   __KERNEL_DIV_ROUND_UP(mip->size.height,
+ desc->block_size.height) *
+   mip->row

[PATCH 6/9] drm/vmwgfx: Implement an infrastructure for write-coherent resources v2

2019-04-24 Thread Thomas Hellstrom
This infrastructure will, for coherent resources, make sure that
from the user-space point of view, data written by the CPU is immediately
automatically available to the GPU at resource validation time.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Deepak Rawat 
---
v2: Minor documentation- and typo fixes
---
 drivers/gpu/drm/vmwgfx/Kconfig|   1 +
 drivers/gpu/drm/vmwgfx/Makefile   |   2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c|   5 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c   |   5 +
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h   |  26 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c   |   1 -
 drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c| 409 ++
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c  |  57 +++
 drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h |  11 +
 drivers/gpu/drm/vmwgfx/vmwgfx_validation.c|  71 +++
 drivers/gpu/drm/vmwgfx/vmwgfx_validation.h|  16 +-
 11 files changed, 584 insertions(+), 20 deletions(-)
 create mode 100644 drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c

diff --git a/drivers/gpu/drm/vmwgfx/Kconfig b/drivers/gpu/drm/vmwgfx/Kconfig
index 6b28a326f8bb..d5fd81a521f6 100644
--- a/drivers/gpu/drm/vmwgfx/Kconfig
+++ b/drivers/gpu/drm/vmwgfx/Kconfig
@@ -8,6 +8,7 @@ config DRM_VMWGFX
select FB_CFB_IMAGEBLIT
select DRM_TTM
select FB
+   select AS_DIRTY_HELPERS
# Only needed for the transitional use of drm_crtc_init - can be removed
# again once vmwgfx sets up the primary plane itself.
select DRM_KMS_HELPER
diff --git a/drivers/gpu/drm/vmwgfx/Makefile b/drivers/gpu/drm/vmwgfx/Makefile
index 8841bd30e1e5..c877a21a0739 100644
--- a/drivers/gpu/drm/vmwgfx/Makefile
+++ b/drivers/gpu/drm/vmwgfx/Makefile
@@ -8,7 +8,7 @@ vmwgfx-y := vmwgfx_execbuf.o vmwgfx_gmr.o vmwgfx_kms.o 
vmwgfx_drv.o \
vmwgfx_cmdbuf_res.o vmwgfx_cmdbuf.o vmwgfx_stdu.o \
vmwgfx_cotable.o vmwgfx_so.o vmwgfx_binding.o vmwgfx_msg.o \
vmwgfx_simple_resource.o vmwgfx_va.o vmwgfx_blit.o \
-   vmwgfx_validation.o \
+   vmwgfx_validation.o vmwgfx_page_dirty.o \
ttm_object.o ttm_lock.o
 
 obj-$(CONFIG_DRM_VMWGFX) := vmwgfx.o
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index c0829d50eecc..90ca866640fe 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -463,6 +463,7 @@ void vmw_bo_bo_free(struct ttm_buffer_object *bo)
 {
struct vmw_buffer_object *vmw_bo = vmw_buffer_object(bo);
 
+   WARN_ON(vmw_bo->dirty);
vmw_bo_unmap(vmw_bo);
kfree(vmw_bo);
 }
@@ -476,8 +477,10 @@ void vmw_bo_bo_free(struct ttm_buffer_object *bo)
 static void vmw_user_bo_destroy(struct ttm_buffer_object *bo)
 {
struct vmw_user_buffer_object *vmw_user_bo = vmw_user_buffer_object(bo);
+   struct vmw_buffer_object *vbo = _user_bo->vbo;
 
-   vmw_bo_unmap(_user_bo->vbo);
+   WARN_ON(vbo->dirty);
+   vmw_bo_unmap(vbo);
ttm_prime_object_kfree(vmw_user_bo, prime);
 }
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 6165fe2c4504..74e94138877e 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -857,6 +857,11 @@ static int vmw_driver_load(struct drm_device *dev, 
unsigned long chipset)
DRM_ERROR("Failed initializing TTM buffer object driver.\n");
goto out_no_bdev;
}
+   dev_priv->vm_ops = *dev_priv->bdev.vm_ops;
+   dev_priv->vm_ops.fault = vmw_bo_vm_fault;
+   dev_priv->vm_ops.pfn_mkwrite = vmw_bo_vm_mkwrite;
+   dev_priv->vm_ops.page_mkwrite = vmw_bo_vm_mkwrite;
+   dev_priv->bdev.vm_ops = _priv->vm_ops;
 
/*
 * Enable VRAM, but initially don't use it until SVGA is enabled and
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index bd6919b90519..f05fce52fbb4 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -95,6 +95,7 @@ struct vmw_fpriv {
  * @dx_query_ctx: DX context if this buffer object is used as a DX query MOB
  * @map: Kmap object for semi-persistent mappings
  * @res_prios: Eviction priority counts for attached resources
+ * @dirty: structure for user-space dirty-tracking
  */
 struct vmw_buffer_object {
struct ttm_buffer_object base;
@@ -105,6 +106,7 @@ struct vmw_buffer_object {
/* Protected by reservation */
struct ttm_bo_kmap_obj map;
u32 res_prios[TTM_MAX_BO_PRIORITY];
+   struct vmw_bo_dirty *dirty;
 };
 
 /**
@@ -135,7 +137,8 @@ struct vmw_res_func;
  * @res_dirty: Resource contains data not yet in the backup buffer. Protected
  * by resource reserved.
  * @backup_dirty: Backup buffer contains data not yet in the HW resource.
- * Protecte by resource reserved.
+ * Protected by resource reserved.
+ * @coherent: Emulate coh

[PATCH 4/9] drm/ttm: Allow the driver to provide the ttm struct vm_operations_struct

2019-04-24 Thread Thomas Hellstrom
Add a pointer to the struct vm_operations_struct in the bo_device, and
assign that pointer to the default value currently used.

The driver can then optionally modify that pointer and the new value
can be used for each new vma created.

Cc: "Christian König" 

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c| 1 +
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 6 +++---
 include/drm/ttm/ttm_bo_driver.h | 6 ++
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 3f56647cdb35..1c85bec00472 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1656,6 +1656,7 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
mutex_lock(_global_mutex);
list_add_tail(>device_list, >device_list);
mutex_unlock(_global_mutex);
+   bdev->vm_ops = _bo_vm_ops;
 
return 0;
 out_no_sys:
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index e86a29a1e51f..bfb25b81fed7 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -395,7 +395,7 @@ static int ttm_bo_vm_access(struct vm_area_struct *vma, 
unsigned long addr,
return ret;
 }
 
-static const struct vm_operations_struct ttm_bo_vm_ops = {
+const struct vm_operations_struct ttm_bo_vm_ops = {
.fault = ttm_bo_vm_fault,
.open = ttm_bo_vm_open,
.close = ttm_bo_vm_close,
@@ -445,7 +445,7 @@ int ttm_bo_mmap(struct file *filp, struct vm_area_struct 
*vma,
if (unlikely(ret != 0))
goto out_unref;
 
-   vma->vm_ops = _bo_vm_ops;
+   vma->vm_ops = bdev->vm_ops;
 
/*
 * Note: We're transferring the bo reference to
@@ -477,7 +477,7 @@ int ttm_fbdev_mmap(struct vm_area_struct *vma, struct 
ttm_buffer_object *bo)
 
ttm_bo_get(bo);
 
-   vma->vm_ops = _bo_vm_ops;
+   vma->vm_ops = bo->bdev->vm_ops;
vma->vm_private_data = bo;
vma->vm_flags |= VM_MIXEDMAP;
vma->vm_flags |= VM_IO | VM_DONTEXPAND;
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index cbf3180cb612..cfeaff5d9706 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -443,6 +443,9 @@ extern struct ttm_bo_global {
  * @driver: Pointer to a struct ttm_bo_driver struct setup by the driver.
  * @man: An array of mem_type_managers.
  * @vma_manager: Address space manager
+ * @vm_ops: Pointer to the struct vm_operations_struct used for this
+ * device's VM operations. The driver may override this before the first
+ * mmap() call.
  * lru_lock: Spinlock that protects the buffer+device lru lists and
  * ddestroy lists.
  * @dev_mapping: A pointer to the struct address_space representing the
@@ -461,6 +464,7 @@ struct ttm_bo_device {
struct ttm_bo_global *glob;
struct ttm_bo_driver *driver;
struct ttm_mem_type_manager man[TTM_NUM_MEM_TYPES];
+   const struct vm_operations_struct *vm_ops;
 
/*
 * Protected by internal locks.
@@ -489,6 +493,8 @@ struct ttm_bo_device {
bool no_retry;
 };
 
+extern const struct vm_operations_struct ttm_bo_vm_ops;
+
 /**
  * struct ttm_lru_bulk_move_pos
  *
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 0/9] Emulated coherent graphics memory v2

2019-04-24 Thread Thomas Hellstrom
Graphics APIs like OpenGL 4.4 and Vulkan require the graphics driver
to provide coherent graphics memory, meaning that the GPU sees any
content written to the coherent memory on the next GPU operation that
touches that memory, and the CPU sees any content written by the GPU
to that memory immediately after any fence object trailing the GPU
operation has signaled.

Paravirtual drivers that otherwise require explicit synchronization
needs to do this by hooking up dirty tracking to pagefault handlers
and buffer object validation. This is a first attempt to do that for
the vmwgfx driver.

The mm patches has been out for RFC. I think I have addressed all the
feedback I got, except a possible softdirty breakage. But although the
dirty-tracking and softdirty may write-protect PTEs both care about,
that shouldn't really cause any operation interference. In particular
since we use the hardware dirty PTE bits and softdirty uses other PTE bits.

For the TTM changes they are hopefully in line with the long-term
strategy of making helpers out of what's left of TTM.

The code has been tested and excercised by a tailored version of mesa
where we disable all explicit synchronization and assume graphics memory
is coherent. The performance loss varies of course; a typical number is
around 5%.

Any feedback greatly appreciated.

Changes v1-v2:
- Addressed a number of typos and formatting issues.
- Added a usage warning for apply_to_pfn_range() and apply_to_page_range()
- Re-evaluated the decision to use apply_to_pfn_range() rather than
  modifying the pagewalk.c. It still looks like generically handling the
  transparent huge page cases requires the mmap_sem to be held at least
  in read mode, so sticking with apply_to_pfn_range() for now.
- The TTM page-fault helper vma copy argument was scratched in favour of
  a pageprot_t argument.
  
Cc: Andrew Morton 
Cc: Matthew Wilcox 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Minchan Kim 
Cc: Michal Hocko 
Cc: Huang Ying 
Cc: Souptick Joarder 
Cc: "Jérôme Glisse" 
Cc: "Christian König" 
Cc: linux...@kvack.org
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 7/9] drm/vmwgfx: Use an RBtree instead of linked list for MOB resources

2019-04-24 Thread Thomas Hellstrom
With emulated coherent memory we need to be able to quickly look up
a resource from the MOB offset. Instead of traversing a linked list with
O(n) worst case, use an RBtree with O(log n) worst case complexity.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Deepak Rawat 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c   |  5 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h  | 10 +++
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 33 +---
 3 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index 90ca866640fe..e8bc7a7ac031 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -464,6 +464,7 @@ void vmw_bo_bo_free(struct ttm_buffer_object *bo)
struct vmw_buffer_object *vmw_bo = vmw_buffer_object(bo);
 
WARN_ON(vmw_bo->dirty);
+   WARN_ON(!RB_EMPTY_ROOT(_bo->res_tree));
vmw_bo_unmap(vmw_bo);
kfree(vmw_bo);
 }
@@ -480,6 +481,7 @@ static void vmw_user_bo_destroy(struct ttm_buffer_object 
*bo)
struct vmw_buffer_object *vbo = _user_bo->vbo;
 
WARN_ON(vbo->dirty);
+   WARN_ON(!RB_EMPTY_ROOT(>res_tree));
vmw_bo_unmap(vbo);
ttm_prime_object_kfree(vmw_user_bo, prime);
 }
@@ -515,8 +517,7 @@ int vmw_bo_init(struct vmw_private *dev_priv,
memset(vmw_bo, 0, sizeof(*vmw_bo));
BUILD_BUG_ON(TTM_MAX_BO_PRIORITY <= 3);
vmw_bo->base.priority = 3;
-
-   INIT_LIST_HEAD(_bo->res_list);
+   vmw_bo->res_tree = RB_ROOT;
 
ret = ttm_bo_init(bdev, _bo->base, size,
  ttm_bo_type_device, placement,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index f05fce52fbb4..81ebcd668038 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -90,7 +90,7 @@ struct vmw_fpriv {
 /**
  * struct vmw_buffer_object - TTM buffer object with vmwgfx additions
  * @base: The TTM buffer object
- * @res_list: List of resources using this buffer object as a backing MOB
+ * @res_tree: RB tree of resources using this buffer object as a backing MOB
  * @pin_count: pin depth
  * @dx_query_ctx: DX context if this buffer object is used as a DX query MOB
  * @map: Kmap object for semi-persistent mappings
@@ -99,7 +99,7 @@ struct vmw_fpriv {
  */
 struct vmw_buffer_object {
struct ttm_buffer_object base;
-   struct list_head res_list;
+   struct rb_root res_tree;
s32 pin_count;
/* Not ref-counted.  Protected by binding_mutex */
struct vmw_resource *dx_query_ctx;
@@ -147,8 +147,8 @@ struct vmw_res_func;
  * pin-count greater than zero. It is not on the resource LRU lists and its
  * backup buffer is pinned. Hence it can't be evicted.
  * @func: Method vtable for this resource. Immutable.
+ * @mob_node; Node for the MOB backup rbtree. Protected by @backup reserved.
  * @lru_head: List head for the LRU list. Protected by 
@dev_priv::resource_lock.
- * @mob_head: List head for the MOB backup list. Protected by @backup reserved.
  * @binding_head: List head for the context binding list. Protected by
  * the @dev_priv::binding_mutex
  * @res_free: The resource destructor.
@@ -169,8 +169,8 @@ struct vmw_resource {
unsigned long backup_offset;
unsigned long pin_count;
const struct vmw_res_func *func;
+   struct rb_node mob_node;
struct list_head lru_head;
-   struct list_head mob_head;
struct list_head binding_head;
struct vmw_resource_dirty *dirty;
void (*res_free) (struct vmw_resource *res);
@@ -743,7 +743,7 @@ void vmw_resource_dirty_update(struct vmw_resource *res, 
pgoff_t start,
  */
 static inline bool vmw_resource_mob_attached(const struct vmw_resource *res)
 {
-   return !list_empty(>mob_head);
+   return !RB_EMPTY_NODE(>mob_node);
 }
 
 /**
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index d35f4bd32cd9..ff9fe5650468 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -41,11 +41,24 @@
 void vmw_resource_mob_attach(struct vmw_resource *res)
 {
struct vmw_buffer_object *backup = res->backup;
+   struct rb_node **new = >res_tree.rb_node, *parent = NULL;
 
lockdep_assert_held(>base.resv->lock.base);
res->used_prio = (res->res_dirty) ? res->func->dirty_prio :
res->func->prio;
-   list_add_tail(>mob_head, >res_list);
+
+   while (*new) {
+   struct vmw_resource *this =
+   container_of(*new, struct vmw_resource, mob_node);
+
+   parent = *new;
+   new = (res->backup_offset < this->backup_offset) ?
+   &((*new)->rb_left) : &((*new)->rb_right);
+   }
+
+   rb_link_node(>mob_node, 

[PATCH 5/9] drm/ttm: TTM fault handler helpers v2

2019-04-24 Thread Thomas Hellstrom
With the vmwgfx dirty tracking, the default TTM fault handler is not
completely sufficient (vmwgfx need to modify the vma->vm_flags member,
and also needs to restrict the number of prefaults).

We also want to replicate the new ttm_bo_vm_reserve() functionality

So start turning the TTM vm code into helpers: ttm_bo_vm_fault_reserved()
and ttm_bo_vm_reserve(), and provide a default TTM fault handler for other
drivers to use.

Cc: "Christian König" 

Signed-off-by: Thomas Hellstrom 
Reviewed-by: "Christian König"  #v1
---
v2: Remove some unnecessary code pointed out in review comments
Make ttm_bo_vm_fault_reserved() take a pgprot_t as an argument
instead of a struct vm_area_struct pointer.
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 175 +++-
 include/drm/ttm/ttm_bo_api.h|  10 ++
 2 files changed, 113 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index bfb25b81fed7..d15f222dc081 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -42,8 +42,6 @@
 #include 
 #include 
 
-#define TTM_BO_VM_NUM_PREFAULT 16
-
 static vm_fault_t ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo,
struct vm_fault *vmf)
 {
@@ -106,31 +104,30 @@ static unsigned long ttm_bo_io_mem_pfn(struct 
ttm_buffer_object *bo,
+ page_offset;
 }
 
-static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
+/**
+ * ttm_bo_vm_reserve - Reserve a buffer object in a retryable vm callback
+ * @bo: The buffer object
+ * @vmf: The fault structure handed to the callback
+ *
+ * vm callbacks like fault() and *_mkwrite() allow for the mm_sem to be dropped
+ * during long waits, and after the wait the callback will be restarted. This
+ * is to allow other threads using the same virtual memory space concurrent
+ * access to map(), unmap() completely unrelated buffer objects. TTM buffer
+ * object reservations sometimes wait for GPU and should therefore be
+ * considered long waits. This function reserves the buffer object 
interruptibly
+ * taking this into account. Starvation is avoided by the vm system not
+ * allowing too many repeated restarts.
+ * This function is intended to be used in customized fault() and _mkwrite()
+ * handlers.
+ *
+ * Return:
+ *0 on success and the bo was reserved.
+ *VM_FAULT_RETRY if blocking wait.
+ *VM_FAULT_NOPAGE if blocking wait and retrying was not allowed.
+ */
+vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo,
+struct vm_fault *vmf)
 {
-   struct vm_area_struct *vma = vmf->vma;
-   struct ttm_buffer_object *bo = (struct ttm_buffer_object *)
-   vma->vm_private_data;
-   struct ttm_bo_device *bdev = bo->bdev;
-   unsigned long page_offset;
-   unsigned long page_last;
-   unsigned long pfn;
-   struct ttm_tt *ttm = NULL;
-   struct page *page;
-   int err;
-   int i;
-   vm_fault_t ret = VM_FAULT_NOPAGE;
-   unsigned long address = vmf->address;
-   struct ttm_mem_type_manager *man =
-   >man[bo->mem.mem_type];
-   struct vm_area_struct cvma;
-
-   /*
-* Work around locking order reversal in fault / nopfn
-* between mmap_sem and bo_reserve: Perform a trylock operation
-* for reserve, and if it fails, retry the fault after waiting
-* for the buffer to become unreserved.
-*/
if (unlikely(!reservation_object_trylock(bo->resv))) {
if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) {
@@ -151,14 +148,55 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
return VM_FAULT_NOPAGE;
}
 
+   return 0;
+}
+EXPORT_SYMBOL(ttm_bo_vm_reserve);
+
+/**
+ * ttm_bo_vm_fault_reserved - TTM fault helper
+ * @vmf: The struct vm_fault given as argument to the fault callback
+ * @prot: The page protection to be used for this memory area.
+ * @num_prefault: Maximum number of prefault pages. The caller may want to
+ * specify this based on madvice settings and the size of the GPU object
+ * backed by the memory.
+ *
+ * This function inserts one or more page table entries pointing to the
+ * memory backing the buffer object, and then returns a return code
+ * instructing the caller to retry the page access.
+ *
+ * Return:
+ *   VM_FAULT_NOPAGE on success or pending signal
+ *   VM_FAULT_SIGBUS on unspecified error
+ *   VM_FAULT_OOM on out-of-memory
+ *   VM_FAULT_RETRY if retryable wait
+ */
+vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
+   pgprot_t prot,
+   pgoff_t num_prefault)
+{
+   struct vm_area_struct *vma = vmf->vma;
+   struct vm_area_struct cvma = *vma;
+   struct ttm_buffer_object *bo = (struct ttm_buffer_object *)
+   

[PATCH 8/9] drm/vmwgfx: Implement an infrastructure for read-coherent resources v2

2019-04-24 Thread Thomas Hellstrom
Similar to write-coherent resources, make sure that from the user-space
point of view, GPU rendered contents is automatically available for
reading by the CPU.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Deepak Rawat 
---
v2: Comment- and formatting fixes.
---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h   |   7 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c|  73 -
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c  | 103 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_resource_priv.h |   2 +
 drivers/gpu/drm/vmwgfx/vmwgfx_validation.c|   3 +-
 5 files changed, 177 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index 81ebcd668038..ab3670a06108 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -690,7 +690,8 @@ extern void vmw_resource_unreference(struct vmw_resource 
**p_res);
 extern struct vmw_resource *vmw_resource_reference(struct vmw_resource *res);
 extern struct vmw_resource *
 vmw_resource_reference_unless_doomed(struct vmw_resource *res);
-extern int vmw_resource_validate(struct vmw_resource *res, bool intr);
+extern int vmw_resource_validate(struct vmw_resource *res, bool intr,
+bool dirtying);
 extern int vmw_resource_reserve(struct vmw_resource *res, bool interruptible,
bool no_backup);
 extern bool vmw_resource_needs_backup(const struct vmw_resource *res);
@@ -734,6 +735,8 @@ void vmw_resource_mob_attach(struct vmw_resource *res);
 void vmw_resource_mob_detach(struct vmw_resource *res);
 void vmw_resource_dirty_update(struct vmw_resource *res, pgoff_t start,
   pgoff_t end);
+int vmw_resources_clean(struct vmw_buffer_object *vbo, pgoff_t start,
+   pgoff_t end, pgoff_t *num_prefault);
 
 /**
  * vmw_resource_mob_attached - Whether a resource currently has a mob attached
@@ -1428,6 +1431,8 @@ int vmw_bo_dirty_add(struct vmw_buffer_object *vbo);
 void vmw_bo_dirty_transfer_to_res(struct vmw_resource *res);
 void vmw_bo_dirty_clear_res(struct vmw_resource *res);
 void vmw_bo_dirty_release(struct vmw_buffer_object *vbo);
+void vmw_bo_dirty_unmap(struct vmw_buffer_object *vbo,
+   pgoff_t start, pgoff_t end);
 vm_fault_t vmw_bo_vm_fault(struct vm_fault *vmf);
 vm_fault_t vmw_bo_vm_mkwrite(struct vm_fault *vmf);
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c
index 8d154f90bdc0..730c51e397dd 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c
@@ -153,7 +153,6 @@ static void vmw_bo_dirty_scan_mkwrite(struct 
vmw_buffer_object *vbo)
}
 }
 
-
 /**
  * vmw_bo_dirty_scan - Scan for dirty pages and add them to the dirty
  * tracking structure
@@ -171,6 +170,51 @@ void vmw_bo_dirty_scan(struct vmw_buffer_object *vbo)
vmw_bo_dirty_scan_mkwrite(vbo);
 }
 
+/**
+ * vmw_bo_dirty_pre_unmap - write-protect and pick up dirty pages before
+ * an unmap_mapping_range operation.
+ * @vbo: The buffer object,
+ * @start: First page of the range within the buffer object.
+ * @end: Last page of the range within the buffer object + 1.
+ *
+ * If we're using the _PAGETABLE scan method, we may leak dirty pages
+ * when calling unmap_mapping_range(). This function makes sure we pick
+ * up all dirty pages.
+ */
+static void vmw_bo_dirty_pre_unmap(struct vmw_buffer_object *vbo,
+  pgoff_t start, pgoff_t end)
+{
+   struct vmw_bo_dirty *dirty = vbo->dirty;
+   unsigned long offset = drm_vma_node_start(>base.vma_node);
+   struct address_space *mapping = vbo->base.bdev->dev_mapping;
+
+   if (dirty->method != VMW_BO_DIRTY_PAGETABLE || start >= end)
+   return;
+
+   apply_as_wrprotect(mapping, start + offset, end - start);
+   apply_as_clean(mapping, start + offset, end - start, offset,
+  >bitmap[0], >start, >end);
+}
+
+/**
+ * vmw_bo_dirty_unmap - Clear all ptes pointing to a range within a bo
+ * @vbo: The buffer object,
+ * @start: First page of the range within the buffer object.
+ * @end: Last page of the range within the buffer object + 1.
+ *
+ * This is similar to ttm_bo_unmap_virtual_locked() except it takes a subrange.
+ */
+void vmw_bo_dirty_unmap(struct vmw_buffer_object *vbo,
+   pgoff_t start, pgoff_t end)
+{
+   unsigned long offset = drm_vma_node_start(>base.vma_node);
+   struct address_space *mapping = vbo->base.bdev->dev_mapping;
+
+   vmw_bo_dirty_pre_unmap(vbo, start, end);
+   unmap_shared_mapping_range(mapping, (offset + start) << PAGE_SHIFT,
+  (loff_t) (end - start) << PAGE_SHIFT);
+}
+
 /**
  * vmw_bo_dirty_add - Add a dirty-tracking user to a buffer object
  * @vbo: The buffer object
@@ -389,21 +433,40 @@

[PATCH 3/9] mm: Add write-protect and clean utilities for address space ranges v2

2019-04-24 Thread Thomas Hellstrom
Add two utilities to a) write-protect and b) clean all ptes pointing into
a range of an address space
The utilities are intended to aid in tracking dirty pages (either
driver-allocated system memory or pci device memory).
The write-protect utility should be used in conjunction with
page_mkwrite() and pfn_mkwrite() to trigger write page-faults on page
accesses. Typically one would want to use this on sparse accesses into
large memory regions. The clean utility should be used to utilize
hardware dirtying functionality and avoid the overhead of page-faults,
typically on large accesses into small memory regions.

The added file "as_dirty_helpers.c" is initially listed as maintained by
VMware under our DRM driver. If somebody would like it elsewhere,
that's of course no problem.

Notable changes since RFC:
- Added comments to help avoid the usage of these function for VMAs
  it's not intended for. We also do advisory checks on the vm_flags and
  warn on illegal usage.
- Perform the pte modifications the same way softdirty does.
- Add mmu_notifier range invalidation calls.
- Add a config option so that this code is not unconditionally included.
- Tell the mmu_gather code about pending tlb flushes.

Cc: Andrew Morton 
Cc: Matthew Wilcox 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Minchan Kim 
Cc: Michal Hocko 
Cc: Huang Ying 
Cc: Souptick Joarder 
Cc: "Jérôme Glisse" 
Cc: linux...@kvack.org
Cc: linux-ker...@vger.kernel.org

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Ralph Campbell  #v1
---
v2: Fix formatting and typos.
Change file-name of the added file, and don't compile it unless
configured to do so.
---
 MAINTAINERS   |   1 +
 include/linux/mm.h|   9 +-
 mm/Kconfig|   3 +
 mm/Makefile   |   1 +
 mm/as_dirty_helpers.c | 297 ++
 5 files changed, 310 insertions(+), 1 deletion(-)
 create mode 100644 mm/as_dirty_helpers.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 35e6357f9d30..015e1e758bf6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4971,6 +4971,7 @@ T:git git://people.freedesktop.org/~thomash/linux
 S: Supported
 F: drivers/gpu/drm/vmwgfx/
 F: include/uapi/drm/vmwgfx_drm.h
+F: mm/as_dirty_helpers.c
 
 DRM DRIVERS
 M: David Airlie 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b7dd4ddd6efb..62f24dd0bfa0 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2642,7 +2642,14 @@ struct pfn_range_apply {
 };
 extern int apply_to_pfn_range(struct pfn_range_apply *closure,
  unsigned long address, unsigned long size);
-
+unsigned long apply_as_wrprotect(struct address_space *mapping,
+pgoff_t first_index, pgoff_t nr);
+unsigned long apply_as_clean(struct address_space *mapping,
+pgoff_t first_index, pgoff_t nr,
+pgoff_t bitmap_pgoff,
+unsigned long *bitmap,
+pgoff_t *start,
+pgoff_t *end);
 #ifdef CONFIG_PAGE_POISONING
 extern bool page_poisoning_enabled(void);
 extern void kernel_poison_pages(struct page *page, int numpages, int enable);
diff --git a/mm/Kconfig b/mm/Kconfig
index 25c71eb8a7db..80e41cdbb4ae 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -758,4 +758,7 @@ config GUP_BENCHMARK
 config ARCH_HAS_PTE_SPECIAL
bool
 
+config AS_DIRTY_HELPERS
+bool
+
 endmenu
diff --git a/mm/Makefile b/mm/Makefile
index d210cc9d6f80..4bf396ba3a00 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -99,3 +99,4 @@ obj-$(CONFIG_HARDENED_USERCOPY) += usercopy.o
 obj-$(CONFIG_PERCPU_STATS) += percpu-stats.o
 obj-$(CONFIG_HMM) += hmm.o
 obj-$(CONFIG_MEMFD_CREATE) += memfd.o
+obj-$(CONFIG_AS_DIRTY_HELPERS) += as_dirty_helpers.o
diff --git a/mm/as_dirty_helpers.c b/mm/as_dirty_helpers.c
new file mode 100644
index ..26984841d18f
--- /dev/null
+++ b/mm/as_dirty_helpers.c
@@ -0,0 +1,297 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/**
+ * struct apply_as - Closure structure for apply_as_range
+ * @base: struct pfn_range_apply we derive from
+ * @start: Address of first modified pte
+ * @end: Address of last modified pte + 1
+ * @total: Total number of modified ptes
+ * @vma: Pointer to the struct vm_area_struct we're currently operating on
+ */
+struct apply_as {
+   struct pfn_range_apply base;
+   unsigned long start;
+   unsigned long end;
+   unsigned long total;
+   const struct vm_area_struct *vma;
+};
+
+/**
+ * apply_pt_wrprotect - Leaf pte callback to write-protect a pte
+ * @pte: Pointer to the pte
+ * @token: Page table token, see apply_to_pfn_range()
+ * @addr: The virtual page address
+ * @closure: Pointer to a struct pfn_range_apply embedded in a
+ * struct apply_as
+ *
+ * The function write-protects a pte and records the range in
+ * virtual addr

[PATCH 2/9] mm: Add an apply_to_pfn_range interface v2

2019-04-24 Thread Thomas Hellstrom
This is basically apply_to_page_range with added functionality:
Allocating missing parts of the page table becomes optional, which
means that the function can be guaranteed not to error if allocation
is disabled. Also passing of the closure struct and callback function
becomes different and more in line with how things are done elsewhere.

Finally we keep apply_to_page_range as a wrapper around apply_to_pfn_range

The reason for not using the page-walk code is that we want to perform
the page-walk on vmas pointing to an address space without requiring the
mmap_sem to be held rather than on vmas belonging to a process with the
mmap_sem held.

Notable changes since RFC:
Don't export apply_to_pfn range.

Cc: Andrew Morton 
Cc: Matthew Wilcox 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Minchan Kim 
Cc: Michal Hocko 
Cc: Huang Ying 
Cc: Souptick Joarder 
Cc: "Jérôme Glisse" 
Cc: linux...@kvack.org
Cc: linux-ker...@vger.kernel.org

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Ralph Campbell  #v1
---
v2: Clearly warn people from using apply_to_pfn_range and
apply_to_page_range unless they know what they are doing.
---
 include/linux/mm.h |  10 
 mm/memory.c| 135 ++---
 2 files changed, 113 insertions(+), 32 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 80bb6408fe73..b7dd4ddd6efb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2632,6 +2632,16 @@ typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, 
unsigned long addr,
 extern int apply_to_page_range(struct mm_struct *mm, unsigned long address,
   unsigned long size, pte_fn_t fn, void *data);
 
+struct pfn_range_apply;
+typedef int (*pter_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr,
+struct pfn_range_apply *closure);
+struct pfn_range_apply {
+   struct mm_struct *mm;
+   pter_fn_t ptefn;
+   unsigned int alloc;
+};
+extern int apply_to_pfn_range(struct pfn_range_apply *closure,
+ unsigned long address, unsigned long size);
 
 #ifdef CONFIG_PAGE_POISONING
 extern bool page_poisoning_enabled(void);
diff --git a/mm/memory.c b/mm/memory.c
index 9580d894f963..0a86ee527ffa 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1938,18 +1938,17 @@ int vm_iomap_memory(struct vm_area_struct *vma, 
phys_addr_t start, unsigned long
 }
 EXPORT_SYMBOL(vm_iomap_memory);
 
-static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
-unsigned long addr, unsigned long end,
-pte_fn_t fn, void *data)
+static int apply_to_pte_range(struct pfn_range_apply *closure, pmd_t *pmd,
+ unsigned long addr, unsigned long end)
 {
pte_t *pte;
int err;
pgtable_t token;
spinlock_t *uninitialized_var(ptl);
 
-   pte = (mm == _mm) ?
+   pte = (closure->mm == _mm) ?
pte_alloc_kernel(pmd, addr) :
-   pte_alloc_map_lock(mm, pmd, addr, );
+   pte_alloc_map_lock(closure->mm, pmd, addr, );
if (!pte)
return -ENOMEM;
 
@@ -1960,86 +1959,109 @@ static int apply_to_pte_range(struct mm_struct *mm, 
pmd_t *pmd,
token = pmd_pgtable(*pmd);
 
do {
-   err = fn(pte++, token, addr, data);
+   err = closure->ptefn(pte++, token, addr, closure);
if (err)
break;
} while (addr += PAGE_SIZE, addr != end);
 
arch_leave_lazy_mmu_mode();
 
-   if (mm != _mm)
+   if (closure->mm != _mm)
pte_unmap_unlock(pte-1, ptl);
return err;
 }
 
-static int apply_to_pmd_range(struct mm_struct *mm, pud_t *pud,
-unsigned long addr, unsigned long end,
-pte_fn_t fn, void *data)
+static int apply_to_pmd_range(struct pfn_range_apply *closure, pud_t *pud,
+ unsigned long addr, unsigned long end)
 {
pmd_t *pmd;
unsigned long next;
-   int err;
+   int err = 0;
 
BUG_ON(pud_huge(*pud));
 
-   pmd = pmd_alloc(mm, pud, addr);
+   pmd = pmd_alloc(closure->mm, pud, addr);
if (!pmd)
return -ENOMEM;
+
do {
next = pmd_addr_end(addr, end);
-   err = apply_to_pte_range(mm, pmd, addr, next, fn, data);
+   if (!closure->alloc && pmd_none_or_clear_bad(pmd))
+   continue;
+   err = apply_to_pte_range(closure, pmd, addr, next);
if (err)
break;
} while (pmd++, addr = next, addr != end);
return err;
 }
 
-static int apply_to_pud_range(struct mm_struct *mm, p4d_t *p4d,
-unsigned long addr, unsigned long end,
-pte_fn_t fn, void *

[PATCH 1/9] mm: Allow the [page|pfn]_mkwrite callbacks to drop the mmap_sem v2

2019-04-24 Thread Thomas Hellstrom
Driver fault callbacks are allowed to drop the mmap_sem when expecting
long hardware waits to avoid blocking other mm users. Allow the mkwrite
callbacks to do the same by returning early on VM_FAULT_RETRY.

In particular we want to be able to drop the mmap_sem when waiting for
a reservation object lock on a GPU buffer object. These locks may be
held while waiting for the GPU.

Cc: Andrew Morton 
Cc: Matthew Wilcox 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Rik van Riel 
Cc: Minchan Kim 
Cc: Michal Hocko 
Cc: Huang Ying 
Cc: Souptick Joarder 
Cc: "Jérôme Glisse" 
Cc: linux...@kvack.org
Cc: linux-ker...@vger.kernel.org

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Ralph Campbell 
---
v2: Make the order error codes we check for consistent with
the order used in the rest of the file.
---
 mm/memory.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index e11ca9dd823f..9580d894f963 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2144,7 +2144,7 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf)
ret = vmf->vma->vm_ops->page_mkwrite(vmf);
/* Restore original flags so that caller is not surprised */
vmf->flags = old_flags;
-   if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))
+   if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
return ret;
if (unlikely(!(ret & VM_FAULT_LOCKED))) {
lock_page(page);
@@ -2419,7 +2419,7 @@ static vm_fault_t wp_pfn_shared(struct vm_fault *vmf)
pte_unmap_unlock(vmf->pte, vmf->ptl);
vmf->flags |= FAULT_FLAG_MKWRITE;
ret = vma->vm_ops->pfn_mkwrite(vmf);
-   if (ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE))
+   if (ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))
return ret;
return finish_mkwrite_fault(vmf);
}
@@ -2440,7 +2440,8 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf)
pte_unmap_unlock(vmf->pte, vmf->ptl);
tmp = do_page_mkwrite(vmf);
if (unlikely(!tmp || (tmp &
- (VM_FAULT_ERROR | VM_FAULT_NOPAGE {
+ (VM_FAULT_ERROR | VM_FAULT_NOPAGE |
+  VM_FAULT_RETRY {
put_page(vmf->page);
return tmp;
}
@@ -3494,7 +3495,8 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
unlock_page(vmf->page);
tmp = do_page_mkwrite(vmf);
if (unlikely(!tmp ||
-   (tmp & (VM_FAULT_ERROR | VM_FAULT_NOPAGE {
+   (tmp & (VM_FAULT_ERROR | VM_FAULT_NOPAGE |
+   VM_FAULT_RETRY {
put_page(vmf->page);
return tmp;
}
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 8/9] drm/vmwgfx: Implement an infrastructure for read-coherent resources

2019-04-24 Thread Thomas Hellstrom
On Mon, 2019-04-22 at 20:12 +, Deepak Singh Rawat wrote:
> Minor nits below, otherwise
> 
> Reviewed-by: Deepak Rawat 
> 
> On Fri, 2019-04-12 at 09:04 -0700, Thomas Hellstrom wrote:
> > Similar to write-coherent resources, make sure that from the user-
> > space
> > point of view, GPU rendered contents is automatically available for
> > reading by the CPU.
> > 
> > Signed-off-by: Thomas Hellstrom 
> > ---
> > 
> > +   while (cur) {
> > +   struct vmw_resource *cur_res =
> > +   container_of(cur, struct vmw_resource,
> > mob_node);
> > +
> > +   if (cur_res->backup_offset >= res_end) {
> > +   cur = cur->rb_left;
> > +   } else if (cur_res->backup_offset + cur_res-
> > > backup_size <=
> > +  res_start) {
> > +   cur = cur->rb_right;
> > +   } else {
> > +   found = cur_res;
> 
> I didn't looked into how RB tree works but do you need to break the
> loop when resource is found?


No, here we will continue looking for a resource with even lower
starting offset. I'll add a comment about that.

Thanks,
Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  1   2   3   4   5   6   7   8   9   10   >