Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-13 Thread Christian König

Am 13.10.2016 um 02:20 schrieb Lorenzo Stoakes:

This patch series adjusts functions in the get_user_pages* family such that
desired FOLL_* flags are passed as an argument rather than implied by flags.

The purpose of this change is to make the use of FOLL_FORCE explicit so it is
easier to grep for and clearer to callers that this flag is being used. The use
of FOLL_FORCE is an issue as it overrides missing VM_READ/VM_WRITE flags for the
VMA whose pages we are reading from/writing to, which can result in surprising
behaviour.

The patch series came out of the discussion around commit 38e0885, which
addressed a BUG_ON() being triggered when a page was faulted in with PROT_NONE
set but having been overridden by FOLL_FORCE. do_numa_page() was run on the
assumption the page _must_ be one marked for NUMA node migration as an actual
PROT_NONE page would have been dealt with prior to this code path, however
FOLL_FORCE introduced a situation where this assumption did not hold.

See https://marc.info/?l=linux-mm=147585445805166 for the patch proposal.

Lorenzo Stoakes (10):
   mm: remove write/force parameters from __get_user_pages_locked()
   mm: remove write/force parameters from __get_user_pages_unlocked()
   mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
   mm: replace get_user_pages_locked() write/force parameters with gup_flags
   mm: replace get_vaddr_frames() write/force parameters with gup_flags
   mm: replace get_user_pages() write/force parameters with gup_flags
   mm: replace get_user_pages_remote() write/force parameters with gup_flags
   mm: replace __access_remote_vm() write parameter with gup_flags
   mm: replace access_remote_vm() write parameter with gup_flags
   mm: replace access_process_vm() write parameter with gup_flags


Patch number 6 in this series (which touches drivers I co-maintain) is 
Acked-by: Christian König <christian.koe...@amd.com>.


In general looks like a very nice cleanup to me, but I'm not enlightened 
enough to full judge.


Regards,
Christian.



  arch/alpha/kernel/ptrace.c |  9 ++--
  arch/blackfin/kernel/ptrace.c  |  5 ++-
  arch/cris/arch-v32/drivers/cryptocop.c |  4 +-
  arch/cris/arch-v32/kernel/ptrace.c |  4 +-
  arch/ia64/kernel/err_inject.c  |  2 +-
  arch/ia64/kernel/ptrace.c  | 14 +++---
  arch/m32r/kernel/ptrace.c  | 15 ---
  arch/mips/kernel/ptrace32.c|  5 ++-
  arch/mips/mm/gup.c |  2 +-
  arch/powerpc/kernel/ptrace32.c |  5 ++-
  arch/s390/mm/gup.c |  3 +-
  arch/score/kernel/ptrace.c | 10 +++--
  arch/sh/mm/gup.c   |  3 +-
  arch/sparc/kernel/ptrace_64.c  | 24 +++
  arch/sparc/mm/gup.c|  3 +-
  arch/x86/kernel/step.c |  3 +-
  arch/x86/mm/gup.c  |  2 +-
  arch/x86/mm/mpx.c  |  5 +--
  arch/x86/um/ptrace_32.c|  3 +-
  arch/x86/um/ptrace_64.c|  3 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  7 ++-
  drivers/gpu/drm/etnaviv/etnaviv_gem.c  |  7 ++-
  drivers/gpu/drm/exynos/exynos_drm_g2d.c|  3 +-
  drivers/gpu/drm/i915/i915_gem_userptr.c|  6 ++-
  drivers/gpu/drm/radeon/radeon_ttm.c|  3 +-
  drivers/gpu/drm/via/via_dmablit.c  |  4 +-
  drivers/infiniband/core/umem.c |  6 ++-
  drivers/infiniband/core/umem_odp.c |  7 ++-
  drivers/infiniband/hw/mthca/mthca_memfree.c|  2 +-
  drivers/infiniband/hw/qib/qib_user_pages.c |  3 +-
  drivers/infiniband/hw/usnic/usnic_uiom.c   |  5 ++-
  drivers/media/pci/ivtv/ivtv-udma.c |  4 +-
  drivers/media/pci/ivtv/ivtv-yuv.c  |  5 ++-
  drivers/media/platform/omap/omap_vout.c|  2 +-
  drivers/media/v4l2-core/videobuf-dma-sg.c  |  7 ++-
  drivers/media/v4l2-core/videobuf2-memops.c |  6 ++-
  drivers/misc/mic/scif/scif_rma.c   |  3 +-
  drivers/misc/sgi-gru/grufault.c|  2 +-
  drivers/platform/goldfish/goldfish_pipe.c  |  3 +-
  drivers/rapidio/devices/rio_mport_cdev.c   |  3 +-
  drivers/scsi/st.c  |  5 +--
  .../interface/vchiq_arm/vchiq_2835_arm.c   |  3 +-
  .../vc04_services/interface/vchiq_arm/vchiq_arm.c  |  3 +-
  drivers/video/fbdev/pvr2fb.c   |  4 +-
  drivers/virt/fsl_hypervisor.c  |  4 +-
  fs/exec.c  |  9 +++-
  fs/proc/base.c | 19 +---
  include/linu

Re: [patch net-next 0/3] net/sched: Improve getting objects by indexes

2017-08-16 Thread Christian König

Am 16.08.2017 um 04:12 schrieb Chris Mi:

Using current TC code, it is very slow to insert a lot of rules.

In order to improve the rules update rate in TC,
we introduced the following two changes:
 1) changed cls_flower to use IDR to manage the filters.
 2) changed all act_xxx modules to use IDR instead of
a small hash table

But IDR has a limitation that it uses int. TC handle uses u32.
To make sure there is no regression, we also changed IDR to use
unsigned long. All clients of IDR are changed to use new IDR API.


WOW, wait a second. The idr change is touching a lot of drivers and to 
be honest doesn't looks correct at all.


Just look at the first chunk of your modification:

@@ -998,8 +999,9 @@ int bsg_register_queue(struct request_queue *q, struct 
device *parent,
  
  	mutex_lock(_mutex);
  
-	ret = idr_alloc(_minor_idr, bcd, 0, BSG_MAX_DEVS, GFP_KERNEL);

-   if (ret < 0) {
+   ret = idr_alloc(_minor_idr, bcd, _index, 0, BSG_MAX_DEVS,
+   GFP_KERNEL);
+   if (ret) {
if (ret == -ENOSPC) {
printk(KERN_ERR "bsg: too many bsg devices\n");
ret = -EINVAL;
The condition "if (ret)" will now always be true after the first 
allocation and so we always run into the error handling after that.


I've never read the bsg code before, but that's certainly not correct. 
And that incorrect pattern repeats over and over again in this code.


Apart from that why the heck do you want to allocate more than 1<<31 
handles?


Regards,
Christian.



Chris Mi (3):
   idr: Use unsigned long instead of int
   net/sched: Change cls_flower to use IDR
   net/sched: Change act_api and act_xxx modules to use IDR

  block/bsg.c |   8 +-
  block/genhd.c   |  12 +-
  drivers/atm/nicstar.c   |  11 +-
  drivers/block/drbd/drbd_main.c  |  31 +--
  drivers/block/drbd/drbd_nl.c|  22 ++-
  drivers/block/drbd/drbd_proc.c  |   3 +-
  drivers/block/drbd/drbd_receiver.c  |  15 +-
  drivers/block/drbd/drbd_state.c |  34 ++--
  drivers/block/drbd/drbd_worker.c|   6 +-
  drivers/block/loop.c|  17 +-
  drivers/block/nbd.c |  20 +-
  drivers/block/zram/zram_drv.c   |   9 +-
  drivers/char/tpm/tpm-chip.c |  10 +-
  drivers/char/tpm/tpm.h  |   2 +-
  drivers/dca/dca-sysfs.c |   9 +-
  drivers/firewire/core-cdev.c|  18 +-
  drivers/firewire/core-device.c  |  15 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c |   8 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c |   9 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |   6 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |   2 +-
  drivers/gpu/drm/drm_auth.c  |   9 +-
  drivers/gpu/drm/drm_connector.c |  10 +-
  drivers/gpu/drm/drm_context.c   |  20 +-
  drivers/gpu/drm/drm_dp_aux_dev.c|  11 +-
  drivers/gpu/drm/drm_drv.c   |   6 +-
  drivers/gpu/drm/drm_gem.c   |  19 +-
  drivers/gpu/drm/drm_info.c  |   2 +-
  drivers/gpu/drm/drm_mode_object.c   |  11 +-
  drivers/gpu/drm/drm_syncobj.c   |  18 +-
  drivers/gpu/drm/exynos/exynos_drm_ipp.c |  25 ++-
  drivers/gpu/drm/i915/gvt/display.c  |   2 +-
  drivers/gpu/drm/i915/gvt/kvmgt.c|   2 +-
  drivers/gpu/drm/i915/gvt/vgpu.c |   9 +-
  drivers/gpu/drm/i915/i915_debugfs.c |   6 +-
  drivers/gpu/drm/i915/i915_gem_context.c |   9 +-
  drivers/gpu/drm/qxl/qxl_cmd.c   |   8 +-
  drivers/gpu/drm/qxl/qxl_release.c   |  14 +-
  drivers/gpu/drm/sis/sis_mm.c|   8 +-
  drivers/gpu/drm/tegra/drm.c |  10 +-
  drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c|   3 +-
  drivers/gpu/drm/vgem/vgem_fence.c   |  12 +-
  drivers/gpu/drm/via/via_mm.c|   8 +-
  drivers/gpu/drm/virtio/virtgpu_kms.c|   5 +-
  drivers/gpu/drm/virtio/virtgpu_vq.c |   5 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_resource.c|   9 +-
  drivers/i2c/i2c-core-base.c |  19 +-
  drivers/infiniband/core/cm.c|   8 +-
  drivers/infiniband/core/cma.c   |  12 +-
  drivers/infiniband/core/rdma_core.c |   9 +-
  drivers/infiniband/core/sa_query.c  |  23 +--
  drivers/infiniband/core/ucm.c   |   7 +-
  drivers/infiniband/core/ucma.c  |  14 +-
  drivers/infiniband/hw/cxgb3/iwch.c  |   4 +-
  drivers/infiniband/hw/cxgb3/iwch.h  |   4 +-
  

Re: consolidate swiotlb dma_map implementations

2018-01-10 Thread Christian König

Acked-by: Christian König <christian.koe...@amd.com> for the whole series.

Regards,
Christian.

Am 10.01.2018 um 09:09 schrieb Christoph Hellwig:

A lot of architectures have essentially identical dma_map_ops
implementations to use swiotlb.  This series adds new generic
swiotlb_alloc/free helpers that take the attrs argument exposed
in dma_map_ops, and which do an enhanced direct allocation
modelled after x86 and reused from the dma-direct code, and
then switches most architectures over to it.  The only exceptions
are mips, which requires additional cache flushing which will
need a new abstraction, and x86 itself which will be handled in
a later series with other x86 dma mapping changes.

To support the generic code a few architectures that currently
use ZONE_DMA/GFP_DMA for <= 32-bit allocations are switched to
implement ZONE_DMA32 instead.

This series is based on the previously sent series to consolidate
the direct dma mapping implementation.  A git tree with this
series as well as the prerequisites is available here:

git://git.infradead.org/users/hch/misc.git swiotlb

Gitweb:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb




Re: consolidate swiotlb dma_map implementations

2018-01-16 Thread Christian König

Am 16.01.2018 um 09:28 schrieb Christoph Hellwig:

On Tue, Jan 16, 2018 at 09:22:52AM +0100, Christian König wrote:

Hi Konrad,

can you send the first patch to Linus for inclusion in 4.15 if you haven't
already done so?

It's in the 4.16 queue with a cc to stable.  I guess we're ok with
a duplicate commit if we have to.


Yeah, while it's only a false positive warning it would be really nice 
to have in 4.15.


It affects all drivers using TTM in the system and not just the two I'm 
the maintainer of.


Regards,
Christian.


Re: consolidate swiotlb dma_map implementations

2018-01-16 Thread Christian König

Hi Konrad,

can you send the first patch to Linus for inclusion in 4.15 if you 
haven't already done so?


I'm still getting reports from people complaining about the error message.

Thanks,
Christian.

Am 16.01.2018 um 08:53 schrieb Christoph Hellwig:

I've pulled this into the dma-mapping for-next tree, including the
missing free_pages noted.  I'd be fine to rebase another day or two
for additional reviews or important fixes.




Re: [PATCH V1 10/10] drm: Remove drm specific kmap_atomic code

2020-05-01 Thread Christian König

Am 30.04.20 um 22:38 schrieb ira.we...@intel.com:

From: Ira Weiny 

kmap_atomic_prot() is now exported by all architectures.  Use this
function rather than open coding a driver specific kmap_atomic.

Signed-off-by: Ira Weiny 


Ah, yes looking into this once more this was on my TODO list for quite a 
while as well.


Patch is Reviewed-by: Christian König , feel 
free to push it upstream through whatever channel you like or ping me if 
I should pick it up into drm-misc-next.


Regards,
Christian.


---
  drivers/gpu/drm/ttm/ttm_bo_util.c| 56 ++--
  drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 16 
  include/drm/ttm/ttm_bo_api.h |  4 --
  3 files changed, 12 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 52d2b71f1588..f09b096ba4fd 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -257,54 +257,6 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned 
long page)
return 0;
  }
  
-#ifdef CONFIG_X86

-#define __ttm_kmap_atomic_prot(__page, __prot) kmap_atomic_prot(__page, __prot)
-#define __ttm_kunmap_atomic(__addr) kunmap_atomic(__addr)
-#else
-#define __ttm_kmap_atomic_prot(__page, __prot) vmap(&__page, 1, 0,  __prot)
-#define __ttm_kunmap_atomic(__addr) vunmap(__addr)
-#endif
-
-
-/**
- * ttm_kmap_atomic_prot - Efficient kernel map of a single page with
- * specified page protection.
- *
- * @page: The page to map.
- * @prot: The page protection.
- *
- * This function maps a TTM page using the kmap_atomic api if available,
- * otherwise falls back to vmap. The user must make sure that the
- * specified page does not have an aliased mapping with a different caching
- * policy unless the architecture explicitly allows it. Also mapping and
- * unmapping using this api must be correctly nested. Unmapping should
- * occur in the reverse order of mapping.
- */
-void *ttm_kmap_atomic_prot(struct page *page, pgprot_t prot)
-{
-   if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
-   return kmap_atomic(page);
-   else
-   return __ttm_kmap_atomic_prot(page, prot);
-}
-EXPORT_SYMBOL(ttm_kmap_atomic_prot);
-
-/**
- * ttm_kunmap_atomic_prot - Unmap a page that was mapped using
- * ttm_kmap_atomic_prot.
- *
- * @addr: The virtual address from the map.
- * @prot: The page protection.
- */
-void ttm_kunmap_atomic_prot(void *addr, pgprot_t prot)
-{
-   if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
-   kunmap_atomic(addr);
-   else
-   __ttm_kunmap_atomic(addr);
-}
-EXPORT_SYMBOL(ttm_kunmap_atomic_prot);
-
  static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
unsigned long page,
pgprot_t prot)
@@ -316,13 +268,13 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void 
*src,
return -ENOMEM;
  
  	src = (void *)((unsigned long)src + (page << PAGE_SHIFT));

-   dst = ttm_kmap_atomic_prot(d, prot);
+   dst = kmap_atomic_prot(d, prot);
if (!dst)
return -ENOMEM;
  
  	memcpy_fromio(dst, src, PAGE_SIZE);
  
-	ttm_kunmap_atomic_prot(dst, prot);

+   kunmap_atomic(dst);
  
  	return 0;

  }
@@ -338,13 +290,13 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void 
*dst,
return -ENOMEM;
  
  	dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT));

-   src = ttm_kmap_atomic_prot(s, prot);
+   src = kmap_atomic_prot(s, prot);
if (!src)
return -ENOMEM;
  
  	memcpy_toio(dst, src, PAGE_SIZE);
  
-	ttm_kunmap_atomic_prot(src, prot);

+   kunmap_atomic(src);
  
  	return 0;

  }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
index bb46ca0c458f..94d456a1d1a9 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
@@ -374,12 +374,12 @@ static int vmw_bo_cpu_blit_line(struct 
vmw_bo_blit_line_data *d,
copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset);
  
  		if (unmap_src) {

-   ttm_kunmap_atomic_prot(d->src_addr, d->src_prot);
+   kunmap_atomic(d->src_addr);
d->src_addr = NULL;
}
  
  		if (unmap_dst) {

-   ttm_kunmap_atomic_prot(d->dst_addr, d->dst_prot);
+   kunmap_atomic(d->dst_addr);
d->dst_addr = NULL;
}
  
@@ -388,8 +388,8 @@ static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d,

return -EINVAL;
  
  			d->dst_addr =

-   ttm_kmap_atomic_prot(d->dst_pages[dst_page],
-d->dst_prot);
+   kmap_atomic_prot(d->dst_pages[dst_page],
+   

Re: [PATCH v1 12/15] powerpc/uaccess: Refactor get/put_user() and __get/put_user()

2021-03-08 Thread Christian König
The radeon warning is trivial to fix, going to send out a patch in a few 
moments.


Regards,
Christian.

Am 08.03.21 um 13:14 schrieb Christophe Leroy:

+Evgeniy for W1 Dallas
+Alex & Christian for RADEON

Le 07/03/2021 à 11:23, kernel test robot a écrit :

Hi Christophe,

I love your patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on v5.12-rc2 next-20210305]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: 
https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-Cleanup-of-uaccess-h/20210226-015715
base: 
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next

config: powerpc-randconfig-s031-20210307 (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce:
 wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross 
-O ~/bin/make.cross

 chmod +x ~/bin/make.cross
 # apt-get install sparse
 # sparse version: v0.6.3-245-gacc5c298-dirty
 # 
https://github.com/0day-ci/linux/commit/449bdbf978936e67e4919be8be0eec3e490a65e2

 git remote add linux-review https://github.com/0day-ci/linux
 git fetch --no-tags linux-review 
Christophe-Leroy/powerpc-Cleanup-of-uaccess-h/20210226-015715

 git checkout 449bdbf978936e67e4919be8be0eec3e490a65e2
 # save the attached .config to linux build tree
 COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 
make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=powerpc


If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 



The mentioned patch is not the source of the problem, it only allows 
to spot it.


Christophe




"sparse warnings: (new ones prefixed by >>)"
drivers/w1/slaves/w1_ds28e04.c:342:13: sparse: sparse: incorrect 
type in initializer (different address spaces) @@ expected char 
[noderef] __user *_pu_addr @@ got char *buf @@
    drivers/w1/slaves/w1_ds28e04.c:342:13: sparse: expected char 
[noderef] __user *_pu_addr

    drivers/w1/slaves/w1_ds28e04.c:342:13: sparse: got char *buf
drivers/w1/slaves/w1_ds28e04.c:356:13: sparse: sparse: incorrect 
type in initializer (different address spaces) @@ expected char 
const [noderef] __user *_gu_addr @@ got char const *buf @@
    drivers/w1/slaves/w1_ds28e04.c:356:13: sparse: expected char 
const [noderef] __user *_gu_addr
    drivers/w1/slaves/w1_ds28e04.c:356:13: sparse: got char const 
*buf

--
    drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: sparse: cast 
removes address space '__user' of expression
    drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: sparse: cast 
removes address space '__user' of expression
drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: sparse: 
incorrect type in initializer (different address spaces) @@ 
expected unsigned int [noderef] __user *_pu_addr @@ got 
unsigned int [usertype] * @@
    drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: expected 
unsigned int [noderef] __user *_pu_addr
    drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: got 
unsigned int [usertype] *
    drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: sparse: cast 
removes address space '__user' of expression


vim +342 drivers/w1/slaves/w1_ds28e04.c

fa33a65a9cf7e2 Greg Kroah-Hartman 2013-08-21  338
fa33a65a9cf7e2 Greg Kroah-Hartman 2013-08-21  339  static ssize_t 
crccheck_show(struct device *dev, struct device_attribute *attr,
fa33a65a9cf7e2 Greg Kroah-Hartman 2013-08-21 340   
char *buf)

fbf7f7b4e2ae40 Markus Franke  2012-05-26  341  {
fbf7f7b4e2ae40 Markus Franke  2012-05-26 @342  if 
(put_user(w1_enable_crccheck + 0x30, buf))

fbf7f7b4e2ae40 Markus Franke  2012-05-26  343 return -EFAULT;
fbf7f7b4e2ae40 Markus Franke  2012-05-26  344
fbf7f7b4e2ae40 Markus Franke  2012-05-26  345  return 
sizeof(w1_enable_crccheck);

fbf7f7b4e2ae40 Markus Franke  2012-05-26  346  }
fbf7f7b4e2ae40 Markus Franke  2012-05-26  347
fa33a65a9cf7e2 Greg Kroah-Hartman 2013-08-21  348  static ssize_t 
crccheck_store(struct device *dev, struct device_attribute *attr,
fbf7f7b4e2ae40 Markus Franke  2012-05-26 349    
const char *buf, size_t count)

fbf7f7b4e2ae40 Markus Franke  2012-05-26  350  {
fbf7f7b4e2ae40 Markus Franke  2012-05-26  351  char val;
fbf7f7b4e2ae40 Markus Franke  2012-05-26  352
fbf7f7b4e2ae40 Markus Franke  2012-05-26  353  if (count != 1 
|| !buf)

fbf7f7b4e2ae40 Markus Franke  2012-05-26  354 return -EINVAL;
fbf7f7b4e2ae40 Markus Franke  2012-05-26  355
fbf7f7b4e2ae40 Markus Franke  2012-05-26 @356  if 
(get_user(val, buf))

fbf7f7b4e2ae40 Markus Franke  2012-05-26  357 return -EFAULT;
fbf7f7b4e2ae40 Markus Franke  2012-05-26  358
fbf7f7b4e2ae40 Markus Franke  

Re: [RFC v2 2/2] drm/amd/display: Use PPC FPU functions

2021-07-21 Thread Christian König

Am 21.07.21 um 06:48 schrieb Anson Jacob:

Use kernel_fpu_begin & kernel_fpu_end for PPC

Depends on "ppc/fpu: Add generic FPU api similar to x86"

v2:
- Got rid of macro switch for PPC as header file with same
   name as x86 is added by previous patch in the series

Signed-off-by: Anson Jacob 
CC: Christoph Hellwig 
CC: Rodrigo Siqueira 
CC: Harry Wentland 
CC: Christian König 


Looks good in general, but question is what about other architectures 
like ARM?


Regards,
Christian.


---
  drivers/gpu/drm/amd/display/dc/os_types.h | 29 ---
  1 file changed, 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/os_types.h 
b/drivers/gpu/drm/amd/display/dc/os_types.h
index 126c2f3a4dd3..47ef434f93d8 100644
--- a/drivers/gpu/drm/amd/display/dc/os_types.h
+++ b/drivers/gpu/drm/amd/display/dc/os_types.h
@@ -51,38 +51,9 @@
  #define dm_error(fmt, ...) DRM_ERROR(fmt, ##__VA_ARGS__)
  
  #if defined(CONFIG_DRM_AMD_DC_DCN)

-#if defined(CONFIG_X86)
  #include 
  #define DC_FP_START() kernel_fpu_begin()
  #define DC_FP_END() kernel_fpu_end()
-#elif defined(CONFIG_PPC64)
-#include 
-#include 
-#define DC_FP_START() { \
-   if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \
-   preempt_disable(); \
-   enable_kernel_vsx(); \
-   } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \
-   preempt_disable(); \
-   enable_kernel_altivec(); \
-   } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \
-   preempt_disable(); \
-   enable_kernel_fp(); \
-   } \
-}
-#define DC_FP_END() { \
-   if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \
-   disable_kernel_vsx(); \
-   preempt_enable(); \
-   } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \
-   disable_kernel_altivec(); \
-   preempt_enable(); \
-   } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \
-   disable_kernel_fp(); \
-   preempt_enable(); \
-   } \
-}
-#endif
  #endif
  
  /*




Re: [RFC v2 2/2] drm/amd/display: Use PPC FPU functions

2021-07-21 Thread Christian König

Am 21.07.21 um 08:51 schrieb Christoph Hellwig:

On Wed, Jul 21, 2021 at 08:29:43AM +0200, Christian K??nig wrote:

Looks good in general, but question is what about other architectures like
ARM?

DRM_AMD_DC_DCN currently requires X86 || PPC64.


And exactly that's the problem I'm noting here. At least officially AMD 
claims that we support ARM and some very brave still use the hardware 
together with MIPS as well.



Maybe a good think would be to add a new KERNEL_FPU_API Kconfig symbol,
selected by x86 and powerpc (I think ppc32 should be fine too now) so
that we get these arch dependencies out of the driver.


Good idea.

Christian.


Re: [RFC 0/2] Add generic FPU api similar to x86

2021-07-20 Thread Christian König
While you already CCed a bunch of people stuff like that needs to go to 
the appropriate mailing list and not just amd-gfx.


Especially LKML so that other core devs can take a look as well.

Regards,
Christian.

Am 19.07.21 um 21:52 schrieb Anson Jacob:

This is an attempt to have generic FPU enable/disable
calls similar to x86.
So that we can simplify gpu/drm/amd/display/dc/os_types.h

Also adds FPU correctness logic seen in x86.

Anson Jacob (2):
   ppc/fpu: Add generic FPU api similar to x86
   drm/amd/display: Use PPC FPU functions

  arch/powerpc/include/asm/switch_to.h  |  29 ++---
  arch/powerpc/kernel/process.c | 130 ++
  drivers/gpu/drm/amd/display/dc/os_types.h |  28 +
  3 files changed, 139 insertions(+), 48 deletions(-)





Re: [PATCH 00/11] Implement generic prot_guest_has() helper function

2021-07-28 Thread Christian König

Am 28.07.21 um 00:26 schrieb Tom Lendacky:

This patch series provides a generic helper function, prot_guest_has(),
to replace the sme_active(), sev_active(), sev_es_active() and
mem_encrypt_active() functions.

It is expected that as new protected virtualization technologies are
added to the kernel, they can all be covered by a single function call
instead of a collection of specific function calls all called from the
same locations.

The powerpc and s390 patches have been compile tested only. Can the
folks copied on this series verify that nothing breaks for them.


As GPU driver dev I'm only one end user of this, but at least from the 
high level point of view that makes totally sense to me.


Feel free to add an Acked-by: Christian König .

We could run that through the AMD GPU unit tests, but I fear we actually 
don't test on a system with SEV/SME active.


Going to raise that on our team call today.

Regards,
Christian.



Cc: Andi Kleen 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Daniel Vetter 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Airlie 
Cc: Heiko Carstens 
Cc: Ingo Molnar 
Cc: Joerg Roedel 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Thomas Zimmermann 
Cc: Vasily Gorbik 
Cc: VMware Graphics 
Cc: Will Deacon 

---

Patches based on:
   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
   commit 79e920060fa7 ("Merge branch 'WIP/fixes'")

Tom Lendacky (11):
   mm: Introduce a function to check for virtualization protection
 features
   x86/sev: Add an x86 version of prot_guest_has()
   powerpc/pseries/svm: Add a powerpc version of prot_guest_has()
   x86/sme: Replace occurrences of sme_active() with prot_guest_has()
   x86/sev: Replace occurrences of sev_active() with prot_guest_has()
   x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()
   treewide: Replace the use of mem_encrypt_active() with
 prot_guest_has()
   mm: Remove the now unused mem_encrypt_active() function
   x86/sev: Remove the now unused mem_encrypt_active() function
   powerpc/pseries/svm: Remove the now unused mem_encrypt_active()
 function
   s390/mm: Remove the now unused mem_encrypt_active() function

  arch/Kconfig   |  3 ++
  arch/powerpc/include/asm/mem_encrypt.h |  5 --
  arch/powerpc/include/asm/protected_guest.h | 30 +++
  arch/powerpc/platforms/pseries/Kconfig |  1 +
  arch/s390/include/asm/mem_encrypt.h|  2 -
  arch/x86/Kconfig   |  1 +
  arch/x86/include/asm/kexec.h   |  2 +-
  arch/x86/include/asm/mem_encrypt.h | 13 +
  arch/x86/include/asm/protected_guest.h | 27 ++
  arch/x86/kernel/crash_dump_64.c|  4 +-
  arch/x86/kernel/head64.c   |  4 +-
  arch/x86/kernel/kvm.c  |  3 +-
  arch/x86/kernel/kvmclock.c |  4 +-
  arch/x86/kernel/machine_kexec_64.c | 19 +++
  arch/x86/kernel/pci-swiotlb.c  |  9 ++--
  arch/x86/kernel/relocate_kernel_64.S   |  2 +-
  arch/x86/kernel/sev.c  |  6 +--
  arch/x86/kvm/svm/svm.c |  3 +-
  arch/x86/mm/ioremap.c  | 16 +++---
  arch/x86/mm/mem_encrypt.c  | 60 +++---
  arch/x86/mm/mem_encrypt_identity.c |  3 +-
  arch/x86/mm/pat/set_memory.c   |  3 +-
  arch/x86/platform/efi/efi_64.c |  9 ++--
  arch/x86/realmode/init.c   |  8 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +-
  drivers/gpu/drm/drm_cache.c|  4 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c|  4 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_msg.c|  6 +--
  drivers/iommu/amd/init.c   |  7 +--
  drivers/iommu/amd/iommu.c  |  3 +-
  drivers/iommu/amd/iommu_v2.c   |  3 +-
  drivers/iommu/iommu.c  |  3 +-
  fs/proc/vmcore.c   |  6 +--
  include/linux/mem_encrypt.h|  4 --
  include/linux/protected_guest.h| 37 +
  kernel/dma/swiotlb.c   |  4 +-
  36 files changed, 218 insertions(+), 104 deletions(-)
  create mode 100644 arch/powerpc/include/asm/protected_guest.h
  create mode 100644 arch/x86/include/asm/protected_guest.h
  create mode 100644 include/linux/protected_guest.h





Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Christian König

Am 28.02.22 um 12:08 schrieb Jakob Koschel:

If the list does not contain the expected element, the value of
list_for_each_entry() iterator will not point to a valid structure.
To avoid type confusion in such case, the list iterator
scope will be limited to list_for_each_entry() loop.


We explicitly have the list_entry_is_head() macro to test after a loop 
if the element pointer points to the head of the list instead of a valid 
list entry.


So at least from my side I absolutely don't think that this is a good idea.


In preparation to limiting scope of a list iterator to the list traversal
loop, use a dedicated pointer to point to the found element.
Determining if an element was found is then simply checking if
the pointer is != NULL.


Since when do we actually want to do this?

Take this code here as an example:

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 48afe96ae0f0..6c916416decc 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -450,7 +450,8 @@ static void sgx_mmu_notifier_release(struct mmu_notifier 
*mn,
 struct mm_struct *mm)
  {
struct sgx_encl_mm *encl_mm = container_of(mn, struct sgx_encl_mm, 
mmu_notifier);
-   struct sgx_encl_mm *tmp = NULL;
+   struct sgx_encl_mm *found_encl_mm = NULL;
+   struct sgx_encl_mm *tmp;

/*
 * The enclave itself can remove encl_mm.  Note, objects can't be moved
@@ -460,12 +461,13 @@ static void sgx_mmu_notifier_release(struct mmu_notifier 
*mn,
list_for_each_entry(tmp, _mm->encl->mm_list, list) {
if (tmp == encl_mm) {
list_del_rcu(_mm->list);
+   found_encl_mm = tmp;
break;
}
}
spin_unlock(_mm->encl->mm_lock);

-   if (tmp == encl_mm) {
+   if (found_encl_mm) {
synchronize_srcu(_mm->encl->srcu);
mmu_notifier_put(mn);
}


I don't think that using the extra variable makes the code in any way 
more reliable or easier to read.


Regards,
Christian.


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Christian König

Am 28.02.22 um 20:56 schrieb Linus Torvalds:

On Mon, Feb 28, 2022 at 4:19 AM Christian König
 wrote:

I don't think that using the extra variable makes the code in any way
more reliable or easier to read.

So I think the next step is to do the attached patch (which requires
that "-std=gnu11" that was discussed in the original thread).

That will guarantee that the 'pos' parameter of list_for_each_entry()
is only updated INSIDE the for_each_list_entry() loop, and can never
point to the (wrongly typed) head entry.

And I would actually hope that it should actually cause compiler
warnings about possibly uninitialized variables if people then use the
'pos' pointer outside the loop. Except

  (a) that code in sgx/encl.c currently initializes 'tmp' to NULL for
inexplicable reasons - possibly because it already expected this
behavior

  (b) when I remove that NULL initializer, I still don't get a warning,
because we've disabled -Wno-maybe-uninitialized since it results in so
many false positives.

Oh well.

Anyway, give this patch a look, and at least if it's expanded to do
"(pos) = NULL" in the entry statement for the for-loop, it will avoid
the HEAD type confusion that Jakob is working on. And I think in a
cleaner way than the horrid games he plays.

(But it won't avoid possible CPU speculation of such type confusion.
That, in my opinion, is a completely different issue)


Yes, completely agree.


I do wish we could actually poison the 'pos' value after the loop
somehow - but clearly the "might be uninitialized" I was hoping for
isn't the way to do it.

Anybody have any ideas?


I think we should look at the use cases why code is touching (pos) after 
the loop.


Just from skimming over the patches to change this and experience with 
the drivers/subsystems I help to maintain I think the primary pattern 
looks something like this:


list_for_each_entry(entry, head, member) {
    if (some_condition_checking(entry))
        break;
}
do_something_with(entry);

So the solution should probably not be to change all those use cases to 
use more temporary variables, but rather to add a list_find_entry(..., 
condition) macro and consistently use that one instead.


Regards,
Christian.



 Linus




Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Christian König

Am 28.02.22 um 22:13 schrieb James Bottomley:

On Mon, 2022-02-28 at 21:56 +0100, Christian König wrote:

Am 28.02.22 um 21:42 schrieb James Bottomley:

On Mon, 2022-02-28 at 21:07 +0100, Christian König wrote:

Am 28.02.22 um 20:56 schrieb Linus Torvalds:

On Mon, Feb 28, 2022 at 4:19 AM Christian König
 wrote:
[SNIP]
Anybody have any ideas?

I think we should look at the use cases why code is touching
(pos)
after the loop.

Just from skimming over the patches to change this and experience
with the drivers/subsystems I help to maintain I think the
primary pattern looks something like this:

list_for_each_entry(entry, head, member) {
   if (some_condition_checking(entry))
   break;
}
do_something_with(entry);

Actually, we usually have a check to see if the loop found
anything, but in that case it should something like

if (list_entry_is_head(entry, head, member)) {
  return with error;
}
do_somethin_with(entry);

Suffice?  The list_entry_is_head() macro is designed to cope with
the bogus entry on head problem.

That will work and is also what people already do.

The key problem is that we let people do the same thing over and
over again with slightly different implementations.

Out in the wild I've seen at least using a separate variable, using
a bool to indicate that something was found and just assuming that
the list has an entry.

The last case is bogus and basically what can break badly.

Yes, I understand that.  I'm saying we should replace that bogus checks
of entry->something against some_value loop termination condition with
the list_entry_is_head() macro.  That should be a one line and fairly
mechanical change rather than the explosion of code changes we seem to
have in the patch series.


Yes, exactly that's my thinking as well.

Christian.



James






Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread Christian König




Am 28.02.22 um 21:42 schrieb James Bottomley:

On Mon, 2022-02-28 at 21:07 +0100, Christian König wrote:

Am 28.02.22 um 20:56 schrieb Linus Torvalds:

On Mon, Feb 28, 2022 at 4:19 AM Christian König
 wrote:
[SNIP]
Anybody have any ideas?

I think we should look at the use cases why code is touching (pos)
after the loop.

Just from skimming over the patches to change this and experience
with the drivers/subsystems I help to maintain I think the primary
pattern looks something like this:

list_for_each_entry(entry, head, member) {
  if (some_condition_checking(entry))
  break;
}
do_something_with(entry);


Actually, we usually have a check to see if the loop found anything,
but in that case it should something like

if (list_entry_is_head(entry, head, member)) {
 return with error;
}
do_somethin_with(entry);

Suffice?  The list_entry_is_head() macro is designed to cope with the
bogus entry on head problem.


That will work and is also what people already do.

The key problem is that we let people do the same thing over and over 
again with slightly different implementations.


Out in the wild I've seen at least using a separate variable, using a 
bool to indicate that something was found and just assuming that the 
list has an entry.


The last case is bogus and basically what can break badly.

If we would have an unified macro which search for an entry combined 
with automated reporting on patches to use that macro I think the 
potential to introduce such issues will already go down massively 
without auditing tons of existing code.


Regards,
Christian.



James






Re: [PATCH v4 00/15] Unified cross-architecture kernel-mode FPU API

2024-04-03 Thread Christian König
I only skimmed over the platform patches and spend only a few minutes on 
the amdgpu stuff.


From what I've seen this series seems to make perfect sense to me, I 
just can't fully judge everything.


So feel free to add Acked-by: Christian König  
but I strongly suggest that Harry and Rodrigo take a look as well.


Regards,
Christian.

Am 29.03.24 um 08:18 schrieb Samuel Holland:

This series unifies the kernel-mode FPU API across several architectures
by wrapping the existing functions (where needed) in consistently-named
functions placed in a consistent header location, with mostly the same
semantics: they can be called from preemptible or non-preemptible task
context, and are not assumed to be reentrant. Architectures are also
expected to provide CFLAGS adjustments for compiling FPU-dependent code.
For the moment, SIMD/vector units are out of scope for this common API.

This allows us to remove the ifdeffery and duplicated Makefile logic at
each FPU user. It then implements the common API on RISC-V, and converts
a couple of users to the new API: the AMDGPU DRM driver, and the FPU
self test.

The underlying goal of this series is to allow using newer AMD GPUs
(e.g. Navi) on RISC-V boards such as SiFive's HiFive Unmatched. Those
GPUs need CONFIG_DRM_AMD_DC_FP to initialize, which requires kernel-mode
FPU support.

Previous versions:
v3: 
https://lore.kernel.org/linux-kernel/20240327200157.1097089-1-samuel.holl...@sifive.com/
v2: 
https://lore.kernel.org/linux-kernel/20231228014220.3562640-1-samuel.holl...@sifive.com/
v1: 
https://lore.kernel.org/linux-kernel/20231208055501.2916202-1-samuel.holl...@sifive.com/
v0: 
https://lore.kernel.org/linux-kernel/20231122030621.3759313-1-samuel.holl...@sifive.com/

Changes in v4:
  - Add missed CFLAGS changes for recov_neon_inner.c
(fixes arm build failures)
  - Fix x86 include guard issue (fixes x86 build failures)

Changes in v3:
  - Rebase on v6.9-rc1
  - Limit riscv ARCH_HAS_KERNEL_FPU_SUPPORT to 64BIT

Changes in v2:
  - Add documentation explaining the built-time and runtime APIs
  - Add a linux/fpu.h header for generic isolation enforcement
  - Remove file name from header comment
  - Clean up arch/arm64/lib/Makefile, like for arch/arm
  - Remove RISC-V architecture-specific preprocessor check
  - Split altivec removal to a separate patch
  - Use linux/fpu.h instead of asm/fpu.h in consumers
  - Declare test_fpu() in a header

Michael Ellerman (1):
   drm/amd/display: Only use hard-float, not altivec on powerpc

Samuel Holland (14):
   arch: Add ARCH_HAS_KERNEL_FPU_SUPPORT
   ARM: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
   ARM: crypto: Use CC_FLAGS_FPU for NEON CFLAGS
   arm64: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
   arm64: crypto: Use CC_FLAGS_FPU for NEON CFLAGS
   lib/raid6: Use CC_FLAGS_FPU for NEON CFLAGS
   LoongArch: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
   powerpc: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
   x86/fpu: Fix asm/fpu/types.h include guard
   x86: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
   riscv: Add support for kernel-mode FPU
   drm/amd/display: Use ARCH_HAS_KERNEL_FPU_SUPPORT
   selftests/fpu: Move FP code to a separate translation unit
   selftests/fpu: Allow building on other architectures

  Documentation/core-api/floating-point.rst | 78 +++
  Documentation/core-api/index.rst  |  1 +
  Makefile  |  5 ++
  arch/Kconfig  |  6 ++
  arch/arm/Kconfig  |  1 +
  arch/arm/Makefile |  7 ++
  arch/arm/include/asm/fpu.h| 15 
  arch/arm/lib/Makefile |  3 +-
  arch/arm64/Kconfig|  1 +
  arch/arm64/Makefile   |  9 ++-
  arch/arm64/include/asm/fpu.h  | 15 
  arch/arm64/lib/Makefile   |  6 +-
  arch/loongarch/Kconfig|  1 +
  arch/loongarch/Makefile   |  5 +-
  arch/loongarch/include/asm/fpu.h  |  1 +
  arch/powerpc/Kconfig  |  1 +
  arch/powerpc/Makefile |  5 +-
  arch/powerpc/include/asm/fpu.h| 28 +++
  arch/riscv/Kconfig|  1 +
  arch/riscv/Makefile   |  3 +
  arch/riscv/include/asm/fpu.h  | 16 
  arch/riscv/kernel/Makefile|  1 +
  arch/riscv/kernel/kernel_mode_fpu.c   | 28 +++
  arch/x86/Kconfig  |  1 +
  arch/x86/Makefile | 20 +
  arch/x86/include/asm/fpu.h| 13 
  arch/x86/include/asm/fpu/types.h  |  6 +-
  drivers/gpu/drm/amd/display/Kconfig   |  2 +-
  .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 35 +
  drivers/gpu/drm/amd/display/dc/dml/Makefile   | 36 +
  drivers/gpu/drm/amd/display/dc/dml2/Makefile  | 36