date:20201001

[PATCH v17 3/5] firmware: xilinx: Add RPU configuration APIs

2020-10-01 Thread Ben Levinsky

This patch adds APIs to access to configure RPU and its
processor-specific memory.

That is query the run-time mode of RPU as either split or lockstep as well
as API to set this mode. In addition add APIs to access configuration of
the RPUs' tightly coupled memory (TCM).

Signed-off-by: Ben Levinsky 
---
v3:
- add xilinx-related platform mgmt fn's instead of wrapping around
  function pointer in xilinx eemi ops struct
v4:
- add default values for enums
v9:
- update commit message
- for zynqmp_pm_set_tcm_config and zynqmp_pm_get_rpu_mode update docs for
  expected output, arguments as well removing unused args
- remove unused fn zynqmp_pm_get_node_status
v11:
- update usage of zynqmp_pm_get_rpu_mode to return rpu mode in enum
- update zynqmp_pm_set_tcm_config and zynqmp_pm_set_rpu_mode arguments to 
remove unused args
v12:
- in drivers/firmware/zynqmp.c, update zynqmp_pm_set_rpu_mode so rpu_mode
  is only set if no error
- update args for zynqmp_pm_set_rpu_mode, zynqmp_pm_set_tcm_config fn arg's to
  reflect what is expected in the function and the usage in
  zynqmp_r5_remoteproc accordingly
- zynqmp_pm_get_rpu_mode argument zynqmp_pm_get_rpu_mode is
  only set if no error
---
 drivers/firmware/xilinx/zynqmp.c | 61 
 include/linux/firmware/xlnx-zynqmp.h | 18 
 2 files changed, 79 insertions(+)

diff --git a/drivers/firmware/xilinx/zynqmp.c b/drivers/firmware/xilinx/zynqmp.c
index a966ee956573..b390a00338d0 100644
--- a/drivers/firmware/xilinx/zynqmp.c
+++ b/drivers/firmware/xilinx/zynqmp.c
@@ -846,6 +846,67 @@ int zynqmp_pm_release_node(const u32 node)
 }
 EXPORT_SYMBOL_GPL(zynqmp_pm_release_node);
 
+/**
+ * zynqmp_pm_get_rpu_mode() - Get RPU mode
+ * @node_id:   Node ID of the device
+ * @rpu_mode:  return by reference value
+ * either split or lockstep
+ *
+ * Return: return 0 on success or error+reason.
+ * if success, then  rpu_mode will be set
+ * to current rpu mode.
+ */
+int zynqmp_pm_get_rpu_mode(u32 node_id, enum rpu_oper_mode *rpu_mode)
+{
+   u32 ret_payload[PAYLOAD_ARG_CNT];
+   int ret;
+
+   ret = zynqmp_pm_invoke_fn(PM_IOCTL, node_id,
+ IOCTL_GET_RPU_OPER_MODE, 0, 0, ret_payload);
+
+   /* only set rpu_mode if no error */
+   if (ret == XST_PM_SUCCESS)
+   *rpu_mode = ret_payload[0];
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(zynqmp_pm_get_rpu_mode);
+
+/**
+ * zynqmp_pm_set_rpu_mode() - Set RPU mode
+ * @node_id:   Node ID of the device
+ * @rpu_mode:  Argument 1 to requested IOCTL call. either split or lockstep
+ *
+ * This function is used to set RPU mode to split or
+ * lockstep
+ *
+ * Return: Returns status, either success or error+reason
+ */
+int zynqmp_pm_set_rpu_mode(u32 node_id, enum rpu_oper_mode rpu_mode)
+{
+   return zynqmp_pm_invoke_fn(PM_IOCTL, node_id,
+  IOCTL_SET_RPU_OPER_MODE, (u32)rpu_mode,
+  0, NULL);
+}
+EXPORT_SYMBOL_GPL(zynqmp_pm_set_rpu_mode);
+
+/**
+ * zynqmp_pm_set_tcm_config - configure TCM
+ * @tcm_mode:  Argument 1 to requested IOCTL call
+ *  either PM_RPU_TCM_COMB or PM_RPU_TCM_SPLIT
+ *
+ * This function is used to set RPU mode to split or combined
+ *
+ * Return: status: 0 for success, else failure
+ */
+int zynqmp_pm_set_tcm_config(u32 node_id, enum rpu_tcm_comb tcm_mode)
+{
+   return zynqmp_pm_invoke_fn(PM_IOCTL, node_id,
+  IOCTL_TCM_COMB_CONFIG, (u32)tcm_mode, 0,
+  NULL);
+}
+EXPORT_SYMBOL_GPL(zynqmp_pm_set_tcm_config);
+
 /**
  * zynqmp_pm_force_pwrdwn - PM call to request for another PU or subsystem to
  * be powered down forcefully
diff --git a/include/linux/firmware/xlnx-zynqmp.h 
b/include/linux/firmware/xlnx-zynqmp.h
index 6241c5ac51b3..79aa2fcbcd54 100644
--- a/include/linux/firmware/xlnx-zynqmp.h
+++ b/include/linux/firmware/xlnx-zynqmp.h
@@ -385,6 +385,9 @@ int zynqmp_pm_request_wake(const u32 node,
   const bool set_addr,
   const u64 address,
   const enum zynqmp_pm_request_ack ack);
+int zynqmp_pm_get_rpu_mode(u32 node_id, enum rpu_oper_mode *rpu_mode);
+int zynqmp_pm_set_rpu_mode(u32 node_id, u32 arg1);
+int zynqmp_pm_set_tcm_config(u32 node_id, u32 arg1);
 #else
 static inline struct zynqmp_eemi_ops *zynqmp_pm_get_eemi_ops(void)
 {
@@ -549,6 +552,21 @@ static inline int zynqmp_pm_request_wake(const u32 node,
 {
return -ENODEV;
 }
+
+static inline int zynqmp_pm_get_rpu_mode(u32 node_id, enum rpu_oper_mode 
*rpu_mode)
+{
+   return -ENODEV;
+}
+
+static inline int zynqmp_pm_set_rpu_mode(u32 node_id, u32 arg1)
+{
+   return -ENODEV;
+}
+
+static inline int zynqmp_pm_set_tcm_config(u32 node_id, u32 arg1)
+{
+   return -ENODEV;
+}
 #endif
 
 #endif /* __FIRMWARE_ZYNQMP_H__ */
-- 
2.17.1

Re: [PATCH v7 05/13] PCI/ERR: Use "bridge" for clarity in pcie_do_recovery()

2020-10-01 Thread Sean V Kelley


On 1 Oct 2020, at 2:06, Jonathan Cameron wrote:


On Wed, 30 Sep 2020 14:58:12 -0700
Sean V Kelley  wrote:


From: Sean V Kelley 

The term "dev" is being applied to root ports, switch
upstream ports, switch downstream ports, and the upstream
ports on endpoints. While endpoint upstream ports don't have
subordinate buses, a generic term such as "bridge" may be used


This sentence is a bit confusing.  The bit before the comma
seems only slightly connected. Perhaps 2 sentences?


I agree.  Will reword.



for something with a subordinate bus. The current conditional
logic in pcie_do_recovery() would also benefit from some
simplification with use of pci_upstream_bridge() in place of
dev->bus->self. Reverse the pcie_do_recovery() conditional logic
and replace use of "dev" with "bridge" for greater clarity.

Suggested-by: Bjorn Helgaas 
Signed-off-by: Sean V Kelley 


Acked-by: Jonathan Cameron 


Thanks,

Sean




---
 drivers/pci/pcie/err.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 950612342f1c..c6922c099c76 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -152,16 +152,22 @@ pci_ers_result_t pcie_do_recovery(struct 
pci_dev *dev,

 {
pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
struct pci_bus *bus;
+   struct pci_dev *bridge;
+   int type;

/*
-	 * Error recovery runs on all subordinates of the first downstream 
port.
-	 * If the downstream port detected the error, it is cleared at the 
end.

+* Error recovery runs on all subordinates of the first downstream
+* bridge. If the downstream bridge detected the error, it is
+* cleared at the end.
 */
-   if (!(pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
- pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM))
-   dev = dev->bus->self;
-   bus = dev->subordinate;
-
+   type = pci_pcie_type(dev);
+   if (type == PCI_EXP_TYPE_ROOT_PORT ||
+   type == PCI_EXP_TYPE_DOWNSTREAM)
+   bridge = dev;
+   else
+   bridge = pci_upstream_bridge(dev);
+
+   bus = bridge->subordinate;
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, );

[PATCH] vhost-vdpa: fix page pinning leakage in error path

2020-10-01 Thread Si-Wei Liu

Pinned pages are not properly accounted particularly when
mapping error occurs on IOTLB update. Clean up dangling
pinned pages for the error path. As the inflight pinned
pages, specifically for memory region that strides across
multiple chunks, would need more than one free page for
book keeping and accounting. For simplicity, pin pages
for all memory in the IOVA range in one go rather than
have multiple pin_user_pages calls to make up the entire
region. This way it's easier to track and account the
pages already mapped, particularly for clean-up in the
error path.

Fixes: 20453a45fb06 ("vhost: introduce vDPA-based backend")
Signed-off-by: Si-Wei Liu 
---
 drivers/vhost/vdpa.c | 121 +++
 1 file changed, 73 insertions(+), 48 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 796fe97..abc4aa2 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -565,6 +565,8 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
  perm_to_iommu_flags(perm));
}
 
+   if (r)
+   vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1);
return r;
 }
 
@@ -592,21 +594,19 @@ static int vhost_vdpa_process_iotlb_update(struct 
vhost_vdpa *v,
struct vhost_dev *dev = >vdev;
struct vhost_iotlb *iotlb = dev->iotlb;
struct page **page_list;
-   unsigned long list_size = PAGE_SIZE / sizeof(struct page *);
+   struct vm_area_struct **vmas;
unsigned int gup_flags = FOLL_LONGTERM;
-   unsigned long npages, cur_base, map_pfn, last_pfn = 0;
-   unsigned long locked, lock_limit, pinned, i;
+   unsigned long map_pfn, last_pfn = 0;
+   unsigned long npages, lock_limit;
+   unsigned long i, nmap = 0;
u64 iova = msg->iova;
+   long pinned;
int ret = 0;
 
if (vhost_iotlb_itree_first(iotlb, msg->iova,
msg->iova + msg->size - 1))
return -EEXIST;
 
-   page_list = (struct page **) __get_free_page(GFP_KERNEL);
-   if (!page_list)
-   return -ENOMEM;
-
if (msg->perm & VHOST_ACCESS_WO)
gup_flags |= FOLL_WRITE;
 
@@ -614,61 +614,86 @@ static int vhost_vdpa_process_iotlb_update(struct 
vhost_vdpa *v,
if (!npages)
return -EINVAL;
 
+   page_list = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
+   vmas = kvmalloc_array(npages, sizeof(struct vm_area_struct *),
+ GFP_KERNEL);
+   if (!page_list || !vmas) {
+   ret = -ENOMEM;
+   goto free;
+   }
+
mmap_read_lock(dev->mm);
 
-   locked = atomic64_add_return(npages, >mm->pinned_vm);
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
-
-   if (locked > lock_limit) {
+   if (npages + atomic64_read(>mm->pinned_vm) > lock_limit) {
ret = -ENOMEM;
-   goto out;
+   goto unlock;
}
 
-   cur_base = msg->uaddr & PAGE_MASK;
-   iova &= PAGE_MASK;
+   pinned = pin_user_pages(msg->uaddr & PAGE_MASK, npages, gup_flags,
+   page_list, vmas);
+   if (npages != pinned) {
+   if (pinned < 0) {
+   ret = pinned;
+   } else {
+   unpin_user_pages(page_list, pinned);
+   ret = -ENOMEM;
+   }
+   goto unlock;
+   }
 
-   while (npages) {
-   pinned = min_t(unsigned long, npages, list_size);
-   ret = pin_user_pages(cur_base, pinned,
-gup_flags, page_list, NULL);
-   if (ret != pinned)
-   goto out;
-
-   if (!last_pfn)
-   map_pfn = page_to_pfn(page_list[0]);
-
-   for (i = 0; i < ret; i++) {
-   unsigned long this_pfn = page_to_pfn(page_list[i]);
-   u64 csize;
-
-   if (last_pfn && (this_pfn != last_pfn + 1)) {
-   /* Pin a contiguous chunk of memory */
-   csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT;
-   if (vhost_vdpa_map(v, iova, csize,
-  map_pfn << PAGE_SHIFT,
-  msg->perm))
-   goto out;
-   map_pfn = this_pfn;
-   iova += csize;
+   iova &= PAGE_MASK;
+   map_pfn = page_to_pfn(page_list[0]);
+
+   /* One more iteration to avoid extra vdpa_map() call out of loop. */
+   for (i = 0; i <= npages; i++) {
+   unsigned long this_pfn;
+   u64 csize;
+
+   /* The last chunk may have no valid PFN next to it */
+   this_pfn = i < npages ?

[PATCH 5/7] Traffic control using high-resolution timer issue

2020-10-01 Thread Erez Geva

  - Add new schedule function for Qdisc watchdog.
The function avoids reprogram if watchdog already expire
before new expire.

  - Use new schedule function in ETF.

  - Add ETF range value to kernel configuration.
as the value is characteristic to Hardware.

Signed-off-by: Erez Geva 
---
 include/net/pkt_sched.h |  2 ++
 net/sched/Kconfig   |  8 
 net/sched/sch_api.c | 33 +
 net/sched/sch_etf.c | 10 --
 4 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index ac8c890a2657..4306c2773778 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -78,6 +78,8 @@ void qdisc_watchdog_init(struct qdisc_watchdog *wd, struct 
Qdisc *qdisc);
 
 void qdisc_watchdog_schedule_range_ns(struct qdisc_watchdog *wd, u64 expires,
  u64 delta_ns);
+void qdisc_watchdog_schedule_soon_ns(struct qdisc_watchdog *wd, u64 expires,
+u64 delta_ns);
 
 static inline void qdisc_watchdog_schedule_ns(struct qdisc_watchdog *wd,
  u64 expires)
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index a3b37d88800e..0f5261ee9e1b 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -195,6 +195,14 @@ config NET_SCH_ETF
  To compile this code as a module, choose M here: the
  module will be called sch_etf.
 
+config NET_SCH_ETF_TIMER_RANGE
+   int "ETF Watchdog time range delta in nano seconds"
+   depends on NET_SCH_ETF
+   default 5000
+   help
+ Specify the time range delta for ETF watchdog
+ Default is 5 microsecond
+
 config NET_SCH_TAPRIO
tristate "Time Aware Priority (taprio) Scheduler"
help
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index ebf59ca1faab..80bd09555f5e 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -645,6 +645,39 @@ void qdisc_watchdog_schedule_range_ns(struct 
qdisc_watchdog *wd, u64 expires,
 }
 EXPORT_SYMBOL(qdisc_watchdog_schedule_range_ns);
 
+void qdisc_watchdog_schedule_soon_ns(struct qdisc_watchdog *wd, u64 expires,
+u64 delta_ns)
+{
+   if (test_bit(__QDISC_STATE_DEACTIVATED,
+_root_sleeping(wd->qdisc)->state))
+   return;
+
+   if (wd->last_expires == expires)
+   return;
+
+   /**
+* If expires is in [0, now + delta_ns],
+* do not program it.
+*/
+   if (expires <= ktime_to_ns(hrtimer_cb_get_time(>timer)) + delta_ns)
+   return;
+
+   /**
+* If timer is already set in [0, expires + delta_ns],
+* do not reprogram it.
+*/
+   if (hrtimer_is_queued(>timer) &&
+   wd->last_expires <= expires + delta_ns)
+   return;
+
+   wd->last_expires = expires;
+   hrtimer_start_range_ns(>timer,
+  ns_to_ktime(expires),
+  delta_ns,
+  HRTIMER_MODE_ABS_PINNED);
+}
+EXPORT_SYMBOL(qdisc_watchdog_schedule_soon_ns);
+
 void qdisc_watchdog_cancel(struct qdisc_watchdog *wd)
 {
hrtimer_cancel(>timer);
diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c
index c48f91075b5c..48b2868c4672 100644
--- a/net/sched/sch_etf.c
+++ b/net/sched/sch_etf.c
@@ -20,6 +20,11 @@
 #include 
 #include 
 
+#ifdef CONFIG_NET_SCH_ETF_TIMER_RANGE
+#define NET_SCH_ETF_TIMER_RANGE CONFIG_NET_SCH_ETF_TIMER_RANGE
+#else
+#define NET_SCH_ETF_TIMER_RANGE (5 * NSEC_PER_USEC)
+#endif
 #define DEADLINE_MODE_IS_ON(x) ((x)->flags & TC_ETF_DEADLINE_MODE_ON)
 #define OFFLOAD_IS_ON(x) ((x)->flags & TC_ETF_OFFLOAD_ON)
 #define SKIP_SOCK_CHECK_IS_SET(x) ((x)->flags & TC_ETF_SKIP_SOCK_CHECK)
@@ -128,8 +133,9 @@ static void reset_watchdog(struct Qdisc *sch)
return;
}
 
-   next = ktime_sub_ns(skb->tstamp, q->delta);
-   qdisc_watchdog_schedule_ns(>watchdog, ktime_to_ns(next));
+   next = ktime_sub_ns(skb->tstamp, q->delta + NET_SCH_ETF_TIMER_RANGE);
+   qdisc_watchdog_schedule_soon_ns(>watchdog, ktime_to_ns(next),
+   NET_SCH_ETF_TIMER_RANGE);
 }
 
 static void report_sock_error(struct sk_buff *skb, u32 err, u8 code)
-- 
2.20.1

[PATCH 1/7] POSIX clock ID check function

2020-10-01 Thread Erez Geva

Add function to check whether a clock ID refer to
a file descriptor of a POSIX dynamic clock.

Signed-off-by: Erez Geva 
---
 include/linux/posix-timers.h | 5 +
 kernel/time/posix-timers.c   | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 896c16d2c5fb..7cb551bbb763 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -57,6 +57,11 @@ static inline int clockid_to_fd(const clockid_t clk)
return ~(clk >> 3);
 }
 
+static inline bool is_clockid_fd_clock(const clockid_t clk)
+{
+   return (clk < 0) && ((clk & CLOCKFD_MASK) == CLOCKFD);
+}
+
 #ifdef CONFIG_POSIX_TIMERS
 
 /**
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index bf540f5a4115..806465233303 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1400,7 +1400,7 @@ static const struct k_clock *clockid_to_kclock(const 
clockid_t id)
clockid_t idx = id;
 
if (id < 0) {
-   return (id & CLOCKFD_MASK) == CLOCKFD ?
+   return is_clockid_fd_clock(id) ?
_posix_dynamic : _posix_cpu;
}
 
-- 
2.20.1

[PATCH 0/7] TC-ETF support PTP clocks series

2020-10-01 Thread Erez Geva

Add support for using PTP clock with
 Traffic control Earliest TxTime First (ETF) Qdisc.

Why do we need ETF to use PTP clock?
Current ETF requires to synchronization the system clock
 to the PTP Hardware clock (PHC) we want to send through.
But there are cases that we can not synchronize the system clock with
 the desire NIC PHC.
1. We use several NICs with several PTP domains that our device
is not allowed to be PTP master.
   And we are not allowed to synchronize these PTP domains.
2. We are using another clock source which we need for our system.
   Yet our device is not allowed to be PTP master.
Regardless of the exact topology, as the Linux tradition is to allow
 the user the freedom to choose, we propose a patch that will allow
 the user to configure the TC-ETF with a PTP clock as well as
 using the system clock.
* NOTE: we do encourage the users to synchronize the system clock with
  a PTP clock.
 As the ETF watchdog uses the system clock.
 Synchronizing the system clock with a PTP clock will probably
  reduce the frequency different of the PHC and the system clock.
 As sequence, the user might be able to reduce the ETF delta time
  and the packets latency cross the network.

Follow the decision to derive a dynamic clock ID from the file description
 of an opened PTP clock device file.
We propose a simple way to use the dynamic clock ID with the ETF Qdisc.
We will submit a patch to the "tc" tool from the iproute2 project
 once this patch is accepted.

The patches contain:
1. Add function to verify that a dynamic clock ID is derived
from file description.
   The function follows the clock ID convention for dynamic clock ID
for file description.
  The function will be used in the second patch.

2. Function to get main system oscillator calibration state.

3. Functions to get and put POSIX clock reference of
a PTP Hardware Clock (PHC).
   The get function uses a dynamic clock ID created by application space.
   The purpose is that a module can hold a POSIX clock reference after the
configuration application closed the PTP clock device file,
and though the dynamic clock ID can not be used any further.
   The POSIX clock refereces are used by the TC-ETF.

4. A fix of the range check in qdisc_watchdog_schedule_range_ns().

5. During testing of ETF, we notice issue with the high-resolution timer
the ETF Qdisc watchdog uses.
   The timer was set for a sleep of 300 nanoseconds but
end up sleeping for 3 milliseconds.
   The problem happens when the timer is already active and
the current expire is earlier then a new expire.
   So, we add a new TC schedule function that do not reprogram the timer
under these conditions.
   The use of the function make sense as the Qdisc watchdog does act as
watchdog.
   The Qdisc watchdog can expire earlier.
   However, if the watchdog is late, packets are dropped.

6. Add kernel configuration for TC-ETF watchdog range.
   As the range is characteristic of Hardware,
   that seems to be the proper way.

7. Add support for using PHC clock with TC-ETF.

Erez Geva (7):
  POSIX clock ID check function
  Function to retrieve main clock state
  Functions to fetch POSIX dynamic clock object
  Fix qdisc_watchdog_schedule_range_ns range check
  Traffic control using high-resolution timer issue
  TC-ETF code improvements
  TC-ETF support PTP clocks

 include/linux/posix-clock.h |  39 +
 include/linux/posix-timers.h|   5 ++
 include/linux/timex.h   |   1 +
 include/net/pkt_sched.h |   2 +
 include/uapi/linux/net_tstamp.h |   5 ++
 kernel/time/posix-clock.c   |  76 
 kernel/time/posix-timers.c  |   2 +-
 kernel/time/timekeeping.c   |   9 ++
 net/sched/Kconfig   |   8 ++
 net/sched/sch_api.c |  36 +++-
 net/sched/sch_etf.c | 148 +---
 11 files changed, 298 insertions(+), 33 deletions(-)


base-commit: a1b8638ba1320e6684aa98233c15255eb803fac7
-- 
2.20.1

Re: [PATCH v7 00/13] Add RCEC handling to PCI/AER

2020-10-01 Thread Sean V Kelley


On 1 Oct 2020, at 3:16, Jonathan Cameron wrote:


On Wed, 30 Sep 2020 14:58:07 -0700
Sean V Kelley  wrote:


From: Sean V Kelley 

Changes since v6 [1]:

- Remove unused includes in rcec.c.
- Add local variable for dev->rcec_ea.
- If no valid capability version then just fill in nextbusn = 0xff.
- Leave a blank line after pci_rcec_init(dev).
- Reword commit w/ "Attempt to do a function level reset for an RCiEP 
on fatal error."
- Change An RCiEP found on bus in range -> An RCiEP found on a 
different bus in range
- Remove special check on capability version if you fill in nextbusn 
= 0xff.

- Remove blank lines in pcie_link_rcec header.
- Fix indentation aer.c.
(Jonathan Cameron)

- Relocate enabling of PME for RCECs to later RCEC handling additions 
to PME.

- Rename rcec_ext to rcec_ea.
- Remove rcec_cap as its use can be handled with rcec_ea.
- Add a forward declaration for struct rcec_ea.
- Rename pci_bridge_walk() to pci_walk_bridge() to match consistency 
with other usage.
- Separate changes to "reset_subordinate_devices" because it doesn't 
change the interface.
- Separate the use of "type", rename of "dev" to "bridge", the 
inversion of the condition and

use of pci_upstream_bridge() instead of dev->bus->self.
- Separate the conditional check (TYPE_ROOT_PORT and TYPE_DOWNSTREAM) 
for AER resets.
- Consider embedding RCiEP's parent RCEC in the rcec_ea struct. 
However, the
issue here is that we don't normally allocate the rcec_ea struct for 
RCiEPs and

the linkage of rcec_ea->rcec is not necessarily more clear.
- Provide more comment on the non-native case for clarity.
(Bjorn Helgaas)

[1] 
https://lore.kernel.org/linux-pci/20200922213859.108826-1-seanvk@oregontracks.org/


Root Complex Event Collectors (RCEC) provide support for terminating 
error
and PME messages from Root Complex Integrated Endpoints (RCiEPs).  An 
RCEC
resides on a Bus in the Root Complex. Multiple RCECs can in fact 
reside on
a single bus. An RCEC will explicitly declare supported RCiEPs 
through the

Root Complex Endpoint Association Extended Capability.

(See PCIe 5.0-1, sections 1.3.2.3 (RCiEP), and 7.9.10 (RCEC Ext. 
Cap.))


The kernel lacks handling for these RCECs and the error messages 
received

from their respective associated RCiEPs. More recently, a new CPU
interconnect, Compute eXpress Link (CXL) depends on RCEC capabilities 
for

purposes of error messaging from CXL 1.1 supported RCiEP devices.

DocLink: https://www.computeexpresslink.org/

This use case is not limited to CXL. Existing hardware today includes
support for RCECs, such as the Denverton microserver product
family. Future hardware will be forthcoming.

(See Intel Document, Order number: 33061-003US)

So services such as AER or PME could be associated with an RCEC 
driver.
In the case of CXL, if an RCiEP (i.e., CXL 1.1 device) is associated 
with a
platform's RCEC it shall signal PME and AER error conditions through 
that

RCEC.

Towards the above use cases, add the missing RCEC class and extend 
the
PCIe Root Port and service drivers to allow association of RCiEPs to 
their
respective parent RCEC and facilitate handling of terminating error 
and PME

messages.


I took a look at the combined result of the series as well as 
individual

patches I've acked.  All looks good to me.

Also ran a quick batch of tests with the non-native / no visible RCEC 
case
and that's working as expected.  Feels a bit odd too give a tested-by 
for
the case that touches only a tiny corner of the code, but if you want 
to include

it...

Tested-by: Jonathan Cameron  
#non-native/no RCEC


Much appreciated Jonathan.

Thanks,

Sean




Thanks,

Jonathan




Jonathan Cameron (1):
  PCI/AER: Extend AER error handling to RCECs

Qiuxu Zhuo (5):
  PCI/RCEC: Add RCEC class code and extended capability
  PCI/RCEC: Bind RCEC devices to the Root Port driver
  PCI/AER: Apply function level reset to RCiEP on fatal error
  PCI/RCEC: Add RCiEP's linked RCEC to AER/ERR
  PCI/AER: Add RCEC AER error injection support

Sean V Kelley (7):
  PCI/RCEC: Cache RCEC capabilities in pci_init_capabilities()
  PCI/ERR: Rename reset_link() to reset_subordinate_device()
  PCI/ERR: Use "bridge" for clarity in pcie_do_recovery()
  PCI/ERR: Limit AER resets in pcie_do_recovery()
  PCI/RCEC: Add pcie_link_rcec() to associate RCiEPs
  PCI/AER: Add pcie_walk_rcec() to RCEC AER handling
  PCI/PME: Add pcie_walk_rcec() to RCEC PME handling

 drivers/pci/pci.h   |  25 -
 drivers/pci/pcie/Makefile   |   2 +-
 drivers/pci/pcie/aer.c  |  36 --
 drivers/pci/pcie/aer_inject.c   |   5 +-
 drivers/pci/pcie/err.c  | 109 +++
 drivers/pci/pcie/pme.c  |  15 ++-
 drivers/pci/pcie/portdrv_core.c |   8 +-
 drivers/pci/pcie/portdrv_pci.c  |   8 +-
 drivers/pci/pcie/rcec.c | 187 


 drivers/pci/probe.c |   2 +
 include/linux/pci.h |   5 +
 include/linux/pci_ids.h |   1 +

[PATCH 2/7] Function to retrieve main clock state

2020-10-01 Thread Erez Geva

Add kernel function to retrieve main clock oscillator state.
As calibration is done from user space daemon,
the kernel access function permit read only.

Signed-off-by: Erez Geva 
---
 include/linux/timex.h | 1 +
 kernel/time/timekeeping.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/include/linux/timex.h b/include/linux/timex.h
index ce0859763670..03bc63bf3073 100644
--- a/include/linux/timex.h
+++ b/include/linux/timex.h
@@ -153,6 +153,7 @@ extern unsigned long tick_nsec; /* SHIFTED_HZ 
period (nsec) */
 
 extern int do_adjtimex(struct __kernel_timex *);
 extern int do_clock_adjtime(const clockid_t which_clock, struct __kernel_timex 
* ktx);
+extern int adjtimex(struct __kernel_timex *txc);
 
 extern void hardpps(const struct timespec64 *, const struct timespec64 *);
 
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 4c47f388a83f..2248fa257ff8 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2372,6 +2372,15 @@ int do_adjtimex(struct __kernel_timex *txc)
return ret;
 }
 
+int adjtimex(struct __kernel_timex *txc)
+{
+   if (txc->modes != 0)
+   return -EINVAL;
+
+   return do_adjtimex(txc);
+}
+EXPORT_SYMBOL_GPL(adjtimex);
+
 #ifdef CONFIG_NTP_PPS
 /**
  * hardpps() - Accessor function to NTP __hardpps function
-- 
2.20.1

Re: [PATCH v3 10/13] ASoC: tegra: Add audio graph based card driver

2020-10-01 Thread Dmitry Osipenko

01.10.2020 23:57, Dmitry Osipenko пишет:
> 01.10.2020 20:33, Sameer Pujar пишет:
>> +/* Setup PLL clock as per the given sample rate */
>> +static int tegra_audio_graph_update_pll(struct snd_pcm_substream *substream,
>> +struct snd_pcm_hw_params *params)
>> +{
>> +struct snd_soc_pcm_runtime *rtd = asoc_substream_to_rtd(substream);
>> +struct asoc_simple_priv *priv = snd_soc_card_get_drvdata(rtd->card);
>> +struct device *dev = rtd->card->dev;
>> +struct tegra_audio_graph_data *graph_data =
>> +(struct tegra_audio_graph_data *)priv->data;
>> +struct tegra_audio_chip_data *chip_data =
>> +(struct tegra_audio_chip_data *)of_device_get_match_data(dev);
> 
> void* doesn't need casting
> 

There are several similar places in the code. Not a big deal, but this
makes code less readable than it could be.

[PATCH] vdpa/mlx5: should keep avail_index despite device status

2020-10-01 Thread Si-Wei Liu

A VM with mlx5 vDPA has below warnings while being reset:

vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
vhost VQ 1 ring restore failed: -1: Resource temporarily unavailable (11)

We should allow userspace emulating the virtio device be
able to get to vq's avail_index, regardless of vDPA device
status. Save the index that was last seen when virtq was
stopped, so that userspace doesn't complain.

Signed-off-by: Si-Wei Liu 
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 70676a6..74264e59 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1133,15 +1133,17 @@ static void suspend_vq(struct mlx5_vdpa_net *ndev, 
struct mlx5_vdpa_virtqueue *m
if (!mvq->initialized)
return;
 
-   if (query_virtqueue(ndev, mvq, )) {
-   mlx5_vdpa_warn(>mvdev, "failed to query virtqueue\n");
-   return;
-   }
if (mvq->fw_state != MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY)
return;
 
if (modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_SUSPEND))
mlx5_vdpa_warn(>mvdev, "modify to suspend failed\n");
+
+   if (query_virtqueue(ndev, mvq, )) {
+   mlx5_vdpa_warn(>mvdev, "failed to query virtqueue\n");
+   return;
+   }
+   mvq->avail_idx = attr.available_index;
 }
 
 static void suspend_vqs(struct mlx5_vdpa_net *ndev)
@@ -1411,8 +1413,14 @@ static int mlx5_vdpa_get_vq_state(struct vdpa_device 
*vdev, u16 idx, struct vdpa
struct mlx5_virtq_attr attr;
int err;
 
-   if (!mvq->initialized)
-   return -EAGAIN;
+   /* If the virtq object was destroyed, use the value saved at
+* the last minute of suspend_vq. This caters for userspace
+* that cares about emulating the index after vq is stopped.
+*/
+   if (!mvq->initialized) {
+   state->avail_index = mvq->avail_idx;
+   return 0;
+   }
 
err = query_virtqueue(ndev, mvq, );
if (err) {
-- 
1.8.3.1

[PATCH 4/7] Fix qdisc_watchdog_schedule_range_ns range check

2020-10-01 Thread Erez Geva

   - As all parameters are unsigned.

   - If 'expires' is bigger than 'last_expires' then the left expression
 overflows.

   - It is better to use addition and check both ends of the range.

Signed-off-by: Erez Geva 
---
 net/sched/sch_api.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 2a76a2f5ed88..ebf59ca1faab 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -632,7 +632,8 @@ void qdisc_watchdog_schedule_range_ns(struct qdisc_watchdog 
*wd, u64 expires,
/* If timer is already set in [expires, expires + delta_ns],
 * do not reprogram it.
 */
-   if (wd->last_expires - expires <= delta_ns)
+   if (wd->last_expires >= expires &&
+   wd->last_expires <= expires + delta_ns)
return;
}
 
-- 
2.20.1

Re: [GIT PULL]: soundwire updates for v5.10-rc1

2020-10-01 Thread Greg KH

On Thu, Oct 01, 2020 at 11:26:32AM +0530, Vinod Koul wrote:
> Hi Greg,
> 
> Please pull to receive updates for soundwire subsystem.
> 
> The following changes since commit 9123e3a74ec7b934a4a099e98af6a61c2f80bbf5:
> 
>   Linux 5.9-rc1 (2020-08-16 13:04:57 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire.git 
> tags/soundwire-5.10-rc1

Pulled and pushed out, thanks.

greg k-h

[PATCH v2 8/9] x86: Convert mmu context ia32_compat into a proper flags field

2020-10-01 Thread Gabriel Krisman Bertazi

The ia32_compat attribute is a weird thing.  It mirrors TIF_IA32 and
TIF_X32 and is used only in two very unrelated places: (1) to decide if
the vsyscall page is accessible (2) for uprobes to find whether the
patched instruction is 32 or 64 bit.  In preparation to remove the TI
flags, we want new values for ia32_compat, but given its odd semantics,
I'd rather make it a real flags field that configures these specific
behaviours.  So, set_personality_x64 can ask for the vsyscall page,
which is not available in x32/ia32 and set_personality_ia32 can
configure the uprobe code as needed.

uprobe cannot rely on other methods like user_64bit_mode() to decide how
to patch, so it needs some specific flag like this.

Signed-off-by: Gabriel Krisman Bertazi 
---
 arch/x86/entry/vsyscall/vsyscall_64.c |  2 +-
 arch/x86/include/asm/mmu.h|  6 --
 arch/x86/include/asm/mmu_context.h|  2 +-
 arch/x86/kernel/process_64.c  | 17 +++--
 4 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c 
b/arch/x86/entry/vsyscall/vsyscall_64.c
index 44c33103a955..20abc396dbe0 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -316,7 +316,7 @@ static struct vm_area_struct gate_vma __ro_after_init = {
 struct vm_area_struct *get_gate_vma(struct mm_struct *mm)
 {
 #ifdef CONFIG_COMPAT
-   if (!mm || mm->context.ia32_compat)
+   if (!mm || !(mm->context.flags & MM_CONTEXT_GATE_PAGE))
return NULL;
 #endif
if (vsyscall_mode == NONE)
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 9257667d13c5..76ab742a0e39 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -7,6 +7,9 @@
 #include 
 #include 
 
+#define MM_CONTEXT_UPROBE_IA32 1 /* Uprobes on this MM assume 32-bit code */
+#define MM_CONTEXT_GATE_PAGE   2 /* Whether MM has gate page */
+
 /*
  * x86 has arch-specific MMU state beyond what lives in mm_struct.
  */
@@ -33,8 +36,7 @@ typedef struct {
 #endif
 
 #ifdef CONFIG_X86_64
-   /* True if mm supports a task running in 32 bit compatibility mode. */
-   unsigned short ia32_compat;
+   unsigned short flags;
 #endif
 
struct mutex lock;
diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index d98016b83755..054a79157323 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -177,7 +177,7 @@ static inline void arch_exit_mmap(struct mm_struct *mm)
 static inline bool is_64bit_mm(struct mm_struct *mm)
 {
return  !IS_ENABLED(CONFIG_IA32_EMULATION) ||
-   !(mm->context.ia32_compat == TIF_IA32);
+   !(mm->context.flags & MM_CONTEXT_UPROBE_IA32);
 }
 #else
 static inline bool is_64bit_mm(struct mm_struct *mm)
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 56e882c339e6..3226ceed409c 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -650,10 +650,8 @@ void set_personality_64bit(void)
/* Pretend that this comes from a 64bit execve */
task_pt_regs(current)->orig_ax = __NR_execve;
current_thread_info()->status &= ~TS_COMPAT;
-
-   /* Ensure the corresponding mm is not marked. */
if (current->mm)
-   current->mm->context.ia32_compat = 0;
+   current->mm->context.flags = MM_CONTEXT_GATE_PAGE;
 
/* TBD: overwrites user setup. Should have two bits.
   But 64bit processes have always behaved this way,
@@ -668,7 +666,8 @@ static void __set_personality_x32(void)
clear_thread_flag(TIF_IA32);
set_thread_flag(TIF_X32);
if (current->mm)
-   current->mm->context.ia32_compat = TIF_X32;
+   current->mm->context.flags = 0;
+
current->personality &= ~READ_IMPLIES_EXEC;
/*
 * in_32bit_syscall() uses the presence of the x32 syscall bit
@@ -688,8 +687,14 @@ static void __set_personality_ia32(void)
 #ifdef CONFIG_IA32_EMULATION
set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_X32);
-   if (current->mm)
-   current->mm->context.ia32_compat = TIF_IA32;
+   if (current->mm) {
+   /*
+* uprobes applied to this MM need to know this and
+* cannot use user_64bit_mode() at that time.
+*/
+   current->mm->context.flags = MM_CONTEXT_UPROBE_IA32;
+   }
+
current->personality |= force_personality32;
/* Prepare the first "return" to user space */
task_pt_regs(current)->orig_ax = __NR_ia32_execve;
-- 
2.28.0

Re: [PATCH net-next 02/16] devlink: Add reload action option to devlink reload command

2020-10-01 Thread Jakub Kicinski

On Thu,  1 Oct 2020 16:59:05 +0300 Moshe Shemesh wrote:
> Add devlink reload action to allow the user to request a specific reload
> action. The action parameter is optional, if not specified then devlink
> driver re-init action is used (backward compatible).
> Note that when required to do firmware activation some drivers may need
> to reload the driver. On the other hand some drivers may need to reset
> the firmware to reinitialize the driver entities. Therefore, the devlink
> reload command returns the actions which were actually performed.
> Reload actions supported are:
> driver_reinit: driver entities re-initialization, applying devlink-param
>and devlink-resource values.
> fw_activate: firmware activate.
> 
> command examples:
> $devlink dev reload pci/:82:00.0 action driver_reinit
> reload_actions_performed:
>   driver_reinit
> 
> $devlink dev reload pci/:82:00.0 action fw_activate
> reload_actions_performed:
>   driver_reinit fw_activate
> 
> Signed-off-by: Moshe Shemesh 

Reviewed-by: Jakub Kicinski

[PATCH v2 6/9] x86: elf: Use e_machine to select setup_additional_pages for x32

2020-10-01 Thread Gabriel Krisman Bertazi

Since TIF_X32 is going away, avoid using it to find the ELF type when
choosing which additional pages to set up.

According to SysV AMD64 ABI Draft, an AMD64 ELF object using ILP32 must
have ELFCLASS32 with (E_MACHINE == EM_X86_64), so use that ELF field to
differentiate a x32 object from a IA32 object when executing
start_thread in compat mode.

Signed-off-by: Gabriel Krisman Bertazi 
---
 arch/x86/entry/vdso/vma.c  | 21 -
 arch/x86/include/asm/elf.h | 11 ---
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 9185cb1d13b9..7a3cda8294a3 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -412,22 +412,25 @@ int arch_setup_additional_pages(struct linux_binprm 
*bprm, int uses_interp)
 }
 
 #ifdef CONFIG_COMPAT
-int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
-  int uses_interp)
+int compat_arch_setup_additional_pages_ia32(struct linux_binprm *bprm,
+   int uses_interp)
 {
-#ifdef CONFIG_X86_X32_ABI
-   if (test_thread_flag(TIF_X32)) {
-   if (!vdso64_enabled)
-   return 0;
-   return map_vdso_randomized(_image_x32);
-   }
-#endif
 #ifdef CONFIG_IA32_EMULATION
return load_vdso32();
 #else
return 0;
 #endif
 }
+
+int compat_arch_setup_additional_pages_x32(struct linux_binprm *bprm,
+  int uses_interp)
+{
+#ifdef CONFIG_X86_X32_ABI
+   if (vdso64_enabled)
+   return map_vdso_randomized(_image_x32);
+#endif
+   return 0;
+}
 #endif
 #else
 int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 33c1c9be2e07..4d91f5b1079f 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -388,9 +388,14 @@ struct linux_binprm;
 #define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
 extern int arch_setup_additional_pages(struct linux_binprm *bprm,
   int uses_interp);
-extern int compat_arch_setup_additional_pages(struct linux_binprm *bprm,
- int uses_interp);
-#define compat_arch_setup_additional_pages compat_arch_setup_additional_pages
+extern int compat_arch_setup_additional_pages_ia32(struct linux_binprm *bprm,
+  int uses_interp);
+extern int compat_arch_setup_additional_pages_x32(struct linux_binprm *bprm,
+ int uses_interp);
+
+#define compat_arch_setup_additional_pages \
+   ((elf_ex->e_machine == EM_X86_64) ? \
+compat_arch_setup_additional_pages_x32 : 
compat_arch_setup_additional_pages_ia32)
 
 /* Do not change the values. See get_align_mask() */
 enum align_flags {
-- 
2.28.0

[PATCH v2 9/9] x86: Reclaim TIF_IA32 and TIF_X32

2020-10-01 Thread Gabriel Krisman Bertazi

Now that these flags are no longer used, reclaim those TI bits.

Signed-off-by: Gabriel Krisman Bertazi 
---
 arch/x86/include/asm/thread_info.h | 4 
 arch/x86/kernel/process_64.c   | 6 --
 2 files changed, 10 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h 
b/arch/x86/include/asm/thread_info.h
index 267701ae3d86..6888aa39c4d6 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -91,7 +91,6 @@ struct thread_info {
 #define TIF_NEED_FPU_LOAD  14  /* load FPU on return to userspace */
 #define TIF_NOCPUID15  /* CPUID is not accessible in userland 
*/
 #define TIF_NOTSC  16  /* TSC is not accessible in userland */
-#define TIF_IA32   17  /* IA32 compatibility process */
 #define TIF_SLD18  /* Restore split lock detection 
on context switch */
 #define TIF_MEMDIE 20  /* is terminating due to OOM killer */
 #define TIF_POLLING_NRFLAG 21  /* idle is polling for TIF_NEED_RESCHED 
*/
@@ -101,7 +100,6 @@ struct thread_info {
 #define TIF_LAZY_MMU_UPDATES   27  /* task is updating the mmu lazily */
 #define TIF_SYSCALL_TRACEPOINT 28  /* syscall tracepoint instrumentation */
 #define TIF_ADDR32 29  /* 32-bit address space on 64 bits */
-#define TIF_X3230  /* 32-bit native x86-64 binary 
*/
 #define TIF_FSCHECK31  /* Check FS is USER_DS on return */
 
 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
@@ -121,7 +119,6 @@ struct thread_info {
 #define _TIF_NEED_FPU_LOAD (1 << TIF_NEED_FPU_LOAD)
 #define _TIF_NOCPUID   (1 << TIF_NOCPUID)
 #define _TIF_NOTSC (1 << TIF_NOTSC)
-#define _TIF_IA32  (1 << TIF_IA32)
 #define _TIF_SLD   (1 << TIF_SLD)
 #define _TIF_POLLING_NRFLAG(1 << TIF_POLLING_NRFLAG)
 #define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
@@ -130,7 +127,6 @@ struct thread_info {
 #define _TIF_LAZY_MMU_UPDATES  (1 << TIF_LAZY_MMU_UPDATES)
 #define _TIF_SYSCALL_TRACEPOINT(1 << TIF_SYSCALL_TRACEPOINT)
 #define _TIF_ADDR32(1 << TIF_ADDR32)
-#define _TIF_X32   (1 << TIF_X32)
 #define _TIF_FSCHECK   (1 << TIF_FSCHECK)
 
 /* flags to check in __switch_to() */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 3226ceed409c..b557312aa9cb 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -644,9 +644,7 @@ void set_personality_64bit(void)
/* inherit personality from parent */
 
/* Make sure to be in 64bit mode */
-   clear_thread_flag(TIF_IA32);
clear_thread_flag(TIF_ADDR32);
-   clear_thread_flag(TIF_X32);
/* Pretend that this comes from a 64bit execve */
task_pt_regs(current)->orig_ax = __NR_execve;
current_thread_info()->status &= ~TS_COMPAT;
@@ -663,8 +661,6 @@ void set_personality_64bit(void)
 static void __set_personality_x32(void)
 {
 #ifdef CONFIG_X86_X32
-   clear_thread_flag(TIF_IA32);
-   set_thread_flag(TIF_X32);
if (current->mm)
current->mm->context.flags = 0;
 
@@ -685,8 +681,6 @@ static void __set_personality_x32(void)
 static void __set_personality_ia32(void)
 {
 #ifdef CONFIG_IA32_EMULATION
-   set_thread_flag(TIF_IA32);
-   clear_thread_flag(TIF_X32);
if (current->mm) {
/*
 * uprobes applied to this MM need to know this and
-- 
2.28.0

[PATCH v2 7/9] x86: Use current USER_CS to setup correct context on vmx entry

2020-10-01 Thread Gabriel Krisman Bertazi

vmx_prepare_switch_to_guest shouldn't use is_64bit_mm, which has a
very specific use in uprobes.  Use the user_64bit_mode helper instead.
This reduces the usage of is_64bit_mm, which is awkward, since it relies
on the personality at load time, which is fine for uprobes, but doesn't
seem fine here.

I tested this by running VMs with 64 and 32 bits payloads from 64/32
programs.

Signed-off-by: Gabriel Krisman Bertazi 
---
 arch/x86/kvm/vmx/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7b2a068f08c1..b5aafd9e5f5d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1172,7 +1172,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
savesegment(es, host_state->es_sel);
 
gs_base = cpu_kernelmode_gs_base(cpu);
-   if (likely(is_64bit_mm(current->mm))) {
+   if (likely(user_64bit_mode(current_pt_regs( {
current_save_fsgs();
fs_sel = current->thread.fsindex;
gs_sel = current->thread.gsindex;
-- 
2.28.0

[PATCH v2 5/9] x86: elf: Use e_machine to select start_thread for x32

2020-10-01 Thread Gabriel Krisman Bertazi

Since TIF_X32 is going away, avoid using it to find the ELF type on
compat_start_thread

According to SysV AMD64 ABI Draft, an AMD64 ELF object using ILP32 must
have ELFCLASS32 with (E_MACHINE == EM_X86_64), so use that ELF field to
differentiate a x32 object from a IA32 object when executing
start_thread in compat mode.

Signed-off-by: Gabriel Krisman Bertazi 
---
 arch/x86/include/asm/elf.h   | 11 +--
 arch/x86/kernel/process_64.c | 11 +++
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 9220efc65d78..33c1c9be2e07 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -186,8 +186,15 @@ static inline void elf_common_init(struct thread_struct *t,
 #defineCOMPAT_ELF_PLAT_INIT(regs, load_addr)   \
elf_common_init(>thread, regs, __USER_DS)
 
-void compat_start_thread(struct pt_regs *regs, u32 new_ip, u32 new_sp);
-#define compat_start_thread compat_start_thread
+void compat_start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp);
+void compat_start_thread_x32(struct pt_regs *regs, u32 new_ip, u32 new_sp);
+#define compat_start_thread(regs, new_ip, new_sp)  \
+do {   \
+   if (elf_ex->e_machine == EM_X86_64) \
+   compat_start_thread_x32(regs, new_ip, new_sp);  \
+   else\
+   compat_start_thread_ia32(regs, new_ip, new_sp); \
+} while (0)
 
 void set_personality_ia32(bool);
 #define COMPAT_SET_PERSONALITY(ex) \
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 9afefe325acb..56e882c339e6 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -511,12 +511,15 @@ start_thread(struct pt_regs *regs, unsigned long new_ip, 
unsigned long new_sp)
 EXPORT_SYMBOL_GPL(start_thread);
 
 #ifdef CONFIG_COMPAT
-void compat_start_thread(struct pt_regs *regs, u32 new_ip, u32 new_sp)
+void compat_start_thread_ia32(struct pt_regs *regs, u32 new_ip, u32 new_sp)
 {
start_thread_common(regs, new_ip, new_sp,
-   test_thread_flag(TIF_X32)
-   ? __USER_CS : __USER32_CS,
-   __USER_DS, __USER_DS);
+   __USER32_CS, __USER_DS, __USER_DS);
+}
+void compat_start_thread_x32(struct pt_regs *regs, u32 new_ip, u32 new_sp)
+{
+   start_thread_common(regs, new_ip, new_sp,
+   __USER_CS, __USER_DS, __USER_DS);
 }
 #endif
 
-- 
2.28.0

[PATCH v2 2/9] x86: Simplify compat syscall userspace allocation

2020-10-01 Thread Gabriel Krisman Bertazi

When allocating user memory space for a compat system call, don't
consider whether the originating code is IA32 or X32, just allocate from
a safe region for both, beyond the redzone.  This should be safe for
IA32, and has the benefit of avoiding TIF_IA32, which we want to drop.

Suggested-by: Andy Lutomirski 
Cc: Christoph Hellwig 
Signed-off-by: Gabriel Krisman Bertazi 
---
 arch/x86/include/asm/compat.h | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
index d4edf281fff4..a4b5126dff4e 100644
--- a/arch/x86/include/asm/compat.h
+++ b/arch/x86/include/asm/compat.h
@@ -179,14 +179,13 @@ typedef struct user_regs_struct compat_elf_gregset_t;
 
 static inline void __user *arch_compat_alloc_user_space(long len)
 {
-   compat_uptr_t sp;
-
-   if (test_thread_flag(TIF_IA32)) {
-   sp = task_pt_regs(current)->sp;
-   } else {
-   /* -128 for the x32 ABI redzone */
-   sp = task_pt_regs(current)->sp - 128;
-   }
+   compat_uptr_t sp = task_pt_regs(current)->sp;
+
+   /*
+* -128 for the x32 ABI redzone.  For IA32, it is not strictly
+* necessary, but not harmful.
+*/
+   sp -= 128;
 
return (void __user *)round_down(sp - len, 16);
 }
-- 
2.28.0

[PATCH v2 1/9] x86: events: Avoid TIF_IA32 when checking 64bit mode

2020-10-01 Thread Gabriel Krisman Bertazi

In preparation to remove TIF_IA32, stop using it in perf events code.

Tested by running perf on 32-bit, 64-bit and x32 applications.

Suggested-by: Andy Lutomirski 
Signed-off-by: Gabriel Krisman Bertazi 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/events/core.c  | 2 +-
 arch/x86/events/intel/ds.c  | 2 +-
 arch/x86/events/intel/lbr.c | 2 +-
 arch/x86/kernel/perf_regs.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 1cbf57dc2ac8..4fe82d9d959b 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2499,7 +2499,7 @@ perf_callchain_user32(struct pt_regs *regs, struct 
perf_callchain_entry_ctx *ent
struct stack_frame_ia32 frame;
const struct stack_frame_ia32 __user *fp;
 
-   if (!test_thread_flag(TIF_IA32))
+   if (user_64bit_mode(regs))
return 0;
 
cs_base = get_segment_base(regs->cs);
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 86848c57b55e..94bd0d3acd15 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1261,7 +1261,7 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
old_to = to;
 
 #ifdef CONFIG_X86_64
-   is_64bit = kernel_ip(to) || !test_thread_flag(TIF_IA32);
+   is_64bit = kernel_ip(to) || any_64bit_mode(regs);
 #endif
insn_init(, kaddr, size, is_64bit);
insn_get_length();
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 8961653c5dd2..1aadb253d296 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1221,7 +1221,7 @@ static int branch_type(unsigned long from, unsigned long 
to, int abort)
 * on 64-bit systems running 32-bit apps
 */
 #ifdef CONFIG_X86_64
-   is64 = kernel_ip((unsigned long)addr) || !test_thread_flag(TIF_IA32);
+   is64 = kernel_ip((unsigned long)addr) || 
any_64bit_mode(current_pt_regs());
 #endif
insn_init(, addr, bytes_read, is64);
insn_get_opcode();
diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c
index bb7e1132290b..9332c49a64a8 100644
--- a/arch/x86/kernel/perf_regs.c
+++ b/arch/x86/kernel/perf_regs.c
@@ -123,7 +123,7 @@ int perf_reg_validate(u64 mask)
 
 u64 perf_reg_abi(struct task_struct *task)
 {
-   if (test_tsk_thread_flag(task, TIF_IA32))
+   if (!user_64bit_mode(task_pt_regs(task)))
return PERF_SAMPLE_REGS_ABI_32;
else
return PERF_SAMPLE_REGS_ABI_64;
-- 
2.28.0

[PATCH v2 4/9] x86: elf: Use e_machine to choose DLINFO in compat

2020-10-01 Thread Gabriel Krisman Bertazi

Since TIF_X32 is going away, avoid using it to find the ELF type on
ARCH_DLINFO.

According to SysV AMD64 ABI Draft, an AMD64 ELF object using ILP32 must
have ELFCLASS32 with (E_MACHINE == EM_X86_64), so use that ELF field to
differentiate a x32 object from a IA32 object when loading ARCH_DLINFO
in compat mode.

Signed-off-by: Gabriel Krisman Bertazi 
---
 arch/x86/include/asm/elf.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index b9a5d488f1a5..9220efc65d78 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -361,7 +361,7 @@ do {
\
 #define AT_SYSINFO 32
 
 #define COMPAT_ARCH_DLINFO \
-if (test_thread_flag(TIF_X32)) \
+if (exec->e_machine == EM_X86_64)  \
ARCH_DLINFO_X32;\
 else   \
ARCH_DLINFO_IA32
-- 
2.28.0

[PATCH v2 3/9] x86: oprofile: Avoid TIF_IA32 when checking 64bit mode

2020-10-01 Thread Gabriel Krisman Bertazi

In preparation to remove TIF_IA32, stop using it in oprofile code.

Signed-off-by: Gabriel Krisman Bertazi 
---
 arch/x86/oprofile/backtrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/oprofile/backtrace.c b/arch/x86/oprofile/backtrace.c
index a2488b6e27d6..1d8391fcca68 100644
--- a/arch/x86/oprofile/backtrace.c
+++ b/arch/x86/oprofile/backtrace.c
@@ -49,7 +49,7 @@ x86_backtrace_32(struct pt_regs * const regs, unsigned int 
depth)
struct stack_frame_ia32 *head;
 
/* User process is IA32 */
-   if (!current || !test_thread_flag(TIF_IA32))
+   if (!current || user_64bit_mode(regs))
return 0;
 
head = (struct stack_frame_ia32 *) regs->bp;
-- 
2.28.0

[PATCH v2 0/9] Reclaim TIF_IA32 and TIF_X32

2020-10-01 Thread Gabriel Krisman Bertazi

TI_IA32 and TIF_X32 are not strictly necessary and they are only set at
task creation time, which doesn't fit with processes that transition
between 64/32 bits.  In addition, other reasons to drop these flags are
that we are running out of TI flags for x86 and it is generally a good
idea to reduce architecture specific TI flags, before move the generic
ones to common code.

Many of the ideas for this patchset came from Andy Lutomirski (Thank
you!)

The only difference of v2 from v1 is the addition of the final 3 patches
that solve the last 3 cases of TIF_IA32 and TIF_X32 usage, and actually
remove the TI flags.

Finally, the testing for this patchset was done exercising the code
paths of each case where the flags were used with x32, ia32 and x64
applications. Finally, x86 selftests showed no regressions.


Gabriel Krisman Bertazi (9):
  x86: events: Avoid TIF_IA32 when checking 64bit mode
  x86: Simplify compat syscall userspace allocation
  x86: oprofile: Avoid TIF_IA32 when checking 64bit mode
  x86: elf: Use e_machine to choose DLINFO in compat
  x86: elf: Use e_machine to select start_thread for x32
  x86: elf: Use e_machine to select setup_additional_pages for x32
  x86: Use current USER_CS to setup correct context on vmx entry
  x86: Convert mmu context ia32_compat into a proper flags field
  x86: Reclaim TIF_IA32 and TIF_X32

 arch/x86/entry/vdso/vma.c | 21 ++---
 arch/x86/entry/vsyscall/vsyscall_64.c |  2 +-
 arch/x86/events/core.c|  2 +-
 arch/x86/events/intel/ds.c|  2 +-
 arch/x86/events/intel/lbr.c   |  2 +-
 arch/x86/include/asm/compat.h | 15 ++--
 arch/x86/include/asm/elf.h| 24 ++-
 arch/x86/include/asm/mmu.h|  6 +++--
 arch/x86/include/asm/mmu_context.h|  2 +-
 arch/x86/include/asm/thread_info.h|  4 
 arch/x86/kernel/perf_regs.c   |  2 +-
 arch/x86/kernel/process_64.c  | 34 ++-
 arch/x86/kvm/vmx/vmx.c|  2 +-
 arch/x86/oprofile/backtrace.c |  2 +-
 14 files changed, 67 insertions(+), 53 deletions(-)

-- 
2.28.0

Re: How should we handle illegal task FPU state?

2020-10-01 Thread Sean Christopherson

On Thu, Oct 01, 2020 at 01:32:04PM -0700, Yu, Yu-cheng wrote:
> On 10/1/2020 10:43 AM, Andy Lutomirski wrote:
> >The question is: what do we do about it?  We have two basic choices, I think.
> >
> >a) Decide that the saved FPU for a task *must* be valid at all times.
> >If there's a failure to restore state, kill the task.
> >
> >b) Improve our failed restoration handling and maybe even
> >intentionally make it possible to create illegal state to allow
> >testing.
> >
> >(a) sounds like a nice concept, but I'm not convinced it's practical.
> >For example, I'm not even convinced that the set of valid SSP values
> >is documented.

Eh, crappy SDM writing isn't a good reason to make our lives harder.  The
SSP MSRs are canonical MSRs and follow the same rules as the SYSCALL,
FS/GS BASE, etc... MSRs.  I'll file an SDM bug.

> >So maybe (b) is the right choice.  Getting a good implementation might
> >be tricky.  Right now, we restore FPU too late in
> >arch_exit_to_user_mode_prepare(), and that function isn't allowed to
> >fail or to send signals.  We could kill the task on failure, and I
> >suppose we could consider queueing a signal, sending IPI-to-self, and
> >returning with TIF_NEED_FPU_LOAD still set and bogus state.  Or we
> >could rework the exit-to-usermode code to allow failure.  All of this
> >becomes utterly gross for the return-from-NMI path, although I guess
> >we don't restore FPU regs in that path regardless.  Or we can
> >do_exit() and just bail outright.
> >
> >I think it would be polite to at least allow core dumping a bogus FPU
> >state, and notifying ptrace() might be nice.  And, if the bogus part
> >of the FPU state is non-supervisor, we could plausibly deliver a
> >signal, but this is (as above) potentially quite difficult.
> >
> >(As an aside, our current handling of signal delivery failure sucks.
> >We should *at least* log something useful.)
> >
> >
> >Regardless of how we decide to handle this, I do think we need to do
> >*something* before applying the CET patches.
> >
> 
> Before supervisor states are introduced, XRSTOR* fails because one of the
> following: memory operand is invalid, xstate_header is wrong, or
> fxregs_state->mxcsr is wrong.  So the code in ex_handler_fprestore() was
> good.
> 
> When supervisor states are introduced for CET and PASID, XRSTORS can fail
> for only one additional reason: if it effects a WRMSR of invalid values.
> 
> If the kernel writes to the MSRs directly, there is wrmsr_safe().  If the
> kernel writes to MSRs' xstates, it can check the values first.  So this
> might not need a generalized handling (but I would not oppose it). Maybe we
> can add a config debug option to check if any writes to those MSR xstates
> are checked before being written (and print out warnings when not)?

That's not really checking the values first though, e.g. if the WRMSR succeeds,
which is the common case, but a later WRMSR fails, then you have to back out
the first MSR.  Even if all goes well, each WRMSR is 125+ cycles, which means
that loading state would get very painful and would defeat the entire reason
for shoving CET into XSAVE state.

Having a try-catch variant at the lowest level, i.e. propagating errors to the
the caller, and building on that sounds appealing.  E.g. KVM could use the
try-catch to test that incoming XSAVE state is valid when userspace is stuffing
guest state instead of manually validating every piece.  Validating CET and
PASID won't be too painful, but there might be a breaking point if the current
trend of shoving everything into XSAVE continues.

One thought for a lowish effort approach to pave the way for CET would be to
try XRSTORS multiple times in switch_fpu_return().  If the first try fails,
then WARN, init non-supervisor state and try a second time, and if _that_ fails
then kill the task.  I.e. do the minimum effort to play nice with bad FPU
state, but don't let anything "accidentally" turn off CET.

Re: [PATCH V2 1/3] efi: Support for MOK variable config table

2020-10-01 Thread Ard Biesheuvel

On Thu, 1 Oct 2020 at 19:44, Nathan Chancellor  wrote:
>
> On Fri, Sep 04, 2020 at 09:31:05PM -0400, Lenny Szubowicz wrote:
> > Because of system-specific EFI firmware limitations, EFI volatile
> > variables may not be capable of holding the required contents of
> > the Machine Owner Key (MOK) certificate store when the certificate
> > list grows above some size. Therefore, an EFI boot loader may pass
> > the MOK certs via a EFI configuration table created specifically for
> > this purpose to avoid this firmware limitation.
> >
> > An EFI configuration table is a much more primitive mechanism
> > compared to EFI variables and is well suited for one-way passage
> > of static information from a pre-OS environment to the kernel.
> >
> > This patch adds initial kernel support to recognize, parse,
> > and validate the EFI MOK configuration table, where named
> > entries contain the same data that would otherwise be provided
> > in similarly named EFI variables.
> >
> > Additionally, this patch creates a sysfs binary file for each
> > EFI MOK configuration table entry found. These files are read-only
> > to root and are provided for use by user space utilities such as
> > mokutil.
> >
> > A subsequent patch will load MOK certs into the trusted platform
> > key ring using this infrastructure.
> >
> > Signed-off-by: Lenny Szubowicz 
>
> I have not seen this reported yet but this breaks arm allyesconfig and
> allmodconfig when CPU_LITTLE_ENDIAN is force selected (because CONFIG_EFI
> will actually be enabled):
>
> $ cat le.config
> CONFIG_CPU_BIG_ENDIAN=n
>
> $ make -skj"$(nproc)" ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- 
> KCONFIG_ALLCONFIG=le.config allyesconfig drivers/firmware/efi/mokvar-table.o
> drivers/firmware/efi/mokvar-table.c: In function 'efi_mokvar_table_init':
> drivers/firmware/efi/mokvar-table.c:139:5: error: implicit declaration of 
> function 'early_memunmap' [-Werror=implicit-function-declaration]
>   139 | early_memunmap(va, map_size);
>   | ^~
> drivers/firmware/efi/mokvar-table.c:148:9: error: implicit declaration of 
> function 'early_memremap' [-Werror=implicit-function-declaration]
>   148 |va = early_memremap(efi.mokvar_table, map_size);
>   | ^~
> drivers/firmware/efi/mokvar-table.c:148:7: warning: assignment to 'void *' 
> from 'int' makes pointer from integer without a cast [-Wint-conversion]
>   148 |va = early_memremap(efi.mokvar_table, map_size);
>   |   ^
> cc1: some warnings being treated as errors
> make[4]: *** [scripts/Makefile.build:283: 
> drivers/firmware/efi/mokvar-table.o] Error 1
>
> Cheers,
> Nathan

Hi Nathan,

Does adding

#include 

to drivers/firmware/efi/mokvar-table.c fix the issue?

Re: [PATCH v3 10/13] ASoC: tegra: Add audio graph based card driver

2020-10-01 Thread Dmitry Osipenko

01.10.2020 20:33, Sameer Pujar пишет:
> +/* Setup PLL clock as per the given sample rate */
> +static int tegra_audio_graph_update_pll(struct snd_pcm_substream *substream,
> + struct snd_pcm_hw_params *params)
> +{
> + struct snd_soc_pcm_runtime *rtd = asoc_substream_to_rtd(substream);
> + struct asoc_simple_priv *priv = snd_soc_card_get_drvdata(rtd->card);
> + struct device *dev = rtd->card->dev;
> + struct tegra_audio_graph_data *graph_data =
> + (struct tegra_audio_graph_data *)priv->data;
> + struct tegra_audio_chip_data *chip_data =
> + (struct tegra_audio_chip_data *)of_device_get_match_data(dev);

void* doesn't need casting

Re: [PATCH net-next 01/16] devlink: Change devlink_reload_supported() param type

2020-10-01 Thread Jakub Kicinski

On Thu,  1 Oct 2020 16:59:04 +0300 Moshe Shemesh wrote:
> Change devlink_reload_supported() function to get devlink_ops pointer
> param instead of devlink pointer param.
> This change will be used in the next patch to check if devlink reload is
> supported before devlink instance is allocated.
> 
> Signed-off-by: Moshe Shemesh 

Reviewed-by: Jakub Kicinski

RE: [PATCH v2] PCI: hv: Fix hibernation in case interrupts are not re-created

2020-10-01 Thread Dexuan Cui

> From: Lorenzo Pieralisi 
> Sent: Thursday, October 1, 2020 3:13 AM
> > ...
> > I mean this is a Hyper-V specific problem, so IMO we should fix the
> > pci-hyperv driver rather than change the PCI device drivers, which 
> > work perfectly on a physical machine and on other hypervisors. 
> > Also it can be difficult or impossible to ask the authors of the 
> > aforementioned PCI device drivers to destroy and re-create 
> > MSI/MSI-X across hibernation, especially for the out-of-tree driver(s).
> 
> Good, so why did you mention PCI drivers in the commit log if they
> are not related to the problem you are fixing ?

I mentioned the names of the PCI device drivers because IMO people
want to know how the issue can reproduce (i.e. which PCI devices
are affected and which are not), so they know how to test this patch.

I'll remove the names of the unaffected PCI device drivers from the 
commit log, and only keep the name of the Nvidia GPU drivers (which
are so far the only drivers I have identified that are affected, when
Linux VM runs on Hyper-V and hibernates).
 
> > > Regardless, this commit log does not provide the information that
> > > it should.
> >
> > Hi Lozenzo, I'm happy to add more info. Can you please let me know
> > what extra info I should provide?
> 
> s/Lozenzo/Lorenzo
Sorry! Will fix the typo.
 
> The info you describe properly below, namely what the _actual_ problem
> is.

I will send v3 with the below info.
 
> > Here if hv_irq_unmask does not call pci_msi_unmask_irq(), the
> > desc->masked remains "true", so later after hibernation, the MSI
> > interrupt line always reamins masked, which is incorrect.
> >
> > Here the slient failure of hv_irq_unmask() does not matter since all the
> > non-boot CPUs are being offlined (meaning all the devices have been
> > frozen). Note: the correct affinity info is still updated into the
> > irqdata data structure in migrate_one_irq() -> irq_do_set_affinity() ->
> > hv_set_affinity(), so when the VM resumes, hv_pci_resume() ->
> > hv_pci_restore_msi_state() is able to correctly restore the irqs with
> > the correct affinity.
> >
> > I hope the explanation can help clarify things. I understand this is
> > not as natual as tht case that Linux runs on a physical machine, but
> > due to the unique PCI pass-through implementation of Hyper-V, IMO this
> > is the only viable fix for the problem here. BTW, this patch is only
> > confined to the pci-hyperv driver and I believe it can no cause any
> > regression.
> 
> Understood, write this in the commit log and I won't nag you any further.

Ok. I treat it as an opportunity to practise and improve my writing :-)
 
> Side note: this issue is there because the hypcall failure triggers
> an early return from hv_irq_unmask(). 

Yes.

> Is that early return really correct ? 

Good question. IMO it's incorrect, because hv_irq_unmask() is called 
when the interrupt is activated for the first time, and when the interrupt's
affinity is being changed. In these cases, we may as well call
pci_msi_unmask_irq() unconditionally, even if the hypercall fails.

BTW, AFAIK, in practice the hypercall only fails in 2 cases:
1. The device is removed when Linux VM has not finished the device's
initialization.
2. In hibernation, the device has been disabled while the generic
hibernation code tries to migrate the interrupt, as I explained.

In the 2 cases, the hypercall returns the same error code
HV_STATUS_INVALID_PARAMETER(0x5).

> Another possibility is just logging the error and let
> hv_irq_unmask() continue and call pci_msi_unmask_irq() in the exit
> path.

This is a good idea. I'll make this change in v3.
 
> Is there a hypcall return value that you can use to detect fatal vs
> non-fatal (ie hibernation) hypcall failures ?

Unluckily IMO there is not. The spec (v6.0b)'s section 10.5.4 (page 106)
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs
does define some return values, but IMO they're not applicable here.

> I was confused by reading the patch since it seemed that you call
> pci_msi_unmask_irq() _only_ while hibernating, which was certainly
> a bug.
> 
> Thank you for explaining.
> 
> Lorenzo

Thanks for reviewing! I'll post v3. Looking forward to your new comments!

Thanks,
-- Dexuan

Re: [PATCH v4 1/4] mm: memcontrol: use helpers to access page's memcg data

2020-10-01 Thread Roman Gushchin

On Thu, Oct 01, 2020 at 02:59:50PM -0400, Johannes Weiner wrote:
> On Thu, Oct 01, 2020 at 11:27:39AM -0700, Roman Gushchin wrote:
> > On Thu, Oct 01, 2020 at 09:46:38AM -0400, Johannes Weiner wrote:
> > > On Wed, Sep 30, 2020 at 05:27:07PM -0700, Roman Gushchin wrote:
> > > > +/*
> > > > + * set_page_memcg - associate a page with a memory cgroup
> > > > + * @page: a pointer to the page struct
> > > > + * @memcg: a pointer to the memory cgroup
> > > > + *
> > > > + * Associates a page with a memory cgroup.
> > > > + */
> > > > +static inline void set_page_memcg(struct page *page, struct mem_cgroup 
> > > > *memcg)
> > > > +{
> > > > +   VM_BUG_ON_PAGE(PageSlab(page), page);
> > > > +
> > > > +   /*
> > > > +* Please, refer to page_memcg()'s description for the page and 
> > > > memcg
> > > > +* binding stability requirements.
> > > > +*/
> > > > +   page->memcg_data = (unsigned long)memcg;
> > > > +}
> > > 
> > > Please delete and inline this as per previous feedback, thanks.
> > 
> > Why it's better?
> > It's ok for set_page_memcg(), but obviously worse for set_page_objcgs():
> > it was nice to have all bit magic in one place, in few helper functions.
> > And now it spills into several places. What's the win?
> 
> set_page_objcgs() is a worthwhile abstraction because it includes the
> synchronization primitives that make it safe to use wrt
> page_objcgs(). They encapsulate the cmpxchg and the READ_ONCE().
> 
> set_page_memcg() doesn't do any synchronization and relies fully on
> the contextual locking. The name implies that it includes things to
> make it safe wrt page_memcg(), which isn't true at all. It's a long
> and misleading name for '='.
> 
> Btw, I really don't mind having this discussion, but please don't send
> revisions that silently ignore feedback you don't agree with.

I'm not ignoring: I thought you was looking to remove clear_page_* functions,
but it wasn't clear you want eliminate set_page_memcg() function. Please, when
you asking for some "style" changes, please, provide some rationale, it's way
less obvious than you think, what exactly you don't like in the proposed 
version.

Thanks.

Re: [PATCH v4 3/9] lib: zstd: Upgrade to latest upstream zstd version 1.4.6

2020-10-01 Thread kernel test robot

Hi Nick,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on kdave/for-next]
[also build test WARNING on f2fs/dev-test linus/master v5.9-rc7 next-20201001]
[cannot apply to cryptodev/master crypto/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Nick-Terrell/Update-to-zstd-1-4-6/20200930-145157
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
config: i386-randconfig-c001-20200930 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

echo
echo "coccinelle warnings: (new ones prefixed by >>)"
echo
>> lib/zstd/compress/zstd_compress.c:3248:24-25: Unneeded semicolon

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

[PATCH] lib: zstd: fix semicolon.cocci warnings

2020-10-01 Thread kernel test robot

From: kernel test robot 

lib/zstd/compress/zstd_compress.c:3248:24-25: Unneeded semicolon


 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Nick Terrell 
Signed-off-by: kernel test robot 
---

url:
https://github.com/0day-ci/linux/commits/Nick-Terrell/Update-to-zstd-1-4-6/20200930-145157
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next

 zstd_compress.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/zstd/compress/zstd_compress.c
+++ b/lib/zstd/compress/zstd_compress.c
@@ -3245,7 +3245,7 @@ size_t ZSTD_compress(void* dst, size_t d
 ZSTD_CCtx* cctx = ZSTD_createCCtx();
 RETURN_ERROR_IF(!cctx, memory_allocation, "ZSTD_createCCtx failed");
 result = ZSTD_compressCCtx(cctx, dst, dstCapacity, src, srcSize, 
compressionLevel);
-ZSTD_freeCCtx(cctx);;
+ZSTD_freeCCtx(cctx);
 return result;
 }

Re: [PATCH V3 1/8] sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output

2020-10-01 Thread Greg Kroah-Hartman

On Wed, Sep 30, 2020 at 09:17:03PM -0700, Kees Cook wrote:
> On Wed, Sep 30, 2020 at 01:57:40PM +0200, Greg Kroah-Hartman wrote:
> > Kees, and Rafael, I don't know if you saw this proposal from Joe for
> > sysfs files, questions below:
> 
> I'm a fan. I think the use of sprintf() in sysfs might have been one of
> my earliest complaints about unsafe code patterns in the kernel. ;)

Ok, great.

> > > +/**
> > > + *   sysfs_emit - scnprintf equivalent, aware of PAGE_SIZE buffer.
> > > + *   @buf:   start of PAGE_SIZE buffer.
> > > + *   @fmt:   format
> > > + *   @...:   optional arguments to @format
> > > + *
> > > + *
> > > + * Returns number of characters written to @buf.
> > > + */
> > > +int sysfs_emit(char *buf, const char *fmt, ...)
> > > +{
> > > + va_list args;
> > > + int len;
> > > +
> > > + if (WARN(!buf || offset_in_page(buf),
> > > +  "invalid sysfs_emit: buf:%p\n", buf))
> 
> I don't want the %p here, but otherwise, sure. I'd also make it a _ONCE
> variant:
> 
>   if (WARN_ONCE(!buf || offset_in_page(buf),
>"invalid sysfs_emit: offset_in_page(buf):%zd\n",
> buf ? offset_in_page(buf) : 0))

As Joe points out, _ONCE doesn't work because this happens from all
sysfs files, not just one.

thanks,

greg k-h

Re: [PATCH 1/6] powerpc/time: Rename mftbl() to mftb()

2020-10-01 Thread Segher Boessenkool

On Thu, Oct 01, 2020 at 12:42:39PM +, Christophe Leroy wrote:
> On PPC64, we have mftb().
> On PPC32, we have mftbl() and an #define mftb() mftbl().
> 
> mftb() and mftbl() are equivalent, their purpose is to read the
> content of SPRN_TRBL, as returned by 'mftb' simplified instruction.
> 
> binutils seems to define 'mftbl' instruction as an equivalent
> of 'mftb'.
> 
> However in both 32 bits and 64 bits documentation, only 'mftb' is
> defined, and when performing a disassembly with objdump, the displayed
> instruction is 'mftb'
> 
> No need to have two ways to do the same thing with different
> names, rename mftbl() to have only mftb().

There are mttbl and mttbu insns (and no mttb insn); they write a 32-bit
half for the time base.  There is an mftb, and an mftbu.  mftbu reads
the upper half, while mftb reads the *whole* register.  SPR 269 is the
TBU register, while SPR 268 is called both TB and TBL.  Yes, it is
confusing :-)

The "mftb" name is much clearer than "mftbl" (on 64-bit), because it
reads the whole 64-bit register.  On 32-bit mftbl is clearer (but not
defined in the architecture, not officially an insn or even an extended
mnemonic).

Segher

Re: [PATCH 1/1] drm/amdgpu: fix NULL pointer dereference for Renoir

2020-10-01 Thread Alex Deucher

On Thu, Oct 1, 2020 at 4:33 PM Dirk Gouders  wrote:
>
> Dirk Gouders  writes:
>
> > Commit c1cf79ca5ced46 (drm/amdgpu: use IP discovery table for renoir)
> > introduced a NULL pointer dereference when booting with
> > amdgpu.discovery=0, because it removed the call of vega10_reg_base_init()
> > for that case.
> >
> > Fix this by calling that funcion if amdgpu_discovery == 0 in addition to
> > the case that amdgpu_discovery_reg_base_init() failed.
> >
> > Fixes: c1cf79ca5ced46 (drm/amdgpu: use IP discovery table for renoir)
> > Signed-off-by: Dirk Gouders 
> > Cc: Hawking Zhang 
> > Cc: Evan Quan 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/soc15.c | 10 +-
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
> > b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > index 84d811b6e48b..f8cb62b326d6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -694,12 +694,12 @@ static void soc15_reg_base_init(struct amdgpu_device 
> > *adev)
> >* it doesn't support SRIOV. */
> >   if (amdgpu_discovery) {
> >   r = amdgpu_discovery_reg_base_init(adev);
> > - if (r) {
> > - DRM_WARN("failed to init reg base from ip 
> > discovery table, "
> > -  "fallback to legacy init method\n");
> > - vega10_reg_base_init(adev);
> > - }
> > + if (r == 0)
> > +   break;
>
> Grrr, wrong indentation here.
> But I will wait for your review before v1.

Fixed up locally and applied.  Thanks!

Alex


>
> Dirk
>
>
> > + DRM_WARN("failed to init reg base from ip discovery 
> > table, "
> > +  "fallback to legacy init method\n");
> >   }
> > + vega10_reg_base_init(adev);
> >   break;
> >   case CHIP_VEGA20:
> >   vega20_reg_base_init(adev);

Re: [PATCH v3 05/18] dmaengine: idxd: add IMS support in base driver

2020-10-01 Thread Dave Jiang





On 9/30/2020 11:47 AM, Thomas Gleixner wrote:

On Tue, Sep 15 2020 at 16:28, Dave Jiang wrote:

  struct idxd_device {
@@ -170,6 +171,7 @@ struct idxd_device {
  
  	int num_groups;
  
+	u32 ims_offset;

u32 msix_perm_offset;
u32 wqcfg_offset;
u32 grpcfg_offset;
@@ -177,6 +179,7 @@ struct idxd_device {
  
  	u64 max_xfer_bytes;

u32 max_batch_size;
+   int ims_size;
int max_groups;
int max_engines;
int max_tokens;
@@ -196,6 +199,7 @@ struct idxd_device {
struct work_struct work;
  
  	int *int_handles;

+   struct sbitmap ims_sbmap;


This bitmap is needed for what?


Nothing anymore. I forgot to remove. All this is handled by MSI core now with 
code from you.





--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -231,10 +231,51 @@ static void idxd_read_table_offsets(struct idxd_device 
*idxd)
idxd->msix_perm_offset = offsets.msix_perm * 0x100;
dev_dbg(dev, "IDXD MSIX Permission Offset: %#x\n",
idxd->msix_perm_offset);
+   idxd->ims_offset = offsets.ims * 0x100;


Magic constant pulled out of thin air. #define 


Will fix




+   dev_dbg(dev, "IDXD IMS Offset: %#x\n", idxd->ims_offset);
idxd->perfmon_offset = offsets.perfmon * 0x100;
dev_dbg(dev, "IDXD Perfmon Offset: %#x\n", idxd->perfmon_offset);
  }
  
+#define PCI_DEVSEC_CAP		0x23

+#define SIOVDVSEC1(offset) ((offset) + 0x4)
+#define SIOVDVSEC2(offset) ((offset) + 0x8)
+#define DVSECID0x5
+#define SIOVCAP(offset)((offset) + 0x14)
+
+static void idxd_check_siov(struct idxd_device *idxd)
+{
+   struct pci_dev *pdev = idxd->pdev;
+   struct device *dev = >dev;
+   int dvsec;
+   u16 val16;
+   u32 val32;
+
+   dvsec = pci_find_ext_capability(pdev, PCI_DEVSEC_CAP);
+   pci_read_config_word(pdev, SIOVDVSEC1(dvsec), );
+   if (val16 != PCI_VENDOR_ID_INTEL) {
+   dev_dbg(>dev, "DVSEC vendor id is not Intel\n");
+   return;
+   }
+
+   pci_read_config_word(pdev, SIOVDVSEC2(dvsec), );
+   if (val16 != DVSECID) {
+   dev_dbg(>dev, "DVSEC ID is not SIOV\n");
+   return;
+   }
+
+   pci_read_config_dword(pdev, SIOVCAP(dvsec), );
+   if ((val32 & 0x1) && idxd->hw.gen_cap.max_ims_mult) {
+   idxd->ims_size = idxd->hw.gen_cap.max_ims_mult * 256ULL;
+   dev_dbg(dev, "IMS size: %u\n", idxd->ims_size);
+   set_bit(IDXD_FLAG_SIOV_SUPPORTED, >flags);
+   dev_dbg(>dev, "IMS supported for device\n");
+   return;
+   }
+
+   dev_dbg(>dev, "SIOV unsupported for device\n");


It's really hard to find the code inside all of this dev_dbg()
noise. But why is this capability check done in this driver? Is this
capability stuff really IDXD specific or is the next device which
supports this going to copy and pasta the above?


Will look into move this into a common detection function for all similar 
devices. This should be common for all Intel devices that support SIOV.





  static void idxd_read_caps(struct idxd_device *idxd)
  {
struct device *dev = >pdev->dev;
@@ -253,6 +294,7 @@ static void idxd_read_caps(struct idxd_device *idxd)
dev_dbg(dev, "max xfer size: %llu bytes\n", idxd->max_xfer_bytes);
idxd->max_batch_size = 1U << idxd->hw.gen_cap.max_batch_shift;
dev_dbg(dev, "max batch size: %u\n", idxd->max_batch_size);
+   idxd_check_siov(idxd);
if (idxd->hw.gen_cap.config_en)
set_bit(IDXD_FLAG_CONFIGURABLE, >flags);
  
@@ -347,9 +389,19 @@ static int idxd_probe(struct idxd_device *idxd)
  
  	idxd->major = idxd_cdev_get_major(idxd);
  
+	if (idxd->ims_size) {

+   rc = sbitmap_init_node(>ims_sbmap, idxd->ims_size, -1,
+  GFP_KERNEL, dev_to_node(dev));
+   if (rc < 0)
+   goto sbitmap_fail;
+   }


Ah, here the bitmap is allocated, but it's still completely unclear what
it is used for.


Need to remove.



The subject line is misleading as hell. This does not add support, it's
doing some magic capability checks and allocates stuff which nobody
knows what it is used for.


With the unneeded code removal and moving the SIOV detection code to common 
implementation, it should be more clear.




Thanks,

 tglx

[RFC PATCH 18/22] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits

2020-10-01 Thread Chang S. Bae

Intel's Advanced Matrix Extension (AMX) is a new 64-bit extended feature
consisting of two-dimensional registers and an accelerator unit. The first
implementation of the latter is the tile matrix multiply unit (TMUL). TMUL
performs SIMD dot-products on four bytes (INT8) or two bfloat16
floating-point (BF16) elements.

Here we add AMX to the kernel/user ABI, by enumerating the capability.
E.g., /proc/cpuinfo: amx_tile, amx_bf16, amx_int8

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/cpufeatures.h | 3 +++
 arch/x86/kernel/cpu/cpuid-deps.c   | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 7d7fe1d82966..79ad9bb1c01c 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -371,6 +371,9 @@
 #define X86_FEATURE_SERIALIZE  (18*32+14) /* SERIALIZE instruction */
 #define X86_FEATURE_PCONFIG(18*32+18) /* Intel PCONFIG */
 #define X86_FEATURE_ARCH_LBR   (18*32+19) /* Intel ARCH LBR */
+#define X86_FEATURE_AMX_BF16   (18*32+22) /* AMX BF16 Support */
+#define X86_FEATURE_AMX_TILE   (18*32+24) /* AMX tile Support */
+#define X86_FEATURE_AMX_INT8   (18*32+25) /* AMX INT8 Support */
 #define X86_FEATURE_SPEC_CTRL  (18*32+26) /* "" Speculation Control 
(IBRS + IBPB) */
 #define X86_FEATURE_INTEL_STIBP(18*32+27) /* "" Single Thread 
Indirect Branch Predictors */
 #define X86_FEATURE_FLUSH_L1D  (18*32+28) /* Flush L1D cache */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 3cbe24ca80ab..27e036c73f7e 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -69,6 +69,9 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_CQM_MBM_TOTAL,X86_FEATURE_CQM_LLC   },
{ X86_FEATURE_CQM_MBM_LOCAL,X86_FEATURE_CQM_LLC   },
{ X86_FEATURE_AVX512_BF16,  X86_FEATURE_AVX512VL  },
+   { X86_FEATURE_AMX_TILE, X86_FEATURE_XSAVE },
+   { X86_FEATURE_AMX_INT8, X86_FEATURE_AMX_TILE  },
+   { X86_FEATURE_AMX_BF16, X86_FEATURE_AMX_TILE  },
{}
 };
 
-- 
2.17.1

[RFC PATCH 03/22] x86/fpu/xstate: Modify address finder prototypes to access all the possible areas

2020-10-01 Thread Chang S. Bae

The xstate infrastructure is not flexible to support dynamic areas in
task->fpu. Change the prototype of some address finding functions to access
task->fpu directly. Make changes for both outer and inner helpers:
get_xsave_addr() and __raw_xsave_addr().

No functional change.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: k...@vger.kernel.org
---
 arch/x86/include/asm/fpu/internal.h |  2 +-
 arch/x86/include/asm/fpu/xstate.h   |  2 +-
 arch/x86/include/asm/pgtable.h  |  2 +-
 arch/x86/kernel/cpu/common.c|  2 +-
 arch/x86/kernel/fpu/xstate.c| 43 -
 arch/x86/kvm/x86.c  | 26 +++--
 arch/x86/mm/pkeys.c |  2 +-
 7 files changed, 52 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index c404fedf1a75..baca80e877a6 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -578,7 +578,7 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
 * return to userland e.g. for a copy_to_user() operation.
 */
if (current->mm) {
-   pk = get_xsave_addr(_fpu->state.xsave, XFEATURE_PKRU);
+   pk = get_xsave_addr(new_fpu, XFEATURE_PKRU);
if (pk)
pkru_val = pk->pkru;
}
diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index a315b055212f..3fbf45727ad6 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -100,7 +100,7 @@ extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 extern void __init update_regset_xstate_info(unsigned int size,
 u64 xstate_mask);
 
-void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
+void *get_xsave_addr(struct fpu *fpu, int xfeature_nr);
 const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index b836138ce852..e24a8fb8f479 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,7 +142,7 @@ static inline void write_pkru(u32 pkru)
if (!boot_cpu_has(X86_FEATURE_OSPKE))
return;
 
-   pk = get_xsave_addr(>thread.fpu.state.xsave, XFEATURE_PKRU);
+   pk = get_xsave_addr(>thread.fpu, XFEATURE_PKRU);
 
/*
 * The PKRU value in xstate needs to be in sync with the value that is
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index d0363e15ec2e..183ee7f77065 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -478,7 +478,7 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
return;
 
cr4_set_bits(X86_CR4_PKE);
-   pk = get_xsave_addr(_fpstate.xsave, XFEATURE_PKRU);
+   pk = get_xsave_addr(NULL, XFEATURE_PKRU);
if (pk)
pk->pkru = init_pkru_value;
/*
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index e3a9bddc39d9..bab22766b79b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -891,15 +891,23 @@ void fpu__resume_cpu(void)
  * buffer the state is.  Callers should ensure that the buffer
  * is valid.
  */
-static void *__raw_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
+static void *__raw_xsave_addr(struct fpu *fpu, int xfeature_nr)
 {
+   void *xsave;
+
if (!xfeature_enabled(xfeature_nr)) {
WARN_ON_FPU(1);
return NULL;
}
 
-   return (void *)xsave + xstate_comp_offsets[xfeature_nr];
+   if (fpu)
+   xsave = >state.xsave;
+   else
+   xsave = _fpstate.xsave;
+
+   return xsave + xstate_comp_offsets[xfeature_nr];
 }
+
 /*
  * Given the xsave area and a state inside, this function returns the
  * address of the state.
@@ -911,15 +919,18 @@ static void *__raw_xsave_addr(struct xregs_state *xsave, 
int xfeature_nr)
  * this will return NULL.
  *
  * Inputs:
- * xstate: the thread's storage area for all FPU data
+ * fpu: the thread's FPU data to access all the FPU state storages.
+(If a null pointer is given, assume the init_fpstate)
  * xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
  * XFEATURE_SSE, etc...)
  * Output:
  * address of the state in the xsave area, or NULL if the
  * field is not present in the xsave buffer.
  */
-void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
+void *get_xsave_addr(struct fpu *fpu, int xfeature_nr)
 {
+   struct xregs_state *xsave;
+
/*
 * Do we even *have* xsave state?
 */
@@ -932,6 +943,12 @@ void *get_xsave_addr(struct xregs_state *xsave, int 
xfeature_nr)
 */

[RFC PATCH 00/22] x86: Support Intel Advanced Matrix Extensions

2020-10-01 Thread Chang S. Bae

Intel Advanced Matrix Extensions (AMX)[1][2] will be shipping on servers
soon.  AMX consists of configurable TMM "TILE" registers plus new
accelerator instructions that operate on them.  TMUL (Tile matrix MULtiply)
is the first accelerator instruction set to use the new registers, and we
anticipate additional instructions in the future.

Neither AMX state nor TMUL instructions depend on AVX.  However, AMX and
AVX do share common challenges.  The TMM registers are 8KB today, and
architecturally as large as 64KB, which merits updates to hardware and
software state management.

Further, both technologies run faster when they are not simultaneously
running on SMT siblings, and both technologies use of power and bandwidth
impact the power and performance available to neighboring cores.  (This
impact has measurably improved in recent hardware.)

If the existing kernel approach for managing XSAVE state was employed to
handle AMX, 8KB space would be added to every task, but possibly rarely
used.  So Linux support is optimized by using a new XSAVE feature: eXtended
Feature Disabling (XFD).  The kernel arms XFD to provide a #NM exception
upon a tasks' first access to TILE state. The kernel exception handler
installs the appropriate XSAVE context switch buffer, and the task behaves
as if the kernel had done that for all tasks.  Using XFD, AMX space is
allocated only when needed, eliminating the memory waste for unused state
components.

This series requires the new minimum sigaltstack support [4] and is based
on the mainline with dynamic supervisor state support [3]. The series is
composed of three parts:
* Patch 1-16: Foundation to support dynamic user state management, as
  preparatory for managing tile data state.
* Patch 17-21: Actual AMX enablement, including unit tests
* Patch 22: Introduce boot parameters

Thanks to Len Brown and Dave Hansen for help with the cover letter.

[1]: Intel Architecture Instruction Set Extension Programming Reference
 June 2020, 
https://software.intel.com/content/dam/develop/public/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf
[2]: 
https://software.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference/intrinsics/intrinsics-for-intel-advanced-matrix-extensions-intel-amx-instructions.html
[3]: 
https://lore.kernel.org/lkml/1593780569-62993-1-git-send-email-kan.li...@linux.intel.com/
[4]: 
https://lore.kernel.org/lkml/20200929205746.6763-1-chang.seok@intel.com/

Chang S. Bae (22):
  x86/fpu/xstate: Modify area init helper prototypes to access all the
possible areas
  x86/fpu/xstate: Modify xstate copy helper prototypes to access all the
possible areas
  x86/fpu/xstate: Modify address finder prototypes to access all the
possible areas
  x86/fpu/xstate: Modify save and restore helper prototypes to access
all the possible areas
  x86/fpu/xstate: Introduce a new variable for dynamic user states
  x86/fpu/xstate: Outline dynamic xstate area size in the task context
  x86/fpu/xstate: Introduce helpers to manage an xstate area dynamically
  x86/fpu/xstate: Define the scope of the initial xstate data
  x86/fpu/xstate: Introduce wrapper functions for organizing xstate area
access
  x86/fpu/xstate: Update xstate save function for supporting dynamic
user xstate
  x86/fpu/xstate: Update xstate area address finder for supporting
dynamic user xstate
  x86/fpu/xstate: Update xstate context copy function for supporting
dynamic area
  x86/fpu/xstate: Expand dynamic user state area on first use
  x86/fpu/xstate: Inherit dynamic user state when used in the parent
  x86/fpu/xstate: Support ptracer-induced xstate area expansion
  x86/fpu/xstate: Support dynamic user state in the signal handling path
  x86/fpu/xstate: Extend the table for mapping xstate components with
features
  x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature
bits
  x86/fpu/amx: Define AMX state components and have it used for
boot-time checks
  x86/fpu/amx: Enable the AMX feature in 64-bit mode
  selftest/x86/amx: Include test cases for the AMX state management
  x86/fpu/xstate: Introduce boot-parameters for control some state
component support

 .../admin-guide/kernel-parameters.txt |  15 +
 arch/x86/include/asm/cpufeatures.h|   4 +
 arch/x86/include/asm/fpu/internal.h   |  99 ++-
 arch/x86/include/asm/fpu/types.h  |  62 +-
 arch/x86/include/asm/fpu/xstate.h |  41 +-
 arch/x86/include/asm/msr-index.h  |   2 +
 arch/x86/include/asm/pgtable.h|   2 +-
 arch/x86/include/asm/processor.h  |  10 +-
 arch/x86/include/asm/trace/fpu.h  |   6 +-
 arch/x86/kernel/cpu/common.c  |   2 +-
 arch/x86/kernel/cpu/cpuid-deps.c  |   3 +
 arch/x86/kernel/fpu/core.c| 107 ++-
 arch/x86/kernel/fpu/init.c|

[RFC PATCH 14/22] x86/fpu/xstate: Inherit dynamic user state when used in the parent

2020-10-01 Thread Chang S. Bae

When a new task is created, the kernel copies all the states from the
parent. If the parent already has any dynamic user state in use, the new
task has to expand the XSAVE buffer to save them. Also, disable the
associated first-use fault.

No functional change until the kernel supports dynamic user states.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/fpu/core.c | 28 
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 2e07bfcd54b3..239c7798bc01 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -225,6 +225,7 @@ int fpu__copy(struct task_struct *dst, struct task_struct 
*src)
 {
struct fpu *dst_fpu = >thread.fpu;
struct fpu *src_fpu = >thread.fpu;
+   unsigned int size;
 
dst_fpu->last_cpu = -1;
 
@@ -233,15 +234,26 @@ int fpu__copy(struct task_struct *dst, struct task_struct 
*src)
 
WARN_ON_FPU(src_fpu != >thread.fpu);
 
-   /*
-* Don't let 'init optimized' areas of the XSAVE area
-* leak into the child task:
-*/
-   memset(_fpu->state.xsave, 0, fpu_kernel_xstate_default_size);
-
-   dst_fpu->state_mask = xfeatures_mask_all & ~xfeatures_mask_user_dynamic;
dst_fpu->state_ptr = NULL;
 
+   /* Inherit the dynamic area if the parent already has. */
+   if (src_fpu->state_ptr) {
+   int ret;
+
+   dst_fpu->state_mask = 0;
+   ret = alloc_xstate_area(dst_fpu, src_fpu->state_mask, );
+   if (ret)
+   return ret;
+   } else {
+   dst_fpu->state_mask = src_fpu->state_mask & 
~xfeatures_mask_user_dynamic;
+   size = fpu_kernel_xstate_default_size;
+   /*
+* Don't let 'init optimized' areas of the XSAVE area
+* leak into the child task:
+*/
+   memset(_fpu->state.xsave, 0, size);
+   }
+
/*
 * If the FPU registers are not current just memcpy() the state.
 * Otherwise save current FPU registers directly into the child's FPU
@@ -252,7 +264,7 @@ int fpu__copy(struct task_struct *dst, struct task_struct 
*src)
 */
fpregs_lock();
if (test_thread_flag(TIF_NEED_FPU_LOAD))
-   memcpy(__xstate(dst_fpu), __xstate(src_fpu), 
fpu_kernel_xstate_default_size);
+   memcpy(__xstate(dst_fpu), __xstate(src_fpu), size);
 
else if (!copy_fpregs_to_fpstate(dst_fpu))
copy_kernel_to_fpregs(dst_fpu);
-- 
2.17.1

[RFC PATCH 22/22] x86/fpu/xstate: Introduce boot-parameters for control some state component support

2020-10-01 Thread Chang S. Bae

"xstate.disable=0x6000" will disable AMX on a system that has AMX compiled
into XFEATURE_MASK_USER_SUPPORTED.

"xstate.enable=0x6000" will enable AMX on a system that does NOT have AMX
compiled into XFEATURE_MASK_USER_SUPPORTED (assuming the kernel is new
enough to support this feature).

While this cmdline is currently enabled only for AMX, it is intended to be
easily enabled to be useful for future XSAVE-enabled features.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 .../admin-guide/kernel-parameters.txt | 15 ++
 arch/x86/include/asm/fpu/types.h  |  6 +++
 arch/x86/kernel/fpu/init.c| 52 +--
 3 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index a1068742a6df..742167c6f789 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5838,6 +5838,21 @@
which allow the hypervisor to 'idle' the guest on lock
contention.
 
+   xstate.enable=  [X86-64]
+   xstate.disable= [X86-64]
+   The kernel is compiled with a default xstate bitmask --
+   enabling it to use the XSAVE hardware to efficiently
+   save and restore thread states on context switch.
+   xstate.enable allows adding to that default mask at
+   boot-time without recompiling the kernel just to support
+   the new thread state. (Note that the kernel will ignore
+   any bits in the mask that do not correspond to features
+   that are actually available in CPUID)  xstate.disable
+   allows clearing bits in the default mask, forcing the
+   kernel to forget that it supports the specified thread
+   state. When a bit set for both, the kernel takes
+   xstate.disable in a priority.
+
xirc2ps_cs= [NET,PCMCIA]
Format:

,[,[,[,]]]
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 002248dba6dc..2a944e8903bb 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -148,6 +148,12 @@ enum xfeature {
 #define XFEATURE_MASK_XTILE(XFEATURE_MASK_XTILE_DATA \
 | XFEATURE_MASK_XTILE_CFG)
 
+#define XFEATURE_REGION_MASK(max_bit, min_bit) \
+   ((BIT_ULL((max_bit) - (min_bit) + 1) - 1) << (min_bit))
+
+#define XFEATURE_MASK_CONFIGURABLE \
+   XFEATURE_REGION_MASK(XFEATURE_XTILE_DATA, XFEATURE_XTILE_CFG)
+
 #define FIRST_EXTENDED_XFEATUREXFEATURE_YMM
 
 struct reg_128_bit {
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 8e2a77bc1782..a354286e7c90 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -227,13 +227,42 @@ static void __init 
fpu__init_system_xstate_size_legacy(void)
  * This must be called after fpu__init_parse_early_param() is called and
  * xfeatures_mask is enumerated.
  */
+
+static u64 xstate_enable;
+static u64 xstate_disable;
+
 u64 __init fpu__get_supported_xfeatures_mask(void)
 {
u64 mask = XFEATURE_MASK_USER_SUPPORTED | 
XFEATURE_MASK_SUPERVISOR_SUPPORTED;
 
-   if (!IS_ENABLED(CONFIG_X86_64))
-   mask &= ~(XFEATURE_MASK_XTILE);
-
+   if (!IS_ENABLED(CONFIG_X86_64)) {
+   mask  &= ~(XFEATURE_MASK_XTILE);
+   } else if (xstate_enable || xstate_disable) {
+   u64 custom = mask;
+   u64 unknown;
+
+   custom |= xstate_enable;
+   custom &= ~xstate_disable;
+
+   unknown = custom & ~mask;
+   if (unknown) {
+   /*
+* User should fully understand the result of using 
undocumented
+* xstate component
+*/
+   pr_warn("x86/fpu: Attempt to enable unknown xstate 
features 0x%llx\n",
+   unknown);
+   WARN_ON_FPU(1);
+   }
+
+   if ((custom & XFEATURE_MASK_XTILE) != XFEATURE_MASK_XTILE) {
+   pr_warn("x86/fpu: Disable 0x%x components due to 
incorrect setup\n",
+   XFEATURE_MASK_XTILE);
+   custom &= ~(XFEATURE_MASK_XTILE);
+   }
+
+   mask = custom;
+   }
return mask;
 }
 
@@ -254,6 +283,7 @@ static void __init fpu__init_parse_early_param(void)
 {
char arg[32];
char *argptr = arg;
+   u64 mask;
int bit;
 
 #ifdef CONFIG_X86_32
@@ -283,6 +313,22 @@ static void __init

[RFC PATCH 01/22] x86/fpu/xstate: Modify area init helper prototypes to access all the possible areas

2020-10-01 Thread Chang S. Bae

The xstate infrastructure is not flexible to support dynamic areas in
task->fpu. Change the fpstate_init() prototype to access task->fpu
directly. It treats a null pointer as indicating init_fpstate, as this
initial data does not belong to any task. For the compacted format,
fpstate_init_xstate() now accepts the state component bitmap to configure
XCOMP_BV.

No functional change.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: k...@vger.kernel.org
---
 arch/x86/include/asm/fpu/internal.h |  6 +++---
 arch/x86/kernel/fpu/core.c  | 14 +++---
 arch/x86/kernel/fpu/init.c  |  2 +-
 arch/x86/kernel/fpu/regset.c|  2 +-
 arch/x86/kernel/fpu/xstate.c|  3 +--
 arch/x86/kvm/x86.c  |  2 +-
 6 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index 0a460f2a3f90..c404fedf1a75 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -79,20 +79,20 @@ static __always_inline __pure bool use_fxsr(void)
 
 extern union fpregs_state init_fpstate;
 
-extern void fpstate_init(union fpregs_state *state);
+extern void fpstate_init(struct fpu *fpu);
 #ifdef CONFIG_MATH_EMULATION
 extern void fpstate_init_soft(struct swregs_state *soft);
 #else
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
-static inline void fpstate_init_xstate(struct xregs_state *xsave)
+static inline void fpstate_init_xstate(struct xregs_state *xsave, u64 
xcomp_mask)
 {
/*
 * XRSTORS requires these bits set in xcomp_bv, or it will
 * trigger #GP:
 */
-   xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
+   xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xcomp_mask;
 }
 
 static inline void fpstate_init_fxstate(struct fxregs_state *fx)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index eb86a2b831b1..41d926c76615 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -191,8 +191,16 @@ static inline void fpstate_init_fstate(struct fregs_state 
*fp)
fp->fos = 0xu;
 }
 
-void fpstate_init(union fpregs_state *state)
+/* If a null pointer is given, assume to take the initial FPU state, 
init_fpstate. */
+void fpstate_init(struct fpu *fpu)
 {
+   union fpregs_state *state;
+
+   if (fpu)
+   state = >state;
+   else
+   state = _fpstate;
+
if (!static_cpu_has(X86_FEATURE_FPU)) {
fpstate_init_soft(>soft);
return;
@@ -201,7 +209,7 @@ void fpstate_init(union fpregs_state *state)
memset(state, 0, fpu_kernel_xstate_size);
 
if (static_cpu_has(X86_FEATURE_XSAVES))
-   fpstate_init_xstate(>xsave);
+   fpstate_init_xstate(>xsave, xfeatures_mask_all);
if (static_cpu_has(X86_FEATURE_FXSR))
fpstate_init_fxstate(>fxsave);
else
@@ -261,7 +269,7 @@ static void fpu__initialize(struct fpu *fpu)
WARN_ON_FPU(fpu != >thread.fpu);
 
set_thread_flag(TIF_NEED_FPU_LOAD);
-   fpstate_init(>state);
+   fpstate_init(fpu);
trace_x86_fpu_init_state(fpu);
 }
 
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 61ddc3a5e5c2..4e89a2698cfb 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -125,7 +125,7 @@ static void __init fpu__init_system_generic(void)
 * Set up the legacy init FPU context. (xstate init might overwrite this
 * with a more modern format, if the CPU supports it.)
 */
-   fpstate_init(_fpstate);
+   fpstate_init(NULL);
 
fpu__init_system_mxcsr();
 }
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index c413756ba89f..4c4d9059ff36 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -144,7 +144,7 @@ int xstateregs_set(struct task_struct *target, const struct 
user_regset *regset,
 * In case of failure, mark all states as init:
 */
if (ret)
-   fpstate_init(>state);
+   fpstate_init(fpu);
 
return ret;
 }
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 038e19c0019e..ee4946c60ab1 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -454,8 +454,7 @@ static void __init setup_init_fpu_buf(void)
print_xstate_features();
 
if (boot_cpu_has(X86_FEATURE_XSAVES))
-   init_fpstate.xsave.header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
-xfeatures_mask_all;
+   fpstate_init_xstate(_fpstate.xsave, xfeatures_mask_all);
 
/*
 * Init all the features state with header.xfeatures being 0x0
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ce856e0ece84..9da8cb4b8589

[RFC PATCH 15/22] x86/fpu/xstate: Support ptracer-induced xstate area expansion

2020-10-01 Thread Chang S. Bae

ptrace() may request an update to task->fpu that has not yet been
allocated. Detect this case and allocate task->fpu to support the request.
Also, disable the (now unnecessary) associated first-use fault.

No functional change until the kernel supports dynamic user states.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/fpu/regset.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 8d863240b9c6..6b9d0c0a266d 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -125,6 +125,35 @@ int xstateregs_set(struct task_struct *target, const 
struct user_regset *regset,
 
xsave = __xsave(fpu);
 
+   /*
+* When a ptracer attempts to write any state in task->fpu but not 
allocated,
+* it dynamically expands the xstate area of fpu->state_ptr.
+*/
+   if (count > get_xstate_size(fpu->state_mask)) {
+   unsigned int offset, size;
+   struct xstate_header hdr;
+   u64 mask;
+
+   offset = offsetof(struct xregs_state, header);
+   size = sizeof(hdr);
+
+   /* Retrieve XSTATE_BV */
+   if (kbuf) {
+   memcpy(, kbuf + offset, size);
+   } else {
+   ret = __copy_from_user(, ubuf + offset, size);
+   if (ret)
+   return ret;
+   }
+
+   mask = hdr.xfeatures & xfeatures_mask_user_dynamic;
+   if (!mask) {
+   ret = alloc_xstate_area(fpu, mask, NULL);
+   if (ret)
+   return ret;
+   }
+   }
+
fpu__prepare_write(fpu);
 
if (using_compacted_format()) {
-- 
2.17.1

[RFC PATCH 17/22] x86/fpu/xstate: Extend the table for mapping xstate components with features

2020-10-01 Thread Chang S. Bae

At compile-time xfeatures_mask_all includes all possible XCR0 features. At
run-time fpu__init_system_xstate() clears features in xfeatures_mask_all
that are not enabled in CPUID. It does this by looping through all possible
XCR0 features.

Update the code to handle the possibility that there will be gaps in the
XCR0 feature bit numbers.

No functional change, until hardware with bit number gaps in XCR0.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/fpu/xstate.c | 39 ++--
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 13e8eff7a23b..eaada4a38153 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -41,17 +41,22 @@ static const char *xfeature_names[] =
"unknown xstate feature",
 };
 
-static short xsave_cpuid_features[] __initdata = {
-   X86_FEATURE_FPU,
-   X86_FEATURE_XMM,
-   X86_FEATURE_AVX,
-   X86_FEATURE_MPX,
-   X86_FEATURE_MPX,
-   X86_FEATURE_AVX512F,
-   X86_FEATURE_AVX512F,
-   X86_FEATURE_AVX512F,
-   X86_FEATURE_INTEL_PT,
-   X86_FEATURE_PKU,
+struct xfeature_capflag_info {
+   int xfeature_idx;
+   short cpu_cap;
+};
+
+static struct xfeature_capflag_info xfeature_capflags[] __initdata = {
+   { XFEATURE_FP,  X86_FEATURE_FPU },
+   { XFEATURE_SSE, X86_FEATURE_XMM },
+   { XFEATURE_YMM, X86_FEATURE_AVX },
+   { XFEATURE_BNDREGS, X86_FEATURE_MPX },
+   { XFEATURE_BNDCSR,  X86_FEATURE_MPX },
+   { XFEATURE_OPMASK,  X86_FEATURE_AVX512F },
+   { XFEATURE_ZMM_Hi256,   X86_FEATURE_AVX512F },
+   { XFEATURE_Hi16_ZMM,X86_FEATURE_AVX512F },
+   { XFEATURE_PT_UNIMPLEMENTED_SO_FAR, X86_FEATURE_INTEL_PT },
+   { XFEATURE_PKRU,X86_FEATURE_PKU },
 };
 
 /*
@@ -950,11 +955,15 @@ void __init fpu__init_system_xstate(void)
}
 
/*
-* Clear XSAVE features that are disabled in the normal CPUID.
+* Cross-check XSAVE feature with CPU capability flag. Clear the
+* mask bit for disabled features.
 */
-   for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
-   if (!boot_cpu_has(xsave_cpuid_features[i]))
-   xfeatures_mask_all &= ~BIT_ULL(i);
+   for (i = 0; i < ARRAY_SIZE(xfeature_capflags); i++) {
+   short cpu_cap = xfeature_capflags[i].cpu_cap;
+   int idx = xfeature_capflags[i].xfeature_idx;
+
+   if (!boot_cpu_has(cpu_cap))
+   xfeatures_mask_all &= ~BIT_ULL(idx);
}
 
xfeatures_mask_all &= fpu__get_supported_xfeatures_mask();
-- 
2.17.1

[RFC PATCH 11/22] x86/fpu/xstate: Update xstate area address finder for supporting dynamic user xstate

2020-10-01 Thread Chang S. Bae

__raw_xsave_addr() returns the requested component's pointer in an XSAVE
buffer, by simply looking up the offset table. The offset used to be fixed,
but, with dynamic user states, it becomes variable.

get_xstate_size() has a routine to find an offset at run-time. Refactor to
use it for the address finder.

No functional change until the kernel enables dynamic user xstates.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/fpu/xstate.c | 82 +++-
 1 file changed, 52 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index d73ab3259896..556ae8593806 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -130,15 +130,50 @@ static bool xfeature_is_supervisor(int xfeature_nr)
return ecx & 1;
 }
 
+/*
+ * Available once those arrays for the offset, size, and alignment info are 
set up,
+ * by setup_xstate_features().
+ */
+static unsigned int __get_xstate_comp_offset(u64 mask, int feature_nr)
+{
+   u64 xmask = BIT_ULL(feature_nr + 1) - 1;
+   unsigned int next_offset, offset = 0;
+   int i;
+
+   if ((mask & xmask) == (xfeatures_mask_all & xmask))
+   return xstate_comp_offsets[feature_nr];
+
+   /*
+* Calculate the size by summing up each state together, since no known
+* offset found with the xstate area format out of the given mask.
+*/
+
+   next_offset = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+
+   for (i = FIRST_EXTENDED_XFEATURE; i <= feature_nr; i++) {
+   if (!(mask & BIT_ULL(i)))
+   continue;
+
+   offset = xstate_aligns[i] ? ALIGN(next_offset, 64) : 
next_offset;
+   next_offset += xstate_sizes[i];
+   }
+
+   return offset;
+}
+
+static unsigned int get_xstate_comp_offset(struct fpu *fpu, int feature_nr)
+{
+   return __get_xstate_comp_offset(fpu->state_mask, feature_nr);
+}
+
 /*
  * Available once those arrays for the offset, size, and alignment info are 
set up,
  * by setup_xstate_features().
  */
 unsigned int get_xstate_size(u64 mask)
 {
-   unsigned int size;
-   u64 xmask;
-   int i, nr;
+   unsigned int offset;
+   int nr;
 
if (!mask)
return 0;
@@ -152,24 +187,8 @@ unsigned int get_xstate_size(u64 mask)
if (!using_compacted_format())
return xstate_offsets[nr] + xstate_sizes[nr];
 
-   xmask = BIT_ULL(nr + 1) - 1;
-
-   if (mask == (xmask & xfeatures_mask_all))
-   return xstate_comp_offsets[nr] + xstate_sizes[nr];
-
-   /*
-* Calculate the size by summing up each state together, since no known
-* size found with the xstate area format out of the given mask.
-*/
-   for (size = FXSAVE_SIZE + XSAVE_HDR_SIZE, i = FIRST_EXTENDED_XFEATURE; 
i <= nr; i++) {
-   if (!(mask & BIT_ULL(i)))
-   continue;
-
-   if (xstate_aligns[i])
-   size = ALIGN(size, 64);
-   size += xstate_sizes[i];
-   }
-   return size;
+   offset = __get_xstate_comp_offset(mask, nr);
+   return offset + xstate_sizes[nr];
 }
 
 /*
@@ -986,17 +1005,20 @@ static void *__raw_xsave_addr(struct fpu *fpu, int 
xfeature_nr)
 {
void *xsave;
 
-   if (!xfeature_enabled(xfeature_nr)) {
-   WARN_ON_FPU(1);
-   return NULL;
-   }
-
-   if (fpu)
-   xsave = __xsave(fpu);
-   else
+   if (!xfeature_enabled(xfeature_nr))
+   goto not_found;
+   else if (!fpu)
xsave = _fpstate.xsave;
+   else if (!(fpu->state_mask & BIT_ULL(xfeature_nr)))
+   goto not_found;
+   else
+   xsave = __xsave(fpu);
+
+   return (xsave + get_xstate_comp_offset(fpu, xfeature_nr));
 
-   return xsave + xstate_comp_offsets[xfeature_nr];
+not_found:
+   WARN_ON_FPU(1);
+   return NULL;
 }
 
 /*
-- 
2.17.1

[RFC PATCH 10/22] x86/fpu/xstate: Update xstate save function for supporting dynamic user xstate

2020-10-01 Thread Chang S. Bae

copy_xregs_to_kernel() used to save all user states in an invariably
sufficient buffer. When the dynamic user state is enabled, it becomes
conditional which state to be saved.

fpu->state_mask can indicate which state components are reserved to be
saved in XSAVE buffer. Use it as XSAVE's instruction mask to select states.

KVM saves xstate in guest_fpu and user_fpu. With the change, the KVM code
needs to ensure a valid fpu->state_mask before XSAVE.

No functional change until the kernel supports dynamic user states.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: k...@vger.kernel.org
---
 arch/x86/include/asm/fpu/internal.h |  3 +--
 arch/x86/kernel/fpu/core.c  |  2 +-
 arch/x86/kvm/x86.c  | 11 ---
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index 2dfb3b6f58fc..3b03ead87a46 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -331,9 +331,8 @@ static inline void copy_kernel_to_xregs_booting(struct 
xregs_state *xstate)
 /*
  * Save processor xstate to xsave area.
  */
-static inline void copy_xregs_to_kernel(struct xregs_state *xstate)
+static inline void copy_xregs_to_kernel(struct xregs_state *xstate, u64 mask)
 {
-   u64 mask = xfeatures_mask_all;
u32 lmask = mask;
u32 hmask = mask >> 32;
int err;
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index dca4961fcc36..ece6428ba85b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -99,7 +99,7 @@ int copy_fpregs_to_fpstate(struct fpu *fpu)
if (likely(use_xsave())) {
struct xregs_state *xsave = >xsave;
 
-   copy_xregs_to_kernel(xsave);
+   copy_xregs_to_kernel(xsave, fpu->state_mask);
 
/*
 * AVX512 state is tracked here because its use is
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ecec6418ccca..a8b5f507083c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8842,15 +8842,20 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu)
 
 static void kvm_save_current_fpu(struct fpu *fpu)
 {
+   struct fpu *src_fpu = >thread.fpu;
+
/*
 * If the target FPU state is not resident in the CPU registers, just
 * memcpy() from current, else save CPU state directly to the target.
 */
-   if (test_thread_flag(TIF_NEED_FPU_LOAD))
-   memcpy(>state, >thread.fpu.state,
+   if (test_thread_flag(TIF_NEED_FPU_LOAD)) {
+   memcpy(>state, _fpu->state,
   fpu_kernel_xstate_default_size);
-   else
+   } else {
+   if (fpu->state_mask != src_fpu->state_mask)
+   fpu->state_mask = src_fpu->state_mask;
copy_fpregs_to_fpstate(fpu);
+   }
 }
 
 /* Swap (qemu) user FPU context for the guest FPU context. */
-- 
2.17.1

[RFC PATCH 07/22] x86/fpu/xstate: Introduce helpers to manage an xstate area dynamically

2020-10-01 Thread Chang S. Bae

task->fpu has a buffer to keep the extended register states, but it is not
expandable at runtime. Introduce runtime methods and new fpu struct fields
to support the expansion.

fpu->state_mask indicates the saved states per task and fpu->state_ptr
points the dynamically allocated area.

alloc_xstate_area() uses vmalloc() for its scalability. However, set a
threshold (64KB) to watch out a potential need for an alternative
mechanism.

Also, introduce a new helper -- get_xstate_size() to calculate the area
size.

No functional change until the kernel supports dynamic user states.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/fpu/types.h  |  29 +--
 arch/x86/include/asm/fpu/xstate.h |   3 +
 arch/x86/kernel/fpu/core.c|   3 +
 arch/x86/kernel/fpu/xstate.c  | 124 ++
 4 files changed, 154 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index c87364ea6446..4b7756644824 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -327,14 +327,33 @@ struct fpu {
 */
unsigned long   avx512_timestamp;
 
+   /*
+* @state_mask:
+*
+* The state component bitmap. It indicates the saved xstate in
+* either @state or @state_ptr. The map value starts to be aligned
+* with @state and then with @state_ptr once it is in use.
+*/
+   u64 state_mask;
+
+   /*
+* @state_ptr:
+*
+* Copy of all extended register states, in a dynamically-allocated
+* area, we save and restore over context switches. When a task is
+* using extended features, the register state is always the most
+* current. This state copy is more recent than @state. If the task
+* context-switches away, they get saved here, representing the xstate.
+*/
+   union fpregs_state  *state_ptr;
+
/*
 * @state:
 *
-* In-memory copy of all FPU registers that we save/restore
-* over context switches. If the task is using the FPU then
-* the registers in the FPU are more recent than this state
-* copy. If the task context-switches away then they get
-* saved here and represent the FPU state.
+* Copy of some extended register state that we save and restore
+* over context switches. If a task uses a dynamically-allocated
+* area, @state_ptr, then it has a more recent state copy than this.
+* This copy follows the same attributes as described for @state_ptr.
 */
union fpregs_state  state;
/*
diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index 9aad91c0725b..37728bfcb71e 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -103,6 +103,9 @@ extern void __init update_regset_xstate_info(unsigned int 
size,
 u64 xstate_mask);
 
 void *get_xsave_addr(struct fpu *fpu, int xfeature_nr);
+int alloc_xstate_area(struct fpu *fpu, u64 mask, unsigned int *alloc_size);
+void free_xstate_area(struct fpu *fpu);
+
 const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 875620fdfe61..e25f7866800e 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -235,6 +235,9 @@ int fpu__copy(struct task_struct *dst, struct task_struct 
*src)
 */
memset(_fpu->state.xsave, 0, fpu_kernel_xstate_default_size);
 
+   dst_fpu->state_mask = xfeatures_mask_all & ~xfeatures_mask_user_dynamic;
+   dst_fpu->state_ptr = NULL;
+
/*
 * If the FPU registers are not current just memcpy() the state.
 * Otherwise save current FPU registers directly into the child's FPU
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 6e0d8a9699ed..af60332aafef 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -69,6 +70,7 @@ static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... 
XFEATURE_MAX - 1] =
 static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] 
= -1};
 static unsigned int xstate_comp_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX 
- 1] = -1};
 static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] = { [ 0 ... 
XFEATURE_MAX - 1] = -1};
+static bool xstate_aligns[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = false};
 
 /*
  * The XSAVE area of kernel can be in standard or compacted format;
@@ -128,6 +130,48 @@ static bool xfeature_is_supervisor(int xfeature_nr)
return ecx & 1;
 }

[RFC PATCH 21/22] selftest/x86/amx: Include test cases for the AMX state management

2020-10-01 Thread Chang S. Bae

This selftest exercises the kernel's ability to inherit and context switch
AMX state, by verifying that they retain unique data when creating a child
process and between multiple threads.

Also, ptrace() is used to insert AMX state into existing threads -- both
before and after the existing thread has initialized its AMX state.

For the signal handling path, verify in the signal handler that the signal
frame either include or exclude AMX data -- depending on if the signaled
thread has initialized AMX state.

Collect the test cases of validating those operations together, as they
share some common setup for the AMX state.

These test cases do not depend on AMX compiler support, as they employ
user-space-XSAVE directly to access AMX state.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-kselft...@vger.kernel.org
---
 tools/testing/selftests/x86/Makefile |   2 +-
 tools/testing/selftests/x86/amx.c| 736 +++
 2 files changed, 737 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/amx.c

diff --git a/tools/testing/selftests/x86/Makefile 
b/tools/testing/selftests/x86/Makefile
index e0c52e5ab49e..6f6e6cabca69 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -17,7 +17,7 @@ TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs 
syscall_nt test_mremap
 TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
test_FCMOV test_FCOMI test_FISTTP \
vdso_restorer
-TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering
+TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering amx
 # Some selftests require 32bit support enabled also on 64bit systems
 TARGETS_C_32BIT_NEEDED := ldt_gdt ptrace_syscall
 
diff --git a/tools/testing/selftests/x86/amx.c 
b/tools/testing/selftests/x86/amx.c
new file mode 100644
index ..bf766b22cf77
--- /dev/null
+++ b/tools/testing/selftests/x86/amx.c
@@ -0,0 +1,736 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#ifndef __x86_64__
+# error This test is 64-bit only
+#endif
+
+typedef uint8_t u8;
+typedef uint16_t u16;
+typedef uint32_t u32;
+typedef uint64_t u64;
+
+#define PAGE_SIZE  (1 << 12)
+
+#define NUM_TILES  8
+#define TILE_SIZE  1024
+#define XSAVE_SIZE ((NUM_TILES * TILE_SIZE) + PAGE_SIZE)
+
+struct xsave_data {
+   u8 area[XSAVE_SIZE];
+} __attribute__((aligned(64)));
+
+/* Tile configuration associated: */
+#define MAX_TILES  16
+#define RESERVED_BYTES 14
+
+struct tile_config {
+   u8  palette_id;
+   u8  start_row;
+   u8  reserved[RESERVED_BYTES];
+   u16 colsb[MAX_TILES];
+   u8  rows[MAX_TILES];
+};
+
+struct tile_data {
+   u8 data[NUM_TILES * TILE_SIZE];
+};
+
+static inline u64 __xgetbv(u32 index)
+{
+   u32 eax, edx;
+
+   asm volatile(".byte 0x0f,0x01,0xd0"
+: "=a" (eax), "=d" (edx)
+: "c" (index));
+   return eax + ((u64)edx << 32);
+}
+
+static inline void __cpuid(u32 *eax, u32 *ebx, u32 *ecx, u32 *edx)
+{
+   asm volatile("cpuid;"
+: "=a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx)
+: "0" (*eax), "2" (*ecx));
+}
+
+/* Load tile configuration */
+static inline void __ldtilecfg(void *cfg)
+{
+   asm volatile(".byte 0xc4,0xe2,0x78,0x49,0x00"
+: : "a"(cfg));
+}
+
+/* Load tile data to %tmm0 register only */
+static inline void __tileloadd(void *tile)
+{
+   asm volatile(".byte 0xc4,0xe2,0x7b,0x4b,0x04,0x10"
+: : "a"(tile), "d"(0));
+}
+
+/* Save extended states */
+static inline void __xsave(void *area, u32 lo, u32 hi)
+{
+   asm volatile(".byte 0x48,0x0f,0xae,0x27"
+: : "D" (area), "a" (lo), "d" (hi)
+: "memory");
+}
+
+/* Restore extended states */
+static inline void __xrstor(void *area, u32 lo, u32 hi)
+{
+   asm volatile(".byte 0x48,0x0f,0xae,0x2f"
+: : "D" (area), "a" (lo), "d" (hi));
+}
+
+/* Release tile states to init values */
+static inline void __tilerelease(void)
+{
+   asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0" ::);
+}
+
+/* Hardware info check: */
+
+static inline bool check_xsave_supports_xtile(void)
+{
+   u32 eax, ebx, ecx, edx;
+   bool available = false;
+
+#define XSAVE_CPUID0x1
+#define XSAVE_ECX_BIT  26
+#define XFEATURE_XTILE_CFG 17
+#define XFEATURE_XTILE_DATA18
+#define

[RFC PATCH 13/22] x86/fpu/xstate: Expand dynamic user state area on first use

2020-10-01 Thread Chang S. Bae

Intel's Extended Feature Disable (XFD) feature is an extension of the XSAVE
architecture. XFD allows the kernel to enable a feature state in XCR0 and
to receive a #NM trap when a task uses instructions accessing that state.
In this way, Linux can allocate the large task->fpu buffer only for tasks
that use it.

XFD introduces two MSRs: IA32_XFD to enable/disable the feature and
IA32_XFD_ERR to assist the #NM trap handler. Both use the same
state-component bitmap format, used by XCR0.

Use this hardware capability to find the right time to expand xstate area.
Introduce two sets of helper functions for that:

1. The first set is primarily for interacting with the XFD hardware
   feature. Helpers for configuring disablement, e.g. in context switching,
   are:
xdisable_setbits()
xdisable_getbits()
xdisable_switch()

2. The second set is for managing the first-use status and handling #NM
   trap:
xfirstuse_enabled()
xfirstuse_not_detected()
xfirstuse_event_handler()

The #NM handler induces the xstate area expansion to save the first-used
states.

No functional change until the kernel enables dynamic user states and XFD.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/cpufeatures.h  |  1 +
 arch/x86/include/asm/fpu/internal.h | 53 -
 arch/x86/include/asm/msr-index.h|  2 ++
 arch/x86/kernel/fpu/core.c  | 37 
 arch/x86/kernel/fpu/xstate.c| 34 --
 arch/x86/kernel/process.c   |  5 +++
 arch/x86/kernel/process_32.c|  2 +-
 arch/x86/kernel/process_64.c|  2 +-
 arch/x86/kernel/traps.c |  3 ++
 9 files changed, 133 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 2901d5df4366..7d7fe1d82966 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -274,6 +274,7 @@
 #define X86_FEATURE_XSAVEC (10*32+ 1) /* XSAVEC instruction */
 #define X86_FEATURE_XGETBV1(10*32+ 2) /* XGETBV with ECX = 1 
instruction */
 #define X86_FEATURE_XSAVES (10*32+ 3) /* XSAVES/XRSTORS 
instructions */
+#define X86_FEATURE_XFD(10*32+ 4) /* eXtended Feature 
Disabling */
 
 /*
  * Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index 3b03ead87a46..f5dbbaa060fb 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -572,11 +572,60 @@ static inline void switch_fpu_prepare(struct fpu 
*old_fpu, int cpu)
  * Misc helper functions:
  */
 
+/* The first-use detection helpers: */
+
+static inline void xdisable_setbits(u64 value)
+{
+   wrmsrl_safe(MSR_IA32_XFD, value);
+}
+
+static inline u64 xdisable_getbits(void)
+{
+   u64 value;
+
+   rdmsrl_safe(MSR_IA32_XFD, );
+   return value;
+}
+
+static inline u64 xfirstuse_enabled(void)
+{
+   /* All the dynamic user components are first-use enabled. */
+   return xfeatures_mask_user_dynamic;
+}
+
+/*
+ * Convert fpu->firstuse_bv to xdisable configuration in MSR IA32_XFD.
+ * xdisable_setbits() only uses this.
+ */
+static inline u64 xfirstuse_not_detected(struct fpu *fpu)
+{
+   u64 firstuse_bv = (fpu->state_mask & xfirstuse_enabled());
+
+   /*
+* If first-use is not detected, set the bit. If the detection is
+* not enabled, the bit is always zero in firstuse_bv. So, make
+* following conversion:
+*/
+   return  (xfirstuse_enabled() ^ firstuse_bv);
+}
+
+/* Update MSR IA32_XFD based on fpu->firstuse_bv */
+static inline void xdisable_switch(struct fpu *prev, struct fpu *next)
+{
+   if (!static_cpu_has(X86_FEATURE_XFD) || !xfirstuse_enabled())
+   return;
+
+   if (unlikely(prev->state_mask != next->state_mask))
+   xdisable_setbits(xfirstuse_not_detected(next));
+}
+
+bool xfirstuse_event_handler(struct fpu *fpu);
+
 /*
  * Load PKRU from the FPU context if available. Delay loading of the
  * complete FPU state until the return to userland.
  */
-static inline void switch_fpu_finish(struct fpu *new_fpu)
+static inline void switch_fpu_finish(struct fpu *old_fpu, struct fpu *new_fpu)
 {
u32 pkru_val = init_pkru_value;
struct pkru_state *pk;
@@ -586,6 +635,8 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
 
set_thread_flag(TIF_NEED_FPU_LOAD);
 
+   xdisable_switch(old_fpu, new_fpu);
+
if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
return;
 
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 2859ee4f39a8..0ccbe8cc99ad 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -610,6 +610,8 @@
 #define

Re: [PATCH v1 00/12] Intel FPGA Security Manager Class Driver

2020-10-01 Thread Russ Weight




On 9/5/20 7:13 AM, Wu, Hao wrote:
>> Subject: [PATCH v1 00/12] Intel FPGA Security Manager Class Driver
>>
>>
>> These patches depend on the patchset: "add regmap-spi-avmm & Intel
>> Max10 BMC chip support" which is currently under review.
>>
>>--
>>
>> This patchset introduces the Intel Security Manager class driver
>> for managing secure updates on Intel FPGA Cards. It also provides
>> the n3000bmc-secure mfd sub-driver for the MAX10 BMC for the n3000
>> Programmable Acceleration Cards (PAC). The n3000bmc-secure driver
>> is implemented using the Intel Security Manager class driver.
> So this patchset contains two parts
> (1) adding a new class driver for Intel FPGA secure update.
> (2) a new driver which uses (1) to implement secure update for n3000 PAC.
Yes - that is correct
>
> And only part (2) depends on "Intel MAX10 BMC chip support" patchset.
> (Maybe you can provide a link to that thread).
>
> Is my understanding correct? If yes, is it possible to reorder these patches?
> At least there is no dependency on the class driver patches, right?
Yes - I'm splitting the patch set, and I'll provide links for the dependencies
on the MAX10 BMC Secure Engine patch set.
>
>> The Intel Security Manager class driver provides a common API for
>> user-space tools to manage updates for Secure FPGA devices. Device
>> drivers that instantiate the Intel Security Manager class driver will
>> interact with the HW secure update engine in order to transfer
>> new FPGA and BMC images to FLASH so that they will be automatically
>> loaded when the FPGA card reboots.
>>
>> The API consists of sysfs nodes and supports the following functions:
>>
>> (1) Instantiate and monitor a secure update
>> (2) Display security information including: Root Entry Hashes (REH),
>> Cancelled Code Signing Keys (CSK), and flash update counts for
>> both BMC and FPGA images.
>>
>> Secure updates make use of the request_firmware framework, which
>> requires that image files are accessible under /lib/firmware. A request
>> for a secure update returns immediately, while the update itself
>> proceeds in the context of a kernel worker thread. Sysfs files provide
>> a means for monitoring the progress of a secure update and for
>> retrieving error information in the event of a failure.
> Maybe you can explain a little more on why we need to have this done
> via a class driver not just some internal code in max10 driver? This class
> driver will be reused in different cases? And why adding a new class
> driver not just reuse or extend fpga manager (existing fpga mgr is used
> to update fpga too).
Yes - I'll so that in the next patch set.
>
>> The n3000bmc-secure driver instantiates the Intel Security Manager
>> class driver and provides the callback functions required to support
>> secure updates on Intel n3000 PAC devices.
>>
>> Russ Weight (12):
>>   fpga: fpga security manager class driver
> Intel FPGA Security Manager?
Yes - I'll make that change
>
>>   fpga: create intel max10 bmc security engine
>>   fpga: expose max10 flash update counts in sysfs
>>   fpga: expose max10 canceled keys in sysfs
>>   fpga: enable secure updates
>>   fpga: add max10 secure update functions
>>   fpga: expose sec-mgr update status
>>   fpga: expose sec-mgr update errors
>>   fpga: expose sec-mgr update size
>>   fpga: enable sec-mgr update cancel
>>   fpga: expose hardware error info in sysfs
> For these patches, is it possible to have a better title for these patches.
> Then it will be easier to know which component this patch is going to modify.
> e.g. fpga: ifpga-sec-mgr: xx
Yes. Thanks for the comments.

- Russ
>
> Thanks
> Hao
>
>>   fpga: add max10 get_hw_errinfo callback func
>>
>>  .../ABI/testing/sysfs-class-ifpga-sec-mgr | 151 
>>  MAINTAINERS   |   8 +
>>  drivers/fpga/Kconfig  |  20 +
>>  drivers/fpga/Makefile |   6 +
>>  drivers/fpga/ifpga-sec-mgr.c  | 669 ++
>>  drivers/fpga/intel-m10-bmc-secure.c   | 557 +++
>>  include/linux/fpga/ifpga-sec-mgr.h| 201 ++
>>  include/linux/mfd/intel-m10-bmc.h | 116 +++
>>  8 files changed, 1728 insertions(+)
>>  create mode 100644 Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr
>>  create mode 100644 drivers/fpga/ifpga-sec-mgr.c
>>  create mode 100644 drivers/fpga/intel-m10-bmc-secure.c
>>  create mode 100644 include/linux/fpga/ifpga-sec-mgr.h
>>
>> --
>> 2.17.1

[RFC PATCH 09/22] x86/fpu/xstate: Introduce wrapper functions for organizing xstate area access

2020-10-01 Thread Chang S. Bae

task->fpu now has two possible xstate areas, fpu->state or fpu->state_ptr.
Instead of open code for accessing to one of the two areas, rearrange them
to use a new wrapper.

Some open code (e.g., in KVM) is left unchanged as not going to use
fpu->state_ptr at the moment.

No functional change until the kernel supports dynamic user states.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/fpu/internal.h | 10 ++
 arch/x86/include/asm/fpu/xstate.h   | 10 ++
 arch/x86/include/asm/trace/fpu.h|  6 --
 arch/x86/kernel/fpu/core.c  | 27 ---
 arch/x86/kernel/fpu/regset.c| 28 +---
 arch/x86/kernel/fpu/signal.c| 23 +--
 arch/x86/kernel/fpu/xstate.c| 20 +++-
 7 files changed, 77 insertions(+), 47 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index d64c1083bd93..2dfb3b6f58fc 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -209,10 +209,12 @@ static inline int copy_user_to_fregs(struct fregs_state 
__user *fx)
 
 static inline void copy_fxregs_to_kernel(struct fpu *fpu)
 {
+   union fpregs_state *xstate = __xstate(fpu);
+
if (IS_ENABLED(CONFIG_X86_32))
-   asm volatile( "fxsave %[fx]" : [fx] "=m" (fpu->state.fxsave));
+   asm volatile("fxsave %[fx]" : [fx] "=m" (xstate->fxsave));
else
-   asm volatile("fxsaveq %[fx]" : [fx] "=m" (fpu->state.fxsave));
+   asm volatile("fxsaveq %[fx]" : [fx] "=m" (xstate->fxsave));
 }
 
 /* These macros all use (%edi)/(%rdi) as the single memory argument. */
@@ -410,7 +412,7 @@ static inline int copy_user_to_xregs(struct xregs_state 
__user *buf, u64 mask)
  */
 static inline int copy_kernel_to_xregs_err(struct fpu *fpu, u64 mask)
 {
-   struct xregs_state *xstate = >state.xsave;
+   struct xregs_state *xstate = __xsave(fpu);
u32 lmask = mask;
u32 hmask = mask >> 32;
int err;
@@ -439,7 +441,7 @@ static inline void __copy_kernel_to_fpregs(union 
fpregs_state *fpstate, u64 mask
 
 static inline void copy_kernel_to_fpregs(struct fpu *fpu)
 {
-   union fpregs_state *fpstate = >state;
+   union fpregs_state *fpstate = __xstate(fpu);
 
/*
 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index 9de8b4c49855..b2125ec90cdb 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -102,6 +102,16 @@ extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 extern void __init update_regset_xstate_info(unsigned int size,
 u64 xstate_mask);
 
+static inline union fpregs_state *__xstate(struct fpu *fpu)
+{
+   return (fpu->state_ptr) ? fpu->state_ptr : >state;
+}
+
+static inline struct xregs_state *__xsave(struct fpu *fpu)
+{
+   return &__xstate(fpu)->xsave;
+}
+
 void *get_xsave_addr(struct fpu *fpu, int xfeature_nr);
 unsigned int get_xstate_size(u64 mask);
 int alloc_xstate_area(struct fpu *fpu, u64 mask, unsigned int *alloc_size);
diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
index 879b77792f94..9dcce5833bc6 100644
--- a/arch/x86/include/asm/trace/fpu.h
+++ b/arch/x86/include/asm/trace/fpu.h
@@ -22,8 +22,10 @@ DECLARE_EVENT_CLASS(x86_fpu,
__entry->fpu= fpu;
__entry->load_fpu   = test_thread_flag(TIF_NEED_FPU_LOAD);
if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
-   __entry->xfeatures = fpu->state.xsave.header.xfeatures;
-   __entry->xcomp_bv  = fpu->state.xsave.header.xcomp_bv;
+   struct xregs_state *xsave = __xsave(fpu);
+
+   __entry->xfeatures = xsave->header.xfeatures;
+   __entry->xcomp_bv  = xsave->header.xcomp_bv;
}
),
TP_printk("x86/fpu: %p load: %d xfeatures: %llx xcomp_bv: %llx",
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 33956ae3de2b..dca4961fcc36 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -94,14 +94,18 @@ EXPORT_SYMBOL(irq_fpu_usable);
  */
 int copy_fpregs_to_fpstate(struct fpu *fpu)
 {
+   union fpregs_state *xstate = __xstate(fpu);
+
if (likely(use_xsave())) {
-   copy_xregs_to_kernel(>state.xsave);
+   struct xregs_state *xsave = >xsave;
+
+   copy_xregs_to_kernel(xsave);
 
/*
 * AVX512 state is tracked here because its use is
 * known to slow the max clock speed of the core.
 */
-   if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
+

[RFC PATCH 16/22] x86/fpu/xstate: Support dynamic user state in the signal handling path

2020-10-01 Thread Chang S. Bae

Entering a signal handler, the kernel saves XSAVE area. The dynamic user
state is better to be saved only when used. fpu->state_mask can help to
exclude unused states.

Returning from signal handler, XRSTOR re-initializes the excluded state
components.

No functional change until the kernel actually supports the dynamic user
states.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/fpu/internal.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index f5dbbaa060fb..fd044b31ce40 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -368,7 +368,7 @@ static inline void copy_kernel_to_xregs(struct xregs_state 
*xstate, u64 mask)
  */
 static inline int copy_xregs_to_user(struct xregs_state __user *buf)
 {
-   u64 mask = xfeatures_mask_user();
+   u64 mask = current->thread.fpu.state_mask;
u32 lmask = mask;
u32 hmask = mask >> 32;
int err;
-- 
2.17.1

[RFC PATCH 19/22] x86/fpu/amx: Define AMX state components and have it used for boot-time checks

2020-10-01 Thread Chang S. Bae

Linux uses check_xstate_against_struct() to sanity check the size of
XSTATE-enabled features. AMX is the XSAVE-enabled feature, and its size is
not hard-coded but discoverable at run-time via CPUID.

The AMX state is composed of state components 17 and 18, which are all user
state components. The first component is the XTILECFG state of a 64-byte
tile-related control register. The state component 18, called XTILEDATA,
contains the actual tile data, and the state size varies on
implementations. The architectural maximum, as defined in the CPUID(0x1d,
1): EAX[15:0], is a byte less than 64KB. The first implementation supports
8KB.

Check the XTILEDATA state size dynamically. The feature introduces the new
tile register, TMM. Define one register struct only and read the number of
registers from CPUID. Cross-check the overall size with CPUID again.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/fpu/types.h  | 27 +
 arch/x86/include/asm/fpu/xstate.h |  2 +
 arch/x86/kernel/fpu/xstate.c  | 65 +++
 3 files changed, 94 insertions(+)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 4b7756644824..002248dba6dc 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -120,6 +120,9 @@ enum xfeature {
XFEATURE_RSRVD_COMP_13,
XFEATURE_RSRVD_COMP_14,
XFEATURE_LBR,
+   XFEATURE_RSRVD_COMP_16,
+   XFEATURE_XTILE_CFG,
+   XFEATURE_XTILE_DATA,
 
XFEATURE_MAX,
 };
@@ -135,11 +138,15 @@ enum xfeature {
 #define XFEATURE_MASK_PT   (1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR)
 #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU)
 #define XFEATURE_MASK_LBR  (1 << XFEATURE_LBR)
+#define XFEATURE_MASK_XTILE_CFG(1 << XFEATURE_XTILE_CFG)
+#define XFEATURE_MASK_XTILE_DATA   (1 << XFEATURE_XTILE_DATA)
 
 #define XFEATURE_MASK_FPSSE(XFEATURE_MASK_FP | XFEATURE_MASK_SSE)
 #define XFEATURE_MASK_AVX512   (XFEATURE_MASK_OPMASK \
 | XFEATURE_MASK_ZMM_Hi256 \
 | XFEATURE_MASK_Hi16_ZMM)
+#define XFEATURE_MASK_XTILE(XFEATURE_MASK_XTILE_DATA \
+| XFEATURE_MASK_XTILE_CFG)
 
 #define FIRST_EXTENDED_XFEATUREXFEATURE_YMM
 
@@ -152,6 +159,9 @@ struct reg_256_bit {
 struct reg_512_bit {
u8  regbytes[512/8];
 };
+struct reg_1024_byte {
+   u8  regbytes[1024];
+};
 
 /*
  * State component 2:
@@ -254,6 +264,23 @@ struct arch_lbr_state {
u64 ler_to;
u64 ler_info;
struct lbr_entryentries[];
+};
+
+/*
+ * State component 17: 64-byte tile configuration register.
+ */
+struct xtile_cfg {
+   u64 tcfg[8];
+} __packed;
+
+/*
+ * State component 18: 1KB tile data register.
+ * Each register represents 16 64-byte rows of the matrix
+ * data. But the number of registers depends on the actual
+ * implementation.
+ */
+struct xtile_data {
+   struct reg_1024_bytetmm;
 } __packed;
 
 struct xstate_header {
diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index b2125ec90cdb..aadbcf893cc0 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -13,6 +13,8 @@
 
 #define XSTATE_CPUID   0x000d
 
+#define TILE_CPUID 0x001d
+
 #define FXSAVE_SIZE512
 
 #define XSAVE_HDR_SIZE 64
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index eaada4a38153..9d617d6506be 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -39,6 +39,15 @@ static const char *xfeature_names[] =
"Processor Trace (unused)"  ,
"Protection Keys User registers",
"unknown xstate feature",
+   "unknown xstate feature",
+   "unknown xstate feature",
+   "unknown xstate feature",
+   "unknown xstate feature",
+   "unknown xstate feature",
+   "unknown xstate feature",
+   "AMX Tile config"   ,
+   "AMX Tile data" ,
+   "unknown xstate feature",
 };
 
 struct xfeature_capflag_info {
@@ -57,6 +66,8 @@ static struct xfeature_capflag_info xfeature_capflags[] 
__initdata = {
{ XFEATURE_Hi16_ZMM,X86_FEATURE_AVX512F },
{ XFEATURE_PT_UNIMPLEMENTED_SO_FAR, X86_FEATURE_INTEL_PT },
{ XFEATURE_PKRU,X86_FEATURE_PKU },
+   { XFEATURE_XTILE_CFG,   X86_FEATURE_AMX_TILE },
+   { XFEATURE_XTILE_DATA,  X86_FEATURE_AMX_TILE }
 };
 
 /*
@@ -417,6 +428,8 @@ static void __init print_xstate_features(void)
print_xstate_feature(XFEATURE_MASK_ZMM_Hi256);

[RFC PATCH 20/22] x86/fpu/amx: Enable the AMX feature in 64-bit mode

2020-10-01 Thread Chang S. Bae

In 64-bit mode, include the AMX state components in
XFEATURE_MASK_USER_SUPPORTED.

The XFD feature will be used to dynamically allocate per-task XSAVE
buffer on first use.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/fpu/xstate.h | 3 ++-
 arch/x86/kernel/fpu/init.c| 8 ++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index aadbcf893cc0..872325768b13 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -34,7 +34,8 @@
  XFEATURE_MASK_Hi16_ZMM | \
  XFEATURE_MASK_PKRU | \
  XFEATURE_MASK_BNDREGS | \
- XFEATURE_MASK_BNDCSR)
+ XFEATURE_MASK_BNDCSR | \
+ XFEATURE_MASK_XTILE)
 
 /* All currently supported supervisor features */
 #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (0)
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index ee6499075a89..8e2a77bc1782 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -229,8 +229,12 @@ static void __init 
fpu__init_system_xstate_size_legacy(void)
  */
 u64 __init fpu__get_supported_xfeatures_mask(void)
 {
-   return XFEATURE_MASK_USER_SUPPORTED |
-  XFEATURE_MASK_SUPERVISOR_SUPPORTED;
+   u64 mask = XFEATURE_MASK_USER_SUPPORTED | 
XFEATURE_MASK_SUPERVISOR_SUPPORTED;
+
+   if (!IS_ENABLED(CONFIG_X86_64))
+   mask &= ~(XFEATURE_MASK_XTILE);
+
+   return mask;
 }
 
 /* Legacy code to initialize eager fpu mode. */
-- 
2.17.1

[RFC PATCH 02/22] x86/fpu/xstate: Modify xstate copy helper prototypes to access all the possible areas

2020-10-01 Thread Chang S. Bae

The xstate infrastructure is not flexible to support dynamic areas in
task->fpu. Make the xstate copy functions to access task->fpu directly.

No functional change.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/fpu/xstate.h |  8 
 arch/x86/kernel/fpu/regset.c  |  6 +++---
 arch/x86/kernel/fpu/signal.c  | 17 -
 arch/x86/kernel/fpu/xstate.c  | 19 +++
 4 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index 14ab815132d4..a315b055212f 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -105,10 +105,10 @@ const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 struct membuf;
-void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave);
-int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
-int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_supervisor_to_kernel(struct xregs_state *xsave);
+void copy_xstate_to_kernel(struct membuf to, struct fpu *fpu);
+int copy_kernel_to_xstate(struct fpu *fpu, const void *kbuf);
+int copy_user_to_xstate(struct fpu *fpu, const void __user *ubuf);
+void copy_supervisor_to_kernel(struct fpu *fpu);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 4c4d9059ff36..5e13e58d11d4 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -85,7 +85,7 @@ int xstateregs_get(struct task_struct *target, const struct 
user_regset *regset,
fpu__prepare_read(fpu);
 
if (using_compacted_format()) {
-   copy_xstate_to_kernel(to, xsave);
+   copy_xstate_to_kernel(to, fpu);
return 0;
} else {
fpstate_sanitize_xstate(fpu);
@@ -126,9 +126,9 @@ int xstateregs_set(struct task_struct *target, const struct 
user_regset *regset,
 
if (using_compacted_format()) {
if (kbuf)
-   ret = copy_kernel_to_xstate(xsave, kbuf);
+   ret = copy_kernel_to_xstate(fpu, kbuf);
else
-   ret = copy_user_to_xstate(xsave, ubuf);
+   ret = copy_user_to_xstate(fpu, ubuf);
} else {
ret = user_regset_copyin(, , , , xsave, 0, 
-1);
if (!ret)
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 9f009525f551..adbf63114bc2 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -212,11 +212,11 @@ int copy_fpstate_to_sigframe(void __user *buf, void 
__user *buf_fx, int size)
 }
 
 static inline void
-sanitize_restored_user_xstate(union fpregs_state *state,
+sanitize_restored_user_xstate(struct fpu *fpu,
  struct user_i387_ia32_struct *ia32_env,
  u64 user_xfeatures, int fx_only)
 {
-   struct xregs_state *xsave = >xsave;
+   struct xregs_state *xsave = >state.xsave;
struct xstate_header *header = >header;
 
if (use_xsave()) {
@@ -253,7 +253,7 @@ sanitize_restored_user_xstate(union fpregs_state *state,
xsave->i387.mxcsr &= mxcsr_feature_mask;
 
if (ia32_env)
-   convert_to_fxsr(>fxsave, ia32_env);
+   convert_to_fxsr(>state.fxsave, ia32_env);
}
 }
 
@@ -396,7 +396,7 @@ static int __fpu__restore_sig(void __user *buf, void __user 
*buf_fx, int size)
 * current supervisor states first and invalidate the FPU regs.
 */
if (xfeatures_mask_supervisor())
-   copy_supervisor_to_kernel(>state.xsave);
+   copy_supervisor_to_kernel(fpu);
set_thread_flag(TIF_NEED_FPU_LOAD);
}
__fpu_invalidate_fpregs_state(fpu);
@@ -406,18 +406,18 @@ static int __fpu__restore_sig(void __user *buf, void 
__user *buf_fx, int size)
u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
 
if (using_compacted_format()) {
-   ret = copy_user_to_xstate(>state.xsave, buf_fx);
+   ret = copy_user_to_xstate(fpu, buf_fx);
} else {
ret = __copy_from_user(>state.xsave, buf_fx, 
state_size);
 
if (!ret && state_size > offsetof(struct xregs_state, 
header))
ret = 
validate_user_xstate_header(>state.xsave.header);
+
}
if (ret)
goto err_out;
 
-   sanitize_restored_user_xstate(>state, envp,

[RFC PATCH 04/22] x86/fpu/xstate: Modify save and restore helper prototypes to access all the possible areas

2020-10-01 Thread Chang S. Bae

The xstate infrastructure is not flexible to support dynamic areas in
task->fpu. Make the xstate save and restore helpers to access task->fpu
directly.

No functional change.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: k...@vger.kernel.org
---
 arch/x86/include/asm/fpu/internal.h | 9 ++---
 arch/x86/kernel/fpu/core.c  | 4 ++--
 arch/x86/kernel/fpu/signal.c| 3 +--
 arch/x86/kvm/x86.c  | 2 +-
 4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index baca80e877a6..6eec5209750f 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -396,8 +396,9 @@ static inline int copy_user_to_xregs(struct xregs_state 
__user *buf, u64 mask)
  * Restore xstate from kernel space xsave area, return an error code instead of
  * an exception.
  */
-static inline int copy_kernel_to_xregs_err(struct xregs_state *xstate, u64 
mask)
+static inline int copy_kernel_to_xregs_err(struct fpu *fpu, u64 mask)
 {
+   struct xregs_state *xstate = >state.xsave;
u32 lmask = mask;
u32 hmask = mask >> 32;
int err;
@@ -424,8 +425,10 @@ static inline void __copy_kernel_to_fpregs(union 
fpregs_state *fpstate, u64 mask
}
 }
 
-static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate)
+static inline void copy_kernel_to_fpregs(struct fpu *fpu)
 {
+   union fpregs_state *fpstate = >state;
+
/*
 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
 * pending. Clear the x87 state here by setting it to fixed values.
@@ -510,7 +513,7 @@ static inline void __fpregs_load_activate(void)
return;
 
if (!fpregs_state_valid(fpu, cpu)) {
-   copy_kernel_to_fpregs(>state);
+   copy_kernel_to_fpregs(fpu);
fpregs_activate(fpu);
fpu->last_cpu = cpu;
}
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 41d926c76615..39ddb22c143b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -172,7 +172,7 @@ void fpu__save(struct fpu *fpu)
 
if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
if (!copy_fpregs_to_fpstate(fpu)) {
-   copy_kernel_to_fpregs(>state);
+   copy_kernel_to_fpregs(fpu);
}
}
 
@@ -248,7 +248,7 @@ int fpu__copy(struct task_struct *dst, struct task_struct 
*src)
memcpy(_fpu->state, _fpu->state, 
fpu_kernel_xstate_size);
 
else if (!copy_fpregs_to_fpstate(dst_fpu))
-   copy_kernel_to_fpregs(_fpu->state);
+   copy_kernel_to_fpregs(dst_fpu);
 
fpregs_unlock();
 
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index adbf63114bc2..6f3bcc7dab80 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -427,8 +427,7 @@ static int __fpu__restore_sig(void __user *buf, void __user 
*buf_fx, int size)
 * Restore previously saved supervisor xstates along with
 * copied-in user xstates.
 */
-   ret = copy_kernel_to_xregs_err(>state.xsave,
-  user_xfeatures | 
xfeatures_mask_supervisor());
+   ret = copy_kernel_to_xregs_err(fpu, user_xfeatures | 
xfeatures_mask_supervisor());
 
} else if (use_fxsr()) {
ret = __copy_from_user(>state.fxsave, buf_fx, state_size);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c4b8d3705625..192d52ff5b8c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8877,7 +8877,7 @@ static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
 
kvm_save_current_fpu(vcpu->arch.guest_fpu);
 
-   copy_kernel_to_fpregs(>arch.user_fpu->state);
+   copy_kernel_to_fpregs(vcpu->arch.user_fpu);
 
fpregs_mark_activate();
fpregs_unlock();
-- 
2.17.1

[RFC PATCH 06/22] x86/fpu/xstate: Outline dynamic xstate area size in the task context

2020-10-01 Thread Chang S. Bae

The xstate area size in task->fpu used to be fixed at runtime. To
accommodate dynamic user states, introduce variables for representing the
maximum and default (as minimum) area sizes.

do_extra_xstate_size_checks() is ready to calculate both sizes, which can
be compared with CPUID. CPUID can immediately provide the maximum size. The
code needs to rewrite XCR0 registers to get the default size that excludes
the dynamic parts. It is not always straightforward especially when
inter-dependency exists between state component bits. To make it simple,
the code double-checks the maximum size only.

No functional change as long as the kernel does not support the dynamic
area.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: k...@vger.kernel.org
---
 arch/x86/include/asm/processor.h | 10 ++-
 arch/x86/kernel/fpu/core.c   |  6 ++--
 arch/x86/kernel/fpu/init.c   | 33 --
 arch/x86/kernel/fpu/signal.c |  2 +-
 arch/x86/kernel/fpu/xstate.c | 48 +---
 arch/x86/kernel/process.c|  6 
 arch/x86/kvm/x86.c   |  2 +-
 7 files changed, 65 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 97143d87994c..f5f83aa1b90f 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -477,7 +477,8 @@ DECLARE_PER_CPU_ALIGNED(struct stack_canary, stack_canary);
 DECLARE_PER_CPU(struct irq_stack *, softirq_stack_ptr);
 #endif /* X86_64 */
 
-extern unsigned int fpu_kernel_xstate_size;
+extern unsigned int fpu_kernel_xstate_default_size;
+extern unsigned int fpu_kernel_xstate_max_size;
 extern unsigned int fpu_user_xstate_size;
 
 struct perf_event;
@@ -551,12 +552,7 @@ struct thread_struct {
 };
 
 /* Whitelist the FPU state from the task_struct for hardened usercopy. */
-static inline void arch_thread_struct_whitelist(unsigned long *offset,
-   unsigned long *size)
-{
-   *offset = offsetof(struct thread_struct, fpu.state);
-   *size = fpu_kernel_xstate_size;
-}
+extern void arch_thread_struct_whitelist(unsigned long *offset, unsigned long 
*size);
 
 /*
  * Thread-synchronous status.
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 39ddb22c143b..875620fdfe61 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -206,7 +206,7 @@ void fpstate_init(struct fpu *fpu)
return;
}
 
-   memset(state, 0, fpu_kernel_xstate_size);
+   memset(state, 0, fpu_kernel_xstate_default_size);
 
if (static_cpu_has(X86_FEATURE_XSAVES))
fpstate_init_xstate(>xsave, xfeatures_mask_all);
@@ -233,7 +233,7 @@ int fpu__copy(struct task_struct *dst, struct task_struct 
*src)
 * Don't let 'init optimized' areas of the XSAVE area
 * leak into the child task:
 */
-   memset(_fpu->state.xsave, 0, fpu_kernel_xstate_size);
+   memset(_fpu->state.xsave, 0, fpu_kernel_xstate_default_size);
 
/*
 * If the FPU registers are not current just memcpy() the state.
@@ -245,7 +245,7 @@ int fpu__copy(struct task_struct *dst, struct task_struct 
*src)
 */
fpregs_lock();
if (test_thread_flag(TIF_NEED_FPU_LOAD))
-   memcpy(_fpu->state, _fpu->state, 
fpu_kernel_xstate_size);
+   memcpy(_fpu->state, _fpu->state, 
fpu_kernel_xstate_default_size);
 
else if (!copy_fpregs_to_fpstate(dst_fpu))
copy_kernel_to_fpregs(dst_fpu);
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 4e89a2698cfb..ee6499075a89 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -131,13 +131,17 @@ static void __init fpu__init_system_generic(void)
 }
 
 /*
- * Size of the FPU context state. All tasks in the system use the
- * same context size, regardless of what portion they use.
- * This is inherent to the XSAVE architecture which puts all state
- * components into a single, continuous memory block:
+ * Size of the maximum FPU context state. It is inherent to the XSAVE 
architecture
+ * which puts all state components into a single, continuous memory block:
  */
-unsigned int fpu_kernel_xstate_size;
-EXPORT_SYMBOL_GPL(fpu_kernel_xstate_size);
+unsigned int fpu_kernel_xstate_max_size;
+
+/*
+ * Size of the initial FPU context state. All tasks in the system use this 
context
+ * size by default.
+ */
+unsigned int fpu_kernel_xstate_default_size;
+EXPORT_SYMBOL_GPL(fpu_kernel_xstate_default_size);
 
 /* Get alignment of the TYPE. */
 #define TYPE_ALIGN(TYPE) offsetof(struct { char x; TYPE test; }, test)
@@ -167,9 +171,9 @@ static void __init fpu__init_task_struct_size(void)
 
/*
 * Add back the dynamically-calculated register state
-* size.
+* size by default.
 */
-   task_size += fpu_kernel_xstate_size;
+

[RFC PATCH 12/22] x86/fpu/xstate: Update xstate context copy function for supporting dynamic area

2020-10-01 Thread Chang S. Bae

There are xstate context copy functions that used in ptrace() and signal
return paths. They serve callers to read (or write) xstate values in the
task->fpu's buffer, or to get initial values. With dynamic user states, a
component's position in the buffer may vary and the initial value is not
always stored in init_fpstate.

Change the helpers to find a component's offset accordingly (either lookup
table or calculation).

When copying an initial value, explicitly check the init_fpstate coverage.
If not found, reset the memory in the destination. Otherwise, copy values
from init_fpstate.

No functional change until the kernel supports dynamic user states.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/kernel/fpu/xstate.c | 55 +++-
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 556ae8593806..b9261ab4e5e2 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -245,12 +245,14 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
if (!(xfeatures & XFEATURE_MASK_SSE))
memset(>xmm_space[0], 0, 256);
 
+   /* Make sure 'xfeatures' to be a subset of fpu->state_mask */
+   xfeatures = ((xfeatures_mask_user() & fpu->state_mask) & ~xfeatures);
/*
 * First two features are FPU and SSE, which above we handled
 * in a special way already:
 */
feature_bit = 0x2;
-   xfeatures = (xfeatures_mask_user() & ~xfeatures) >> 2;
+   xfeatures >>= 0x2;
 
/*
 * Update all the remaining memory layouts according to their
@@ -259,12 +261,15 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
 */
while (xfeatures) {
if (xfeatures & 0x1) {
-   int offset = xstate_comp_offsets[feature_bit];
-   int size = xstate_sizes[feature_bit];
-
-   memcpy((void *)xsave + offset,
-  (void *)_fpstate.xsave + offset,
-  size);
+   unsigned int offset = get_xstate_comp_offset(fpu, 
feature_bit);
+   unsigned int size = xstate_sizes[feature_bit];
+
+   if (get_init_fpstate_mask() & BIT_ULL(feature_bit))
+   memcpy((void *)xsave + offset,
+  (void *)_fpstate.xsave + offset,
+  size);
+   else
+   memset((void *)xsave + offset, 0, size);
}
 
xfeatures >>= 1;
@@ -1238,7 +1243,10 @@ static void fill_gap(struct membuf *to, unsigned *last, 
unsigned offset)
 {
if (*last >= offset)
return;
-   membuf_write(to, (void *)_fpstate.xsave + *last, offset - *last);
+   if (offset <= get_init_fpstate_size())
+   membuf_write(to, (void *)_fpstate.xsave + *last, offset - 
*last);
+   else
+   membuf_zero(to, offset - *last);
*last = offset;
 }
 
@@ -1246,7 +1254,10 @@ static void copy_part(struct membuf *to, unsigned *last, 
unsigned offset,
  unsigned size, void *from)
 {
fill_gap(to, last, offset);
-   membuf_write(to, from, size);
+   if (from)
+   membuf_write(to, from, size);
+   else
+   membuf_zero(to, size);
*last = offset + size;
 }
 
@@ -1298,12 +1309,22 @@ void copy_xstate_to_kernel(struct membuf to, struct fpu 
*fpu)
  sizeof(header), );
 
for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
+   u64 mask = BIT_ULL(i);
+   void *src;
/*
-* Copy only in-use xstates:
+* Copy only in-use xstate at first. If the feature is enabled,
+* find the init value, whether stored in init_fpstate or simply
+* zeros, and then copy them.
 */
-   if ((header.xfeatures >> i) & 1) {
-   void *src = __raw_xsave_addr(fpu, i);
-
+   if (header.xfeatures & mask) {
+   src = __raw_xsave_addr(fpu, i);
+   copy_part(, , xstate_offsets[i],
+ xstate_sizes[i], src);
+   } else if (xfeatures_mask_user() & mask) {
+   if (get_init_fpstate_mask() & mask)
+   src = (void *)_fpstate.xsave + last;
+   else
+   src = NULL;
copy_part(, , xstate_offsets[i],
  xstate_sizes[i], src);
}
@@ -1337,6 +1358,9 @@ int copy_kernel_to_xstate(struct fpu *fpu, const void 
*kbuf)
if (hdr.xfeatures & mask) {
void *dst =

[RFC PATCH 05/22] x86/fpu/xstate: Introduce a new variable for dynamic user states

2020-10-01 Thread Chang S. Bae

The kernel recently supported the dynamic supervisor states. The approach
does not save the register states at every context switch (even used), but
only when needed. It is not suitable for user states.

Introduce xfeatures_mask_user_dynamic to identify dynamic user states, and
rename these as related to the dynamic supervisor states:
xfeatures_mask_supervisor_dynamic()
XFEATURE_MASK_SUPERVISOR_DYNAMIC

No functional change.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/fpu/xstate.h | 12 +++-
 arch/x86/kernel/fpu/xstate.c  | 29 +++--
 2 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index 3fbf45727ad6..9aad91c0725b 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -56,7 +56,7 @@
  * - Don't set the bit corresponding to the dynamic supervisor feature in
  *   IA32_XSS at run time, since it has been set at boot time.
  */
-#define XFEATURE_MASK_DYNAMIC (XFEATURE_MASK_LBR)
+#define XFEATURE_MASK_SUPERVISOR_DYNAMIC (XFEATURE_MASK_LBR)
 
 /*
  * Unsupported supervisor features. When a supervisor feature in this mask is
@@ -66,7 +66,7 @@
 
 /* All supervisor states including supported and unsupported states. */
 #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \
- XFEATURE_MASK_DYNAMIC | \
+ XFEATURE_MASK_SUPERVISOR_DYNAMIC | \
  XFEATURE_MASK_SUPERVISOR_UNSUPPORTED)
 
 #ifdef CONFIG_X86_64
@@ -87,14 +87,16 @@ static inline u64 xfeatures_mask_user(void)
return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
 
-static inline u64 xfeatures_mask_dynamic(void)
+static inline u64 xfeatures_mask_supervisor_dynamic(void)
 {
if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
-   return XFEATURE_MASK_DYNAMIC & ~XFEATURE_MASK_LBR;
+   return XFEATURE_MASK_SUPERVISOR_DYNAMIC & ~XFEATURE_MASK_LBR;
 
-   return XFEATURE_MASK_DYNAMIC;
+   return XFEATURE_MASK_SUPERVISOR_DYNAMIC;
 }
 
+extern u64 xfeatures_mask_user_dynamic;
+
 extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 
 extern void __init update_regset_xstate_info(unsigned int size,
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index bab22766b79b..bf2b09bf9b38 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -59,6 +59,12 @@ static short xsave_cpuid_features[] __initdata = {
  */
 u64 xfeatures_mask_all __read_mostly;
 
+/*
+ * This represents user xstates, a subset of xfeatures_mask_all, saved in a
+ * dynamic kernel XSAVE buffer.
+ */
+u64 xfeatures_mask_user_dynamic __read_mostly;
+
 static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] 
= -1};
 static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] 
= -1};
 static unsigned int xstate_comp_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX 
- 1] = -1};
@@ -235,7 +241,7 @@ void fpu__init_cpu_xstate(void)
 */
if (boot_cpu_has(X86_FEATURE_XSAVES)) {
wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor() |
-xfeatures_mask_dynamic());
+xfeatures_mask_supervisor_dynamic());
}
 }
 
@@ -682,7 +688,7 @@ static unsigned int __init get_xsaves_size(void)
  */
 static unsigned int __init get_xsaves_size_no_dynamic(void)
 {
-   u64 mask = xfeatures_mask_dynamic();
+   u64 mask = xfeatures_mask_supervisor_dynamic();
unsigned int size;
 
if (!mask)
@@ -769,6 +775,7 @@ static int __init init_xstate_size(void)
 static void fpu__init_disable_system_xstate(void)
 {
xfeatures_mask_all = 0;
+   xfeatures_mask_user_dynamic = 0;
cr4_clear_bits(X86_CR4_OSXSAVE);
setup_clear_cpu_cap(X86_FEATURE_XSAVE);
 }
@@ -835,6 +842,8 @@ void __init fpu__init_system_xstate(void)
}
 
xfeatures_mask_all &= fpu__get_supported_xfeatures_mask();
+   /* Do not support the dynamically allocated area yet. */
+   xfeatures_mask_user_dynamic = 0;
 
/* Enable xstate instructions to be able to continue with 
initialization: */
fpu__init_cpu_xstate();
@@ -882,7 +891,7 @@ void fpu__resume_cpu(void)
 */
if (boot_cpu_has(X86_FEATURE_XSAVES)) {
wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor()  |
-xfeatures_mask_dynamic());
+xfeatures_mask_supervisor_dynamic());
}
 }
 
@@ -1316,8 +1325,8 @@ void copy_supervisor_to_kernel(struct fpu *fpu)
  * @mask: Represent the dynamic supervisor features saved into the xsave area
  *
  * Only the dynamic supervisor states sets in the mask are saved into the xsave
- * area (See the comment

[RFC PATCH 08/22] x86/fpu/xstate: Define the scope of the initial xstate data

2020-10-01 Thread Chang S. Bae

init_fpstate covers all the component states. But it becomes less efficient
to do this as the state size trends larger but with trivial initial data.

Limit init_fpstate by clarifying its size and coverage, which are all but
dynamic user states. The dynamic states are assumed to be large but having
initial data with zeros.

No functional change until the kernel supports dynamic user states.

Signed-off-by: Chang S. Bae 
Reviewed-by: Len Brown 
Cc: x...@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/fpu/internal.h | 18 +++---
 arch/x86/include/asm/fpu/xstate.h   |  1 +
 arch/x86/kernel/fpu/core.c  |  4 ++--
 arch/x86/kernel/fpu/xstate.c|  4 ++--
 4 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index 6eec5209750f..d64c1083bd93 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -79,6 +79,18 @@ static __always_inline __pure bool use_fxsr(void)
 
 extern union fpregs_state init_fpstate;
 
+static inline u64 get_init_fpstate_mask(void)
+{
+   /* init_fpstate covers state components, as covered in fpu->state */
+   return (xfeatures_mask_all & ~xfeatures_mask_user_dynamic);
+}
+
+static inline unsigned int get_init_fpstate_size(void)
+{
+   /* fpu->state size matches with init_fpstate size */
+   return fpu_kernel_xstate_default_size;
+}
+
 extern void fpstate_init(struct fpu *fpu);
 #ifdef CONFIG_MATH_EMULATION
 extern void fpstate_init_soft(struct swregs_state *soft);
@@ -268,12 +280,12 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu)
 : "memory")
 
 /*
- * This function is called only during boot time when x86 caps are not set
- * up and alternative can not be used yet.
+ * Use this function to dump the initial state, only during boot time when x86
+ * caps not set up and alternative not available yet.
  */
 static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate)
 {
-   u64 mask = xfeatures_mask_all;
+   u64 mask = get_init_fpstate_mask();
u32 lmask = mask;
u32 hmask = mask >> 32;
int err;
diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index 37728bfcb71e..9de8b4c49855 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -103,6 +103,7 @@ extern void __init update_regset_xstate_info(unsigned int 
size,
 u64 xstate_mask);
 
 void *get_xsave_addr(struct fpu *fpu, int xfeature_nr);
+unsigned int get_xstate_size(u64 mask);
 int alloc_xstate_area(struct fpu *fpu, u64 mask, unsigned int *alloc_size);
 void free_xstate_area(struct fpu *fpu);
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index e25f7866800e..33956ae3de2b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -206,10 +206,10 @@ void fpstate_init(struct fpu *fpu)
return;
}
 
-   memset(state, 0, fpu_kernel_xstate_default_size);
+   memset(state, 0, fpu ? get_xstate_size(fpu->state_mask) : 
get_init_fpstate_size());
 
if (static_cpu_has(X86_FEATURE_XSAVES))
-   fpstate_init_xstate(>xsave, xfeatures_mask_all);
+   fpstate_init_xstate(>xsave, fpu ? fpu->state_mask : 
get_init_fpstate_mask());
if (static_cpu_has(X86_FEATURE_FXSR))
fpstate_init_fxstate(>fxsave);
else
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index af60332aafef..2e190254d4aa 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -134,7 +134,7 @@ static bool xfeature_is_supervisor(int xfeature_nr)
  * Available once those arrays for the offset, size, and alignment info are 
set up,
  * by setup_xstate_features().
  */
-static unsigned int get_xstate_size(u64 mask)
+unsigned int get_xstate_size(u64 mask)
 {
unsigned int size;
u64 xmask;
@@ -507,7 +507,7 @@ static void __init setup_init_fpu_buf(void)
print_xstate_features();
 
if (boot_cpu_has(X86_FEATURE_XSAVES))
-   fpstate_init_xstate(_fpstate.xsave, xfeatures_mask_all);
+   fpstate_init_xstate(_fpstate.xsave, 
get_init_fpstate_mask());
 
/*
 * Init all the features state with header.xfeatures being 0x0
-- 
2.17.1

Re: [PATCH v6 0/5] DVFS support for Venus

2020-10-01 Thread Doug Anderson

Hi,

On Wed, Sep 16, 2020 at 12:26 AM Stanimir Varbanov
 wrote:
>
> Hi,
>
> On 9/16/20 8:33 AM, Rajendra Nayak wrote:
> >
> > On 9/1/2020 7:50 PM, Rajendra Nayak wrote:
> >> Rob, can you pick PATCH 1 since its already reviewed by you.
> >> Stan, Patch 2 and 3 will need to be picked by you and they both have
> >> your ACKs
> >
> > Rob/Stan, any plans to get the patches merged for 5.10?
>
> 2/5 and 3/5 are queued up for v5.10 through media tree.

Normally I'd expect device tree bindings (patch #1) to go through the
same tree as the driver changes.  Does the media tree work
differently?  If you're expecting Rob Herring to land the device tree
binding change, is he aware?

-Doug

Re: Linux 5.9-rc7 / VmallocTotal wrongly reported

2020-10-01 Thread Roman Gushchin

On Thu, Oct 01, 2020 at 12:58:36PM -0700, Linus Torvalds wrote:
> On Thu, Oct 1, 2020 at 12:56 PM Roman Gushchin  wrote:
> >
> > Bastian, can you, please, share your config?
> 
> Bastian actually did that in the original email, but that was only
> sent to me and Andrew in private.
> 
> Here's that config replicated for your pleasure,

Thank you!

> 
> #
> # Processor type and features
> #
> # CONFIG_ZONE_DMA is not set
> # CONFIG_SMP is not set

Yes, here is the deal.

The SMP-version of __mod_node_page_state() converts a passed value from bytes
to pages, but the non-SMP doesn't.

Thanks!


--

>From 3d0233b37340c78012b991d3570b92f91cf5ebd2 Mon Sep 17 00:00:00 2001
From: Roman Gushchin 
Date: Thu, 1 Oct 2020 13:07:49 -0700
Subject: [PATCH] mm: memcg/slab: fix slab statistics in !SMP configuration

Since ea426c2a7de8 ("mm: memcg: prepare for byte-sized vmstat items")
the write side of slab counters accepts a value in bytes and converts
it to pages. It happens in __mod_node_page_state().

However a non-SMP version of __mod_node_page_state() doesn't perform
this conversion. It leads to incorrect (unrealistically high) slab
counters values. Fix this by adding a similar conversion to the
non-SMP version of __mod_node_page_state().

Signed-off-by: Roman Gushchin 
Reported-by: Bastian Bittorf 
Fixes: ea426c2a7de8 ("mm: memcg: prepare for byte-sized vmstat items")
---
 include/linux/vmstat.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index d5431c1bf6e5..322dcbfcc933 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -312,6 +312,11 @@ static inline void __mod_zone_page_state(struct zone *zone,
 static inline void __mod_node_page_state(struct pglist_data *pgdat,
enum node_stat_item item, int delta)
 {
+   if (vmstat_item_in_bytes(item)) {
+   VM_WARN_ON_ONCE(delta & (PAGE_SIZE - 1));
+   delta >>= PAGE_SHIFT;
+   }
+
node_page_state_add(delta, pgdat, item);
 }
 
-- 
2.26.2

Re: [PATCH 1/1] drm/amdgpu: fix NULL pointer dereference for Renoir

2020-10-01 Thread Dirk Gouders

Dirk Gouders  writes:

> Commit c1cf79ca5ced46 (drm/amdgpu: use IP discovery table for renoir)
> introduced a NULL pointer dereference when booting with
> amdgpu.discovery=0, because it removed the call of vega10_reg_base_init()
> for that case.
>
> Fix this by calling that funcion if amdgpu_discovery == 0 in addition to
> the case that amdgpu_discovery_reg_base_init() failed.
>
> Fixes: c1cf79ca5ced46 (drm/amdgpu: use IP discovery table for renoir)
> Signed-off-by: Dirk Gouders 
> Cc: Hawking Zhang 
> Cc: Evan Quan 
> ---
>  drivers/gpu/drm/amd/amdgpu/soc15.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
> b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index 84d811b6e48b..f8cb62b326d6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -694,12 +694,12 @@ static void soc15_reg_base_init(struct amdgpu_device 
> *adev)
>* it doesn't support SRIOV. */
>   if (amdgpu_discovery) {
>   r = amdgpu_discovery_reg_base_init(adev);
> - if (r) {
> - DRM_WARN("failed to init reg base from ip 
> discovery table, "
> -  "fallback to legacy init method\n");
> - vega10_reg_base_init(adev);
> - }
> + if (r == 0)
> +   break;

Grrr, wrong indentation here.
But I will wait for your review before v1.

Dirk


> + DRM_WARN("failed to init reg base from ip discovery 
> table, "
> +  "fallback to legacy init method\n");
>   }
> + vega10_reg_base_init(adev);
>   break;
>   case CHIP_VEGA20:
>   vega20_reg_base_init(adev);

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-01 Thread Dmitry Osipenko

01.10.2020 14:04, Nicolin Chen пишет:
> On Thu, Oct 01, 2020 at 12:23:16PM +0200, Thierry Reding wrote:
>  > > >> It looks to me like the only reason why you need this new global 
> API is
>> because PCI devices may not have a device tree node with a phandle to
>> the IOMMU. However, SMMU support for PCI will only be enabled if the
>> root complex has an iommus property, right? In that case, can't we
>> simply do something like this:
>>
>>  if (dev_is_pci(dev))
>>  np = find_host_bridge(dev)->of_node;
>>  else
>>  np = dev->of_node;
> 
>>> I personally am not a fan of adding a path for PCI device either,
>>> since PCI/IOMMU cores could have taken care of it while the same
>>> path can't be used for other buses.
>>
>> There's already plenty of other drivers that do something similar to
>> this. Take a look at the arm-smmu driver, for example, which seems to be
>> doing exactly the same thing to finding the right device tree node to
>> look at (see dev_get_dev_node() in drivers/iommu/arm-smmu/arm-smmu.c).
> 
> Hmm..okay..that is quite convincing then...

Not very convincing to me. I don't see a "plenty of other drivers",
there is only one arm-smmu driver.

The dev_get_dev_node() is under CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS (!).
Guys, doesn't it look strange to you? :)

The arm-smmu driver does a similar thing for the modern bindings to what
Nicolin's v3 is doing.

>>> If we can't come to an agreement on globalizing mc pointer, would
>>> it be possible to pass tegra_mc_driver through tegra_smmu_probe()
>>> so we can continue to use driver_find_device_by_fwnode() as v1?
>>>
>>> v1: https://lkml.org/lkml/2020/9/26/68
>>
>> tegra_smmu_probe() already takes a struct tegra_mc *. Did you mean
>> tegra_smmu_probe_device()? I don't think we can do that because it isn't
> 
> I was saying to have a global parent_driver pointer: similar to
> my v1, yet rather than "extern" the tegra_mc_driver, we pass it
> through egra_smmu_probe() and store it in a static global value
> so as to call tegra_smmu_get_by_fwnode() in ->probe_device().
> 
> Though I agree that creating a global device pointer (mc) might
> be controversial, yet having a global parent_driver pointer may
> not be against the rule, considering that it is common in iommu
> drivers to call driver_find_device_by_fwnode in probe_device().

You don't need the global pointer if you have SMMU OF node.

You could also get driver pointer from mc->dev->driver.

But I don't think you need to do this at all. The probe_device() could
be invoked only for the tegra_smmu_ops and then seems you could use
dev_iommu_priv_set() in tegra_smmu_of_xlate(), like sun50i-iommu driver
does.

Re: How should we handle illegal task FPU state?

2020-10-01 Thread Yu, Yu-cheng


On 10/1/2020 10:43 AM, Andy Lutomirski wrote:

Our current handling of illegal task FPU state is currently rather
simplistic.  We basically ignore the issue with this extable code:

/*
  * Handler for when we fail to restore a task's FPU state.  We should never get
  * here because the FPU state of a task using the FPU (task->thread.fpu.state)
  * should always be valid.  However, past bugs have allowed userspace to set
  * reserved bits in the XSAVE area using PTRACE_SETREGSET or 
sys_rt_sigreturn().
  * These caused XRSTOR to fail when switching to the task, leaking the FPU
  * registers of the task previously executing on the CPU.  Mitigate this class
  * of vulnerability by restoring from the initial state (essentially, zeroing
  * out all the FPU registers) if we can't restore from the task's FPU state.
  */
__visible bool ex_handler_fprestore(const struct exception_table_entry *fixup,
 struct pt_regs *regs, int trapnr,
 unsigned long error_code,
 unsigned long fault_addr)
{
 regs->ip = ex_fixup_addr(fixup);

 WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing
FPU registers.",
   (void *)instruction_pointer(regs));

 __copy_kernel_to_fpregs(_fpstate, -1);
 return true;
}
EXPORT_SYMBOL_GPL(ex_handler_fprestore);

In other words, we mostly pretend that illegal FPU state can't happen,
and, if it happens, we print a WARN and we blindly run the task with
the wrong state.  This is at least an improvement from the previous
code -- see

commit d5c8028b4788f62b31fb79a331b3ad3e041fa366
Author: Eric Biggers 
Date:   Sat Sep 23 15:00:09 2017 +0200

 x86/fpu: Reinitialize FPU registers if restoring FPU state fails

And we have some code that tries to sanitize user state to avoid this.

IMO this all made a little bit of sense when "FPU" meant literally FPU
or at least state that was more or less just user registers.  But now
we have this fancy "supervisor" state, and I don't think we should be
running user code in a context with potentially corrupted or even
potentially incorrectly re-initialized supervisor state.  This is an
issue for SHSTK -- if an attacker can find a straightforward way to
corrupt a target task's FPU state, then that task will run with CET
disabled.  Whoops!

The question is: what do we do about it?  We have two basic choices, I think.

a) Decide that the saved FPU for a task *must* be valid at all times.
If there's a failure to restore state, kill the task.

b) Improve our failed restoration handling and maybe even
intentionally make it possible to create illegal state to allow
testing.

(a) sounds like a nice concept, but I'm not convinced it's practical.
For example, I'm not even convinced that the set of valid SSP values
is documented.

So maybe (b) is the right choice.  Getting a good implementation might
be tricky.  Right now, we restore FPU too late in
arch_exit_to_user_mode_prepare(), and that function isn't allowed to
fail or to send signals.  We could kill the task on failure, and I
suppose we could consider queueing a signal, sending IPI-to-self, and
returning with TIF_NEED_FPU_LOAD still set and bogus state.  Or we
could rework the exit-to-usermode code to allow failure.  All of this
becomes utterly gross for the return-from-NMI path, although I guess
we don't restore FPU regs in that path regardless.  Or we can
do_exit() and just bail outright.

I think it would be polite to at least allow core dumping a bogus FPU
state, and notifying ptrace() might be nice.  And, if the bogus part
of the FPU state is non-supervisor, we could plausibly deliver a
signal, but this is (as above) potentially quite difficult.

(As an aside, our current handling of signal delivery failure sucks.
We should *at least* log something useful.)


Regardless of how we decide to handle this, I do think we need to do
*something* before applying the CET patches.



Before supervisor states are introduced, XRSTOR* fails because one of 
the following: memory operand is invalid, xstate_header is wrong, or 
fxregs_state->mxcsr is wrong.  So the code in ex_handler_fprestore() was 
good.


When supervisor states are introduced for CET and PASID, XRSTORS can 
fail for only one additional reason: if it effects a WRMSR of invalid 
values.


If the kernel writes to the MSRs directly, there is wrmsr_safe().  If 
the kernel writes to MSRs' xstates, it can check the values first.  So 
this might not need a generalized handling (but I would not oppose it). 
Maybe we can add a config debug option to check if any writes to those 
MSR xstates are checked before being written (and print out warnings 
when not)?


Thanks,
Yu-cheng

Re: [GIT PULL] arm64 fix for 5.9-rc8/final

2020-10-01 Thread pr-tracker-bot

The pull request you sent on Thu, 1 Oct 2020 18:35:07 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux tags/arm64-fixes

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/eed2ef4403de3d8937ccb624e15d3c5004e7dda5

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

[PATCH] locking/atomics: Check atomic-arch-fallback.h too

2020-10-01 Thread Paul Bolle

The sha1sum of include/linux/atomic-arch-fallback.h isn't checked by
check-atomics.sh. It's not clear why it's skipped so let's check it too.

Signed-off-by: Paul Bolle 
---
It seems it never has been checked. So this does cast some doubt about
the usefulness of these tests. But I'm clueless about this atomic stuff
so what do I know?

 scripts/atomic/check-atomics.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/atomic/check-atomics.sh b/scripts/atomic/check-atomics.sh
index 8378c63a1e09..82748d42ecc5 100755
--- a/scripts/atomic/check-atomics.sh
+++ b/scripts/atomic/check-atomics.sh
@@ -16,6 +16,7 @@ fi
 cat <

[PATCH] ARM: multi_v7_defconfig: ti: Enable networking options for nfs boot

2020-10-01 Thread Grygorii Strashko

Enable networking options required for NFS boot on TI platforms, which is
widely for automated test systems.
- enable new TI CPSW switch driver and related NET_SWITCHDEV config
- enable TI DP83867 phy
- explicitly enable PTP clock support to ensure dependent networking
drivers will stay built-in

Signed-off-by: Grygorii Strashko 
---
 arch/arm/configs/multi_v7_defconfig | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index e9e76e32f10f..11b3184d8154 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -152,6 +152,7 @@ CONFIG_INET6_IPCOMP=m
 CONFIG_IPV6_MIP6=m
 CONFIG_IPV6_TUNNEL=m
 CONFIG_IPV6_MULTIPLE_TABLES=y
+CONFIG_NET_SWITCHDEV=y
 CONFIG_NET_DSA=m
 CONFIG_CAN=y
 CONFIG_CAN_AT91=m
@@ -268,9 +269,12 @@ CONFIG_SNI_AVE=y
 CONFIG_STMMAC_ETH=y
 CONFIG_DWMAC_DWC_QOS_ETH=y
 CONFIG_TI_CPSW=y
+CONFIG_TI_CPSW_SWITCHDEV=y
+CONFIG_TI_CPTS=y
 CONFIG_XILINX_EMACLITE=y
 CONFIG_BROADCOM_PHY=y
 CONFIG_ICPLUS_PHY=y
+CONFIG_DP83867_PHY=y
 CONFIG_MARVELL_PHY=y
 CONFIG_MICREL_PHY=y
 CONFIG_AT803X_PHY=y
@@ -434,6 +438,7 @@ CONFIG_SPI_TEGRA20_SLINK=y
 CONFIG_SPI_XILINX=y
 CONFIG_SPI_SPIDEV=y
 CONFIG_SPMI=y
+CONFIG_PTP_1588_CLOCK=y
 CONFIG_PINCTRL_AS3722=y
 CONFIG_PINCTRL_RZA2=y
 CONFIG_PINCTRL_STMFX=y
-- 
2.17.1

Re: [git pull] IOMMU Fixes for Linux v5.9-rc7

2020-10-01 Thread pr-tracker-bot

The pull request you sent on Thu, 1 Oct 2020 20:50:30 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
> tags/iommu-fixes-v5.9-rc7

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/44b6e23be32be4470b1b8bf27380c2e9cca98e2b

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

Re: [musl] [PATCH 1/1] uapi: Don't include in

2020-10-01 Thread Petr Vorel

Hi Rich,

> On Thu, Oct 01, 2020 at 09:52:31PM +0200, Petr Vorel wrote:
> > + update code where needed (include  in code which
> > included  only to get struct sysinfo or SI_LOAD_SHIFT).

> > The reason is to avoid indirect  include when using
> > some network headers:  or others [1] ->
> >  -> .

> > This indirect include causes redefinition of struct sysinfo when
> > included both  and some of network headers:

> > In file included from 
> > x86_64-buildroot-linux-musl/sysroot/usr/include/linux/kernel.h:5,
> >  from 
> > x86_64-buildroot-linux-musl/sysroot/usr/include/linux/netlink.h:5,
> >  from ../include/tst_netlink.h:14,
> >  from tst_crypto.c:13:
> > x86_64-buildroot-linux-musl/sysroot/usr/include/linux/sysinfo.h:8:8: error: 
> > redefinition of ‘struct sysinfo’
> >  struct sysinfo {
> > ^~~
> > In file included from ../include/tst_safe_macros.h:15,
> >  from ../include/tst_test.h:93,
> >  from tst_crypto.c:11:
> > x86_64-buildroot-linux-musl/sysroot/usr/include/sys/sysinfo.h:10:8: note: 
> > originally defined here

> > [1] or , , , 
> > 

> > Suggested-by: Rich Felker 
> > Signed-off-by: Petr Vorel 
> > ---
> > Hi,

> > this looks to be long standing problem: python-psutil [2], iproute2 [3],
> > even for glibc in the past [4] and it tried to be solved before [5].

> > This will require glibc fix after:

> You can't do this; it breaks the existing contract with glibc. New
> kernel headers can't force a glibc upgrade.
Right, got that.

> You just have to get rid
> of use of  elsewhere in the uapi headers. It was a
> mistake that  was ever separated out of
>  since it didn't (and couldn't) fix the contract that
>  exposes struct sysinfo (and that it's misnamed). But
> it's no big deal. This can all be fixed without any breakage anywhere
> just by not using it.
Back to your original suggestion to move the alignment macros to a separate
header. I was trying to avoid it not sure if introducing new header is
acceptable, but we'll see.

> Rich

Kind regards,
Petr

Re: [musl] [PATCH 1/1] uapi: Don't include in

2020-10-01 Thread Rich Felker

On Thu, Oct 01, 2020 at 09:52:31PM +0200, Petr Vorel wrote:
> + update code where needed (include  in code which
> included  only to get struct sysinfo or SI_LOAD_SHIFT).
> 
> The reason is to avoid indirect  include when using
> some network headers:  or others [1] ->
>  -> .
> 
> This indirect include causes redefinition of struct sysinfo when
> included both  and some of network headers:
> 
> In file included from 
> x86_64-buildroot-linux-musl/sysroot/usr/include/linux/kernel.h:5,
>  from 
> x86_64-buildroot-linux-musl/sysroot/usr/include/linux/netlink.h:5,
>  from ../include/tst_netlink.h:14,
>  from tst_crypto.c:13:
> x86_64-buildroot-linux-musl/sysroot/usr/include/linux/sysinfo.h:8:8: error: 
> redefinition of ‘struct sysinfo’
>  struct sysinfo {
> ^~~
> In file included from ../include/tst_safe_macros.h:15,
>  from ../include/tst_test.h:93,
>  from tst_crypto.c:11:
> x86_64-buildroot-linux-musl/sysroot/usr/include/sys/sysinfo.h:10:8: note: 
> originally defined here
> 
> [1] or , , , 
> 
> 
> Suggested-by: Rich Felker 
> Signed-off-by: Petr Vorel 
> ---
> Hi,
> 
> this looks to be long standing problem: python-psutil [2], iproute2 [3],
> even for glibc in the past [4] and it tried to be solved before [5].
> 
> This will require glibc fix after:

You can't do this; it breaks the existing contract with glibc. New
kernel headers can't force a glibc upgrade. You just have to get rid
of use of  elsewhere in the uapi headers. It was a
mistake that  was ever separated out of
 since it didn't (and couldn't) fix the contract that
 exposes struct sysinfo (and that it's misnamed). But
it's no big deal. This can all be fixed without any breakage anywhere
just by not using it.

Rich

Re: [PATCH v11 2/3] arch: Wire up trusted_for(2)

2020-10-01 Thread Mickaël Salaün



On 01/10/2020 21:33, Tycho Andersen wrote:
> On Thu, Oct 01, 2020 at 07:02:31PM +0200, Mickaël Salaün wrote:
>> --- a/include/uapi/asm-generic/unistd.h
>> +++ b/include/uapi/asm-generic/unistd.h
>> @@ -859,9 +859,11 @@ __SYSCALL(__NR_openat2, sys_openat2)
>>  __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
>>  #define __NR_faccessat2 439
>>  __SYSCALL(__NR_faccessat2, sys_faccessat2)
>> +#define __NR_trusted_for 443
>> +__SYSCALL(__NR_trusted_for, sys_trusted_for)
>>  
>>  #undef __NR_syscalls
>> -#define __NR_syscalls 440
>> +#define __NR_syscalls 444
> 
> Looks like a rebase problem here?

No, it is a synchronization with the -next tree (cf. changelog) as asked
(and acked for a previous version) by Arnd.

[PATCH v15 15/15] mtd: spi-nor: micron-st: allow using MT35XU512ABA in Octal DTR mode

2020-10-01 Thread Pratyush Yadav

Since this flash doesn't have a Profile 1.0 table, the Octal DTR
capabilities are enabled in the post SFDP fixup, along with the 8D-8D-8D
fast read settings.

Enable Octal DTR mode with 20 dummy cycles to allow running at the
maximum supported frequency of 200Mhz.

The flash supports the soft reset sequence. So, add the flag in the
flash's info.

Signed-off-by: Pratyush Yadav 
---
 drivers/mtd/spi-nor/micron-st.c | 100 +++-
 1 file changed, 99 insertions(+), 1 deletion(-)

diff --git a/drivers/mtd/spi-nor/micron-st.c b/drivers/mtd/spi-nor/micron-st.c
index ef3695080710..bf3c5110742c 100644
--- a/drivers/mtd/spi-nor/micron-st.c
+++ b/drivers/mtd/spi-nor/micron-st.c
@@ -8,10 +8,108 @@
 
 #include "core.h"
 
+#define SPINOR_OP_MT_DTR_RD0xfd/* Fast Read opcode in DTR mode */
+#define SPINOR_OP_MT_RD_ANY_REG0x85/* Read volatile register */
+#define SPINOR_OP_MT_WR_ANY_REG0x81/* Write volatile register */
+#define SPINOR_REG_MT_CFR0V0x00/* For setting octal DTR mode */
+#define SPINOR_REG_MT_CFR1V0x01/* For setting dummy cycles */
+#define SPINOR_MT_OCT_DTR  0xe7/* Enable Octal DTR. */
+#define SPINOR_MT_EXSPI0xff/* Enable Extended SPI 
(default) */
+
+static int spi_nor_micron_octal_dtr_enable(struct spi_nor *nor, bool enable)
+{
+   struct spi_mem_op op;
+   u8 *buf = nor->bouncebuf;
+   int ret;
+
+   if (enable) {
+   /* Use 20 dummy cycles for memory array reads. */
+   ret = spi_nor_write_enable(nor);
+   if (ret)
+   return ret;
+
+   *buf = 20;
+   op = (struct spi_mem_op)
+   SPI_MEM_OP(SPI_MEM_OP_CMD(SPINOR_OP_MT_WR_ANY_REG, 1),
+  SPI_MEM_OP_ADDR(3, SPINOR_REG_MT_CFR1V, 1),
+  SPI_MEM_OP_NO_DUMMY,
+  SPI_MEM_OP_DATA_OUT(1, buf, 1));
+
+   ret = spi_mem_exec_op(nor->spimem, );
+   if (ret)
+   return ret;
+
+   ret = spi_nor_wait_till_ready(nor);
+   if (ret)
+   return ret;
+   }
+
+   ret = spi_nor_write_enable(nor);
+   if (ret)
+   return ret;
+
+   if (enable)
+   *buf = SPINOR_MT_OCT_DTR;
+   else
+   *buf = SPINOR_MT_EXSPI;
+
+   op = (struct spi_mem_op)
+   SPI_MEM_OP(SPI_MEM_OP_CMD(SPINOR_OP_MT_WR_ANY_REG, 1),
+  SPI_MEM_OP_ADDR(enable ? 3 : 4,
+  SPINOR_REG_MT_CFR0V, 1),
+  SPI_MEM_OP_NO_DUMMY,
+  SPI_MEM_OP_DATA_OUT(1, buf, 1));
+
+   if (!enable)
+   spi_nor_spimem_setup_op(nor, , SNOR_PROTO_8_8_8_DTR);
+
+   ret = spi_mem_exec_op(nor->spimem, );
+   if (ret)
+   return ret;
+
+   /* Give some time for the mode change to take place. */
+   usleep_range(1000, 1500);
+
+   return 0;
+}
+
+static void mt35xu512aba_default_init(struct spi_nor *nor)
+{
+   nor->params->octal_dtr_enable = spi_nor_micron_octal_dtr_enable;
+}
+
+static void mt35xu512aba_post_sfdp_fixup(struct spi_nor *nor)
+{
+   /* Set the Fast Read settings. */
+   nor->params->hwcaps.mask |= SNOR_HWCAPS_READ_8_8_8_DTR;
+   spi_nor_set_read_settings(>params->reads[SNOR_CMD_READ_8_8_8_DTR],
+ 0, 20, SPINOR_OP_MT_DTR_RD,
+ SNOR_PROTO_8_8_8_DTR);
+
+   nor->cmd_ext_type = SPI_NOR_EXT_REPEAT;
+   nor->params->rdsr_dummy = 8;
+   nor->params->rdsr_addr_nbytes = 0;
+
+   /*
+* The BFPT quad enable field is set to a reserved value so the quad
+* enable function is ignored by spi_nor_parse_bfpt(). Make sure we
+* disable it.
+*/
+   nor->params->quad_enable = NULL;
+}
+
+static struct spi_nor_fixups mt35xu512aba_fixups = {
+   .default_init = mt35xu512aba_default_init,
+   .post_sfdp = mt35xu512aba_post_sfdp_fixup,
+};
+
 static const struct flash_info micron_parts[] = {
{ "mt35xu512aba", INFO(0x2c5b1a, 0, 128 * 1024, 512,
   SECT_4K | USE_FSR | SPI_NOR_OCTAL_READ |
-  SPI_NOR_4B_OPCODES) },
+  SPI_NOR_4B_OPCODES | SPI_NOR_OCTAL_DTR_READ |
+  SPI_NOR_OCTAL_DTR_PP |
+  SPI_NOR_IO_MODE_EN_VOLATILE)
+ .fixups = _fixups},
{ "mt35xu02g", INFO(0x2c5b1c, 0, 128 * 1024, 2048,
SECT_4K | USE_FSR | SPI_NOR_OCTAL_READ |
SPI_NOR_4B_OPCODES) },
-- 
2.28.0

[PATCH v15 11/15] mtd: spi-nor: sfdp: detect Soft Reset sequence support from BFPT

2020-10-01 Thread Pratyush Yadav

A Soft Reset sequence will return the flash to Power-on-Reset (POR)
state. It consists of two commands: Soft Reset Enable and Soft Reset.
Find out if the sequence is supported from BFPT DWORD 16.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/core.h | 1 +
 drivers/mtd/spi-nor/sfdp.c | 4 
 drivers/mtd/spi-nor/sfdp.h | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/drivers/mtd/spi-nor/core.h b/drivers/mtd/spi-nor/core.h
index 105a4ddeb309..0a775a7b5606 100644
--- a/drivers/mtd/spi-nor/core.h
+++ b/drivers/mtd/spi-nor/core.h
@@ -27,6 +27,7 @@ enum spi_nor_option_flags {
SNOR_F_HAS_4BIT_BP  = BIT(12),
SNOR_F_HAS_SR_BP3_BIT6  = BIT(13),
SNOR_F_IO_MODE_EN_VOLATILE = BIT(14),
+   SNOR_F_SOFT_RESET   = BIT(15),
 };
 
 struct spi_nor_read_command {
diff --git a/drivers/mtd/spi-nor/sfdp.c b/drivers/mtd/spi-nor/sfdp.c
index 3efcba5e629a..22cb519efe3f 100644
--- a/drivers/mtd/spi-nor/sfdp.c
+++ b/drivers/mtd/spi-nor/sfdp.c
@@ -608,6 +608,10 @@ static int spi_nor_parse_bfpt(struct spi_nor *nor,
break;
}
 
+   /* Soft Reset support. */
+   if (bfpt.dwords[BFPT_DWORD(16)] & BFPT_DWORD16_SWRST_EN_RST)
+   nor->flags |= SNOR_F_SOFT_RESET;
+
/* Stop here if not JESD216 rev C or later. */
if (bfpt_header->length == BFPT_DWORD_MAX_JESD216B)
return spi_nor_post_bfpt_fixups(nor, bfpt_header, ,
diff --git a/drivers/mtd/spi-nor/sfdp.h b/drivers/mtd/spi-nor/sfdp.h
index 6d7243067252..89152ae1cf3e 100644
--- a/drivers/mtd/spi-nor/sfdp.h
+++ b/drivers/mtd/spi-nor/sfdp.h
@@ -90,6 +90,8 @@ struct sfdp_bfpt {
 #define BFPT_DWORD15_QER_SR2_BIT1_NO_RD(0x4UL << 20)
 #define BFPT_DWORD15_QER_SR2_BIT1  (0x5UL << 20) /* Spansion */
 
+#define BFPT_DWORD16_SWRST_EN_RST  BIT(12)
+
 #define BFPT_DWORD18_CMD_EXT_MASK  GENMASK(30, 29)
 #define BFPT_DWORD18_CMD_EXT_REP   (0x0UL << 29) /* Repeat */
 #define BFPT_DWORD18_CMD_EXT_INV   (0x1UL << 29) /* Invert */
-- 
2.28.0

[PATCH v15 08/15] mtd: spi-nor: Introduce SNOR_F_IO_MODE_EN_VOLATILE

2020-10-01 Thread Pratyush Yadav

From: Tudor Ambarus 

We don't want to enter a stateful mode, where a X-X-X I/O mode
is entered by setting a non-volatile bit, because in case of a
reset or a crash, once in the non-volatile mode, we may not be able
to recover in bootloaders and we may break the SPI NOR boot.

Forbid by default the I/O modes that are set via a non-volatile bit.

SPI_NOR_IO_MODE_EN_VOLATILE should be set just for the flashes that
don't define the optional SFDP SCCR Map, so that we don't pollute the
flash info flags.

Signed-off-by: Tudor Ambarus 
Signed-off-by: Pratyush Yadav 
---
 drivers/mtd/spi-nor/core.c | 3 +++
 drivers/mtd/spi-nor/core.h | 6 ++
 2 files changed, 9 insertions(+)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index b83bf5ed2b2d..e91ddb409699 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -3387,6 +3387,9 @@ int spi_nor_scan(struct spi_nor *nor, const char *name,
if (info->flags & SPI_NOR_4B_OPCODES)
nor->flags |= SNOR_F_4B_OPCODES;
 
+   if (info->flags & SPI_NOR_IO_MODE_EN_VOLATILE)
+   nor->flags |= SNOR_F_IO_MODE_EN_VOLATILE;
+
ret = spi_nor_set_addr_width(nor);
if (ret)
return ret;
diff --git a/drivers/mtd/spi-nor/core.h b/drivers/mtd/spi-nor/core.h
index 9a33c8d07335..eaece1123c0b 100644
--- a/drivers/mtd/spi-nor/core.h
+++ b/drivers/mtd/spi-nor/core.h
@@ -26,6 +26,7 @@ enum spi_nor_option_flags {
SNOR_F_HAS_SR_TB_BIT6   = BIT(11),
SNOR_F_HAS_4BIT_BP  = BIT(12),
SNOR_F_HAS_SR_BP3_BIT6  = BIT(13),
+   SNOR_F_IO_MODE_EN_VOLATILE = BIT(14),
 };
 
 struct spi_nor_read_command {
@@ -320,6 +321,11 @@ struct flash_info {
 */
 #define SPI_NOR_OCTAL_DTR_READ BIT(19) /* Flash supports octal DTR Read. */
 #define SPI_NOR_OCTAL_DTR_PP   BIT(20) /* Flash supports Octal DTR Page 
Program */
+#define SPI_NOR_IO_MODE_EN_VOLATILEBIT(21) /*
+* Flash enables the best
+* available I/O mode via a
+* volatile bit.
+*/
 
/* Part specific fixup hooks. */
const struct spi_nor_fixups *fixups;
-- 
2.28.0

[PATCH v15 12/15] mtd: spi-nor: core: perform a Soft Reset on shutdown

2020-10-01 Thread Pratyush Yadav

Perform a Soft Reset on shutdown on flashes that support it so that the
flash can be reset to its initial state and any configurations made by
spi-nor (given that they're only done in volatile registers) will be
reset. This will hand back the flash in pristine state for any further
operations on it.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/core.c  | 45 +
 include/linux/mtd/spi-nor.h |  2 ++
 2 files changed, 47 insertions(+)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index cf6ada7c8a7b..feb4310ff6dc 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -40,6 +40,9 @@
 
 #define SPI_NOR_MAX_ADDR_WIDTH 4
 
+#define SPI_NOR_SRST_SLEEP_MIN 200
+#define SPI_NOR_SRST_SLEEP_MAX 400
+
 /**
  * spi_nor_get_cmd_ext() - Get the command opcode extension based on the
  *extension type.
@@ -3175,6 +3178,45 @@ static int spi_nor_init(struct spi_nor *nor)
return 0;
 }
 
+static void spi_nor_soft_reset(struct spi_nor *nor)
+{
+   struct spi_mem_op op;
+   int ret;
+
+   op = (struct spi_mem_op)SPI_MEM_OP(SPI_MEM_OP_CMD(SPINOR_OP_SRSTEN, 0),
+   SPI_MEM_OP_NO_DUMMY,
+   SPI_MEM_OP_NO_ADDR,
+   SPI_MEM_OP_NO_DATA);
+
+   spi_nor_spimem_setup_op(nor, , nor->reg_proto);
+
+   ret = spi_mem_exec_op(nor->spimem, );
+   if (ret) {
+   dev_warn(nor->dev, "Software reset failed: %d\n", ret);
+   return;
+   }
+
+   op = (struct spi_mem_op)SPI_MEM_OP(SPI_MEM_OP_CMD(SPINOR_OP_SRST, 0),
+   SPI_MEM_OP_NO_DUMMY,
+   SPI_MEM_OP_NO_ADDR,
+   SPI_MEM_OP_NO_DATA);
+
+   spi_nor_spimem_setup_op(nor, , nor->reg_proto);
+
+   ret = spi_mem_exec_op(nor->spimem, );
+   if (ret) {
+   dev_warn(nor->dev, "Software reset failed: %d\n", ret);
+   return;
+   }
+
+   /*
+* Software Reset is not instant, and the delay varies from flash to
+* flash. Looking at a few flashes, most range somewhere below 100
+* microseconds. So, sleep for a range of 200-400 us.
+*/
+   usleep_range(SPI_NOR_SRST_SLEEP_MIN, SPI_NOR_SRST_SLEEP_MAX);
+}
+
 /* mtd resume handler */
 static void spi_nor_resume(struct mtd_info *mtd)
 {
@@ -3194,6 +3236,9 @@ void spi_nor_restore(struct spi_nor *nor)
if (nor->addr_width == 4 && !(nor->flags & SNOR_F_4B_OPCODES) &&
nor->flags & SNOR_F_BROKEN_RESET)
nor->params->set_4byte_addr_mode(nor, false);
+
+   if (nor->flags & SNOR_F_SOFT_RESET)
+   spi_nor_soft_reset(nor);
 }
 EXPORT_SYMBOL_GPL(spi_nor_restore);
 
diff --git a/include/linux/mtd/spi-nor.h b/include/linux/mtd/spi-nor.h
index cd549042c53d..299685d15dc2 100644
--- a/include/linux/mtd/spi-nor.h
+++ b/include/linux/mtd/spi-nor.h
@@ -51,6 +51,8 @@
 #define SPINOR_OP_CLFSR0x50/* Clear flag status register */
 #define SPINOR_OP_RDEAR0xc8/* Read Extended Address 
Register */
 #define SPINOR_OP_WREAR0xc5/* Write Extended Address 
Register */
+#define SPINOR_OP_SRSTEN   0x66/* Software Reset Enable */
+#define SPINOR_OP_SRST 0x99/* Software Reset */
 
 /* 4-byte address opcodes - used on Spansion and some Macronix flashes. */
 #define SPINOR_OP_READ_4B  0x13/* Read data bytes (low frequency) */
-- 
2.28.0

[PATCH v15 05/15] mtd: spi-nor: sfdp: parse xSPI Profile 1.0 table

2020-10-01 Thread Pratyush Yadav

This table is indication that the flash is xSPI compliant and hence
supports octal DTR mode. Extract information like the fast read opcode,
dummy cycles, the number of dummy cycles needed for a Read Status
Register command, and the number of address bytes needed for a Read
Status Register command.

We don't know what speed the controller is running at. Find the fast
read dummy cycles for the fastest frequency the flash can run at to be
sure we are never short of dummy cycles. If nothing is available,
default to 20. Flashes that use a different value should update it in
their fixup hooks.

Since we want to set read settings, expose spi_nor_set_read_settings()
in core.h.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/core.c |  2 +-
 drivers/mtd/spi-nor/core.h | 10 +
 drivers/mtd/spi-nor/sfdp.c | 91 ++
 3 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 779e64974fea..ad280874a2e8 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -2332,7 +2332,7 @@ static int spi_nor_check(struct spi_nor *nor)
return 0;
 }
 
-static void
+void
 spi_nor_set_read_settings(struct spi_nor_read_command *read,
  u8 num_mode_clocks,
  u8 num_wait_states,
diff --git a/drivers/mtd/spi-nor/core.h b/drivers/mtd/spi-nor/core.h
index 5d95b4183a33..9a33c8d07335 100644
--- a/drivers/mtd/spi-nor/core.h
+++ b/drivers/mtd/spi-nor/core.h
@@ -192,6 +192,9 @@ struct spi_nor_locking_ops {
  *
  * @size:  the flash memory density in bytes.
  * @page_size: the page size of the SPI NOR flash memory.
+ * @rdsr_dummy:dummy cycles needed for Read Status Register 
command.
+ * @rdsr_addr_nbytes:  dummy address bytes needed for Read Status Register
+ * command.
  * @hwcaps:describes the read and page program hardware
  * capabilities.
  * @reads: read capabilities ordered by priority: the higher index
@@ -214,6 +217,8 @@ struct spi_nor_locking_ops {
 struct spi_nor_flash_parameter {
u64 size;
u32 page_size;
+   u8  rdsr_dummy;
+   u8  rdsr_addr_nbytes;
 
struct spi_nor_hwcaps   hwcaps;
struct spi_nor_read_command reads[SNOR_CMD_READ_MAX];
@@ -425,6 +430,11 @@ ssize_t spi_nor_write_data(struct spi_nor *nor, loff_t to, 
size_t len,
 
 int spi_nor_hwcaps_read2cmd(u32 hwcaps);
 u8 spi_nor_convert_3to4_read(u8 opcode);
+void spi_nor_set_read_settings(struct spi_nor_read_command *read,
+  u8 num_mode_clocks,
+  u8 num_wait_states,
+  u8 opcode,
+  enum spi_nor_protocol proto);
 void spi_nor_set_pp_settings(struct spi_nor_pp_command *pp, u8 opcode,
 enum spi_nor_protocol proto);
 
diff --git a/drivers/mtd/spi-nor/sfdp.c b/drivers/mtd/spi-nor/sfdp.c
index c77655968f80..b2d097b44a55 100644
--- a/drivers/mtd/spi-nor/sfdp.c
+++ b/drivers/mtd/spi-nor/sfdp.c
@@ -4,6 +4,7 @@
  * Copyright (C) 2014, Freescale Semiconductor, Inc.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -19,6 +20,7 @@
 #define SFDP_BFPT_ID   0xff00  /* Basic Flash Parameter Table */
 #define SFDP_SECTOR_MAP_ID 0xff81  /* Sector Map Table */
 #define SFDP_4BAIT_ID  0xff84  /* 4-byte Address Instruction Table */
+#define SFDP_PROFILE1_ID   0xff05  /* xSPI Profile 1.0 table. */
 
 #define SFDP_SIGNATURE 0x50444653U
 
@@ -1108,6 +1110,91 @@ static int spi_nor_parse_4bait(struct spi_nor *nor,
return ret;
 }
 
+#define PROFILE1_DWORD1_RDSR_ADDR_BYTESBIT(29)
+#define PROFILE1_DWORD1_RDSR_DUMMY BIT(28)
+#define PROFILE1_DWORD1_RD_FAST_CMDGENMASK(15, 8)
+#define PROFILE1_DWORD4_DUMMY_200MHZ   GENMASK(11, 7)
+#define PROFILE1_DWORD5_DUMMY_166MHZ   GENMASK(31, 27)
+#define PROFILE1_DWORD5_DUMMY_133MHZ   GENMASK(21, 17)
+#define PROFILE1_DWORD5_DUMMY_100MHZ   GENMASK(11, 7)
+
+/**
+ * spi_nor_parse_profile1() - parse the xSPI Profile 1.0 table
+ * @nor:   pointer to a 'struct spi_nor'
+ * @profile1_header:   pointer to the 'struct sfdp_parameter_header' describing
+ * the Profile 1.0 Table length and version.
+ * @params:pointer to the 'struct spi_nor_flash_parameter' to be.
+ *
+ * Return: 0 on success, -errno otherwise.
+ */
+static int spi_nor_parse_profile1(struct spi_nor *nor,
+ const struct sfdp_parameter_header 
*profile1_header,
+ struct spi_nor_flash_parameter *params)
+{
+   u32 *dwords, addr;
+   size_t len;
+   int ret;
+   u8 dummy,

[PATCH v15 10/15] mtd: spi-nor: core: enable octal DTR mode when possible

2020-10-01 Thread Pratyush Yadav

Allow flashes to specify a hook to enable octal DTR mode. Use this hook
whenever possible to get optimal transfer speeds.

Signed-off-by: Pratyush Yadav 
---
 drivers/mtd/spi-nor/core.c | 38 ++
 drivers/mtd/spi-nor/core.h |  2 ++
 2 files changed, 40 insertions(+)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index e91ddb409699..cf6ada7c8a7b 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -3068,6 +3068,38 @@ static int spi_nor_init_params(struct spi_nor *nor)
return 0;
 }
 
+/** spi_nor_octal_dtr_enable() - enable Octal DTR I/O if needed
+ * @nor: pointer to a 'struct spi_nor'
+ * @enable:  whether to enable or disable Octal DTR
+ *
+ * Return: 0 on success, -errno otherwise.
+ */
+static int spi_nor_octal_dtr_enable(struct spi_nor *nor, bool enable)
+{
+   int ret;
+
+   if (!nor->params->octal_dtr_enable)
+   return 0;
+
+   if (!(nor->read_proto == SNOR_PROTO_8_8_8_DTR &&
+ nor->write_proto == SNOR_PROTO_8_8_8_DTR))
+   return 0;
+
+   if (!(nor->flags & SNOR_F_IO_MODE_EN_VOLATILE))
+   return 0;
+
+   ret = nor->params->octal_dtr_enable(nor, enable);
+   if (ret)
+   return ret;
+
+   if (enable)
+   nor->reg_proto = SNOR_PROTO_8_8_8_DTR;
+   else
+   nor->reg_proto = SNOR_PROTO_1_1_1;
+
+   return 0;
+}
+
 /**
  * spi_nor_quad_enable() - enable Quad I/O if needed.
  * @nor:pointer to a 'struct spi_nor'
@@ -3107,6 +3139,12 @@ static int spi_nor_init(struct spi_nor *nor)
 {
int err;
 
+   err = spi_nor_octal_dtr_enable(nor, true);
+   if (err) {
+   dev_dbg(nor->dev, "octal mode not supported\n");
+   return err;
+   }
+
err = spi_nor_quad_enable(nor);
if (err) {
dev_dbg(nor->dev, "quad mode not supported\n");
diff --git a/drivers/mtd/spi-nor/core.h b/drivers/mtd/spi-nor/core.h
index eaece1123c0b..105a4ddeb309 100644
--- a/drivers/mtd/spi-nor/core.h
+++ b/drivers/mtd/spi-nor/core.h
@@ -204,6 +204,7 @@ struct spi_nor_locking_ops {
  *  higher index in the array, the higher priority.
  * @erase_map: the erase map parsed from the SFDP Sector Map Parameter
  *  Table.
+ * @octal_dtr_enable:  enables SPI NOR octal DTR mode.
  * @quad_enable:   enables SPI NOR quad mode.
  * @set_4byte_addr_mode: puts the SPI NOR in 4 byte addressing mode.
  * @convert_addr:  converts an absolute address into something the flash
@@ -227,6 +228,7 @@ struct spi_nor_flash_parameter {
 
struct spi_nor_erase_maperase_map;
 
+   int (*octal_dtr_enable)(struct spi_nor *nor, bool enable);
int (*quad_enable)(struct spi_nor *nor);
int (*set_4byte_addr_mode)(struct spi_nor *nor, bool enable);
u32 (*convert_addr)(struct spi_nor *nor, u32 addr);
-- 
2.28.0

[PATCH v15 09/15] mtd: spi-nor: Parse SFDP SCCR Map

2020-10-01 Thread Pratyush Yadav

From: Tudor Ambarus 

Parse just the 22nd dword and look for the 'DTR Octal Mode Enable
Volatile bit'.

SPI_NOR_IO_MODE_EN_VOLATILE should be set just for the flashes
that don't define the optional SFDP SCCR Map. For the others,
let the SFDP do its job and fill the SNOR_F_IO_MODE_EN_VOLATILE
flag. We avoid this way polluting the flash flags when declaring
one.

Signed-off-by: Tudor Ambarus 
Signed-off-by: Pratyush Yadav 
---
 drivers/mtd/spi-nor/sfdp.c | 48 ++
 1 file changed, 48 insertions(+)

diff --git a/drivers/mtd/spi-nor/sfdp.c b/drivers/mtd/spi-nor/sfdp.c
index b2d097b44a55..3efcba5e629a 100644
--- a/drivers/mtd/spi-nor/sfdp.c
+++ b/drivers/mtd/spi-nor/sfdp.c
@@ -21,6 +21,10 @@
 #define SFDP_SECTOR_MAP_ID 0xff81  /* Sector Map Table */
 #define SFDP_4BAIT_ID  0xff84  /* 4-byte Address Instruction Table */
 #define SFDP_PROFILE1_ID   0xff05  /* xSPI Profile 1.0 table. */
+#define SFDP_SCCR_MAP_ID   0xff87  /*
+* Status, Control and Configuration
+* Register Map.
+*/
 
 #define SFDP_SIGNATURE 0x50444653U
 
@@ -1195,6 +1199,46 @@ static int spi_nor_parse_profile1(struct spi_nor *nor,
return ret;
 }
 
+#define SCCR_DWORD22_OCTAL_DTR_EN_VOLATILE BIT(31)
+
+/**
+ * spi_nor_parse_sccr() - Parse the Status, Control and Configuration Register
+ *Map.
+ * @nor:   pointer to a 'struct spi_nor'
+ * @sccr_header:   pointer to the 'struct sfdp_parameter_header' describing
+ * the SCCR Map table length and version.
+ * @params:pointer to the 'struct spi_nor_flash_parameter' to be.
+ *
+ * Return: 0 on success, -errno otherwise.
+ */
+static int spi_nor_parse_sccr(struct spi_nor *nor,
+ const struct sfdp_parameter_header *sccr_header,
+ struct spi_nor_flash_parameter *params)
+{
+   u32 *dwords, addr;
+   size_t len;
+   int ret;
+
+   len = sccr_header->length * sizeof(*dwords);
+   dwords = kmalloc(len, GFP_KERNEL);
+   if (!dwords)
+   return -ENOMEM;
+
+   addr = SFDP_PARAM_HEADER_PTP(sccr_header);
+   ret = spi_nor_read_sfdp(nor, addr, len, dwords);
+   if (ret)
+   goto out;
+
+   le32_to_cpu_array(dwords, sccr_header->length);
+
+   if (FIELD_GET(SCCR_DWORD22_OCTAL_DTR_EN_VOLATILE, dwords[22]))
+   nor->flags |= SNOR_F_IO_MODE_EN_VOLATILE;
+
+out:
+   kfree(dwords);
+   return ret;
+}
+
 /**
  * spi_nor_parse_sfdp() - parse the Serial Flash Discoverable Parameters.
  * @nor:   pointer to a 'struct spi_nor'
@@ -1300,6 +1344,10 @@ int spi_nor_parse_sfdp(struct spi_nor *nor,
err = spi_nor_parse_profile1(nor, param_header, params);
break;
 
+   case SFDP_SCCR_MAP_ID:
+   err = spi_nor_parse_sccr(nor, param_header, params);
+   break;
+
default:
break;
}
-- 
2.28.0

[PATCH v15 14/15] mtd: spi-nor: spansion: add support for Cypress Semper flash

2020-10-01 Thread Pratyush Yadav

The Cypress Semper flash is an xSPI compliant octal DTR flash. Add
support for using it in octal DTR mode.

The flash by default boots in a hybrid sector mode. But the sector map
table on the part I had was programmed incorrectly and the SMPT values
on the flash don't match the public datasheet. Specifically, in some
places erase type 3 was used instead of 4. In addition, the region sizes
were incorrect in some places. So, for testing I set CFR3N[3] to enable
uniform sector sizes. Since the uniform sector mode bit is a
non-volatile bit, this series does not change it to avoid making any
permanent changes to the flash configuration. The correct data to
implement a fixup is not available right now and will be done in a
follow-up patch if needed.

Signed-off-by: Pratyush Yadav 
---
 drivers/mtd/spi-nor/spansion.c | 156 +
 1 file changed, 156 insertions(+)

diff --git a/drivers/mtd/spi-nor/spansion.c b/drivers/mtd/spi-nor/spansion.c
index 8429b4af999a..d146c30aab42 100644
--- a/drivers/mtd/spi-nor/spansion.c
+++ b/drivers/mtd/spi-nor/spansion.c
@@ -8,6 +8,157 @@
 
 #include "core.h"
 
+#define SPINOR_OP_RD_ANY_REG   0x65/* Read any register */
+#define SPINOR_OP_WR_ANY_REG   0x71/* Write any register */
+#define SPINOR_REG_CYPRESS_CFR2V   0x0083
+#define SPINOR_REG_CYPRESS_CFR2V_MEMLAT_11_24  0xb
+#define SPINOR_REG_CYPRESS_CFR3V   0x0084
+#define SPINOR_REG_CYPRESS_CFR3V_PGSZ  BIT(4) /* Page size. */
+#define SPINOR_REG_CYPRESS_CFR5V   0x0086
+#define SPINOR_REG_CYPRESS_CFR5V_OCT_DTR_EN0x3
+#define SPINOR_REG_CYPRESS_CFR5V_OCT_DTR_DS0
+#define SPINOR_OP_CYPRESS_RD_FAST  0xee
+
+/**
+ * spi_nor_cypress_octal_dtr_enable() - Enable octal DTR on Cypress flashes.
+ * @nor:   pointer to a 'struct spi_nor'
+ * @enable:  whether to enable or disable Octal DTR
+ *
+ * This also sets the memory access latency cycles to 24 to allow the flash to
+ * run at up to 200MHz.
+ *
+ * Return: 0 on success, -errno otherwise.
+ */
+static int spi_nor_cypress_octal_dtr_enable(struct spi_nor *nor, bool enable)
+{
+   struct spi_mem_op op;
+   u8 *buf = nor->bouncebuf;
+   int ret;
+
+   if (enable) {
+   /* Use 24 dummy cycles for memory array reads. */
+   ret = spi_nor_write_enable(nor);
+   if (ret)
+   return ret;
+
+   *buf = SPINOR_REG_CYPRESS_CFR2V_MEMLAT_11_24;
+   op = (struct spi_mem_op)
+   SPI_MEM_OP(SPI_MEM_OP_CMD(SPINOR_OP_WR_ANY_REG, 1),
+  SPI_MEM_OP_ADDR(3, SPINOR_REG_CYPRESS_CFR2V,
+  1),
+  SPI_MEM_OP_NO_DUMMY,
+  SPI_MEM_OP_DATA_OUT(1, buf, 1));
+
+   ret = spi_mem_exec_op(nor->spimem, );
+   if (ret)
+   return ret;
+
+   ret = spi_nor_wait_till_ready(nor);
+   if (ret)
+   return ret;
+
+   nor->read_dummy = 24;
+   }
+
+   /* Set/unset the octal and DTR enable bits. */
+   ret = spi_nor_write_enable(nor);
+   if (ret)
+   return ret;
+
+   if (enable)
+   *buf = SPINOR_REG_CYPRESS_CFR5V_OCT_DTR_EN;
+   else
+   *buf = SPINOR_REG_CYPRESS_CFR5V_OCT_DTR_DS;
+
+   op = (struct spi_mem_op)
+   SPI_MEM_OP(SPI_MEM_OP_CMD(SPINOR_OP_WR_ANY_REG, 1),
+  SPI_MEM_OP_ADDR(enable ? 3 : 4,
+  SPINOR_REG_CYPRESS_CFR5V,
+  1),
+  SPI_MEM_OP_NO_DUMMY,
+  SPI_MEM_OP_DATA_OUT(1, buf, 1));
+
+   if (!enable)
+   spi_nor_spimem_setup_op(nor, , SNOR_PROTO_8_8_8_DTR);
+
+   ret = spi_mem_exec_op(nor->spimem, );
+   if (ret)
+   return ret;
+
+   /* Give some time for the mode change to take place. */
+   usleep_range(1000, 1500);
+
+   return 0;
+}
+
+static void s28hs512t_default_init(struct spi_nor *nor)
+{
+   nor->params->octal_dtr_enable = spi_nor_cypress_octal_dtr_enable;
+}
+
+static void s28hs512t_post_sfdp_fixup(struct spi_nor *nor)
+{
+   /*
+* On older versions of the flash the xSPI Profile 1.0 table has the
+* 8D-8D-8D Fast Read opcode as 0x00. But it actually should be 0xEE.
+*/
+   if (nor->params->reads[SNOR_CMD_READ_8_8_8_DTR].opcode == 0)
+   nor->params->reads[SNOR_CMD_READ_8_8_8_DTR].opcode =
+   SPINOR_OP_CYPRESS_RD_FAST;
+
+   /* This flash is also missing the 4-byte Page Program opcode bit. */
+   spi_nor_set_pp_settings(>params->page_programs[SNOR_CMD_PP],
+   SPINOR_OP_PP_4B, SNOR_PROTO_1_1_1);
+   /*
+

[PATCH v15 13/15] mtd: spi-nor: core: disable Octal DTR mode on suspend.

2020-10-01 Thread Pratyush Yadav

On resume, the init procedure will be run that will re-enable it.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/core.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index feb4310ff6dc..ff592468cc15 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -3217,6 +3217,20 @@ static void spi_nor_soft_reset(struct spi_nor *nor)
usleep_range(SPI_NOR_SRST_SLEEP_MIN, SPI_NOR_SRST_SLEEP_MAX);
 }
 
+/* mtd suspend handler */
+static int spi_nor_suspend(struct mtd_info *mtd)
+{
+   struct spi_nor *nor = mtd_to_spi_nor(mtd);
+   int ret;
+
+   /* Disable octal DTR mode if we enabled it. */
+   ret = spi_nor_octal_dtr_enable(nor, false);
+   if (ret)
+   dev_err(nor->dev, "suspend() failed\n");
+
+   return ret;
+}
+
 /* mtd resume handler */
 static void spi_nor_resume(struct mtd_info *mtd)
 {
@@ -3420,6 +3434,7 @@ int spi_nor_scan(struct spi_nor *nor, const char *name,
mtd->size = nor->params->size;
mtd->_erase = spi_nor_erase;
mtd->_read = spi_nor_read;
+   mtd->_suspend = spi_nor_suspend;
mtd->_resume = spi_nor_resume;
 
if (nor->params->locking_ops) {
-- 
2.28.0

[PATCH v15 03/15] mtd: spi-nor: add support for DTR protocol

2020-10-01 Thread Pratyush Yadav

Double Transfer Rate (DTR) is SPI protocol in which data is transferred
on each clock edge as opposed to on each clock cycle. Make
framework-level changes to allow supporting flashes in DTR mode.

Right now, mixed DTR modes are not supported. So, for example a mode
like 4S-4D-4D will not work. All phases need to be either DTR or STR.

The xSPI spec says that "The program commands provide SPI backward
compatible commands for programming data...". So 8D-8D-8D page program
opcodes are populated with using 1S-1S-1S opcodes.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/core.c  | 319 +++-
 drivers/mtd/spi-nor/core.h  |   7 +
 drivers/mtd/spi-nor/sfdp.c  |   9 +-
 include/linux/mtd/spi-nor.h |  51 --
 4 files changed, 290 insertions(+), 96 deletions(-)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 7a3bf460a2fa..779e64974fea 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -40,6 +40,78 @@
 
 #define SPI_NOR_MAX_ADDR_WIDTH 4
 
+/**
+ * spi_nor_get_cmd_ext() - Get the command opcode extension based on the
+ *extension type.
+ * @nor:   pointer to a 'struct spi_nor'
+ * @op:pointer to the 'struct spi_mem_op' whose 
properties
+ * need to be initialized.
+ *
+ * Right now, only "repeat" and "invert" are supported.
+ *
+ * Return: The opcode extension.
+ */
+static u8 spi_nor_get_cmd_ext(const struct spi_nor *nor,
+ const struct spi_mem_op *op)
+{
+   switch (nor->cmd_ext_type) {
+   case SPI_NOR_EXT_INVERT:
+   return ~op->cmd.opcode;
+
+   case SPI_NOR_EXT_REPEAT:
+   return op->cmd.opcode;
+
+   default:
+   dev_err(nor->dev, "Unknown command extension type\n");
+   return 0;
+   }
+}
+
+/**
+ * spi_nor_spimem_setup_op() - Set up common properties of a spi-mem op.
+ * @nor:   pointer to a 'struct spi_nor'
+ * @op:pointer to the 'struct spi_mem_op' whose 
properties
+ * need to be initialized.
+ * @proto: the protocol from which the properties need to be set.
+ */
+void spi_nor_spimem_setup_op(const struct spi_nor *nor,
+struct spi_mem_op *op,
+const enum spi_nor_protocol proto)
+{
+   u8 ext;
+
+   op->cmd.buswidth = spi_nor_get_protocol_inst_nbits(proto);
+
+   if (op->addr.nbytes)
+   op->addr.buswidth = spi_nor_get_protocol_addr_nbits(proto);
+
+   if (op->dummy.nbytes)
+   op->dummy.buswidth = spi_nor_get_protocol_addr_nbits(proto);
+
+   if (op->data.nbytes)
+   op->data.buswidth = spi_nor_get_protocol_data_nbits(proto);
+
+   if (spi_nor_protocol_is_dtr(proto)) {
+   /*
+* SPIMEM supports mixed DTR modes, but right now we can only
+* have all phases either DTR or STR. IOW, SPIMEM can have
+* something like 4S-4D-4D, but SPI NOR can't. So, set all 4
+* phases to either DTR or STR.
+*/
+   op->cmd.dtr = true;
+   op->addr.dtr = true;
+   op->dummy.dtr = true;
+   op->data.dtr = true;
+
+   /* 2 bytes per clock cycle in DTR mode. */
+   op->dummy.nbytes *= 2;
+
+   ext = spi_nor_get_cmd_ext(nor, op);
+   op->cmd.opcode = (op->cmd.opcode << 8) | ext;
+   op->cmd.nbytes = 2;
+   }
+}
+
 /**
  * spi_nor_spimem_bounce() - check if a bounce buffer is needed for the data
  *   transfer
@@ -85,17 +157,26 @@ static int spi_nor_spimem_exec_op(struct spi_nor *nor, 
struct spi_mem_op *op)
 static int spi_nor_controller_ops_read_reg(struct spi_nor *nor, u8 opcode,
   u8 *buf, size_t len)
 {
+   if (spi_nor_protocol_is_dtr(nor->reg_proto))
+   return -EOPNOTSUPP;
+
return nor->controller_ops->read_reg(nor, opcode, buf, len);
 }
 
 static int spi_nor_controller_ops_write_reg(struct spi_nor *nor, u8 opcode,
const u8 *buf, size_t len)
 {
+   if (spi_nor_protocol_is_dtr(nor->reg_proto))
+   return -EOPNOTSUPP;
+
return nor->controller_ops->write_reg(nor, opcode, buf, len);
 }
 
 static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
 {
+   if (spi_nor_protocol_is_dtr(nor->write_proto))
+   return -EOPNOTSUPP;
+
return nor->controller_ops->erase(nor, offs);
 }
 
@@ -113,22 +194,20 @@ static ssize_t spi_nor_spimem_read_data(struct spi_nor 
*nor, loff_t from,
size_t len, u8 *buf)
 {
struct spi_mem_op op =
-   SPI_MEM_OP(SPI_MEM_OP_CMD(nor->read_opcode, 1),
-

[PATCH v15 00/15] mtd: spi-nor: add xSPI Octal DTR support

2020-10-01 Thread Pratyush Yadav

Hi,

This series adds support for Octal DTR flashes in the SPI NOR framework,
and then adds hooks for the Cypress Semper and Micron Xcella flashes to
allow running them in Octal DTR mode. This series assumes that the flash
is handed to the kernel in Legacy SPI mode.

Tested on Micron MT35X and S28HS flashes for Octal DTR. Tested on Micron
MT25Q, and Cypress S25FL for regressions. All flashes were tested by
running a read/write stress test on top of UBIFS. On the Cypress S28HS
flash 1-bit ECC had to be used to allow UBIFS to work since partial page
writes don't work with 2-bit ECC.

Changes in v15:
- Give precedence to addr_width found via SFDP over forcing it to 4 for
  8D. The standard knows better.

- Sleep for a range of 1000 to 1500 us instead of 400 to 600. The 400 to
  600 range was too close to the actual time it took to change to 8D
  mode (discovered by polling SR right after and observing that it
  freezed the controller sometimes). Bump it to 1000 - 1500 to be safe.

- Do not initialize dummy to 0 in Profile 1.0 parsing.

- Drop the variable io_mode_en_volatile in spi_nor_parse_sccr(). Use the
  FIELD_GET expression in the if statement directly.

- Drop the debug message when setting dummy cycle configuration failed
  for S28HS flash.

- Move the patches that introduce SNOR_F_IO_MODE_EN_VOLATILE to before
  the one that introduces spi_nor_octal_dtr_enable(). This way we can
  reject flashes with non-volatile configuration from day 0.

Changes in v14:
- Rename spi_nor_{read,write}_reg() to
  spi_nor_controller_ops_{read,write}_reg().

- When spi_nor_spimem_setup_op() will be called after a spi_mem_op is
  initialized, set buswidth of all phases to 0. This will make the
  reader question where the buswidth is actually set and avoid any
  confusions.

- Only use address and dummy bytes from Profile 1.0 table for 8D-8D-8D
  Read SR/FSR instead of all DTR ones.

- Do not make spi_nor_default_setup_op() public. It is not used anymore
  in latest iterations so this is not needed.

- Only enable Octal DTR mode when the configuration to enable it is
  volatile.

- Do not prevent modes other than 8D-8D-8D from enabling 4-byte
  addressing. All other modes don't automatically imply 4-byte
  addressing.

- Add some blank lines in spi_nor_soft_reset().

- Drop the local variable 'dev' in spi_nor_suspend().

- Do not force 4-byte addressing on all DTR modes. Only force it for
  Octal DTR because only in that case 3-byte addresses are not allowed.

- Drop variable addr_width from spi_nor_micron_octal_dtr_enable() and
  spi_nor_cypress_octal_dtr_enable().

- Remove print from spi_nor_micron_octal_dtr_enable() and
  spi_nor_cypress_octal_dtr_enable() when enabling/disabling Octal DTR
  fails.

- Wait some time after enabling Octal DTR mode in
  spi_nor_micron_octal_dtr_enable() and
  spi_nor_cypress_octal_dtr_enable(). This makes sure the mode is
  enabled and flash is usuable.

- Fix alignment of .fixups in micron_parts and spansion_parts.

- s/BFPT_DWORD16_SOFT_RST/BFPT_DWORD16_SWRST_EN_RST

- Fix copy-paste leftover in spi_nor_parse_profile1() documentation.

- Do not assume a default dummy cycle value if none is found in Profile
  1.0 parsing. Leave it to 0 and let flashes fix it up via fixup hooks.

- Parse the SCCR table to discover if the Octal DTR enable bit is
  volatile or not.

- Drop variable addr_width in s28hs512t_post_bfpt_fixup() and move op
  initialization to the declaration to avoid a cast and use
  nor->bouncebuf directly instead of declaring a local variable.

Changes in v13:
- Do not use multiple assignments in spi_nor_spimem_setup_op().

- Use EOPNOTSUPP instead of ENOTSUPP.

- Fix unbalanced braces around else statements.

- Fix whitespace alignment for spi_nor_set_read_settings() prototype.

Changes in v12:
- Rebase on latest master.

- Set dummy and data nbytes to 1 in spi_nor_spimem_check_readop() before
  calling spi_nor_spimem_check_op() to make sure the buswidth for them
  is properly set up. Similarly, set data nbytes to 1 for
  spi_nor_spimem_check_pp(). This makes sure we don't send the wrong
  dummy and data buswidth to the controller's supports_op().

- Enable DQS for Micron MT35XU512ABA. No reason not to.

- Update doc comment for spi_nor_parse_profile1() and
  spi_nor_cypress_octal_dtr_enable() to add missing fields.

Changes in v11:
- Add helpers spi_nor_{read,write}_reg() to make it easier to reject DTR
  ops for them.

- Add helper for spi_nor_controller_ops_erase() to make it easier to
  reject DTR ops.

- s/spi-mem/SPIMEM/ and s/spi-nor/SPI NOR/

- Avoid enabling 4-byte addressing mode for all DTR ops instead of just
  Octal DTR ops. This is based on the assumption that DTR ops can only
  use 4-byte addressing.

- Use round_up() instead of defining ROUND_UP_TO().

- Rename 'table' to 'dwords' in xSPI Profile 1.0 parsing.

- Re-order Profile 1.0 related defines by DWORD order.

- Move rdsr parameter parsing to where opcode is parsed because it is
  from the same DWORD.

[PATCH v15 06/15] mtd: spi-nor: core: use dummy cycle and address width info from SFDP

2020-10-01 Thread Pratyush Yadav

The xSPI Profile 1.0 table specifies how many dummy cycles and address
bytes are needed for the Read Status Register command in octal DTR mode.
Use that information to send the correct Read SR command.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/core.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index ad280874a2e8..b5bb4d6cffc1 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -385,6 +385,11 @@ static int spi_nor_read_sr(struct spi_nor *nor, u8 *sr)
   SPI_MEM_OP_NO_DUMMY,
   SPI_MEM_OP_DATA_IN(1, sr, 0));
 
+   if (nor->reg_proto == SNOR_PROTO_8_8_8_DTR) {
+   op.addr.nbytes = nor->params->rdsr_addr_nbytes;
+   op.dummy.nbytes = nor->params->rdsr_dummy;
+   }
+
spi_nor_spimem_setup_op(nor, , nor->reg_proto);
 
ret = spi_mem_exec_op(nor->spimem, );
@@ -418,6 +423,11 @@ static int spi_nor_read_fsr(struct spi_nor *nor, u8 *fsr)
   SPI_MEM_OP_NO_DUMMY,
   SPI_MEM_OP_DATA_IN(1, fsr, 0));
 
+   if (nor->reg_proto == SNOR_PROTO_8_8_8_DTR) {
+   op.addr.nbytes = nor->params->rdsr_addr_nbytes;
+   op.dummy.nbytes = nor->params->rdsr_dummy;
+   }
+
spi_nor_spimem_setup_op(nor, , nor->reg_proto);
 
ret = spi_mem_exec_op(nor->spimem, );
-- 
2.28.0

[PATCH v15 01/15] mtd: spi-nor: core: use EOPNOTSUPP instead of ENOTSUPP

2020-10-01 Thread Pratyush Yadav

ENOTSUPP is not a SUSV4 error code. Using EOPNOTSUPP is preferred
in its stead.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/core.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 0369d98b2d12..4d0f8d165544 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -2281,7 +2281,7 @@ static int spi_nor_hwcaps_pp2cmd(u32 hwcaps)
  *@nor:pointer to a 'struct spi_nor'
  *@op: pointer to op template to be checked
  *
- * Returns 0 if operation is supported, -ENOTSUPP otherwise.
+ * Returns 0 if operation is supported, -EOPNOTSUPP otherwise.
  */
 static int spi_nor_spimem_check_op(struct spi_nor *nor,
   struct spi_mem_op *op)
@@ -2295,12 +2295,12 @@ static int spi_nor_spimem_check_op(struct spi_nor *nor,
op->addr.nbytes = 4;
if (!spi_mem_supports_op(nor->spimem, op)) {
if (nor->mtd.size > SZ_16M)
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 
/* If flash size <= 16MB, 3 address bytes are sufficient */
op->addr.nbytes = 3;
if (!spi_mem_supports_op(nor->spimem, op))
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
}
 
return 0;
@@ -2312,7 +2312,7 @@ static int spi_nor_spimem_check_op(struct spi_nor *nor,
  *@nor: pointer to a 'struct spi_nor'
  *@read:pointer to op template to be checked
  *
- * Returns 0 if operation is supported, -ENOTSUPP otherwise.
+ * Returns 0 if operation is supported, -EOPNOTSUPP otherwise.
  */
 static int spi_nor_spimem_check_readop(struct spi_nor *nor,
   const struct spi_nor_read_command *read)
@@ -2338,7 +2338,7 @@ static int spi_nor_spimem_check_readop(struct spi_nor 
*nor,
  *@nor: pointer to a 'struct spi_nor'
  *@pp:  pointer to op template to be checked
  *
- * Returns 0 if operation is supported, -ENOTSUPP otherwise.
+ * Returns 0 if operation is supported, -EOPNOTSUPP otherwise.
  */
 static int spi_nor_spimem_check_pp(struct spi_nor *nor,
   const struct spi_nor_pp_command *pp)
-- 
2.28.0

[PATCH v15 07/15] mtd: spi-nor: core: do 2 byte reads for SR and FSR in DTR mode

2020-10-01 Thread Pratyush Yadav

Some controllers, like the cadence qspi controller, have trouble reading
only 1 byte in DTR mode. So, do 2 byte reads for SR and FSR commands in
DTR mode, and then discard the second byte.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/core.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index b5bb4d6cffc1..b83bf5ed2b2d 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -370,7 +370,7 @@ int spi_nor_write_disable(struct spi_nor *nor)
  * spi_nor_read_sr() - Read the Status Register.
  * @nor:   pointer to 'struct spi_nor'.
  * @sr:pointer to a DMA-able buffer where the value of the
- *  Status Register will be written.
+ *  Status Register will be written. Should be at least 2 bytes.
  *
  * Return: 0 on success, -errno otherwise.
  */
@@ -388,6 +388,11 @@ static int spi_nor_read_sr(struct spi_nor *nor, u8 *sr)
if (nor->reg_proto == SNOR_PROTO_8_8_8_DTR) {
op.addr.nbytes = nor->params->rdsr_addr_nbytes;
op.dummy.nbytes = nor->params->rdsr_dummy;
+   /*
+* We don't want to read only one byte in DTR mode. So,
+* read 2 and then discard the second byte.
+*/
+   op.data.nbytes = 2;
}
 
spi_nor_spimem_setup_op(nor, , nor->reg_proto);
@@ -408,7 +413,8 @@ static int spi_nor_read_sr(struct spi_nor *nor, u8 *sr)
  * spi_nor_read_fsr() - Read the Flag Status Register.
  * @nor:   pointer to 'struct spi_nor'
  * @fsr:   pointer to a DMA-able buffer where the value of the
- *  Flag Status Register will be written.
+ *  Flag Status Register will be written. Should be at least 2
+ *  bytes.
  *
  * Return: 0 on success, -errno otherwise.
  */
@@ -426,6 +432,11 @@ static int spi_nor_read_fsr(struct spi_nor *nor, u8 *fsr)
if (nor->reg_proto == SNOR_PROTO_8_8_8_DTR) {
op.addr.nbytes = nor->params->rdsr_addr_nbytes;
op.dummy.nbytes = nor->params->rdsr_dummy;
+   /*
+* We don't want to read only one byte in DTR mode. So,
+* read 2 and then discard the second byte.
+*/
+   op.data.nbytes = 2;
}
 
spi_nor_spimem_setup_op(nor, , nor->reg_proto);
-- 
2.28.0

[PATCH v15 04/15] mtd: spi-nor: sfdp: get command opcode extension type from BFPT

2020-10-01 Thread Pratyush Yadav

Some devices in DTR mode expect an extra command byte called the
extension. The extension can either be same as the opcode, bitwise
inverse of the opcode, or another additional byte forming a 16-byte
opcode. Get the extension type from the BFPT. For now, only flashes with
"repeat" and "inverse" extensions are supported.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/sfdp.c | 18 ++
 drivers/mtd/spi-nor/sfdp.h |  6 ++
 2 files changed, 24 insertions(+)

diff --git a/drivers/mtd/spi-nor/sfdp.c b/drivers/mtd/spi-nor/sfdp.c
index 21fa9ab78eae..c77655968f80 100644
--- a/drivers/mtd/spi-nor/sfdp.c
+++ b/drivers/mtd/spi-nor/sfdp.c
@@ -606,6 +606,24 @@ static int spi_nor_parse_bfpt(struct spi_nor *nor,
if (bfpt_header->length == BFPT_DWORD_MAX_JESD216B)
return spi_nor_post_bfpt_fixups(nor, bfpt_header, ,
params);
+   /* 8D-8D-8D command extension. */
+   switch (bfpt.dwords[BFPT_DWORD(18)] & BFPT_DWORD18_CMD_EXT_MASK) {
+   case BFPT_DWORD18_CMD_EXT_REP:
+   nor->cmd_ext_type = SPI_NOR_EXT_REPEAT;
+   break;
+
+   case BFPT_DWORD18_CMD_EXT_INV:
+   nor->cmd_ext_type = SPI_NOR_EXT_INVERT;
+   break;
+
+   case BFPT_DWORD18_CMD_EXT_RES:
+   dev_dbg(nor->dev, "Reserved command extension used\n");
+   break;
+
+   case BFPT_DWORD18_CMD_EXT_16B:
+   dev_dbg(nor->dev, "16-bit opcodes not supported\n");
+   return -EOPNOTSUPP;
+   }
 
return spi_nor_post_bfpt_fixups(nor, bfpt_header, , params);
 }
diff --git a/drivers/mtd/spi-nor/sfdp.h b/drivers/mtd/spi-nor/sfdp.h
index 7f9846b3a1ad..6d7243067252 100644
--- a/drivers/mtd/spi-nor/sfdp.h
+++ b/drivers/mtd/spi-nor/sfdp.h
@@ -90,6 +90,12 @@ struct sfdp_bfpt {
 #define BFPT_DWORD15_QER_SR2_BIT1_NO_RD(0x4UL << 20)
 #define BFPT_DWORD15_QER_SR2_BIT1  (0x5UL << 20) /* Spansion */
 
+#define BFPT_DWORD18_CMD_EXT_MASK  GENMASK(30, 29)
+#define BFPT_DWORD18_CMD_EXT_REP   (0x0UL << 29) /* Repeat */
+#define BFPT_DWORD18_CMD_EXT_INV   (0x1UL << 29) /* Invert */
+#define BFPT_DWORD18_CMD_EXT_RES   (0x2UL << 29) /* Reserved */
+#define BFPT_DWORD18_CMD_EXT_16B   (0x3UL << 29) /* 16-bit opcode 
*/
+
 struct sfdp_parameter_header {
u8  id_lsb;
u8  minor;
-- 
2.28.0

[PATCH v15 02/15] mtd: spi-nor: add spi_nor_controller_ops_{read_reg,write_reg,erase}()

2020-10-01 Thread Pratyush Yadav

They are thin wrappers around
nor->controller_ops->{read_reg,write_reg,erase}(). In a future commit
DTR support will be added. These ops can not be supported by the
controller_ops hooks and these helpers will make it easier to reject
those calls.

Signed-off-by: Pratyush Yadav 
Reviewed-by: Tudor Ambarus 
---
 drivers/mtd/spi-nor/core.c | 87 +++---
 1 file changed, 53 insertions(+), 34 deletions(-)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 4d0f8d165544..7a3bf460a2fa 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -82,6 +82,23 @@ static int spi_nor_spimem_exec_op(struct spi_nor *nor, 
struct spi_mem_op *op)
return spi_mem_exec_op(nor->spimem, op);
 }
 
+static int spi_nor_controller_ops_read_reg(struct spi_nor *nor, u8 opcode,
+  u8 *buf, size_t len)
+{
+   return nor->controller_ops->read_reg(nor, opcode, buf, len);
+}
+
+static int spi_nor_controller_ops_write_reg(struct spi_nor *nor, u8 opcode,
+   const u8 *buf, size_t len)
+{
+   return nor->controller_ops->write_reg(nor, opcode, buf, len);
+}
+
+static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
+{
+   return nor->controller_ops->erase(nor, offs);
+}
+
 /**
  * spi_nor_spimem_read_data() - read data from flash's memory region via
  *  spi-mem
@@ -229,8 +246,8 @@ int spi_nor_write_enable(struct spi_nor *nor)
 
ret = spi_mem_exec_op(nor->spimem, );
} else {
-   ret = nor->controller_ops->write_reg(nor, SPINOR_OP_WREN,
-NULL, 0);
+   ret = spi_nor_controller_ops_write_reg(nor, SPINOR_OP_WREN,
+  NULL, 0);
}
 
if (ret)
@@ -258,8 +275,8 @@ int spi_nor_write_disable(struct spi_nor *nor)
 
ret = spi_mem_exec_op(nor->spimem, );
} else {
-   ret = nor->controller_ops->write_reg(nor, SPINOR_OP_WRDI,
-NULL, 0);
+   ret = spi_nor_controller_ops_write_reg(nor, SPINOR_OP_WRDI,
+  NULL, 0);
}
 
if (ret)
@@ -289,8 +306,8 @@ static int spi_nor_read_sr(struct spi_nor *nor, u8 *sr)
 
ret = spi_mem_exec_op(nor->spimem, );
} else {
-   ret = nor->controller_ops->read_reg(nor, SPINOR_OP_RDSR,
-   sr, 1);
+   ret = spi_nor_controller_ops_read_reg(nor, SPINOR_OP_RDSR, sr,
+ 1);
}
 
if (ret)
@@ -320,8 +337,8 @@ static int spi_nor_read_fsr(struct spi_nor *nor, u8 *fsr)
 
ret = spi_mem_exec_op(nor->spimem, );
} else {
-   ret = nor->controller_ops->read_reg(nor, SPINOR_OP_RDFSR,
-   fsr, 1);
+   ret = spi_nor_controller_ops_read_reg(nor, SPINOR_OP_RDFSR, fsr,
+ 1);
}
 
if (ret)
@@ -352,7 +369,8 @@ static int spi_nor_read_cr(struct spi_nor *nor, u8 *cr)
 
ret = spi_mem_exec_op(nor->spimem, );
} else {
-   ret = nor->controller_ops->read_reg(nor, SPINOR_OP_RDCR, cr, 1);
+   ret = spi_nor_controller_ops_read_reg(nor, SPINOR_OP_RDCR, cr,
+ 1);
}
 
if (ret)
@@ -385,10 +403,10 @@ int spi_nor_set_4byte_addr_mode(struct spi_nor *nor, bool 
enable)
 
ret = spi_mem_exec_op(nor->spimem, );
} else {
-   ret = nor->controller_ops->write_reg(nor,
-enable ? SPINOR_OP_EN4B :
- SPINOR_OP_EX4B,
-NULL, 0);
+   ret = spi_nor_controller_ops_write_reg(nor,
+  enable ? SPINOR_OP_EN4B :
+   SPINOR_OP_EX4B,
+  NULL, 0);
}
 
if (ret)
@@ -421,8 +439,8 @@ static int spansion_set_4byte_addr_mode(struct spi_nor 
*nor, bool enable)
 
ret = spi_mem_exec_op(nor->spimem, );
} else {
-   ret = nor->controller_ops->write_reg(nor, SPINOR_OP_BRWR,
-nor->bouncebuf, 1);
+   ret = spi_nor_controller_ops_write_reg(nor, SPINOR_OP_BRWR,
+  nor->bouncebuf, 1);
}
 
if (ret)
@@ -453,8 +471,8 @@ int spi_nor_write_ear(struct spi_nor *nor, u8 ear)
 
ret =

[PATCH v3] arm64/mm: add fallback option to allocate virtually contiguous memory

2020-10-01 Thread Sudarshan Rajagopalan

When section mappings are enabled, we allocate vmemmap pages from physically
continuous memory of size PMD_SIZE using vmemmap_alloc_block_buf(). Section
mappings are good to reduce TLB pressure. But when system is highly fragmented
and memory blocks are being hot-added at runtime, its possible that such
physically continuous memory allocations can fail. Rather than failing the
memory hot-add procedure, add a fallback option to allocate vmemmap pages from
discontinuous pages using vmemmap_populate_basepages().

Signed-off-by: Sudarshan Rajagopalan 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Anshuman Khandual 
Cc: Mark Rutland 
Cc: Logan Gunthorpe 
Cc: David Hildenbrand 
Cc: Andrew Morton 
Cc: Steven Price 
---
 arch/arm64/mm/mmu.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 75df62f..11f8639 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1121,8 +1121,15 @@ int __meminit vmemmap_populate(unsigned long start, 
unsigned long end, int node,
void *p = NULL;
 
p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
-   if (!p)
-   return -ENOMEM;
+   if (!p) {
+   /*
+* fallback allocating with virtually
+* contiguous memory for this section
+*/
+   if (vmemmap_populate_basepages(addr, next, 
node, NULL))
+   return -ENOMEM;
+   continue;
+   }
 
pmd_set_huge(pmdp, __pa(p), __pgprot(PROT_SECT_NORMAL));
} else
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [RFC PATCH next-20200930] treewide: Convert macro and uses of section(foo) to section("foo")

2020-10-01 Thread Joe Perches

On Thu, 2020-10-01 at 14:39 -0500, Segher Boessenkool wrch/ote:
> Hi!
> 
> On Thu, Oct 01, 2020 at 12:15:39PM +0200, Miguel Ojeda wrote:
> > > So it looks like the best option is to exclude these
> > > 2 files from conversion.
> > 
> > Agreed. Nevertheless, is there any reason arch/powerpc/* should not be
> > compiling cleanly with compiler.h? (CC'ing the rest of the PowerPC
> > reviewers and ML).
> 
> You need to #include compiler_types.h to get this #define?

Actually no, you need to add

#include 

to both files and then it builds properly.

Ideally though nothing should include this file directly.

> (The twice-defined thing is a warning, not an error.  It should be fixed
> of course, but it is less important; although it may be pointing to a
> deeper problem.)
> 
> 
> Segher

[PATCH v3] arm64/mm: add fallback option to allocate virtually

2020-10-01 Thread Sudarshan Rajagopalan

V1: The initial patch used the approach to abort at the first instance of 
PMD_SIZE
allocation failure, unmaps all previously mapped sections using vmemmap_free
and maps the entire request with vmemmap_populate_basepages to allocate
virtually contiguous memory.
https://lkml.org/lkml/2020/9/10/66

V2: Allocates virtually contiguous memory only for sections that failed
PMD_SIZE allocation, and continues to allocate physically contiguous
memory for other sections.
https://lkml.org/lkml/2020/9/30/1489

V3: Addresses Anshuman's comment to allow fallback to altmap base pages
as well if and when required.

Sudarshan Rajagopalan (1):
  arm64/mm: add fallback option to allocate virtually contiguous memory

 arch/arm64/mm/mmu.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

WARNING in handle_exception_nmi

2020-10-01 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:fb0155a0 Merge tag 'nfs-for-5.9-3' of git://git.linux-nfs...
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11a7329d90
kernel config:  https://syzkaller.appspot.com/x/.config?x=adebb40048274f92
dashboard link: https://syzkaller.appspot.com/bug?extid=4e78ae6b12b00b9d1042
compiler:   clang version 10.0.0 (https://github.com/llvm/llvm-project/ 
c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=173937ad90
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1041373d90

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+4e78ae6b12b00b9d1...@syzkaller.appspotmail.com

L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and 
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for 
details.
[ cut here ]
WARNING: CPU: 1 PID: 6854 at arch/x86/kvm/vmx/vmx.c:4809 
handle_exception_nmi+0x1051/0x12a0 arch/x86/kvm/vmx/vmx.c:4809
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 6854 Comm: syz-executor665 Not tainted 5.9.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1d6/0x29e lib/dump_stack.c:118
 panic+0x2c0/0x800 kernel/panic.c:231
 __warn+0x227/0x250 kernel/panic.c:600
 report_bug+0x1b1/0x2e0 lib/bug.c:198
 handle_bug+0x42/0x80 arch/x86/kernel/traps.c:234
 exc_invalid_op+0x16/0x40 arch/x86/kernel/traps.c:254
 asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536
RIP: 0010:handle_exception_nmi+0x1051/0x12a0 arch/x86/kvm/vmx/vmx.c:4809
Code: fd 98 00 e9 17 f1 ff ff 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c da f0 ff ff 
48 89 df e8 a9 fd 98 00 e9 cd f0 ff ff e8 1f 19 59 00 <0f> 0b e9 e0 f6 ff ff 89 
d1 80 e1 07 80 c1 03 38 c1 0f 8c f4 f1 ff
RSP: 0018:c9e979b0 EFLAGS: 00010293
RAX: 811be461 RBX: fff8 RCX: 888091f42200
RDX:  RSI:  RDI: 0001
RBP:  R08: 811bdb3a R09: ed1014faf071
R10: ed1014faf071 R11:  R12: 8880a7d78380
R13: 111014faf026 R14: 8880a7d78040 R15: 0002
 vcpu_enter_guest+0x6725/0x8a50 arch/x86/kvm/x86.c:8655
 vcpu_run+0x332/0xc00 arch/x86/kvm/x86.c:8720
 kvm_arch_vcpu_ioctl_run+0x451/0x8f0 arch/x86/kvm/x86.c:8937
 kvm_vcpu_ioctl+0x64f/0xa50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3230
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:739
 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x443bb9
Code: e8 dc a3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 
db 00 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:7fff4f9aff08 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX:  RCX: 00443bb9
RDX:  RSI: ae80 RDI: 0005
RBP: 006ce018 R08:  R09: 004002c8
R10: 0012 R11: 0246 R12: 00404120
R13: 004041b0 R14:  R15: 
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Re: [PATCH 3/4] mmap locking API: Don't check locking if the mm isn't live yet

2020-10-01 Thread Jann Horn

On Thu, Oct 1, 2020 at 9:15 PM Jason Gunthorpe  wrote:
> On Thu, Oct 01, 2020 at 01:51:33AM +0200, Jann Horn wrote:
> > On Thu, Oct 1, 2020 at 1:26 AM Jason Gunthorpe  wrote:
> > > On Wed, Sep 30, 2020 at 10:14:57PM +0200, Jann Horn wrote:
> > > > On Wed, Sep 30, 2020 at 2:50 PM Jann Horn  wrote:
> > > > > On Wed, Sep 30, 2020 at 2:30 PM Jason Gunthorpe  wrote:
> > > > > > On Tue, Sep 29, 2020 at 06:20:00PM -0700, Jann Horn wrote:
> > > > > > > In preparation for adding a mmap_assert_locked() check in
> > > > > > > __get_user_pages(), teach the mmap_assert_*locked() helpers that 
> > > > > > > it's fine
> > > > > > > to operate on an mm without locking in the middle of execve() as 
> > > > > > > long as
> > > > > > > it hasn't been installed on a process yet.
> > > > > >
> > > > > > I'm happy to see lockdep being added here, but can you elaborate on
> > > > > > why add this mmap_locked_required instead of obtaining the lock in 
> > > > > > the
> > > > > > execv path?
> > > > >
> > > > > My thinking was: At that point, we're logically still in the
> > > > > single-owner initialization phase of the mm_struct. Almost any object
> > > > > has initialization and teardown steps that occur in a context where
> > > > > the object only has a single owner, and therefore no locking is
> > > > > required. It seems to me that adding locking in places like
> > > > > get_arg_page() would be confusing because it would suggest the
> > > > > existence of concurrency where there is no actual concurrency, and it
> > > > > might be annoying in terms of lockdep if someone tries to use
> > > > > something like get_arg_page() while holding the mmap_sem of the
> > > > > calling process. It would also mean that we'd be doing extra locking
> > > > > in normal kernel builds that isn't actually logically required.
> > > > >
> > > > > Hmm, on the other hand, dup_mmap() already locks the child mm (with
> > > > > mmap_write_lock_nested()), so I guess it wouldn't be too bad to also
> > > > > do it in get_arg_page() and tomoyo_dump_page(), with comments that
> > > > > note that we're doing this for lockdep consistency... I guess I can go
> > > > > change this in v2.
> > > >
> > > > Actually, I'm taking that back. There's an extra problem:
> > > > get_arg_page() accesses bprm->vma, which is set all the way back in
> > > > __bprm_mm_init(). We really shouldn't be pretending that we're
> > > > properly taking the mmap_sem when actually, we keep reusing a
> > > > vm_area_struct pointer.
> > >
> > > Any chance the mmap lock can just be held from mm_struct allocation
> > > till exec inserts it into the process?
> >
> > Hm... it should work if we define a lockdep subclass for this so that
> > lockdep is happy when we call get_user() on the old mm_struct while
> > holding that mmap lock.
>
> A subclass isn't right, it has to be a _nested annotation.
>
> nested locking is a pretty good reason to not be able to do this, this
> is something lockdep does struggle to model.

Did I get the terminology wrong? I thought they were the same. The
down_*_nested() APIs take an argument "subclass", with the default
subclass for the functions without "_nested" being 0.

Anyway, I wrote a patch for this yesterday, I'll send it out later
today after testing that it still boots without lockdep warnings. Then
you can decide whether you prefer it to the current patch.

Re: [PATCH v3 04/18] dmaengine: idxd: add interrupt handle request support

2020-10-01 Thread Dave Jiang





On 9/30/2020 11:36 AM, Thomas Gleixner wrote:

On Tue, Sep 15 2020 at 16:28, Dave Jiang wrote:
  
+#define INT_HANDLE_IMS_TABLE	0x1

+int idxd_device_request_int_handle(struct idxd_device *idxd, int idx,
+  int *handle, enum idxd_interrupt_type 
irq_type)


New lines exist for a reason and this glued together define and function
definition is unreadable garbage.

Also is that magic bit a software flag or defined by hardware? If the
latter then you want to move it to the other hardware defines.


Will move this to hardware register header.



Thanks,

 tglx

Kernel 5.9-rc regression.

2020-10-01 Thread Zhou Yanjie


Hi Thomas and list,

There is a strange phenomenon in kernel 5.9-rc: when using kernel 5.9-rc 
with debian 10 and running htop, the memory footprint will be displayed 
as 3.99T. When the actual memory footprint increases, the displayed 
value will be reduced to 3.98T, 3.97T etc. These phenomena have been 
confirmed in X1000, X1830, and JZ4780 (disable SMP), this phenomenon 
does not seem to affect the SMP processor. When the JZ4780 turn on SMP, 
the memory footprint will be displayed normally.



The following is the relevant log:

  CPU[* 0.7%]   Tasks: 18, 4 thr; 1 running
  Mem[###*3.99T/120M]   Load average: 0.02 0.02 0.00
  Swp[   0K/768M]   Uptime: 00:02:07

  PID USER  PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+ Command
 1135 cu-neo 20   0  8872  3152  2724 R  1.3  2.6  0:00.06 htop
    1 root   20   0 16268  7940  6588 S  0.0  6.5  0:02.06 /sbin/init
  512 root   20   0 23932  6188  5408 S  0.0  5.0  0:00.30 
/lib/systemd/syst
  746 root   20   0 17124  3492  2956 S  0.0  2.8  0:00.17 
/lib/systemd/syst
  770 systemd-t  20   0 23524  5700  5040 S  0.0  4.6  0:00.02 
/lib/systemd/syst
  756 systemd-t  20   0 23524  5700  5040 S  0.0  4.6  0:00.22 
/lib/systemd/syst
  772 root   20   0  8832  2436  2252 S  0.0  2.0  0:00.01 
/usr/sbin/cron -f
  773 root   20   0 14224  5812  5152 S  0.0  4.7  0:00.14 
/lib/systemd/syst
  774 messagebu  20   0  7392  4056  3628 S  0.0  3.3  0:00.20 
/usr/bin/dbus-dae
  775 root   20   0 12256  5096  4644 S  0.0  4.2  0:00.06 
/sbin/wpa_supplic
  793 root   20   0 25152  3844  3112 S  0.0  3.1  0:00.01 
/usr/sbin/rsyslog
  794 root   20   0 25152  3844  3112 S  0.0  3.1  0:00.00 
/usr/sbin/rsyslog
  795 root   20   0 25152  3844  3112 S  0.0  3.1  0:00.02 
/usr/sbin/rsyslog
  776 root   20   0 25152  3844  3112 S  0.0  3.1  0:00.06 
/usr/sbin/rsyslog
  821 root   20   0  8576  5492  4632 S  0.0  4.5  0:00.03 
/sbin/dhclient -4
  868 root   20   0  4524  1856  1748 S  0.0  1.5  0:00.01 
/sbin/agetty -o -
  871 root   20   0 11524  3820  3348 S  0.0  3.1  0:00.17 
/bin/login -p --
F1Help  F2Setup F3SearchF4FilterF5Tree  F6SortByF7Nice -F8Nice +F9Kill  
F10Quit

Re: linux-next: Tree for Oct 1 (drivers/mfd/simple-mfd-i2c.o)

2020-10-01 Thread Randy Dunlap

On 10/1/20 4:39 AM, Stephen Rothwell wrote:
> Hi all,
> 
> Changes since 20200930:
> 

on x86_64:

ld: drivers/mfd/simple-mfd-i2c.o: in function `simple_mfd_i2c_probe':
simple-mfd-i2c.c:(.text+0x48): undefined reference to `__devm_regmap_init_i2c'
ld: drivers/mfd/simple-mfd-i2c.o: in function `simple_mfd_i2c_driver_init':
simple-mfd-i2c.c:(.init.text+0x14): undefined reference to `i2c_register_driver'
ld: drivers/mfd/simple-mfd-i2c.o: in function `simple_mfd_i2c_driver_exit':
simple-mfd-i2c.c:(.exit.text+0xd): undefined reference to `i2c_del_driver'


CONFIG_I2C=m
CONFIG_MFD_SIMPLE_MFD_I2C=y
CONFIG_MFD_SL28CPLD=y


Is linux-next missing the source code for CONFIG_MFD_SL28CPLD?


The build error is caused by:

config MFD_SL28CPLD
tristate "Kontron sl28cpld Board Management Controller"
select MFD_SIMPLE_MFD_I2C

that "select" when "depends on I2C" is absent/missing.


-- 
~Randy
Reported-by: Randy Dunlap

Re: [PATCH v3 devicetree 0/2] Add Seville Ethernet switch to T1040RDB

2020-10-01 Thread David Miller

From: Vladimir Oltean 
Date: Thu,  1 Oct 2020 16:20:11 +0300

> Seville is a DSA switch that is embedded inside the T1040 SoC, and
> supported by the mscc_seville DSA driver inside drivers/net/dsa/ocelot.
> 
> This series adds this switch to the SoC's dtsi files and to the T1040RDB
> board file.

I am assuming the devicetree folks will pick this series up.

Thanks.

Re: [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible

2020-10-01 Thread Uladzislau Rezki

On Thu, Oct 01, 2020 at 11:02:20AM +0200, Michal Hocko wrote:
> On Wed 30-09-20 16:21:54, Paul E. McKenney wrote:
> > On Wed, Sep 30, 2020 at 10:41:39AM +0200, Michal Hocko wrote:
> > > On Tue 29-09-20 18:53:27, Paul E. McKenney wrote:
> [...]
> > > > No argument on it being confusing, and I hope that the added header
> > > > comment helps.  But specifically, can_sleep==true is a promise by the
> > > > caller to be schedulable and not to be holding any lock/mutex/whatever
> > > > that might possibly be acquired by the memory allocator or by anything
> > > > else that the memory allocator might invoke, to your point, including
> > > > for but one example the reclaim logic.
> > > > 
> > > > The only way that can_sleep==true is if this function was invoked due
> > > > to a call to single-argument kvfree_rcu(), which must be schedulable
> > > > because its fallback is to invoke synchronize_rcu().
> > > 
> > > OK. I have to say that it is still not clear to me whether this call
> > > path can be called from the memory reclaim context. If yes then you need
> > > __GFP_NOMEMALLOC as well.
> > 
> > Right now the restriction is that single-argument (AKA can_sleep==true)
> > kvfree_rcu() cannot be invoked from memory reclaim context.
> > 
> > But would adding __GFP_NOMEMALLOC to the can_sleep==true GFP_ flags
> > allow us to remove this restriction?  If so, I will queue a separate
> > patch making this change.  The improved ease of use would be well
> > worth it, if I understand correctly (ha!!!).
> 
> It would be quite daring to claim it will be ok but it will certainly be
> less problematic. Adding the flag will not hurt in any case. As this is
> a shared called that might be called from many contexts I think it will
> be safer to have it there. The justification is that it will prevent
> consumption of memory reserves from MEMALLOC contexts.
> 
> > 
> > > [...]
> > > 
> > > > > What is the point of calling kmalloc  for a PAGE_SIZE object? Wouldn't
> > > > > using the page allocator directly be better?
> > > > 
> > > > Well, you guys gave me considerable heat about abusing internal 
> > > > allocator
> > > > interfaces, and kmalloc() and kfree() seem to be about as non-internal
> > > > as you can get and still be invoking the allocator.  ;-)
> > > 
> > > alloc_pages resp. __get_free_pages is a normal page allocator interface
> > > to use for page size granular allocations. kmalloc is for more fine
> > > grained allocations.
> > 
> > OK, in the short term, both work, but I have queued a separate patch
> > making this change and recording the tradeoffs.  This is not yet a
> > promise to push this patch, but it is a promise not to lose this part
> > of the picture.  Please see below.
> 
> It doesn't matter all that much. Both allocators will work. It is just a
> matter of using optimal tool for the specific purose.
> 
> > You mentioned alloc_pages().  I reverted to __get_free_pages(), but
> > alloc_pages() of course looks nicer.  What are the tradeoffs between
> > __get_free_pages() and alloc_pages()?
> 
> alloc_pages will return struct page but you need a kernel pointer. That
> is what __get_free_pages will give you (or you can call page_address
> directly).
> 
> > Thanx, Paul
> > 
> > 
> > 
> > commit 490b638d7c241ac06cee168ccf8688bb8b872478
> > Author: Paul E. McKenney 
> > Date:   Wed Sep 30 16:16:39 2020 -0700
> > 
> > kvfree_rcu(): Switch from kmalloc/kfree to __get_free_page/free_page.
> > 
> > The advantages of using kmalloc() and kfree() are a possible small 
> > speedup
> > on CONFIG_SLAB=y systems, avoiding the allocation-side cast, and use of
> > more-familiar API members.  The advantages of using __get_free_page()
> > and free_page() are a possible reduction in fragmentation and direct
> > access to the buddy allocator.
> > 
> > To help settle the question as to which to use, this commit switches
> > from kmalloc() and kfree() to __get_free_page() and free_page().
> > 
> > Suggested-by: Michal Hocko 
> > Suggested-by: "Uladzislau Rezki (Sony)" 
> > Signed-off-by: Paul E. McKenney 
> 
> Yes, looks good to me. I am not entirely sure about the fragmentation
> argument. It really depends on the SL.B allocator internals. The same
> applies for the potential speed up. I would be even surprised if the
> SLAB was faster in average considering it has to use the page allocator
> as well. So to me the primary motivation would be "use the right tool
> for the purpose".
> 
As for raised a concern about fragmentation, i mostly was thinking about
that SLAbs where not designed to do an efficient allocations for sizes
which are >= than PAGE_SIZE. But it depends on three different
implementations, actually it also a good argument to switch to the page
allocator. I mean to get rid of such dependency.

Other side is, SLABs, at least SLAB and SLUB

[PATCH 1/1] drm/amdgpu: fix NULL pointer dereference for Renoir

2020-10-01 Thread Dirk Gouders

Commit c1cf79ca5ced46 (drm/amdgpu: use IP discovery table for renoir)
introduced a NULL pointer dereference when booting with
amdgpu.discovery=0, because it removed the call of vega10_reg_base_init()
for that case.

Fix this by calling that funcion if amdgpu_discovery == 0 in addition to
the case that amdgpu_discovery_reg_base_init() failed.

Fixes: c1cf79ca5ced46 (drm/amdgpu: use IP discovery table for renoir)
Signed-off-by: Dirk Gouders 
Cc: Hawking Zhang 
Cc: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/soc15.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 84d811b6e48b..f8cb62b326d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -694,12 +694,12 @@ static void soc15_reg_base_init(struct amdgpu_device 
*adev)
 * it doesn't support SRIOV. */
if (amdgpu_discovery) {
r = amdgpu_discovery_reg_base_init(adev);
-   if (r) {
-   DRM_WARN("failed to init reg base from ip 
discovery table, "
-"fallback to legacy init method\n");
-   vega10_reg_base_init(adev);
-   }
+   if (r == 0)
+ break;
+   DRM_WARN("failed to init reg base from ip discovery 
table, "
+"fallback to legacy init method\n");
}
+   vega10_reg_base_init(adev);
break;
case CHIP_VEGA20:
vega20_reg_base_init(adev);
-- 
2.26.2

Re: [PATCH 1/4] of/fdt: Update zone_dma_bits when running in bcm2711

2020-10-01 Thread Rob Herring

On Thu, Oct 1, 2020 at 12:31 PM Nicolas Saenz Julienne
 wrote:
>
> On Thu, 2020-10-01 at 18:23 +0100, Catalin Marinas wrote:
> > On Thu, Oct 01, 2020 at 06:15:01PM +0100, Catalin Marinas wrote:
> > > Hi Nicolas,
> > >
> > > Thanks for putting this together.
> > >
> > > On Thu, Oct 01, 2020 at 06:17:37PM +0200, Nicolas Saenz Julienne wrote:
> > > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > > > index 4602e467ca8b..cd0d115ef329 100644
> > > > --- a/drivers/of/fdt.c
> > > > +++ b/drivers/of/fdt.c
> > > > @@ -25,6 +25,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include /* for zone_dma_bits */
> > > >
> > > >  #include   /* for COMMAND_LINE_SIZE */
> > > >  #include 
> > > > @@ -1198,6 +1199,14 @@ void __init early_init_dt_scan_nodes(void)
> > > >   of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> > > >  }
> > > >
> > > > +void __init early_init_dt_update_zone_dma_bits(void)
> > > > +{
> > > > + unsigned long dt_root = of_get_flat_dt_root();
> > > > +
> > > > + if (of_flat_dt_is_compatible(dt_root, "brcm,bcm2711"))
> > > > + zone_dma_bits = 30;
> > > > +}
> > >
> > > I think we could keep this entirely in the arm64 setup_machine_fdt() and
> > > not pollute the core code with RPi4-specific code.
> >
> > Actually, even better, could we not move the check to
> > arm64_memblock_init() when we initialise zone_dma_bits?
>
> I did it this way as I vaguely remembered Rob saying he wanted to centralise
> all early boot fdt code in one place. But I'll be happy to move it there.

Right, unless zone_dma_bits is only an arm64 thing, then this doesn't
really have anything arch specific.

Reviewed-by: Rob Herring 

Rob

[PATCH 0/1] drm/amdgpu: fix NULL pointer dereference for Renoir

2020-10-01 Thread Dirk Gouders

Alex Deucher  writes:

> On Wed, Sep 30, 2020 at 4:46 PM Dirk Gouders  wrote:
>>
>> Commit c1cf79ca5ced46 (drm/amdgpu: use IP discovery table for renoir)
>> introduced a NULL pointer dereference when booting with
>> amdgpu.discovery=0.
>>
>> For amdgpu.discovery=0 that commit effectively removed the call of
>> vega10_reg_base_init(adev), so I tested the correctness of the bisect
>> session by restoring that function call for amdgpu_discovery == 0 and with
>> that change, the NULL pointer dereference does not occur:
>>
>
> Can I add your Signed-off-by?

I did not expect the diff to be seen as a proposed patch, not even that it
shows the correct fix.

Anyway, I did my best to create a hopefully acceptable patch with
some modification of the code that avoids "else" and an identical function call
at two places in the code.

I testet that patch with amdgpu.discovery={0,1} and together with the patch for 
the
first issue you helped me with.  The result is no more call traces.

Thank you for your patient assistance with the two issues.

Dirk


> Thanks,
>
> Alex
>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
>> b/drivers/gpu/drm/amd/amdgpu/soc15.c
>> index 84d811b6e48b..2e93c5e1e7e6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
>> @@ -699,7 +699,8 @@ static void soc15_reg_base_init(struct amdgpu_device 
>> *adev)
>>  "fallback to legacy init method\n");
>> vega10_reg_base_init(adev);
>> }
>> -   }
>> +   } else
>> +   vega10_reg_base_init(adev);
>> break;
>> case CHIP_VEGA20:
>> vega20_reg_base_init(adev);
>>
>> Dirk
>> ___
>> amd-gfx mailing list
>> amd-...@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Dirk Gouders (1):
  drm/amdgpu: fix NULL pointer dereference for Renoir

 drivers/gpu/drm/amd/amdgpu/soc15.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

-- 
2.26.2

< 1 2 3 4 5 6 7 8 9 10 >

301 - 400 of 1339 matches

Mail list logo