Re: [PATCH v8 4/7] perf/tools: Enhance JSON/metric infrastructure to handle "?"

2020-05-01 Thread Ian Rogers
On Wed, Apr 1, 2020 at 1:35 PM Kajol Jain  wrote:
>
> Patch enhances current metric infrastructure to handle "?" in the metric
> expression. The "?" can be use for parameters whose value not known while
> creating metric events and which can be replace later at runtime to
> the proper value. It also add flexibility to create multiple events out
> of single metric event added in json file.
>
> Patch adds function 'arch_get_runtimeparam' which is a arch specific
> function, returns the count of metric events need to be created.
> By default it return 1.

Sorry for the slow response, I was trying to understand this patch in
relation to the PMU aliases to see if there was an overlap - I'm still
not sure. This is now merged so I'm just commenting wrt possible
future cleanup. I defer to the maintainers on how this should be
organized. At the metric level, this problem reminds me of both
#smt_on and LLC_MISSES.PCIE_WRITE on cascade lake. #smt_on adds a
degree of CPU specific behavior to an expression.
LLC_MISSES.PCIE_WRITE uses .part0 ... part3 to combine separate but
related counters.
The symbols that the metrics parse are then passed to parse-event. You
don't change parse-event as metricgroup replaces the '?' with a read
value from /devices/hv_24x7/interface/sockets, actually 0 to that
value-1 are passed.

It seems unfortunate to overload the meaning of runtime with a value
read from /devices/hv_24x7/interface/sockets and plumbing this value
around is quite a bit of noise for everything but this use-case. I
kind of wish we could do something like:

for i in 0, read("/devices/hv_24x7/interface/sockets"):
  hv_24x7/pm_pb_cyc,chip=$i

in the metric code. I have some patches to send related to metric
groups and I think this will be an active area of development for a
while as I think there are some open questions on the organization of
the code.

Thanks,
Ian

> This infrastructure needed for hv_24x7 socket/chip level events.
> "hv_24x7" chip level events needs specific chip-id to which the
> data is requested. Function 'arch_get_runtimeparam' implemented
> in header.c which extract number of sockets from sysfs file
> "sockets" under "/sys/devices/hv_24x7/interface/".
>
> With this patch basically we are trying to create as many metric events
> as define by runtime_param.
>
> For that one loop is added in function 'metricgroup__add_metric',
> which create multiple events at run time depend on return value of
> 'arch_get_runtimeparam' and merge that event in 'group_list'.
>
> To achieve that we are actually passing this parameter value as part of
> `expr__find_other` function and changing "?" present in metric expression
> with this value.
>
> As in our json file, there gonna be single metric event, and out of
> which we are creating multiple events.
>
> To understand which data count belongs to which parameter value,
> we also printing param value in generic_metric function.
>
> For example,
> command:# ./perf stat  -M PowerBUS_Frequency -C 0 -I 1000
>  1.000101867  9,356,933  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 
> GHz  PowerBUS_Frequency_0
>  1.000101867  9,366,134  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 
> GHz  PowerBUS_Frequency_1
>  2.000314878  9,365,868  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 
> GHz  PowerBUS_Frequency_0
>  2.000314878  9,366,092  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 
> GHz  PowerBUS_Frequency_1
>
> So, here _0 and _1 after PowerBUS_Frequency specify parameter value.
>
> Signed-off-by: Kajol Jain 
> ---
>  tools/perf/arch/powerpc/util/header.c |  8 
>  tools/perf/tests/expr.c   |  8 
>  tools/perf/util/expr.c| 11 ++-
>  tools/perf/util/expr.h|  5 +++--
>  tools/perf/util/expr.l| 27 +++---
>  tools/perf/util/metricgroup.c | 28 ---
>  tools/perf/util/metricgroup.h |  2 ++
>  tools/perf/util/stat-shadow.c | 17 ++--
>  8 files changed, 79 insertions(+), 27 deletions(-)
>
> diff --git a/tools/perf/arch/powerpc/util/header.c 
> b/tools/perf/arch/powerpc/util/header.c
> index 3b4cdfc5efd6..d4870074f14c 100644
> --- a/tools/perf/arch/powerpc/util/header.c
> +++ b/tools/perf/arch/powerpc/util/header.c
> @@ -7,6 +7,8 @@
>  #include 
>  #include 
>  #include "header.h"
> +#include "metricgroup.h"
> +#include 
>
>  #define mfspr(rn)   ({unsigned long rval; \
>  asm volatile("mfspr %0," __stringify(rn) \
> @@ -44,3 +46,9 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
>
> return bufp;
>  }
> +
> +int arch_get_runtimeparam(void)
> +{
> +   int count;
> +   return sysfs__read_int("/devices/hv_24x7/interface/sockets", ) 
> < 0 ? 1 : count;
> +}
> diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> index ea10fc4412c4..516504cf0ea5 100644
> --- a/tools/perf/tests/expr.c
> +++ b/tools/perf/tests/expr.c
> @@ -10,7 +10,7 @@ static 

[PATCH v2] powerpc: Drop CONFIG_MTD_M25P80 in 85xx-hw.config

2020-05-01 Thread Bin Meng
From: Bin Meng 

Drop CONFIG_MTD_M25P80 that was removed in
commit b35b9a10362d ("mtd: spi-nor: Move m25p80 code in spi-nor.c")

Signed-off-by: Bin Meng 

---

Changes in v2:
- correct the typo (5xx => 85xx) in the commit title

 arch/powerpc/configs/85xx-hw.config | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/configs/85xx-hw.config 
b/arch/powerpc/configs/85xx-hw.config
index b507df6..524db76 100644
--- a/arch/powerpc/configs/85xx-hw.config
+++ b/arch/powerpc/configs/85xx-hw.config
@@ -67,7 +67,6 @@ CONFIG_MTD_CFI_AMDSTD=y
 CONFIG_MTD_CFI_INTELEXT=y
 CONFIG_MTD_CFI=y
 CONFIG_MTD_CMDLINE_PARTS=y
-CONFIG_MTD_M25P80=y
 CONFIG_MTD_NAND_FSL_ELBC=y
 CONFIG_MTD_NAND_FSL_IFC=y
 CONFIG_MTD_RAW_NAND=y
-- 
2.7.4



[PATCH] powerpc: Drop CONFIG_MTD_M25P80 in 5xx-hw.config

2020-05-01 Thread Bin Meng
From: Bin Meng 

Drop CONFIG_MTD_M25P80 that was removed in
commit b35b9a10362d ("mtd: spi-nor: Move m25p80 code in spi-nor.c")

Signed-off-by: Bin Meng 
---

 arch/powerpc/configs/85xx-hw.config | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/configs/85xx-hw.config 
b/arch/powerpc/configs/85xx-hw.config
index b507df6..524db76 100644
--- a/arch/powerpc/configs/85xx-hw.config
+++ b/arch/powerpc/configs/85xx-hw.config
@@ -67,7 +67,6 @@ CONFIG_MTD_CFI_AMDSTD=y
 CONFIG_MTD_CFI_INTELEXT=y
 CONFIG_MTD_CFI=y
 CONFIG_MTD_CMDLINE_PARTS=y
-CONFIG_MTD_M25P80=y
 CONFIG_MTD_NAND_FSL_ELBC=y
 CONFIG_MTD_NAND_FSL_IFC=y
 CONFIG_MTD_RAW_NAND=y
-- 
2.7.4



Re: 5.7-rc interrupt_return Unrecoverable exception 380

2020-05-01 Thread Nicholas Piggin
Excerpts from Hugh Dickins's message of May 2, 2020 6:38 am:
> Hi Nick,
> 
> I've been getting an "Unrecoverable exception 380" after a few hours
> of load on the G5 (yes, that G5!) with 5.7-rc: when interrupt_return
> checks lazy_irq_pending, it crashes at check_preemption_disabled+0x24
> with CONFIG_DEBUG_PREEMPT=y.
> 
> check_preemption_disabled():
> lib/smp_processor_id.c:13
>0: 7c 08 02 a6 mflrr0
>4: fb e1 ff f8 std r31,-8(r1)
>8: fb 61 ff d8 std r27,-40(r1)
>c: fb 81 ff e0 std r28,-32(r1)
>   10: fb a1 ff e8 std r29,-24(r1)
>   14: fb c1 ff f0 std r30,-16(r1)
> get_current():
> arch/powerpc/include/asm/current.h:20
>   18: eb ed 01 88 ld  r31,392(r13)
> check_preemption_disabled():
> lib/smp_processor_id.c:13
>   1c: f8 01 00 10 std r0,16(r1)
>   20: f8 21 ff 61 stdur1,-160(r1)
> __read_once_size():
> include/linux/compiler.h:199
>   24: 81 3f 00 00 lwz r9,0(r31)
> check_preemption_disabled():
> lib/smp_processor_id.c:14
>   28: a3 cd 00 02 lhz r30,2(r13)
> 
> I don't read ppc assembly, and have not jotted down the registers,
> but hope you can make sense of it. I get around it with the patch
> below (just avoiding the debug), but have no idea whether it's a
> necessary fix or a hacky workaround.

Hi Hugh,

Thanks for the report, nice catch. Your fix is actually the correct one 
(well, we probably want a __lazy_irq_pending() variant which is to be 
used in these cases).

Problem is MSR[RI] is cleared here, ready to do the last few things for 
interrupt return where we're not allowed to take any other interrupts.

SLB interrupts can happen just about anywhere aside from kernel text, 
global variables, and stack. When that hits, it appears to be 
unrecoverable due to RI=0.

We could clear just MSR[EE] for asynchronous interrupts, then check 
lazy_irq_pending(), and then clear MSR[RI] ready to return, and the
SLB miss in the debug check would be fine. But that's two mtmsr 
instructions, which is slower. So we'll skip the check.

I tested hash, and preempt, possibly even preempt+hash, but clearly not 
preempt+preempt_debug+hash+slb thrashing!

Thanks,
Nick

> 
> Hugh
> 
> --- 5.7-rc3/arch/powerpc/include/asm/hw_irq.h 2020-04-12 16:24:29.802769727 
> -0700
> +++ linux/arch/powerpc/include/asm/hw_irq.h   2020-04-27 11:31:10.0 
> -0700
> @@ -252,7 +252,7 @@ static inline bool arch_irqs_disabled(vo
>  
>  static inline bool lazy_irq_pending(void)
>  {
> - return !!(get_paca()->irq_happened & ~PACA_IRQ_HARD_DIS);
> + return !!(local_paca->irq_happened & ~PACA_IRQ_HARD_DIS);
>  }
>  
>  /*
> 


Re: [PATCH 21/29] mm: remove the pgprot argument to __vmalloc

2020-05-01 Thread Andrew Morton
On Thu, 30 Apr 2020 22:38:10 -0400 John Dorminy  wrote:

> the change
> description refers to PROT_KERNEL, which is a symbol which does not
> appear to exist; perhaps PAGE_KERNEL was meant?

Yes, thanks, fixed.


Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread Dan Williams
On Fri, May 1, 2020 at 2:11 PM David Hildenbrand  wrote:
>
> On 01.05.20 22:12, Dan Williams wrote:
[..]
> >>> Consider the case of EFI Special Purpose (SP) Memory that is
> >>> marked EFI Conventional Memory with the SP attribute. In that case the
> >>> firmware memory map marked it as conventional RAM, but the kernel
> >>> optionally marks it as System RAM vs Soft Reserved. The 2008 patch
> >>> simply does not consider that case. I'm not sure strict textualism
> >>> works for coding decisions.
> >>
> >> I am no expert on that matter (esp EFI). But looking at the users of
> >> firmware_map_add_early(), the single user is in arch/x86/kernel/e820.c
> >> . So the single source of /sys/firmware/memmap is (besides hotplug) e820.
> >>
> >> "'e820_table_firmware': the original firmware version passed to us by
> >> the bootloader - not modified by the kernel. ... inform the user about
> >> the firmware's notion of memory layout via /sys/firmware/memmap"
> >> (arch/x86/kernel/e820.c)
> >>
> >> How is the EFI Special Purpose (SP) Memory represented in e820?
> >> /sys/firmware/memmap is really simple: just dump in e820. No policies IIUC.
> >
> > e820 now has a Soft Reserved translation for this which means "try to
> > reserve, but treat as System RAM is ok too". It seems generically
> > useful to me that the toggle for determining whether Soft Reserved or
> > System RAM shows up /sys/firmware/memmap is a determination that
> > policy can make. The kernel need not preemptively block it.
>
> So, I think I have to clarify something here. We do have two ways to kexec
>
> 1. kexec_load(): User space (kexec-tools) crafts the memmap (e.g., using
> /sys/firmware/memmap on x86-64) and selects memory where to place the
> kexec images (e.g., using /proc/iomem)
>
> 2. kexec_file_load(): The kernel reuses the (basically) raw firmware
> memmap and selects memory where to place kexec images.
>
> We are talking about changing 1, to behave like 2 in regards to
> dax/kmem. 2. does currently not add any hotplugged memory to the
> fixed-up e820, and it should be fixed regarding hotplugged DIMMs that
> would appear in e820 after a reboot.
>
> Now, all these policy discussions are nice and fun, but I don't really
> see a good reason to (ab)use /sys/firmware/memmap for that (e.g., parent
> properties). If you want to be able to make this configurable, then
> e.g., add a way to configure this in the kernel (for example along with
> kmem) to make 1. and 2. behave the same way. Otherwise, you really only
> can change 1.

That's clearer.

>
>
> Now, let's clarify what I want regarding virtio-mem:
>
> 1. kexec should not add virtio-mem memory to the initial firmware
>memmap. The driver has to be in charge as discussed.
> 2. kexec should not place kexec images onto virtio-mem memory. That
>would end badly.
> 3. kexec should still dump virtio-mem memory via kdump.

Ok, but then seems to say to me that dax/kmem is a different type of
(driver managed) than virtio-mem and it's confusing to try to apply
the same meaning. Why not just call your type for the distinct type it
is "System RAM (virtio-mem)" and let any other driver managed memory
follow the same "System RAM ($driver)" format if it wants?


Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread David Hildenbrand
On 01.05.20 22:12, Dan Williams wrote:
> On Fri, May 1, 2020 at 12:18 PM David Hildenbrand  wrote:
>>
>> On 01.05.20 20:43, Dan Williams wrote:
>>> On Fri, May 1, 2020 at 11:14 AM David Hildenbrand  wrote:

 On 01.05.20 20:03, Dan Williams wrote:
> On Fri, May 1, 2020 at 10:51 AM David Hildenbrand  
> wrote:
>>
>> On 01.05.20 19:45, David Hildenbrand wrote:
>>> On 01.05.20 19:39, Dan Williams wrote:
 On Fri, May 1, 2020 at 10:21 AM David Hildenbrand  
 wrote:
>
> On 01.05.20 18:56, Dan Williams wrote:
>> On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  
>> wrote:
>>>
>>> On 01.05.20 00:24, Andrew Morton wrote:
 On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand 
  wrote:

>>
>> Why does the firmware map support hotplug entries?
>
> I assume:
>
> The firmware memmap was added primarily for x86-64 kexec (and 
> still, is
> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. 
> When DIMMs
> get hotplugged on real HW, they get added to e820. Same applies to
> memory added via HyperV balloon (unless memory is unplugged via
> ballooning and you reboot ... the the e820 is changed as well). I 
> assume
> we wanted to be able to reflect that, to make kexec look like a 
> real reboot.
>
> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
>
>
> But I assume only Andrew can enlighten us.
>
> @Andrew, any guidance here? Should we really add all memory to the
> firmware memmap, even if this contradicts with the existing
> documentation? (especially, if the actual firmware memmap will 
> *not*
> contain that memory after a reboot)

 For some reason that patch is misattributed - it was authored by
 Shaohui Zheng , who hasn't been heard 
 from in
 a decade.  I looked through the email discussion from that time 
 and I'm
 not seeing anything useful.  But I wasn't able to locate Dave 
 Hansen's
 review comments.
>>>
>>> Okay, thanks for checking. I think the documentation from 2008 is 
>>> pretty
>>> clear what has to be done here. I will add some of these details to 
>>> the
>>> patch description.
>>>
>>> Also, now that I know that esp. kexec-tools already don't consider
>>> dax/kmem memory properly (memory will not get dumped via kdump) and
>>> won't really suffer from a name change in /proc/iomem, I will go 
>>> back to
>>> the MHP_DRIVER_MANAGED approach and
>>> 1. Don't create firmware memmap entries
>>> 2. Name the resource "System RAM (driver managed)"
>>> 3. Flag the resource via something like 
>>> IORESOURCE_MEM_DRIVER_MANAGED.
>>>
>>> This way, kernel users and user space can figure out that this 
>>> memory
>>> has different semantics and handle it accordingly - I think that was
>>> what Eric was asking for.
>>>
>>> Of course, open for suggestions.
>>
>> I'm still more of a fan of this being communicated by "System RAM"
>
> I was mentioning somewhere in this thread that "System RAM" inside a
> hierarchy (like dax/kmem) will already be basically ignored by
> kexec-tools. So, placing it inside a hierarchy already makes it look
> special already.
>
> But after all, as we have to change kexec-tools either way, we can
> directly go ahead and flag it properly as special (in case there will
> ever be other cases where we could no longer distinguish it).
>
>> being parented especially because that tells you something about how
>> the memory is driver-managed and which mechanism might be in play.
>
> The could be communicated to some degree via the resource hierarchy.
>
> E.g.,
>
> [root@localhost ~]# cat /proc/iomem
> ...
> 14000-33fff : Persistent Memory
>   14000-1481f : namespace0.0
>   15000-33fff : dax0.0
> 15000-33fff : System RAM (driver managed)
>
> vs.
>
>:/# cat /proc/iomem
> [...]
> 14000-333ff : virtio-mem (virtio0)
>   14000-147ff : System RAM (driver managed)
>   14800-14fff : System RAM (driver managed)

Re: [RFC PATCH v2 1/5] powerpc/mm: Introduce temporary mm

2020-05-01 Thread Christopher M. Riedl
On Wed Apr 29, 2020 at 7:48 AM, Christophe Leroy wrote:
>
> 
>
> 
> Le 29/04/2020 à 04:05, Christopher M. Riedl a écrit :
> > x86 supports the notion of a temporary mm which restricts access to
> > temporary PTEs to a single CPU. A temporary mm is useful for situations
> > where a CPU needs to perform sensitive operations (such as patching a
> > STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
> > said mappings to other CPUs. A side benefit is that other CPU TLBs do
> > not need to be flushed when the temporary mm is torn down.
> > 
> > Mappings in the temporary mm can be set in the userspace portion of the
> > address-space.
> > 
> > Interrupts must be disabled while the temporary mm is in use. HW
> > breakpoints, which may have been set by userspace as watchpoints on
> > addresses now within the temporary mm, are saved and disabled when
> > loading the temporary mm. The HW breakpoints are restored when unloading
> > the temporary mm. All HW breakpoints are indiscriminately disabled while
> > the temporary mm is in use.
>
> 
> Why do we need to use a temporary mm all the time ?
>

Not sure I understand, the temporary mm is only in use for kernel
patching in this series. We could have other uses in the future maybe
where it's beneficial to keep mappings local.

> 
> Doesn't each CPU have its own mm already ? Only the upper address space
> is shared between all mm's but each mm has its own lower address space,
> at least when it is running a user process. Why not just use that mm ?
> As we are mapping then unmapping with interrupts disabled, there is no
> risk at all that the user starts running while the patch page is mapped,
> so I'm not sure why switching to a temporary mm is needed.
>
> 

I suppose that's an option, but then we have to save and restore the
mapping which we temporarily "steal" from userspace. I admit I didn't
consider that as an option when I started this series based on the x86
patches. I think it's cleaner to switch mm, but that's a rather weak
argument. Are you concerned about performance with the temporary mm?

>
> 
> > 
> > Based on x86 implementation:
> > 
> > commit cefa929c034e
> > ("x86/mm: Introduce temporary mm structs")
> > 
> > Signed-off-by: Christopher M. Riedl 
>
> 
> Christophe
>
> 
>
> 



Re: [RFC PATCH v2 1/5] powerpc/mm: Introduce temporary mm

2020-05-01 Thread Christopher M. Riedl
On Wed Apr 29, 2020 at 7:39 AM, Christophe Leroy wrote:
>
> 
>
> 
> Le 29/04/2020 à 04:05, Christopher M. Riedl a écrit :
> > x86 supports the notion of a temporary mm which restricts access to
> > temporary PTEs to a single CPU. A temporary mm is useful for situations
> > where a CPU needs to perform sensitive operations (such as patching a
> > STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
> > said mappings to other CPUs. A side benefit is that other CPU TLBs do
> > not need to be flushed when the temporary mm is torn down.
> > 
> > Mappings in the temporary mm can be set in the userspace portion of the
> > address-space.
> > 
> > Interrupts must be disabled while the temporary mm is in use. HW
> > breakpoints, which may have been set by userspace as watchpoints on
> > addresses now within the temporary mm, are saved and disabled when
> > loading the temporary mm. The HW breakpoints are restored when unloading
> > the temporary mm. All HW breakpoints are indiscriminately disabled while
> > the temporary mm is in use.
> > 
> > Based on x86 implementation:
> > 
> > commit cefa929c034e
> > ("x86/mm: Introduce temporary mm structs")
> > 
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >   arch/powerpc/include/asm/debug.h   |  1 +
> >   arch/powerpc/include/asm/mmu_context.h | 54 ++
> >   arch/powerpc/kernel/process.c  |  5 +++
> >   3 files changed, 60 insertions(+)
> > 
> > diff --git a/arch/powerpc/include/asm/debug.h 
> > b/arch/powerpc/include/asm/debug.h
> > index 7756026b95ca..b945bc16c932 100644
> > --- a/arch/powerpc/include/asm/debug.h
> > +++ b/arch/powerpc/include/asm/debug.h
> > @@ -45,6 +45,7 @@ static inline int debugger_break_match(struct pt_regs 
> > *regs) { return 0; }
> >   static inline int debugger_fault_handler(struct pt_regs *regs) { return 
> > 0; }
> >   #endif
> >   
> > +void __get_breakpoint(struct arch_hw_breakpoint *brk);
> >   void __set_breakpoint(struct arch_hw_breakpoint *brk);
> >   bool ppc_breakpoint_available(void);
> >   #ifdef CONFIG_PPC_ADV_DEBUG_REGS
> > diff --git a/arch/powerpc/include/asm/mmu_context.h 
> > b/arch/powerpc/include/asm/mmu_context.h
> > index 360367c579de..57a8695fe63f 100644
> > --- a/arch/powerpc/include/asm/mmu_context.h
> > +++ b/arch/powerpc/include/asm/mmu_context.h
> > @@ -10,6 +10,7 @@
> >   #include   
> >   #include 
> >   #include 
> > +#include 
> >   
> >   /*
> >* Most if the context management is out of line
> > @@ -270,5 +271,58 @@ static inline int arch_dup_mmap(struct mm_struct 
> > *oldmm,
> > return 0;
> >   }
> >   
> > +struct temp_mm {
> > +   struct mm_struct *temp;
> > +   struct mm_struct *prev;
> > +   bool is_kernel_thread;
> > +   struct arch_hw_breakpoint brk;
> > +};
> > +
> > +static inline void init_temp_mm(struct temp_mm *temp_mm, struct mm_struct 
> > *mm)
> > +{
> > +   temp_mm->temp = mm;
> > +   temp_mm->prev = NULL;
> > +   temp_mm->is_kernel_thread = false;
> > +   memset(_mm->brk, 0, sizeof(temp_mm->brk));
> > +}
> > +
> > +static inline void use_temporary_mm(struct temp_mm *temp_mm)
> > +{
> > +   lockdep_assert_irqs_disabled();
> > +
> > +   temp_mm->is_kernel_thread = current->mm == NULL;
> > +   if (temp_mm->is_kernel_thread)
> > +   temp_mm->prev = current->active_mm;
> > +   else
> > +   temp_mm->prev = current->mm;
> > +
> > +   /*
> > +* Hash requires a non-NULL current->mm to allocate a userspace address
> > +* when handling a page fault. Does not appear to hurt in Radix either.
> > +*/
> > +   current->mm = temp_mm->temp;
> > +   switch_mm_irqs_off(NULL, temp_mm->temp, current);
> > +
> > +   if (ppc_breakpoint_available()) {
> > +   __get_breakpoint(_mm->brk);
> > +   if (temp_mm->brk.type != 0)
> > +   hw_breakpoint_disable();
> > +   }
> > +}
> > +
> > +static inline void unuse_temporary_mm(struct temp_mm *temp_mm)
>
> 
> Not sure "unuse" is a best naming, allthought I don't have a better
> suggestion a the moment. If not using temporary_mm anymore, what are we
> using now ?
>
> 

I'm not too fond of 'unuse' either, but it's what x86 uses and I
couldn't come up with anything better on the spot. Maybe 'undo' is
better since we're switching back to whatever mm was in use before?

> > +{
> > +   lockdep_assert_irqs_disabled();
> > +
> > +   if (temp_mm->is_kernel_thread)
> > +   current->mm = NULL;
> > +   else
> > +   current->mm = temp_mm->prev;
> > +   switch_mm_irqs_off(NULL, temp_mm->prev, current);
> > +
> > +   if (ppc_breakpoint_available() && temp_mm->brk.type != 0)
> > +   __set_breakpoint(_mm->brk);
> > +}
> > +
> >   #endif /* __KERNEL__ */
> >   #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
> > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > index 9c21288f8645..ec4cf890d92c 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -800,6 +800,11 @@ static inline int 

Re: [RFC PATCH v2 3/5] powerpc/lib: Use a temporary mm for code patching

2020-05-01 Thread Christopher M. Riedl
On Wed Apr 29, 2020 at 7:52 AM, Christophe Leroy wrote:
>
> 
>
> 
> Le 29/04/2020 à 04:05, Christopher M. Riedl a écrit :
> > Currently, code patching a STRICT_KERNEL_RWX exposes the temporary
> > mappings to other CPUs. These mappings should be kept local to the CPU
> > doing the patching. Use the pre-initialized temporary mm and patching
> > address for this purpose. Also add a check after patching to ensure the
> > patch succeeded.
> > 
> > Use the KUAP functions on non-BOOKS3_64 platforms since the temporary
> > mapping for patching uses a userspace address (to keep the mapping
> > local). On BOOKS3_64 platforms hash does not implement KUAP and on radix
> > the use of PAGE_KERNEL sets EAA[0] for the PTE which means the AMR
> > (KUAP) protection is ignored (see PowerISA v3.0b, Fig, 35).
> > 
> > Based on x86 implementation:
> > 
> > commit b3fd8e83ada0
> > ("x86/alternatives: Use temporary mm for text poking")
> > 
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >   arch/powerpc/lib/code-patching.c | 149 ---
> >   1 file changed, 55 insertions(+), 94 deletions(-)
> > 
> > diff --git a/arch/powerpc/lib/code-patching.c 
> > b/arch/powerpc/lib/code-patching.c
> > index 259c19480a85..26f06cdb5d7e 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
> > @@ -19,6 +19,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   
> >   static int __patch_instruction(unsigned int *exec_addr, unsigned int 
> > instr,
> >unsigned int *patch_addr)
> > @@ -72,101 +73,58 @@ void __init poking_init(void)
> > pte_unmap_unlock(ptep, ptl);
> >   }
> >   
> > -static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> > -
> > -static int text_area_cpu_up(unsigned int cpu)
> > -{
> > -   struct vm_struct *area;
> > -
> > -   area = get_vm_area(PAGE_SIZE, VM_ALLOC);
> > -   if (!area) {
> > -   WARN_ONCE(1, "Failed to create text area for cpu %d\n",
> > -   cpu);
> > -   return -1;
> > -   }
> > -   this_cpu_write(text_poke_area, area);
> > -
> > -   return 0;
> > -}
> > -
> > -static int text_area_cpu_down(unsigned int cpu)
> > -{
> > -   free_vm_area(this_cpu_read(text_poke_area));
> > -   return 0;
> > -}
> > -
> > -/*
> > - * Run as a late init call. This allows all the boot time patching to be 
> > done
> > - * simply by patching the code, and then we're called here prior to
> > - * mark_rodata_ro(), which happens after all init calls are run. Although
> > - * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we 
> > judge
> > - * it as being preferable to a kernel that will crash later when someone 
> > tries
> > - * to use patch_instruction().
> > - */
> > -static int __init setup_text_poke_area(void)
> > -{
> > -   BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> > -   "powerpc/text_poke:online", text_area_cpu_up,
> > -   text_area_cpu_down));
> > -
> > -   return 0;
> > -}
> > -late_initcall(setup_text_poke_area);
> > +struct patch_mapping {
> > +   spinlock_t *ptl; /* for protecting pte table */
> > +   pte_t *ptep;
> > +   struct temp_mm temp_mm;
> > +};
> >   
> >   /*
> >* This can be called for kernel text or a module.
> >*/
> > -static int map_patch_area(void *addr, unsigned long text_poke_addr)
> > +static int map_patch(const void *addr, struct patch_mapping *patch_mapping)
> >   {
> > -   unsigned long pfn;
> > -   int err;
> > +   struct page *page;
> > +   pte_t pte;
> > +   pgprot_t pgprot;
> >   
> > if (is_vmalloc_addr(addr))
> > -   pfn = vmalloc_to_pfn(addr);
> > +   page = vmalloc_to_page(addr);
> > else
> > -   pfn = __pa_symbol(addr) >> PAGE_SHIFT;
> > +   page = virt_to_page(addr);
> >   
> > -   err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
> > +   if (radix_enabled())
> > +   pgprot = PAGE_KERNEL;
> > +   else
> > +   pgprot = PAGE_SHARED;
> >   
> > -   pr_devel("Mapped addr %lx with pfn %lx:%d\n", text_poke_addr, pfn, err);
> > -   if (err)
> > +   patch_mapping->ptep = get_locked_pte(patching_mm, patching_addr,
> > +_mapping->ptl);
> > +   if (unlikely(!patch_mapping->ptep)) {
> > +   pr_warn("map patch: failed to allocate pte for patching\n");
> > return -1;
> > +   }
> > +
> > +   pte = mk_pte(page, pgprot);
> > +   if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64))
> > +   pte = pte_mkdirty(pte);
>
> 
> Why only when CONFIG_PPC_BOOK3S_64 is not set ?
>
> 
> PAGE_KERNEL should already be dirty, so making it dirty all the time
> shouldn't hurt.
>
> 
Ok, I'll remove this check to simplify.
> > +   set_pte_at(patching_mm, patching_addr, patch_mapping->ptep, pte);
> > +
> > +   init_temp_mm(_mapping->temp_mm, patching_mm);
> > +   use_temporary_mm(_mapping->temp_mm);
> >   
> > return 0;
> >   }
> >   
> > -static inline int unmap_patch_area(unsigned long addr)
> 

Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread Dan Williams
On Fri, May 1, 2020 at 12:18 PM David Hildenbrand  wrote:
>
> On 01.05.20 20:43, Dan Williams wrote:
> > On Fri, May 1, 2020 at 11:14 AM David Hildenbrand  wrote:
> >>
> >> On 01.05.20 20:03, Dan Williams wrote:
> >>> On Fri, May 1, 2020 at 10:51 AM David Hildenbrand  
> >>> wrote:
> 
>  On 01.05.20 19:45, David Hildenbrand wrote:
> > On 01.05.20 19:39, Dan Williams wrote:
> >> On Fri, May 1, 2020 at 10:21 AM David Hildenbrand  
> >> wrote:
> >>>
> >>> On 01.05.20 18:56, Dan Williams wrote:
>  On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  
>  wrote:
> >
> > On 01.05.20 00:24, Andrew Morton wrote:
> >> On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand 
> >>  wrote:
> >>
> 
>  Why does the firmware map support hotplug entries?
> >>>
> >>> I assume:
> >>>
> >>> The firmware memmap was added primarily for x86-64 kexec (and 
> >>> still, is
> >>> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. 
> >>> When DIMMs
> >>> get hotplugged on real HW, they get added to e820. Same applies to
> >>> memory added via HyperV balloon (unless memory is unplugged via
> >>> ballooning and you reboot ... the the e820 is changed as well). I 
> >>> assume
> >>> we wanted to be able to reflect that, to make kexec look like a 
> >>> real reboot.
> >>>
> >>> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
> >>>
> >>>
> >>> But I assume only Andrew can enlighten us.
> >>>
> >>> @Andrew, any guidance here? Should we really add all memory to the
> >>> firmware memmap, even if this contradicts with the existing
> >>> documentation? (especially, if the actual firmware memmap will 
> >>> *not*
> >>> contain that memory after a reboot)
> >>
> >> For some reason that patch is misattributed - it was authored by
> >> Shaohui Zheng , who hasn't been heard 
> >> from in
> >> a decade.  I looked through the email discussion from that time 
> >> and I'm
> >> not seeing anything useful.  But I wasn't able to locate Dave 
> >> Hansen's
> >> review comments.
> >
> > Okay, thanks for checking. I think the documentation from 2008 is 
> > pretty
> > clear what has to be done here. I will add some of these details to 
> > the
> > patch description.
> >
> > Also, now that I know that esp. kexec-tools already don't consider
> > dax/kmem memory properly (memory will not get dumped via kdump) and
> > won't really suffer from a name change in /proc/iomem, I will go 
> > back to
> > the MHP_DRIVER_MANAGED approach and
> > 1. Don't create firmware memmap entries
> > 2. Name the resource "System RAM (driver managed)"
> > 3. Flag the resource via something like 
> > IORESOURCE_MEM_DRIVER_MANAGED.
> >
> > This way, kernel users and user space can figure out that this 
> > memory
> > has different semantics and handle it accordingly - I think that was
> > what Eric was asking for.
> >
> > Of course, open for suggestions.
> 
>  I'm still more of a fan of this being communicated by "System RAM"
> >>>
> >>> I was mentioning somewhere in this thread that "System RAM" inside a
> >>> hierarchy (like dax/kmem) will already be basically ignored by
> >>> kexec-tools. So, placing it inside a hierarchy already makes it look
> >>> special already.
> >>>
> >>> But after all, as we have to change kexec-tools either way, we can
> >>> directly go ahead and flag it properly as special (in case there will
> >>> ever be other cases where we could no longer distinguish it).
> >>>
>  being parented especially because that tells you something about how
>  the memory is driver-managed and which mechanism might be in play.
> >>>
> >>> The could be communicated to some degree via the resource hierarchy.
> >>>
> >>> E.g.,
> >>>
> >>> [root@localhost ~]# cat /proc/iomem
> >>> ...
> >>> 14000-33fff : Persistent Memory
> >>>   14000-1481f : namespace0.0
> >>>   15000-33fff : dax0.0
> >>> 15000-33fff : System RAM (driver managed)
> >>>
> >>> vs.
> >>>
> >>>:/# cat /proc/iomem
> >>> [...]
> >>> 14000-333ff : virtio-mem (virtio0)
> >>>   14000-147ff : System RAM (driver managed)
> >>>   14800-14fff : System RAM (driver managed)
> >>>   15000-157ff : 

Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread David Hildenbrand
On 01.05.20 20:43, Dan Williams wrote:
> On Fri, May 1, 2020 at 11:14 AM David Hildenbrand  wrote:
>>
>> On 01.05.20 20:03, Dan Williams wrote:
>>> On Fri, May 1, 2020 at 10:51 AM David Hildenbrand  wrote:

 On 01.05.20 19:45, David Hildenbrand wrote:
> On 01.05.20 19:39, Dan Williams wrote:
>> On Fri, May 1, 2020 at 10:21 AM David Hildenbrand  
>> wrote:
>>>
>>> On 01.05.20 18:56, Dan Williams wrote:
 On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  
 wrote:
>
> On 01.05.20 00:24, Andrew Morton wrote:
>> On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand 
>>  wrote:
>>

 Why does the firmware map support hotplug entries?
>>>
>>> I assume:
>>>
>>> The firmware memmap was added primarily for x86-64 kexec (and 
>>> still, is
>>> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When 
>>> DIMMs
>>> get hotplugged on real HW, they get added to e820. Same applies to
>>> memory added via HyperV balloon (unless memory is unplugged via
>>> ballooning and you reboot ... the the e820 is changed as well). I 
>>> assume
>>> we wanted to be able to reflect that, to make kexec look like a 
>>> real reboot.
>>>
>>> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
>>>
>>>
>>> But I assume only Andrew can enlighten us.
>>>
>>> @Andrew, any guidance here? Should we really add all memory to the
>>> firmware memmap, even if this contradicts with the existing
>>> documentation? (especially, if the actual firmware memmap will *not*
>>> contain that memory after a reboot)
>>
>> For some reason that patch is misattributed - it was authored by
>> Shaohui Zheng , who hasn't been heard from 
>> in
>> a decade.  I looked through the email discussion from that time and 
>> I'm
>> not seeing anything useful.  But I wasn't able to locate Dave 
>> Hansen's
>> review comments.
>
> Okay, thanks for checking. I think the documentation from 2008 is 
> pretty
> clear what has to be done here. I will add some of these details to 
> the
> patch description.
>
> Also, now that I know that esp. kexec-tools already don't consider
> dax/kmem memory properly (memory will not get dumped via kdump) and
> won't really suffer from a name change in /proc/iomem, I will go back 
> to
> the MHP_DRIVER_MANAGED approach and
> 1. Don't create firmware memmap entries
> 2. Name the resource "System RAM (driver managed)"
> 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.
>
> This way, kernel users and user space can figure out that this memory
> has different semantics and handle it accordingly - I think that was
> what Eric was asking for.
>
> Of course, open for suggestions.

 I'm still more of a fan of this being communicated by "System RAM"
>>>
>>> I was mentioning somewhere in this thread that "System RAM" inside a
>>> hierarchy (like dax/kmem) will already be basically ignored by
>>> kexec-tools. So, placing it inside a hierarchy already makes it look
>>> special already.
>>>
>>> But after all, as we have to change kexec-tools either way, we can
>>> directly go ahead and flag it properly as special (in case there will
>>> ever be other cases where we could no longer distinguish it).
>>>
 being parented especially because that tells you something about how
 the memory is driver-managed and which mechanism might be in play.
>>>
>>> The could be communicated to some degree via the resource hierarchy.
>>>
>>> E.g.,
>>>
>>> [root@localhost ~]# cat /proc/iomem
>>> ...
>>> 14000-33fff : Persistent Memory
>>>   14000-1481f : namespace0.0
>>>   15000-33fff : dax0.0
>>> 15000-33fff : System RAM (driver managed)
>>>
>>> vs.
>>>
>>>:/# cat /proc/iomem
>>> [...]
>>> 14000-333ff : virtio-mem (virtio0)
>>>   14000-147ff : System RAM (driver managed)
>>>   14800-14fff : System RAM (driver managed)
>>>   15000-157ff : System RAM (driver managed)
>>>
>>> Good enough for my taste.
>>>
 What about adding an optional /sys/firmware/memmap/X/parent attribute.
>>>
>>> I really don't want any firmware memmap entries for something that is
>>> not part of the firmware provided memmap. In addition,
>>> 

Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread Dan Williams
On Fri, May 1, 2020 at 11:14 AM David Hildenbrand  wrote:
>
> On 01.05.20 20:03, Dan Williams wrote:
> > On Fri, May 1, 2020 at 10:51 AM David Hildenbrand  wrote:
> >>
> >> On 01.05.20 19:45, David Hildenbrand wrote:
> >>> On 01.05.20 19:39, Dan Williams wrote:
>  On Fri, May 1, 2020 at 10:21 AM David Hildenbrand  
>  wrote:
> >
> > On 01.05.20 18:56, Dan Williams wrote:
> >> On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  
> >> wrote:
> >>>
> >>> On 01.05.20 00:24, Andrew Morton wrote:
>  On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand 
>   wrote:
> 
> >>
> >> Why does the firmware map support hotplug entries?
> >
> > I assume:
> >
> > The firmware memmap was added primarily for x86-64 kexec (and 
> > still, is
> > mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When 
> > DIMMs
> > get hotplugged on real HW, they get added to e820. Same applies to
> > memory added via HyperV balloon (unless memory is unplugged via
> > ballooning and you reboot ... the the e820 is changed as well). I 
> > assume
> > we wanted to be able to reflect that, to make kexec look like a 
> > real reboot.
> >
> > This worked for a while. Then came dax/kmem. Now comes virtio-mem.
> >
> >
> > But I assume only Andrew can enlighten us.
> >
> > @Andrew, any guidance here? Should we really add all memory to the
> > firmware memmap, even if this contradicts with the existing
> > documentation? (especially, if the actual firmware memmap will *not*
> > contain that memory after a reboot)
> 
>  For some reason that patch is misattributed - it was authored by
>  Shaohui Zheng , who hasn't been heard from 
>  in
>  a decade.  I looked through the email discussion from that time and 
>  I'm
>  not seeing anything useful.  But I wasn't able to locate Dave 
>  Hansen's
>  review comments.
> >>>
> >>> Okay, thanks for checking. I think the documentation from 2008 is 
> >>> pretty
> >>> clear what has to be done here. I will add some of these details to 
> >>> the
> >>> patch description.
> >>>
> >>> Also, now that I know that esp. kexec-tools already don't consider
> >>> dax/kmem memory properly (memory will not get dumped via kdump) and
> >>> won't really suffer from a name change in /proc/iomem, I will go back 
> >>> to
> >>> the MHP_DRIVER_MANAGED approach and
> >>> 1. Don't create firmware memmap entries
> >>> 2. Name the resource "System RAM (driver managed)"
> >>> 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.
> >>>
> >>> This way, kernel users and user space can figure out that this memory
> >>> has different semantics and handle it accordingly - I think that was
> >>> what Eric was asking for.
> >>>
> >>> Of course, open for suggestions.
> >>
> >> I'm still more of a fan of this being communicated by "System RAM"
> >
> > I was mentioning somewhere in this thread that "System RAM" inside a
> > hierarchy (like dax/kmem) will already be basically ignored by
> > kexec-tools. So, placing it inside a hierarchy already makes it look
> > special already.
> >
> > But after all, as we have to change kexec-tools either way, we can
> > directly go ahead and flag it properly as special (in case there will
> > ever be other cases where we could no longer distinguish it).
> >
> >> being parented especially because that tells you something about how
> >> the memory is driver-managed and which mechanism might be in play.
> >
> > The could be communicated to some degree via the resource hierarchy.
> >
> > E.g.,
> >
> > [root@localhost ~]# cat /proc/iomem
> > ...
> > 14000-33fff : Persistent Memory
> >   14000-1481f : namespace0.0
> >   15000-33fff : dax0.0
> > 15000-33fff : System RAM (driver managed)
> >
> > vs.
> >
> >:/# cat /proc/iomem
> > [...]
> > 14000-333ff : virtio-mem (virtio0)
> >   14000-147ff : System RAM (driver managed)
> >   14800-14fff : System RAM (driver managed)
> >   15000-157ff : System RAM (driver managed)
> >
> > Good enough for my taste.
> >
> >> What about adding an optional /sys/firmware/memmap/X/parent attribute.
> >
> > I really don't want any firmware memmap entries for something that is
> > not part of the firmware provided memmap. In addition,
> > /sys/firmware/memmap/ is still a fairly x86_64 specific 

Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread David Hildenbrand
On 01.05.20 20:03, Dan Williams wrote:
> On Fri, May 1, 2020 at 10:51 AM David Hildenbrand  wrote:
>>
>> On 01.05.20 19:45, David Hildenbrand wrote:
>>> On 01.05.20 19:39, Dan Williams wrote:
 On Fri, May 1, 2020 at 10:21 AM David Hildenbrand  wrote:
>
> On 01.05.20 18:56, Dan Williams wrote:
>> On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  
>> wrote:
>>>
>>> On 01.05.20 00:24, Andrew Morton wrote:
 On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand 
  wrote:

>>
>> Why does the firmware map support hotplug entries?
>
> I assume:
>
> The firmware memmap was added primarily for x86-64 kexec (and still, 
> is
> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When 
> DIMMs
> get hotplugged on real HW, they get added to e820. Same applies to
> memory added via HyperV balloon (unless memory is unplugged via
> ballooning and you reboot ... the the e820 is changed as well). I 
> assume
> we wanted to be able to reflect that, to make kexec look like a real 
> reboot.
>
> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
>
>
> But I assume only Andrew can enlighten us.
>
> @Andrew, any guidance here? Should we really add all memory to the
> firmware memmap, even if this contradicts with the existing
> documentation? (especially, if the actual firmware memmap will *not*
> contain that memory after a reboot)

 For some reason that patch is misattributed - it was authored by
 Shaohui Zheng , who hasn't been heard from in
 a decade.  I looked through the email discussion from that time and I'm
 not seeing anything useful.  But I wasn't able to locate Dave Hansen's
 review comments.
>>>
>>> Okay, thanks for checking. I think the documentation from 2008 is pretty
>>> clear what has to be done here. I will add some of these details to the
>>> patch description.
>>>
>>> Also, now that I know that esp. kexec-tools already don't consider
>>> dax/kmem memory properly (memory will not get dumped via kdump) and
>>> won't really suffer from a name change in /proc/iomem, I will go back to
>>> the MHP_DRIVER_MANAGED approach and
>>> 1. Don't create firmware memmap entries
>>> 2. Name the resource "System RAM (driver managed)"
>>> 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.
>>>
>>> This way, kernel users and user space can figure out that this memory
>>> has different semantics and handle it accordingly - I think that was
>>> what Eric was asking for.
>>>
>>> Of course, open for suggestions.
>>
>> I'm still more of a fan of this being communicated by "System RAM"
>
> I was mentioning somewhere in this thread that "System RAM" inside a
> hierarchy (like dax/kmem) will already be basically ignored by
> kexec-tools. So, placing it inside a hierarchy already makes it look
> special already.
>
> But after all, as we have to change kexec-tools either way, we can
> directly go ahead and flag it properly as special (in case there will
> ever be other cases where we could no longer distinguish it).
>
>> being parented especially because that tells you something about how
>> the memory is driver-managed and which mechanism might be in play.
>
> The could be communicated to some degree via the resource hierarchy.
>
> E.g.,
>
> [root@localhost ~]# cat /proc/iomem
> ...
> 14000-33fff : Persistent Memory
>   14000-1481f : namespace0.0
>   15000-33fff : dax0.0
> 15000-33fff : System RAM (driver managed)
>
> vs.
>
>:/# cat /proc/iomem
> [...]
> 14000-333ff : virtio-mem (virtio0)
>   14000-147ff : System RAM (driver managed)
>   14800-14fff : System RAM (driver managed)
>   15000-157ff : System RAM (driver managed)
>
> Good enough for my taste.
>
>> What about adding an optional /sys/firmware/memmap/X/parent attribute.
>
> I really don't want any firmware memmap entries for something that is
> not part of the firmware provided memmap. In addition,
> /sys/firmware/memmap/ is still a fairly x86_64 specific thing. Only mips
> and two arm configs enable it at all.
>
> So, IMHO, /sys/firmware/memmap/ is definitely not the way to go.

 I think that's a policy decision and policy decisions do not belong in
 the kernel. Give the tooling the opportunity to decide whether System
 RAM stays that way over a kexec. The parenthetical 

Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread Dan Williams
On Fri, May 1, 2020 at 10:51 AM David Hildenbrand  wrote:
>
> On 01.05.20 19:45, David Hildenbrand wrote:
> > On 01.05.20 19:39, Dan Williams wrote:
> >> On Fri, May 1, 2020 at 10:21 AM David Hildenbrand  wrote:
> >>>
> >>> On 01.05.20 18:56, Dan Williams wrote:
>  On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  
>  wrote:
> >
> > On 01.05.20 00:24, Andrew Morton wrote:
> >> On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand 
> >>  wrote:
> >>
> 
>  Why does the firmware map support hotplug entries?
> >>>
> >>> I assume:
> >>>
> >>> The firmware memmap was added primarily for x86-64 kexec (and still, 
> >>> is
> >>> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When 
> >>> DIMMs
> >>> get hotplugged on real HW, they get added to e820. Same applies to
> >>> memory added via HyperV balloon (unless memory is unplugged via
> >>> ballooning and you reboot ... the the e820 is changed as well). I 
> >>> assume
> >>> we wanted to be able to reflect that, to make kexec look like a real 
> >>> reboot.
> >>>
> >>> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
> >>>
> >>>
> >>> But I assume only Andrew can enlighten us.
> >>>
> >>> @Andrew, any guidance here? Should we really add all memory to the
> >>> firmware memmap, even if this contradicts with the existing
> >>> documentation? (especially, if the actual firmware memmap will *not*
> >>> contain that memory after a reboot)
> >>
> >> For some reason that patch is misattributed - it was authored by
> >> Shaohui Zheng , who hasn't been heard from in
> >> a decade.  I looked through the email discussion from that time and I'm
> >> not seeing anything useful.  But I wasn't able to locate Dave Hansen's
> >> review comments.
> >
> > Okay, thanks for checking. I think the documentation from 2008 is pretty
> > clear what has to be done here. I will add some of these details to the
> > patch description.
> >
> > Also, now that I know that esp. kexec-tools already don't consider
> > dax/kmem memory properly (memory will not get dumped via kdump) and
> > won't really suffer from a name change in /proc/iomem, I will go back to
> > the MHP_DRIVER_MANAGED approach and
> > 1. Don't create firmware memmap entries
> > 2. Name the resource "System RAM (driver managed)"
> > 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.
> >
> > This way, kernel users and user space can figure out that this memory
> > has different semantics and handle it accordingly - I think that was
> > what Eric was asking for.
> >
> > Of course, open for suggestions.
> 
>  I'm still more of a fan of this being communicated by "System RAM"
> >>>
> >>> I was mentioning somewhere in this thread that "System RAM" inside a
> >>> hierarchy (like dax/kmem) will already be basically ignored by
> >>> kexec-tools. So, placing it inside a hierarchy already makes it look
> >>> special already.
> >>>
> >>> But after all, as we have to change kexec-tools either way, we can
> >>> directly go ahead and flag it properly as special (in case there will
> >>> ever be other cases where we could no longer distinguish it).
> >>>
>  being parented especially because that tells you something about how
>  the memory is driver-managed and which mechanism might be in play.
> >>>
> >>> The could be communicated to some degree via the resource hierarchy.
> >>>
> >>> E.g.,
> >>>
> >>> [root@localhost ~]# cat /proc/iomem
> >>> ...
> >>> 14000-33fff : Persistent Memory
> >>>   14000-1481f : namespace0.0
> >>>   15000-33fff : dax0.0
> >>> 15000-33fff : System RAM (driver managed)
> >>>
> >>> vs.
> >>>
> >>>:/# cat /proc/iomem
> >>> [...]
> >>> 14000-333ff : virtio-mem (virtio0)
> >>>   14000-147ff : System RAM (driver managed)
> >>>   14800-14fff : System RAM (driver managed)
> >>>   15000-157ff : System RAM (driver managed)
> >>>
> >>> Good enough for my taste.
> >>>
>  What about adding an optional /sys/firmware/memmap/X/parent attribute.
> >>>
> >>> I really don't want any firmware memmap entries for something that is
> >>> not part of the firmware provided memmap. In addition,
> >>> /sys/firmware/memmap/ is still a fairly x86_64 specific thing. Only mips
> >>> and two arm configs enable it at all.
> >>>
> >>> So, IMHO, /sys/firmware/memmap/ is definitely not the way to go.
> >>
> >> I think that's a policy decision and policy decisions do not belong in
> >> the kernel. Give the tooling the opportunity to decide whether System
> >> RAM stays that way over a kexec. The parenthetical reference otherwise
> >> looks out of 

Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread David Hildenbrand
On 01.05.20 19:45, David Hildenbrand wrote:
> On 01.05.20 19:39, Dan Williams wrote:
>> On Fri, May 1, 2020 at 10:21 AM David Hildenbrand  wrote:
>>>
>>> On 01.05.20 18:56, Dan Williams wrote:
 On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  wrote:
>
> On 01.05.20 00:24, Andrew Morton wrote:
>> On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand  
>> wrote:
>>

 Why does the firmware map support hotplug entries?
>>>
>>> I assume:
>>>
>>> The firmware memmap was added primarily for x86-64 kexec (and still, is
>>> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When DIMMs
>>> get hotplugged on real HW, they get added to e820. Same applies to
>>> memory added via HyperV balloon (unless memory is unplugged via
>>> ballooning and you reboot ... the the e820 is changed as well). I assume
>>> we wanted to be able to reflect that, to make kexec look like a real 
>>> reboot.
>>>
>>> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
>>>
>>>
>>> But I assume only Andrew can enlighten us.
>>>
>>> @Andrew, any guidance here? Should we really add all memory to the
>>> firmware memmap, even if this contradicts with the existing
>>> documentation? (especially, if the actual firmware memmap will *not*
>>> contain that memory after a reboot)
>>
>> For some reason that patch is misattributed - it was authored by
>> Shaohui Zheng , who hasn't been heard from in
>> a decade.  I looked through the email discussion from that time and I'm
>> not seeing anything useful.  But I wasn't able to locate Dave Hansen's
>> review comments.
>
> Okay, thanks for checking. I think the documentation from 2008 is pretty
> clear what has to be done here. I will add some of these details to the
> patch description.
>
> Also, now that I know that esp. kexec-tools already don't consider
> dax/kmem memory properly (memory will not get dumped via kdump) and
> won't really suffer from a name change in /proc/iomem, I will go back to
> the MHP_DRIVER_MANAGED approach and
> 1. Don't create firmware memmap entries
> 2. Name the resource "System RAM (driver managed)"
> 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.
>
> This way, kernel users and user space can figure out that this memory
> has different semantics and handle it accordingly - I think that was
> what Eric was asking for.
>
> Of course, open for suggestions.

 I'm still more of a fan of this being communicated by "System RAM"
>>>
>>> I was mentioning somewhere in this thread that "System RAM" inside a
>>> hierarchy (like dax/kmem) will already be basically ignored by
>>> kexec-tools. So, placing it inside a hierarchy already makes it look
>>> special already.
>>>
>>> But after all, as we have to change kexec-tools either way, we can
>>> directly go ahead and flag it properly as special (in case there will
>>> ever be other cases where we could no longer distinguish it).
>>>
 being parented especially because that tells you something about how
 the memory is driver-managed and which mechanism might be in play.
>>>
>>> The could be communicated to some degree via the resource hierarchy.
>>>
>>> E.g.,
>>>
>>> [root@localhost ~]# cat /proc/iomem
>>> ...
>>> 14000-33fff : Persistent Memory
>>>   14000-1481f : namespace0.0
>>>   15000-33fff : dax0.0
>>> 15000-33fff : System RAM (driver managed)
>>>
>>> vs.
>>>
>>>:/# cat /proc/iomem
>>> [...]
>>> 14000-333ff : virtio-mem (virtio0)
>>>   14000-147ff : System RAM (driver managed)
>>>   14800-14fff : System RAM (driver managed)
>>>   15000-157ff : System RAM (driver managed)
>>>
>>> Good enough for my taste.
>>>
 What about adding an optional /sys/firmware/memmap/X/parent attribute.
>>>
>>> I really don't want any firmware memmap entries for something that is
>>> not part of the firmware provided memmap. In addition,
>>> /sys/firmware/memmap/ is still a fairly x86_64 specific thing. Only mips
>>> and two arm configs enable it at all.
>>>
>>> So, IMHO, /sys/firmware/memmap/ is definitely not the way to go.
>>
>> I think that's a policy decision and policy decisions do not belong in
>> the kernel. Give the tooling the opportunity to decide whether System
>> RAM stays that way over a kexec. The parenthetical reference otherwise
>> looks out of place to me in the /proc/iomem output. What makes it
>> "driver managed" is how the kernel handles it, not how the kernel
>> names it.
> 
> At least, virtio-mem is different. It really *has to be handled* by the
> driver. This is not a policy. It's how it works.
> 

Oh, and I don't see why "System RAM (driver 

Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread David Hildenbrand
On 01.05.20 19:39, Dan Williams wrote:
> On Fri, May 1, 2020 at 10:21 AM David Hildenbrand  wrote:
>>
>> On 01.05.20 18:56, Dan Williams wrote:
>>> On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  wrote:

 On 01.05.20 00:24, Andrew Morton wrote:
> On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand  
> wrote:
>
>>>
>>> Why does the firmware map support hotplug entries?
>>
>> I assume:
>>
>> The firmware memmap was added primarily for x86-64 kexec (and still, is
>> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When DIMMs
>> get hotplugged on real HW, they get added to e820. Same applies to
>> memory added via HyperV balloon (unless memory is unplugged via
>> ballooning and you reboot ... the the e820 is changed as well). I assume
>> we wanted to be able to reflect that, to make kexec look like a real 
>> reboot.
>>
>> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
>>
>>
>> But I assume only Andrew can enlighten us.
>>
>> @Andrew, any guidance here? Should we really add all memory to the
>> firmware memmap, even if this contradicts with the existing
>> documentation? (especially, if the actual firmware memmap will *not*
>> contain that memory after a reboot)
>
> For some reason that patch is misattributed - it was authored by
> Shaohui Zheng , who hasn't been heard from in
> a decade.  I looked through the email discussion from that time and I'm
> not seeing anything useful.  But I wasn't able to locate Dave Hansen's
> review comments.

 Okay, thanks for checking. I think the documentation from 2008 is pretty
 clear what has to be done here. I will add some of these details to the
 patch description.

 Also, now that I know that esp. kexec-tools already don't consider
 dax/kmem memory properly (memory will not get dumped via kdump) and
 won't really suffer from a name change in /proc/iomem, I will go back to
 the MHP_DRIVER_MANAGED approach and
 1. Don't create firmware memmap entries
 2. Name the resource "System RAM (driver managed)"
 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.

 This way, kernel users and user space can figure out that this memory
 has different semantics and handle it accordingly - I think that was
 what Eric was asking for.

 Of course, open for suggestions.
>>>
>>> I'm still more of a fan of this being communicated by "System RAM"
>>
>> I was mentioning somewhere in this thread that "System RAM" inside a
>> hierarchy (like dax/kmem) will already be basically ignored by
>> kexec-tools. So, placing it inside a hierarchy already makes it look
>> special already.
>>
>> But after all, as we have to change kexec-tools either way, we can
>> directly go ahead and flag it properly as special (in case there will
>> ever be other cases where we could no longer distinguish it).
>>
>>> being parented especially because that tells you something about how
>>> the memory is driver-managed and which mechanism might be in play.
>>
>> The could be communicated to some degree via the resource hierarchy.
>>
>> E.g.,
>>
>> [root@localhost ~]# cat /proc/iomem
>> ...
>> 14000-33fff : Persistent Memory
>>   14000-1481f : namespace0.0
>>   15000-33fff : dax0.0
>> 15000-33fff : System RAM (driver managed)
>>
>> vs.
>>
>>:/# cat /proc/iomem
>> [...]
>> 14000-333ff : virtio-mem (virtio0)
>>   14000-147ff : System RAM (driver managed)
>>   14800-14fff : System RAM (driver managed)
>>   15000-157ff : System RAM (driver managed)
>>
>> Good enough for my taste.
>>
>>> What about adding an optional /sys/firmware/memmap/X/parent attribute.
>>
>> I really don't want any firmware memmap entries for something that is
>> not part of the firmware provided memmap. In addition,
>> /sys/firmware/memmap/ is still a fairly x86_64 specific thing. Only mips
>> and two arm configs enable it at all.
>>
>> So, IMHO, /sys/firmware/memmap/ is definitely not the way to go.
> 
> I think that's a policy decision and policy decisions do not belong in
> the kernel. Give the tooling the opportunity to decide whether System
> RAM stays that way over a kexec. The parenthetical reference otherwise
> looks out of place to me in the /proc/iomem output. What makes it
> "driver managed" is how the kernel handles it, not how the kernel
> names it.

At least, virtio-mem is different. It really *has to be handled* by the
driver. This is not a policy. It's how it works.

-- 
Thanks,

David / dhildenb



Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread Dan Williams
On Fri, May 1, 2020 at 10:21 AM David Hildenbrand  wrote:
>
> On 01.05.20 18:56, Dan Williams wrote:
> > On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  wrote:
> >>
> >> On 01.05.20 00:24, Andrew Morton wrote:
> >>> On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand  
> >>> wrote:
> >>>
> >
> > Why does the firmware map support hotplug entries?
> 
>  I assume:
> 
>  The firmware memmap was added primarily for x86-64 kexec (and still, is
>  mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When DIMMs
>  get hotplugged on real HW, they get added to e820. Same applies to
>  memory added via HyperV balloon (unless memory is unplugged via
>  ballooning and you reboot ... the the e820 is changed as well). I assume
>  we wanted to be able to reflect that, to make kexec look like a real 
>  reboot.
> 
>  This worked for a while. Then came dax/kmem. Now comes virtio-mem.
> 
> 
>  But I assume only Andrew can enlighten us.
> 
>  @Andrew, any guidance here? Should we really add all memory to the
>  firmware memmap, even if this contradicts with the existing
>  documentation? (especially, if the actual firmware memmap will *not*
>  contain that memory after a reboot)
> >>>
> >>> For some reason that patch is misattributed - it was authored by
> >>> Shaohui Zheng , who hasn't been heard from in
> >>> a decade.  I looked through the email discussion from that time and I'm
> >>> not seeing anything useful.  But I wasn't able to locate Dave Hansen's
> >>> review comments.
> >>
> >> Okay, thanks for checking. I think the documentation from 2008 is pretty
> >> clear what has to be done here. I will add some of these details to the
> >> patch description.
> >>
> >> Also, now that I know that esp. kexec-tools already don't consider
> >> dax/kmem memory properly (memory will not get dumped via kdump) and
> >> won't really suffer from a name change in /proc/iomem, I will go back to
> >> the MHP_DRIVER_MANAGED approach and
> >> 1. Don't create firmware memmap entries
> >> 2. Name the resource "System RAM (driver managed)"
> >> 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.
> >>
> >> This way, kernel users and user space can figure out that this memory
> >> has different semantics and handle it accordingly - I think that was
> >> what Eric was asking for.
> >>
> >> Of course, open for suggestions.
> >
> > I'm still more of a fan of this being communicated by "System RAM"
>
> I was mentioning somewhere in this thread that "System RAM" inside a
> hierarchy (like dax/kmem) will already be basically ignored by
> kexec-tools. So, placing it inside a hierarchy already makes it look
> special already.
>
> But after all, as we have to change kexec-tools either way, we can
> directly go ahead and flag it properly as special (in case there will
> ever be other cases where we could no longer distinguish it).
>
> > being parented especially because that tells you something about how
> > the memory is driver-managed and which mechanism might be in play.
>
> The could be communicated to some degree via the resource hierarchy.
>
> E.g.,
>
> [root@localhost ~]# cat /proc/iomem
> ...
> 14000-33fff : Persistent Memory
>   14000-1481f : namespace0.0
>   15000-33fff : dax0.0
> 15000-33fff : System RAM (driver managed)
>
> vs.
>
>:/# cat /proc/iomem
> [...]
> 14000-333ff : virtio-mem (virtio0)
>   14000-147ff : System RAM (driver managed)
>   14800-14fff : System RAM (driver managed)
>   15000-157ff : System RAM (driver managed)
>
> Good enough for my taste.
>
> > What about adding an optional /sys/firmware/memmap/X/parent attribute.
>
> I really don't want any firmware memmap entries for something that is
> not part of the firmware provided memmap. In addition,
> /sys/firmware/memmap/ is still a fairly x86_64 specific thing. Only mips
> and two arm configs enable it at all.
>
> So, IMHO, /sys/firmware/memmap/ is definitely not the way to go.

I think that's a policy decision and policy decisions do not belong in
the kernel. Give the tooling the opportunity to decide whether System
RAM stays that way over a kexec. The parenthetical reference otherwise
looks out of place to me in the /proc/iomem output. What makes it
"driver managed" is how the kernel handles it, not how the kernel
names it.


Re: [PATCH v3 0/2] PCI/ERR: Allow Native AER/DPC using _OSC

2020-05-01 Thread Derrick, Jonathan
On Fri, 2020-05-01 at 12:16 -0500, Bjorn Helgaas wrote:
> On Thu, Apr 30, 2020 at 12:46:07PM -0600, Jon Derrick wrote:
> > Hi Bjorn & Kuppuswamy,
> > 
> > I see a problem in the DPC ECN [1] to _OSC in that it doesn't give us a way 
> > to
> > determine if firmware supports _OSC DPC negotation, and therefore how to 
> > handle
> > DPC.
> > 
> > Here is the wording of the ECN that implies that Firmware without _OSC DPC
> > negotiation support should have the OSPM rely on _OSC AER negotiation when
> > determining DPC control:
> > 
> >   PCIe Base Specification suggests that Downstream Port Containment may be
> >   controlled either by the Firmware or the Operating System. It also 
> > suggests
> >   that the Firmware retain ownership of Downstream Port Containment if it 
> > also
> >   owns AER. When the Firmware owns Downstream Port Containment, it is 
> > expected
> >   to use the new "Error Disconnect Recover" notification to alert OSPM of a
> >   Downstream Port Containment event.
> > 
> > In legacy platforms, as bits in _OSC are reserved prior to implementation, 
> > ACPI
> > Root Bus enumeration will mark these Host Bridges as without Native DPC
> > support, even though the specification implies it's expected that AER _OSC
> > negotiation determines DPC control for these platforms. There seems to be a
> > need for a way to determine if the DPC control bit in _OSC is supported and
> > fallback on AER otherwise.
> > 
> > 
> > Currently portdrv assumes DPC control if the port has Native AER services:
> > 
> > static int get_port_device_capability(struct pci_dev *dev)
> > ...
> > if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) &&
> > pci_aer_available() &&
> > (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER)))
> > services |= PCIE_PORT_SERVICE_DPC;
> > 
> > Newer firmware may not grant OSPM DPC control, if for instance, it expects 
> > to
> > use Error Disconnect Recovery. However it looks like ACPI will use DPC 
> > services
> > via the EDR driver, without binding the full DPC port service driver.
> > 
> > 
> > If we change portdrv to probe based on host->native_dpc and not AER, then we
> > break instances with legacy firmware where OSPM will clear host->native_dpc
> > solely due to _OSC bits being reserved:
> > 
> > struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
> > ...
> > if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
> > host_bridge->native_dpc = 0;
> > 
> > 
> > 
> > So my assumption instead is that host->native_dpc can be 0 and expect Native
> > DPC services if AER is used. In other words, if and only if DPC probe is
> > invoked from portdrv, then it needs to rely on the AER dependency. 
> > Otherwise it
> > should be assumed that ACPI set up DPC via EDR. This covers legacy firmware.
> > 
> > However it seems like that could be trouble with newer firmware that might 
> > give
> > OSPM control of AER but not DPC, and would result in both Native DPC and EDR
> > being in effect.
> > 
> > 
> > Anyways here are two patches that give control of AER and DPC on the 
> > results of
> > _OSC. They don't mess with the HEST parser as I expect those to be removed 
> > at
> > some point. I need these for VMD support which doesn't even rely on _OSC, 
> > but I
> > suspect this won't be the last effort as we detangle Firmware First.
> > 
> > [1] https://members.pcisig.com/wg/PCI-SIG/document/12888
> 
> Hi Jon, I think we need to sort out the _OSC/FIRMWARE_FIRST patches
> from Alex and Sathy first, then see what needs to be done on top of
> those, so I'm going to push these off for a few days and they'll
> probably need a refresh.
> 
> Bjorn


Agreed, no need to merge now. Just wanted to bring up the DPC
ambiguity, which I think was first addressed by dpc-native:

commit 35a0b2378c199d4f26e458b2ca38ea56aaf2d9b8
Author: Olof Johansson 
Date:   Wed Oct 23 12:22:05 2019 -0700

PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control

Prior to eed85ff4c0da7 ("PCI/DPC: Enable DPC only if AER is available"),
Linux handled DPC events regardless of whether firmware had granted it
ownership of AER or DPC, e.g., via _OSC.

PCIe r5.0, sec 6.2.10, recommends that the OS link control of DPC to
control of AER, so after eed85ff4c0da7, Linux handles DPC events only if it
has control of AER.

On platforms that do not grant OS control of AER via _OSC, Linux DPC
handling worked before eed85ff4c0da7 but not after.

To make Linux DPC handling work on those platforms the same way they did
before, add a "pcie_ports=dpc-native" kernel parameter that makes Linux
handle DPC events regardless of whether it has control of AER.

[bhelgaas: commit log, move pcie_ports_dpc_native to drivers/pci/]
Link: https://lore.kernel.org/r/20191023192205.97024-1-o...@lixom.net
Signed-off-by: Olof Johansson 
Signed-off-by: Bjorn Helgaas 


Thanks,
Jon


Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread David Hildenbrand
On 01.05.20 18:56, Dan Williams wrote:
> On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  wrote:
>>
>> On 01.05.20 00:24, Andrew Morton wrote:
>>> On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand  
>>> wrote:
>>>
>
> Why does the firmware map support hotplug entries?

 I assume:

 The firmware memmap was added primarily for x86-64 kexec (and still, is
 mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When DIMMs
 get hotplugged on real HW, they get added to e820. Same applies to
 memory added via HyperV balloon (unless memory is unplugged via
 ballooning and you reboot ... the the e820 is changed as well). I assume
 we wanted to be able to reflect that, to make kexec look like a real 
 reboot.

 This worked for a while. Then came dax/kmem. Now comes virtio-mem.


 But I assume only Andrew can enlighten us.

 @Andrew, any guidance here? Should we really add all memory to the
 firmware memmap, even if this contradicts with the existing
 documentation? (especially, if the actual firmware memmap will *not*
 contain that memory after a reboot)
>>>
>>> For some reason that patch is misattributed - it was authored by
>>> Shaohui Zheng , who hasn't been heard from in
>>> a decade.  I looked through the email discussion from that time and I'm
>>> not seeing anything useful.  But I wasn't able to locate Dave Hansen's
>>> review comments.
>>
>> Okay, thanks for checking. I think the documentation from 2008 is pretty
>> clear what has to be done here. I will add some of these details to the
>> patch description.
>>
>> Also, now that I know that esp. kexec-tools already don't consider
>> dax/kmem memory properly (memory will not get dumped via kdump) and
>> won't really suffer from a name change in /proc/iomem, I will go back to
>> the MHP_DRIVER_MANAGED approach and
>> 1. Don't create firmware memmap entries
>> 2. Name the resource "System RAM (driver managed)"
>> 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.
>>
>> This way, kernel users and user space can figure out that this memory
>> has different semantics and handle it accordingly - I think that was
>> what Eric was asking for.
>>
>> Of course, open for suggestions.
> 
> I'm still more of a fan of this being communicated by "System RAM"

I was mentioning somewhere in this thread that "System RAM" inside a
hierarchy (like dax/kmem) will already be basically ignored by
kexec-tools. So, placing it inside a hierarchy already makes it look
special already.

But after all, as we have to change kexec-tools either way, we can
directly go ahead and flag it properly as special (in case there will
ever be other cases where we could no longer distinguish it).

> being parented especially because that tells you something about how
> the memory is driver-managed and which mechanism might be in play.

The could be communicated to some degree via the resource hierarchy.

E.g.,

[root@localhost ~]# cat /proc/iomem
...
14000-33fff : Persistent Memory
  14000-1481f : namespace0.0
  15000-33fff : dax0.0
15000-33fff : System RAM (driver managed)

vs.

   :/# cat /proc/iomem
[...]
14000-333ff : virtio-mem (virtio0)
  14000-147ff : System RAM (driver managed)
  14800-14fff : System RAM (driver managed)
  15000-157ff : System RAM (driver managed)

Good enough for my taste.

> What about adding an optional /sys/firmware/memmap/X/parent attribute.

I really don't want any firmware memmap entries for something that is
not part of the firmware provided memmap. In addition,
/sys/firmware/memmap/ is still a fairly x86_64 specific thing. Only mips
and two arm configs enable it at all.

So, IMHO, /sys/firmware/memmap/ is definitely not the way to go.

-- 
Thanks,

David / dhildenb



Re: [PATCH V1 00/10] Remove duplicated kmap code

2020-05-01 Thread Ira Weiny
On Fri, May 01, 2020 at 01:54:56AM -0700, Christoph Hellwig wrote:
> In addition to the work already it the series, it seems like
> LAST_PKMAP_MASK, PKMAP_ADDR and PKMAP_NR can also be consolidated
> to common code.

Agreed, I mentioned in the cover letter there are similarities...

> 
> Also kmap_atomic_high_prot / kmap_atomic_pfn could move into common
> code, maybe keyed off a symbol selected by the actual users that
> need it.  It also seems like it doesn't actually ever need to be
> exported.

...  but these are not as readily obvious, at least to me.  I do see a pattern
but the differences seemed subtle enough that it would take a while to ensure
correctness.  So I'd like to see this series go in and build on it.

> 
> This in turn would lead to being able to allow io_mapping_map_atomic_wc
> on all architectures, which might make nouveau and qxl happy, but maybe
> that can be left for another series.

I agree, that this should be follow on patches.  I still need to fix the
bisect-ability and I don't want to bog down 0-day with a longer series.

Thanks for the review!
Ira



Re: [PATCH v3 0/2] PCI/ERR: Allow Native AER/DPC using _OSC

2020-05-01 Thread Bjorn Helgaas
On Thu, Apr 30, 2020 at 12:46:07PM -0600, Jon Derrick wrote:
> Hi Bjorn & Kuppuswamy,
> 
> I see a problem in the DPC ECN [1] to _OSC in that it doesn't give us a way to
> determine if firmware supports _OSC DPC negotation, and therefore how to 
> handle
> DPC.
> 
> Here is the wording of the ECN that implies that Firmware without _OSC DPC
> negotiation support should have the OSPM rely on _OSC AER negotiation when
> determining DPC control:
> 
>   PCIe Base Specification suggests that Downstream Port Containment may be
>   controlled either by the Firmware or the Operating System. It also suggests
>   that the Firmware retain ownership of Downstream Port Containment if it also
>   owns AER. When the Firmware owns Downstream Port Containment, it is expected
>   to use the new "Error Disconnect Recover" notification to alert OSPM of a
>   Downstream Port Containment event.
> 
> In legacy platforms, as bits in _OSC are reserved prior to implementation, 
> ACPI
> Root Bus enumeration will mark these Host Bridges as without Native DPC
> support, even though the specification implies it's expected that AER _OSC
> negotiation determines DPC control for these platforms. There seems to be a
> need for a way to determine if the DPC control bit in _OSC is supported and
> fallback on AER otherwise.
> 
> 
> Currently portdrv assumes DPC control if the port has Native AER services:
> 
> static int get_port_device_capability(struct pci_dev *dev)
> ...
>   if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) &&
>   pci_aer_available() &&
>   (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER)))
>   services |= PCIE_PORT_SERVICE_DPC;
> 
> Newer firmware may not grant OSPM DPC control, if for instance, it expects to
> use Error Disconnect Recovery. However it looks like ACPI will use DPC 
> services
> via the EDR driver, without binding the full DPC port service driver.
> 
> 
> If we change portdrv to probe based on host->native_dpc and not AER, then we
> break instances with legacy firmware where OSPM will clear host->native_dpc
> solely due to _OSC bits being reserved:
> 
> struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
> ...
>   if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
>   host_bridge->native_dpc = 0;
> 
> 
> 
> So my assumption instead is that host->native_dpc can be 0 and expect Native
> DPC services if AER is used. In other words, if and only if DPC probe is
> invoked from portdrv, then it needs to rely on the AER dependency. Otherwise 
> it
> should be assumed that ACPI set up DPC via EDR. This covers legacy firmware.
> 
> However it seems like that could be trouble with newer firmware that might 
> give
> OSPM control of AER but not DPC, and would result in both Native DPC and EDR
> being in effect.
> 
> 
> Anyways here are two patches that give control of AER and DPC on the results 
> of
> _OSC. They don't mess with the HEST parser as I expect those to be removed at
> some point. I need these for VMD support which doesn't even rely on _OSC, but 
> I
> suspect this won't be the last effort as we detangle Firmware First.
> 
> [1] https://members.pcisig.com/wg/PCI-SIG/document/12888

Hi Jon, I think we need to sort out the _OSC/FIRMWARE_FIRST patches
from Alex and Sathy first, then see what needs to be done on top of
those, so I'm going to push these off for a few days and they'll
probably need a refresh.

Bjorn


Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread Dan Williams
On Fri, May 1, 2020 at 2:34 AM David Hildenbrand  wrote:
>
> On 01.05.20 00:24, Andrew Morton wrote:
> > On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand  
> > wrote:
> >
> >>>
> >>> Why does the firmware map support hotplug entries?
> >>
> >> I assume:
> >>
> >> The firmware memmap was added primarily for x86-64 kexec (and still, is
> >> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When DIMMs
> >> get hotplugged on real HW, they get added to e820. Same applies to
> >> memory added via HyperV balloon (unless memory is unplugged via
> >> ballooning and you reboot ... the the e820 is changed as well). I assume
> >> we wanted to be able to reflect that, to make kexec look like a real 
> >> reboot.
> >>
> >> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
> >>
> >>
> >> But I assume only Andrew can enlighten us.
> >>
> >> @Andrew, any guidance here? Should we really add all memory to the
> >> firmware memmap, even if this contradicts with the existing
> >> documentation? (especially, if the actual firmware memmap will *not*
> >> contain that memory after a reboot)
> >
> > For some reason that patch is misattributed - it was authored by
> > Shaohui Zheng , who hasn't been heard from in
> > a decade.  I looked through the email discussion from that time and I'm
> > not seeing anything useful.  But I wasn't able to locate Dave Hansen's
> > review comments.
>
> Okay, thanks for checking. I think the documentation from 2008 is pretty
> clear what has to be done here. I will add some of these details to the
> patch description.
>
> Also, now that I know that esp. kexec-tools already don't consider
> dax/kmem memory properly (memory will not get dumped via kdump) and
> won't really suffer from a name change in /proc/iomem, I will go back to
> the MHP_DRIVER_MANAGED approach and
> 1. Don't create firmware memmap entries
> 2. Name the resource "System RAM (driver managed)"
> 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.
>
> This way, kernel users and user space can figure out that this memory
> has different semantics and handle it accordingly - I think that was
> what Eric was asking for.
>
> Of course, open for suggestions.

I'm still more of a fan of this being communicated by "System RAM"
being parented especially because that tells you something about how
the memory is driver-managed and which mechanism might be in play.
What about adding an optional /sys/firmware/memmap/X/parent attribute.
This lets tooling check if it cares via that interface and lets it
lookup the related infrastructure to interact with if it would do
something different for virtio-mem vs dax/kmem?


Re: sparc-related comment, to Re: [PATCH V1 07/10] arch/kmap: Ensure kmap_prot visibility

2020-05-01 Thread Ira Weiny
On Fri, May 01, 2020 at 01:44:46AM -0700, Christoph Hellwig wrote:
> > --- a/arch/sparc/mm/highmem.c
> > +++ b/arch/sparc/mm/highmem.c
> > @@ -33,6 +33,7 @@
> >  #include 
> >  
> >  pgprot_t kmap_prot;
> > +EXPORT_SYMBOL(kmap_prot);
> 
> Btw, I don't see why sparc needs this as a variable, as there is just
> a single assignment to it.

Because sparc uses non-standard defines which I'm not familiar with.

kmap_prot = __pgprot(SRMMU_ET_PTE | SRMMU_PRIV | SRMMU_CACHE);

SRMMU_ET_PTE and friends are defined in 

arch/sparc/include/asm/pgtsrmmu.h

Since I can't readily test sparc this was easier to put out than let 0-day
crank on the entire series checking if including that header in the common
header chain would be an issue.

> 
> If sparc is sorted out we can always make it a define, and use a define
> for kmap_prot that defaults to PAGE_KERNEL, avoiding a little
> more duplication.

Agreed.  But it seems easier as a follow up (for me with 0-day).  Perhaps
someone from sparc can weigh in on the specifics of those defines and why they
are different from the normal ones?  Or even provide a follow on patch?

Ira



[PATCH v2] powerpc/ima: fix secure boot rules in ima arch policy

2020-05-01 Thread Nayna Jain
To prevent verifying the kernel module appended signature twice
(finit_module), once by the module_sig_check() and again by IMA, powerpc
secure boot rules define an IMA architecture specific policy rule
only if CONFIG_MODULE_SIG_FORCE is not enabled. This, unfortunately, does
not take into account the ability of enabling "sig_enforce" on the boot
command line (module.sig_enforce=1).

Including the IMA module appraise rule results in failing the finit_module
syscall, unless the module signing public key is loaded onto the IMA
keyring.

This patch fixes secure boot policy rules to be based on CONFIG_MODULE_SIG
instead.

Fixes: 4238fad366a6 ("powerpc/ima: Add support to initialize ima policy rules")
Signed-off-by: Nayna Jain 
---
v2:
* Fixes the patch description to specify the problem more clearly as asked 
by Michael Ellerman.

 arch/powerpc/kernel/ima_arch.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/ima_arch.c b/arch/powerpc/kernel/ima_arch.c
index e34116255ced..957abd592075 100644
--- a/arch/powerpc/kernel/ima_arch.c
+++ b/arch/powerpc/kernel/ima_arch.c
@@ -19,12 +19,12 @@ bool arch_ima_get_secureboot(void)
  * to be stored as an xattr or as an appended signature.
  *
  * To avoid duplicate signature verification as much as possible, the IMA
- * policy rule for module appraisal is added only if CONFIG_MODULE_SIG_FORCE
+ * policy rule for module appraisal is added only if CONFIG_MODULE_SIG
  * is not enabled.
  */
 static const char *const secure_rules[] = {
"appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist 
appraise_type=imasig|modsig",
-#ifndef CONFIG_MODULE_SIG_FORCE
+#ifndef CONFIG_MODULE_SIG
"appraise func=MODULE_CHECK appraise_flag=check_blacklist 
appraise_type=imasig|modsig",
 #endif
NULL
@@ -50,7 +50,7 @@ static const char *const secure_and_trusted_rules[] = {
"measure func=KEXEC_KERNEL_CHECK template=ima-modsig",
"measure func=MODULE_CHECK template=ima-modsig",
"appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist 
appraise_type=imasig|modsig",
-#ifndef CONFIG_MODULE_SIG_FORCE
+#ifndef CONFIG_MODULE_SIG
"appraise func=MODULE_CHECK appraise_flag=check_blacklist 
appraise_type=imasig|modsig",
 #endif
NULL
-- 
2.18.1



[powerpc:topic/uaccess] BUILD REGRESSION 5bb3b9d986426296507d3ef58d1e5fe4625de01f

2020-05-01 Thread kbuild test robot
   holly_defconfig
powerpc   ppc64_defconfig
powerpc  rhel-kconfig
powerpc   allnoconfig
powerpc  mpc866_ads_defconfig
powerpcamigaone_defconfig
powerpcadder875_defconfig
powerpc ep8248e_defconfig
powerpc  g5_defconfig
powerpc mpc512x_defconfig
m68k randconfig-a001-20200501
mips randconfig-a001-20200501
nds32randconfig-a001-20200501
alpharandconfig-a001-20200501
parisc   randconfig-a001-20200501
riscvrandconfig-a001-20200501
microblaze   randconfig-a001-20200430
nios2randconfig-a001-20200430
h8300randconfig-a001-20200430
c6x  randconfig-a001-20200430
sparc64  randconfig-a001-20200430
s390 randconfig-a001-20200430
xtensa   randconfig-a001-20200430
csky randconfig-a001-20200430
openrisc randconfig-a001-20200430
sh   randconfig-a001-20200430
s390 randconfig-a001-20200501
xtensa   randconfig-a001-20200501
sh   randconfig-a001-20200501
openrisc randconfig-a001-20200501
csky randconfig-a001-20200501
i386 randconfig-b001-20200430
i386 randconfig-b002-20200430
x86_64   randconfig-b001-20200430
i386 randconfig-b003-20200430
x86_64   randconfig-b002-20200430
x86_64   randconfig-b003-20200430
i386 randconfig-b003-20200501
x86_64   randconfig-b002-20200501
i386 randconfig-b001-20200501
x86_64   randconfig-b003-20200501
x86_64   randconfig-b001-20200501
i386 randconfig-b002-20200501
x86_64   randconfig-c001-20200501
x86_64   randconfig-c002-20200501
i386 randconfig-c002-20200501
x86_64   randconfig-c003-20200501
i386 randconfig-c001-20200501
i386 randconfig-c003-20200501
x86_64   randconfig-c001-20200430
i386 randconfig-c001-20200430
i386 randconfig-c002-20200430
x86_64   randconfig-c002-20200430
x86_64   randconfig-c003-20200430
i386 randconfig-c003-20200430
x86_64   randconfig-d001-20200501
i386 randconfig-d003-20200501
x86_64   randconfig-d003-20200501
i386 randconfig-d001-20200501
x86_64   randconfig-d002-20200501
i386 randconfig-d002-20200501
x86_64   randconfig-d002-20200430
x86_64   randconfig-d001-20200430
i386 randconfig-d001-20200430
i386 randconfig-d003-20200430
i386 randconfig-d002-20200430
x86_64   randconfig-d003-20200430
x86_64   randconfig-e002-20200430
i386 randconfig-e003-20200430
x86_64   randconfig-e003-20200430
i386 randconfig-e002-20200430
x86_64   randconfig-e001-20200430
i386 randconfig-e001-20200430
x86_64   randconfig-e002-20200501
x86_64   randconfig-e003-20200501
i386 randconfig-e003-20200501
x86_64   randconfig-e001-20200501
i386 randconfig-e002-20200501
i386 randconfig-e001-20200501
x86_64   randconfig-f001-20200430
i386 randconfig-f002-20200430
i386 randconfig-f003-20200430
i386 randconfig-f001-20200430
x86_64   randconfig-f003-20200430
i386 randconfig-f003-20200501
x86_64   randconfig-f001-20200501
x86_64   randconfig-f003-20200501
i386 randconfig-f001-20200501
i386 randconfig-f002-20200501
i386 randconfig-g003-20200501
i386 randconfig-g002-20200501
x86_64   randconfig-g002-20200501
i386 randconfig-g001-20200501
x86_64   randconfig-a003-20200501
x86_64   randconfig-a001-20200501
i386 randconfig-a003-20200501
i386 randconfig-a002-20200501
i386 randconfig-a001-20200501
i386 randconfig-h001-20200501
i386 randconfig-h002-20200501
i386 randconfig-h003-20200501
x86_64   randconfig-h001-20200501
x86_64   randconfig-h003-20200501
i386 randconfig-h002-20200430
i386 randconfig-h003-20200430
x86_64   randconfig-h001-20200430
x86_64   randconfig-h003-20200430
i386 randconfig-h001-20200430
sparcrandconfig-a001-20200430
arc

Re: [PATCH 2/3] ASoC: fsl_esai: Add support for imx8qm

2020-05-01 Thread Mark Brown
On Fri, May 01, 2020 at 04:12:05PM +0800, Shengjiu Wang wrote:
> The difference for esai on imx8qm is that DMA device is EDMA.
> 
> EDMA requires the period size to be multiple of maxburst. Otherwise
> the remaining bytes are not transferred and thus noise is produced.

If this constraint comes from the DMA controller then normally you'd
expect the DMA controller integration to be enforcing this - is there no
information in the DMA API that lets us know that this constraint is
there?


signature.asc
Description: PGP signature


Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP

2020-05-01 Thread David Hildenbrand
On 01.05.20 00:24, Andrew Morton wrote:
> On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand  wrote:
> 
>>>
>>> Why does the firmware map support hotplug entries?
>>
>> I assume:
>>
>> The firmware memmap was added primarily for x86-64 kexec (and still, is
>> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When DIMMs
>> get hotplugged on real HW, they get added to e820. Same applies to
>> memory added via HyperV balloon (unless memory is unplugged via
>> ballooning and you reboot ... the the e820 is changed as well). I assume
>> we wanted to be able to reflect that, to make kexec look like a real reboot.
>>
>> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
>>
>>
>> But I assume only Andrew can enlighten us.
>>
>> @Andrew, any guidance here? Should we really add all memory to the
>> firmware memmap, even if this contradicts with the existing
>> documentation? (especially, if the actual firmware memmap will *not*
>> contain that memory after a reboot)
> 
> For some reason that patch is misattributed - it was authored by
> Shaohui Zheng , who hasn't been heard from in
> a decade.  I looked through the email discussion from that time and I'm
> not seeing anything useful.  But I wasn't able to locate Dave Hansen's
> review comments.

Okay, thanks for checking. I think the documentation from 2008 is pretty
clear what has to be done here. I will add some of these details to the
patch description.

Also, now that I know that esp. kexec-tools already don't consider
dax/kmem memory properly (memory will not get dumped via kdump) and
won't really suffer from a name change in /proc/iomem, I will go back to
the MHP_DRIVER_MANAGED approach and
1. Don't create firmware memmap entries
2. Name the resource "System RAM (driver managed)"
3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.

This way, kernel users and user space can figure out that this memory
has different semantics and handle it accordingly - I think that was
what Eric was asking for.

Of course, open for suggestions.

-- 
Thanks,

David / dhildenb



Re: [PATCH V1 00/10] Remove duplicated kmap code

2020-05-01 Thread Christoph Hellwig
In addition to the work already it the series, it seems like
LAST_PKMAP_MASK, PKMAP_ADDR and PKMAP_NR can also be consolidated
to common code.

Also kmap_atomic_high_prot / kmap_atomic_pfn could move into common
code, maybe keyed off a symbol selected by the actual users that
need it.  It also seems like it doesn't actually ever need to be
exported.

This in turn would lead to being able to allow io_mapping_map_atomic_wc
on all architectures, which might make nouveau and qxl happy, but maybe
that can be left for another series.


Re: [PATCH V1 10/10] drm: Remove drm specific kmap_atomic code

2020-05-01 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 


Re: [PATCH V1 09/10] arch/kmap: Define kmap_atomic_prot() for all arch's

2020-05-01 Thread Christoph Hellwig
On Thu, Apr 30, 2020 at 01:38:44PM -0700, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> To support kmap_atomic_prot(), all architectures need to support
> protections passed to their kmap_atomic_high() function.  Pass
> protections into kmap_atomic_high() and change the name to
> kmap_atomic_high_prot() to match.
> 
> Then define kmap_atomic_prot() as a core function which calls
> kmap_atomic_high_prot() when needed.
> 
> Finally, redefine kmap_atomic() as a wrapper of kmap_atomic_prot() with
> the default kmap_prot exported by the architectures.

Looks good,

Reviewed-by: Christoph Hellwig 

But can you also consolidate the kmap_atomic_high_prot and
kunmap_atomic_high in linux/highmem.h instead of keeping the duplicates
in all arch headers?

> 
> Signed-off-by: Ira Weiny 
> ---
>  arch/arc/include/asm/highmem.h| 2 +-
>  arch/arc/mm/highmem.c | 6 +++---
>  arch/arm/include/asm/highmem.h| 2 +-
>  arch/arm/mm/highmem.c | 6 +++---
>  arch/csky/include/asm/highmem.h   | 2 +-
>  arch/csky/mm/highmem.c| 6 +++---
>  arch/microblaze/include/asm/highmem.h | 7 +--
>  arch/microblaze/mm/highmem.c  | 4 ++--
>  arch/mips/include/asm/highmem.h   | 2 +-
>  arch/mips/mm/highmem.c| 6 +++---
>  arch/nds32/include/asm/highmem.h  | 2 +-
>  arch/nds32/mm/highmem.c   | 6 +++---
>  arch/powerpc/include/asm/highmem.h| 8 +---
>  arch/powerpc/mm/highmem.c | 4 ++--
>  arch/sparc/include/asm/highmem.h  | 2 +-
>  arch/sparc/mm/highmem.c   | 6 +++---
>  arch/x86/include/asm/highmem.h| 6 +-
>  arch/x86/mm/highmem_32.c  | 4 ++--
>  arch/xtensa/include/asm/highmem.h | 2 +-
>  arch/xtensa/mm/highmem.c  | 6 +++---
>  include/linux/highmem.h   | 5 +++--
>  21 files changed, 40 insertions(+), 54 deletions(-)
> 
> diff --git a/arch/arc/include/asm/highmem.h b/arch/arc/include/asm/highmem.h
> index e16531495620..09f86bde6809 100644
> --- a/arch/arc/include/asm/highmem.h
> +++ b/arch/arc/include/asm/highmem.h
> @@ -30,7 +30,7 @@
>  
>  #include 
>  
> -extern void *kmap_atomic_high(struct page *page);
> +extern void *kmap_atomic_high_prot(struct page *page, pgprot_t prot);
>  extern void kunmap_atomic_high(void *kvaddr);
>  
>  extern void kmap_init(void);
> diff --git a/arch/arc/mm/highmem.c b/arch/arc/mm/highmem.c
> index 5d3eab4ac0b0..479b0d72d3cf 100644
> --- a/arch/arc/mm/highmem.c
> +++ b/arch/arc/mm/highmem.c
> @@ -49,7 +49,7 @@
>  extern pte_t * pkmap_page_table;
>  static pte_t * fixmap_page_table;
>  
> -void *kmap_atomic_high(struct page *page)
> +void *kmap_atomic_high_prot(struct page *page, pgprot_t prot)
>  {
>   int idx, cpu_idx;
>   unsigned long vaddr;
> @@ -59,11 +59,11 @@ void *kmap_atomic_high(struct page *page)
>   vaddr = FIXMAP_ADDR(idx);
>  
>   set_pte_at(_mm, vaddr, fixmap_page_table + idx,
> -mk_pte(page, kmap_prot));
> +mk_pte(page, prot));
>  
>   return (void *)vaddr;
>  }
> -EXPORT_SYMBOL(kmap_atomic_high);
> +EXPORT_SYMBOL(kmap_atomic_high_prot);
>  
>  void kunmap_atomic_high(void *kv)
>  {
> diff --git a/arch/arm/include/asm/highmem.h b/arch/arm/include/asm/highmem.h
> index a9d5e9bce1cc..e35f2f73f6aa 100644
> --- a/arch/arm/include/asm/highmem.h
> +++ b/arch/arm/include/asm/highmem.h
> @@ -60,7 +60,7 @@ static inline void *kmap_high_get(struct page *page)
>   * when CONFIG_HIGHMEM is not set.
>   */
>  #ifdef CONFIG_HIGHMEM
> -extern void *kmap_atomic_high(struct page *page);
> +extern void *kmap_atomic_high_prot(struct page *page, pgprot_t prot);
>  extern void kunmap_atomic_high(void *kvaddr);
>  extern void *kmap_atomic_pfn(unsigned long pfn);
>  #endif
> diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c
> index ac8394655a6e..e013f6b81328 100644
> --- a/arch/arm/mm/highmem.c
> +++ b/arch/arm/mm/highmem.c
> @@ -31,7 +31,7 @@ static inline pte_t get_fixmap_pte(unsigned long vaddr)
>   return *ptep;
>  }
>  
> -void *kmap_atomic_high(struct page *page)
> +void *kmap_atomic_high_prot(struct page *page, pgprot_t prot)
>  {
>   unsigned int idx;
>   unsigned long vaddr;
> @@ -67,11 +67,11 @@ void *kmap_atomic_high(struct page *page)
>* in place, so the contained TLB flush ensures the TLB is updated
>* with the new mapping.
>*/
> - set_fixmap_pte(idx, mk_pte(page, kmap_prot));
> + set_fixmap_pte(idx, mk_pte(page, prot));
>  
>   return (void *)vaddr;
>  }
> -EXPORT_SYMBOL(kmap_atomic_high);
> +EXPORT_SYMBOL(kmap_atomic_high_prot);
>  
>  void kunmap_atomic_high(void *kvaddr)
>  {
> diff --git a/arch/csky/include/asm/highmem.h b/arch/csky/include/asm/highmem.h
> index 5bbbe59e60a9..59854c7ccf78 100644
> --- a/arch/csky/include/asm/highmem.h
> +++ b/arch/csky/include/asm/highmem.h
> @@ -32,7 +32,7 @@ extern pte_t *pkmap_page_table;
>  
>  #define ARCH_HAS_KMAP_FLUSH_TLB
>  extern void 

Re: [PATCH V1 08/10] arch/kmap: Don't hard code kmap_prot values

2020-05-01 Thread Christoph Hellwig
On Thu, Apr 30, 2020 at 01:38:43PM -0700, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> To support kmap_atomic_prot() on all architectures each arch must
> support protections passed in to them.
> 
> Change csky, mips, nds32 and xtensa to use their global kmap_prot value
> rather than a hard coded value which was equal.

Looks good,

Reviewed-by: Christoph Hellwig 


sparc-related comment, to Re: [PATCH V1 07/10] arch/kmap: Ensure kmap_prot visibility

2020-05-01 Thread Christoph Hellwig
> --- a/arch/sparc/mm/highmem.c
> +++ b/arch/sparc/mm/highmem.c
> @@ -33,6 +33,7 @@
>  #include 
>  
>  pgprot_t kmap_prot;
> +EXPORT_SYMBOL(kmap_prot);

Btw, I don't see why sparc needs this as a variable, as there is just
a single assignment to it.

If sparc is sorted out we can always make it a define, and use a define
for kmap_prot that defaults to PAGE_KERNEL, avoiding a little
more duplication.


Re: [PATCH V1 06/10] arch/kunmap_atomic: Consolidate duplicate code

2020-05-01 Thread Christoph Hellwig
On Thu, Apr 30, 2020 at 01:38:41PM -0700, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> Every single architecture (including !CONFIG_HIGHMEM) calls...
> 
>   pagefault_enable();
>   preempt_enable();
> 
> ... before returning from __kunmap_atomic().  Lift this code into the
> kunmap_atomic() macro.
> 
> While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to
> be consistent.
> 
> Signed-off-by: Ira Weiny 

Looks good,

Reviewed-by: Christoph Hellwig 


Re: [PATCH V1 05/10] arch/kmap_atomic: Consolidate duplicate code

2020-05-01 Thread Christoph Hellwig
On Thu, Apr 30, 2020 at 01:38:40PM -0700, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> Every arch has the same code to ensure atomic operations and a check for
> !HIGHMEM page.
> 
> Remove the duplicate code by defining a core kmap_atomic() which only
> calls the arch specific kmap_atomic_high() when the page is high memory.
> 
> Signed-off-by: Ira Weiny 

Looks good:

Reviewed-by: Christoph Hellwig 


Re: [PATCH V1 04/10] arch/kunmap: Remove duplicate kunmap implementations

2020-05-01 Thread Christoph Hellwig
On Thu, Apr 30, 2020 at 01:38:39PM -0700, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> All architectures do exactly the same thing for kunmap(); remove all the
> duplicate definitions and lift the call to the core.
> 
> This also has the benefit of changing kmap_unmap() on a number of
> architectures to be an inline call rather than an actual function.
> 
> Signed-off-by: Ira Weiny 

Looks good,

Reviewed-by: Christoph Hellwig 


Re: [PATCH V1 03/10] arch/kmap: Remove redundant arch specific kmaps

2020-05-01 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 


Re: [PATCH V1 02/10] arch/xtensa: Move kmap build bug out of the way

2020-05-01 Thread Christoph Hellwig
On Thu, Apr 30, 2020 at 01:38:37PM -0700, ira.we...@intel.com wrote:
> @@ -88,6 +88,11 @@ void __init kmap_init(void)
>  {
>   unsigned long kmap_vstart;
>  
> + /* Check if this memory layout is broken because PKMAP overlaps
> +  * page table.
> +  */
> + BUILD_BUG_ON(PKMAP_BASE <
> +  TLBTEMP_BASE_1 + TLBTEMP_SIZE);

This can fit on a single line.  Otherwise looks good:

Reviewed-by: Christoph Hellwig 


Re: [PATCH V1 01/10] arch/kmap: Remove BUG_ON()

2020-05-01 Thread Christoph Hellwig
On Thu, Apr 30, 2020 at 01:38:36PM -0700, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> Replace the use of BUG_ON(in_interrupt()) in the kmap() and kunmap()
> in favor of might_sleep().
> 
> Besides the benefits of might_sleep(), this normalizes the
> implementations such that they can be made generic in subsequent
> patches.
> 
> Reviewed-by: Dan Williams 
> Signed-off-by: Ira Weiny 

Looks good,

Reviewed-by: Christoph Hellwig 


[PATCH 1/3] ASoC: fsl_esai: introduce SoC specific data

2020-05-01 Thread Shengjiu Wang
Introduce a SoC specific data structure which contains the
differences between the different SoCs.
This makes it easier to support more differences without having
to introduce a new if/else each time.

Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/fsl_esai.c | 46 
 1 file changed, 37 insertions(+), 9 deletions(-)

diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index 84290be778f0..bac65ba7fbad 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -21,6 +21,17 @@
SNDRV_PCM_FMTBIT_S20_3LE | \
SNDRV_PCM_FMTBIT_S24_LE)
 
+/**
+ * fsl_esai_soc_data: soc specific data
+ *
+ * @imx: for imx platform
+ * @reset_at_xrun: flags for enable reset operaton
+ */
+struct fsl_esai_soc_data {
+   bool imx;
+   bool reset_at_xrun;
+};
+
 /**
  * fsl_esai: ESAI private data
  *
@@ -33,6 +44,7 @@
  * @fsysclk: system clock source to derive HCK, SCK and FS
  * @spbaclk: SPBA clock (optional, depending on SoC design)
  * @task: tasklet to handle the reset operation
+ * @soc: soc specific data
  * @lock: spin lock between hw_reset() and trigger()
  * @fifo_depth: depth of tx/rx FIFO
  * @slot_width: width of each DAI slot
@@ -44,7 +56,6 @@
  * @sck_div: if using PSR/PM dividers for SCKx clock
  * @slave_mode: if fully using DAI slave mode
  * @synchronous: if using tx/rx synchronous mode
- * @reset_at_xrun: flags for enable reset operaton
  * @name: driver name
  */
 struct fsl_esai {
@@ -57,6 +68,7 @@ struct fsl_esai {
struct clk *fsysclk;
struct clk *spbaclk;
struct tasklet_struct task;
+   const struct fsl_esai_soc_data *soc;
spinlock_t lock; /* Protect hw_reset and trigger */
u32 fifo_depth;
u32 slot_width;
@@ -70,10 +82,24 @@ struct fsl_esai {
bool sck_div[2];
bool slave_mode;
bool synchronous;
-   bool reset_at_xrun;
char name[32];
 };
 
+static struct fsl_esai_soc_data fsl_esai_vf610 = {
+   .imx = false,
+   .reset_at_xrun = true,
+};
+
+static struct fsl_esai_soc_data fsl_esai_imx35 = {
+   .imx = true,
+   .reset_at_xrun = true,
+};
+
+static struct fsl_esai_soc_data fsl_esai_imx6ull = {
+   .imx = true,
+   .reset_at_xrun = false,
+};
+
 static irqreturn_t esai_isr(int irq, void *devid)
 {
struct fsl_esai *esai_priv = (struct fsl_esai *)devid;
@@ -85,7 +111,7 @@ static irqreturn_t esai_isr(int irq, void *devid)
regmap_read(esai_priv->regmap, REG_ESAI_SAISR, );
 
if ((saisr & (ESAI_SAISR_TUE | ESAI_SAISR_ROE)) &&
-   esai_priv->reset_at_xrun) {
+   esai_priv->soc->reset_at_xrun) {
dev_dbg(>dev, "reset module for xrun\n");
regmap_update_bits(esai_priv->regmap, REG_ESAI_TCR,
   ESAI_xCR_xEIE_MASK, 0);
@@ -936,9 +962,11 @@ static int fsl_esai_probe(struct platform_device *pdev)
esai_priv->pdev = pdev;
snprintf(esai_priv->name, sizeof(esai_priv->name), "%pOFn", np);
 
-   if (of_device_is_compatible(np, "fsl,vf610-esai") ||
-   of_device_is_compatible(np, "fsl,imx35-esai"))
-   esai_priv->reset_at_xrun = true;
+   esai_priv->soc = of_device_get_match_data(>dev);
+   if (!esai_priv->soc) {
+   dev_err(>dev, "failed to get soc data\n");
+   return -ENODEV;
+   }
 
/* Get the addresses and IRQ */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -1063,9 +1091,9 @@ static int fsl_esai_remove(struct platform_device *pdev)
 }
 
 static const struct of_device_id fsl_esai_dt_ids[] = {
-   { .compatible = "fsl,imx35-esai", },
-   { .compatible = "fsl,vf610-esai", },
-   { .compatible = "fsl,imx6ull-esai", },
+   { .compatible = "fsl,imx35-esai", .data = _esai_imx35 },
+   { .compatible = "fsl,vf610-esai", .data = _esai_vf610 },
+   { .compatible = "fsl,imx6ull-esai", .data = _esai_imx6ull },
{}
 };
 MODULE_DEVICE_TABLE(of, fsl_esai_dt_ids);
-- 
2.21.0



[PATCH 2/3] ASoC: fsl_esai: Add support for imx8qm

2020-05-01 Thread Shengjiu Wang
The difference for esai on imx8qm is that DMA device is EDMA.

EDMA requires the period size to be multiple of maxburst. Otherwise
the remaining bytes are not transferred and thus noise is produced.

We can handle this issue by adding a constraint on
SNDRV_PCM_HW_PARAM_PERIOD_SIZE to be multiple of tx/rx maxburst value.

Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/fsl_esai.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index bac65ba7fbad..61b5c0bde788 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -26,10 +26,12 @@
  *
  * @imx: for imx platform
  * @reset_at_xrun: flags for enable reset operaton
+ * @use_edma: edma is used.
  */
 struct fsl_esai_soc_data {
bool imx;
bool reset_at_xrun;
+   bool use_edma;
 };
 
 /**
@@ -88,16 +90,25 @@ struct fsl_esai {
 static struct fsl_esai_soc_data fsl_esai_vf610 = {
.imx = false,
.reset_at_xrun = true,
+   .use_edma = false,
 };
 
 static struct fsl_esai_soc_data fsl_esai_imx35 = {
.imx = true,
.reset_at_xrun = true,
+   .use_edma = false,
 };
 
 static struct fsl_esai_soc_data fsl_esai_imx6ull = {
.imx = true,
.reset_at_xrun = false,
+   .use_edma = false,
+};
+
+static struct fsl_esai_soc_data fsl_esai_imx8qm = {
+   .imx = true,
+   .reset_at_xrun = false,
+   .use_edma = true,
 };
 
 static irqreturn_t esai_isr(int irq, void *devid)
@@ -513,6 +524,7 @@ static int fsl_esai_startup(struct snd_pcm_substream 
*substream,
struct snd_soc_dai *dai)
 {
struct fsl_esai *esai_priv = snd_soc_dai_get_drvdata(dai);
+   bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
 
if (!dai->active) {
/* Set synchronous mode */
@@ -527,6 +539,12 @@ static int fsl_esai_startup(struct snd_pcm_substream 
*substream,
   ESAI_xCCR_xDC_MASK, ESAI_xCCR_xDC(2));
}
 
+   if (esai_priv->soc->use_edma)
+   snd_pcm_hw_constraint_step(substream->runtime, 0,
+  SNDRV_PCM_HW_PARAM_PERIOD_SIZE,
+  tx ? 
esai_priv->dma_params_tx.maxburst :
+  esai_priv->dma_params_rx.maxburst);
+
return 0;
 
 }
@@ -1094,6 +1112,7 @@ static const struct of_device_id fsl_esai_dt_ids[] = {
{ .compatible = "fsl,imx35-esai", .data = _esai_imx35 },
{ .compatible = "fsl,vf610-esai", .data = _esai_vf610 },
{ .compatible = "fsl,imx6ull-esai", .data = _esai_imx6ull },
+   { .compatible = "fsl,imx8qm-esai", .data = _esai_imx8qm },
{}
 };
 MODULE_DEVICE_TABLE(of, fsl_esai_dt_ids);
-- 
2.21.0



[PATCH 3/3] ASoC: fsl_esai: Add new compatible string for imx8qm

2020-05-01 Thread Shengjiu Wang
Add new compatible string "fsl,imx8qm-esai" in the binding document.

Signed-off-by: Shengjiu Wang 
---
 Documentation/devicetree/bindings/sound/fsl,esai.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/sound/fsl,esai.txt 
b/Documentation/devicetree/bindings/sound/fsl,esai.txt
index 0e6e2166f76c..0a2480aeecf0 100644
--- a/Documentation/devicetree/bindings/sound/fsl,esai.txt
+++ b/Documentation/devicetree/bindings/sound/fsl,esai.txt
@@ -12,6 +12,7 @@ Required properties:
  "fsl,imx35-esai",
  "fsl,vf610-esai",
  "fsl,imx6ull-esai",
+ "fsl,imx8qm-esai",
 
   - reg: Offset and length of the register set for the 
device.
 
-- 
2.21.0



[PATCH 0/3] ASoC: fsl_esai: Add support for imx8qm

2020-05-01 Thread Shengjiu Wang
Add support for imx8qm.

Shengjiu Wang (3):
  ASoC: fsl_esai: introduce SoC specific data
  ASoC: fsl_esai: Add support for imx8qm
  ASoC: fsl_esai: Add new compatible string for imx8qm

 .../devicetree/bindings/sound/fsl,esai.txt|  1 +
 sound/soc/fsl/fsl_esai.c  | 65 ---
 2 files changed, 57 insertions(+), 9 deletions(-)

-- 
2.21.0



Re: [PATCH V1 10/10] drm: Remove drm specific kmap_atomic code

2020-05-01 Thread Christian König

Am 30.04.20 um 22:38 schrieb ira.we...@intel.com:

From: Ira Weiny 

kmap_atomic_prot() is now exported by all architectures.  Use this
function rather than open coding a driver specific kmap_atomic.

Signed-off-by: Ira Weiny 


Ah, yes looking into this once more this was on my TODO list for quite a 
while as well.


Patch is Reviewed-by: Christian König , feel 
free to push it upstream through whatever channel you like or ping me if 
I should pick it up into drm-misc-next.


Regards,
Christian.


---
  drivers/gpu/drm/ttm/ttm_bo_util.c| 56 ++--
  drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 16 
  include/drm/ttm/ttm_bo_api.h |  4 --
  3 files changed, 12 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 52d2b71f1588..f09b096ba4fd 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -257,54 +257,6 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned 
long page)
return 0;
  }
  
-#ifdef CONFIG_X86

-#define __ttm_kmap_atomic_prot(__page, __prot) kmap_atomic_prot(__page, __prot)
-#define __ttm_kunmap_atomic(__addr) kunmap_atomic(__addr)
-#else
-#define __ttm_kmap_atomic_prot(__page, __prot) vmap(&__page, 1, 0,  __prot)
-#define __ttm_kunmap_atomic(__addr) vunmap(__addr)
-#endif
-
-
-/**
- * ttm_kmap_atomic_prot - Efficient kernel map of a single page with
- * specified page protection.
- *
- * @page: The page to map.
- * @prot: The page protection.
- *
- * This function maps a TTM page using the kmap_atomic api if available,
- * otherwise falls back to vmap. The user must make sure that the
- * specified page does not have an aliased mapping with a different caching
- * policy unless the architecture explicitly allows it. Also mapping and
- * unmapping using this api must be correctly nested. Unmapping should
- * occur in the reverse order of mapping.
- */
-void *ttm_kmap_atomic_prot(struct page *page, pgprot_t prot)
-{
-   if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
-   return kmap_atomic(page);
-   else
-   return __ttm_kmap_atomic_prot(page, prot);
-}
-EXPORT_SYMBOL(ttm_kmap_atomic_prot);
-
-/**
- * ttm_kunmap_atomic_prot - Unmap a page that was mapped using
- * ttm_kmap_atomic_prot.
- *
- * @addr: The virtual address from the map.
- * @prot: The page protection.
- */
-void ttm_kunmap_atomic_prot(void *addr, pgprot_t prot)
-{
-   if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL))
-   kunmap_atomic(addr);
-   else
-   __ttm_kunmap_atomic(addr);
-}
-EXPORT_SYMBOL(ttm_kunmap_atomic_prot);
-
  static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src,
unsigned long page,
pgprot_t prot)
@@ -316,13 +268,13 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void 
*src,
return -ENOMEM;
  
  	src = (void *)((unsigned long)src + (page << PAGE_SHIFT));

-   dst = ttm_kmap_atomic_prot(d, prot);
+   dst = kmap_atomic_prot(d, prot);
if (!dst)
return -ENOMEM;
  
  	memcpy_fromio(dst, src, PAGE_SIZE);
  
-	ttm_kunmap_atomic_prot(dst, prot);

+   kunmap_atomic(dst);
  
  	return 0;

  }
@@ -338,13 +290,13 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void 
*dst,
return -ENOMEM;
  
  	dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT));

-   src = ttm_kmap_atomic_prot(s, prot);
+   src = kmap_atomic_prot(s, prot);
if (!src)
return -ENOMEM;
  
  	memcpy_toio(dst, src, PAGE_SIZE);
  
-	ttm_kunmap_atomic_prot(src, prot);

+   kunmap_atomic(src);
  
  	return 0;

  }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
index bb46ca0c458f..94d456a1d1a9 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
@@ -374,12 +374,12 @@ static int vmw_bo_cpu_blit_line(struct 
vmw_bo_blit_line_data *d,
copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset);
  
  		if (unmap_src) {

-   ttm_kunmap_atomic_prot(d->src_addr, d->src_prot);
+   kunmap_atomic(d->src_addr);
d->src_addr = NULL;
}
  
  		if (unmap_dst) {

-   ttm_kunmap_atomic_prot(d->dst_addr, d->dst_prot);
+   kunmap_atomic(d->dst_addr);
d->dst_addr = NULL;
}
  
@@ -388,8 +388,8 @@ static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d,

return -EINVAL;
  
  			d->dst_addr =

-   ttm_kmap_atomic_prot(d->dst_pages[dst_page],
-d->dst_prot);
+   kmap_atomic_prot(d->dst_pages[dst_page],
+d->dst_prot);