Re: [PATCH v4 0/3] Use dma_default_coherent for devicetree default coherency

2023-04-06 Thread Christoph Hellwig
Thanks,

applied to the dma-mapping tree for 6.4.


[PATCH] powerpc/boot: Fix crt0.S current address branch form

2023-04-06 Thread Nicholas Piggin
Use the preferred form of branch-and-link for finding the current
address so objtool doesn't think it is an unannotated intra-function
call.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/boot/crt0.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/boot/crt0.S b/arch/powerpc/boot/crt0.S
index 44544720daae..121cab9d579b 100644
--- a/arch/powerpc/boot/crt0.S
+++ b/arch/powerpc/boot/crt0.S
@@ -51,7 +51,7 @@ _zimage_start:
 _zimage_start_lib:
/* Work out the offset between the address we were linked at
   and the address where we're running. */
-   bl  .+4
+   bcl 20,31,.+4
 p_base:mflrr10 /* r10 now points to runtime addr of 
p_base */
 #ifndef __powerpc64__
/* grab the link address of the dynamic section in r11 */
@@ -274,7 +274,7 @@ prom:
mtsrr1  r10
 
/* Load FW address, set LR to label 1, and jump to FW */
-   bl  0f
+   bcl 20,31,0f
 0: mflrr10
addir11,r10,(1f-0b)
mtlrr11
-- 
2.40.0



[PATCH] powerpc/boot: Fix boot wrapper code generation with CONFIG_POWER10_CPU

2023-04-06 Thread Nicholas Piggin
-mcpu=power10 will generate prefixed and pcrel code by default, which
we do not support. The general kernel disables these with cflags, but
those were missed for the boot wrapper.

Reported-by: Danny Tsen 
Fixes: 4b2a9315f20d9 ("powerpc/64s: POWER10 CPU Kconfig build option")
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/boot/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 295f76df13b5..13fad4f0a6d8 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -34,6 +34,8 @@ endif
 
 BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
 -fno-strict-aliasing -O2 -msoft-float -mno-altivec -mno-vsx \
+$(call cc-option,-mno-prefixed) $(call cc-option,-mno-pcrel) \
+$(call cc-option,-mno-mma) \
 $(call cc-option,-mno-spe) $(call cc-option,-mspe=no) \
 -pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc \
 $(LINUXINCLUDE)
-- 
2.40.0



Re: [PATCH] powerpc/64: Always build with 128-bit long double

2023-04-06 Thread Hamza Mahfooz



On 4/4/23 06:28, Michael Ellerman wrote:

The amdgpu driver builds some of its code with hard-float enabled,
whereas the rest of the kernel is built with soft-float.

When building with 64-bit long double, if soft-float and hard-float
objects are linked together, the build fails due to incompatible ABI
tags.

In the past there have been build errors in the amdgpu driver caused by
this, some of those were due to bad intermingling of soft & hard-float
code, but those issues have now all been fixed since commit c92b7fe0d92a
("drm/amd/display: move remaining FPU code to dml folder").

However it's still possible for soft & hard-float objects to end up
linked together, if the amdgpu driver is built-in to the kernel along
with the test_emulate_step.c code, which uses soft-float. That happens
in an allyesconfig build.

Currently those build errors are avoided because the amdgpu driver is
gated on 128-bit long double being enabled. But that's not a detail the
amdgpu driver should need to be aware of, and if another driver starts
using hard-float the same problem would occur.

All versions of the 64-bit ABI specify that long-double is 128-bits.
However some compilers, notably the kernel.org ones, are built to use
64-bit long double by default.

Apart from this issue of soft vs hard-float, the kernel doesn't care
what size long double is. In particular the kernel using 128-bit long
double doesn't impact userspace's ability to use 64-bit long double, as
musl does.

So always build the 64-bit kernel with 128-bit long double. That should
avoid any build errors due to the incompatible ABI tags. Excluding the
code that uses soft/hard-float, the vmlinux is identical with/without
the flag.

It does mean any code which is incorrectly intermingling soft &
hard-float code will build without error, so those bugs will need to be
caught by testing rather than at build time.

For more background see:
   - commit d11219ad53dc ("amdgpu: disable powerpc support for the newer display 
engine")
   - commit c653c591789b ("drm/amdgpu: Re-enable DCN for 64-bit powerpc")
   - 
https://lore.kernel.org/r/dab9cbd8-2626-4b99-8098-31fe76397...@app.fastmail.com

Signed-off-by: Michael Ellerman 


Reviewed-by: Hamza Mahfooz 

If you'd prefer to have this go through the amdgpu branch, please let
me know.


---
  arch/powerpc/Kconfig| 4 
  arch/powerpc/Makefile   | 1 +
  drivers/gpu/drm/amd/display/Kconfig | 2 +-
  3 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index fc4e81dafca7..3fb2c2766139 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -291,10 +291,6 @@ config PPC
# Please keep this list sorted alphabetically.
#
  
-config PPC_LONG_DOUBLE_128

-   depends on PPC64 && ALTIVEC
-   def_bool $(success,test "$(shell,echo __LONG_DOUBLE_128__ | $(CC) -E -P 
-)" = 1)
-
  config PPC_BARRIER_NOSPEC
bool
default y
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 12447b2361e4..4343cca57cb3 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -133,6 +133,7 @@ endif
  endif
  CFLAGS-$(CONFIG_PPC64)+= $(call cc-option,-mcmodel=medium,$(call 
cc-option,-mminimal-toc))
  CFLAGS-$(CONFIG_PPC64)+= $(call 
cc-option,-mno-pointers-to-nested-functions)
+CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mlong-double-128)
  
  # Clang unconditionally reserves r2 on ppc32 and does not support the flag

  # https://bugs.llvm.org/show_bug.cgi?id=39555
diff --git a/drivers/gpu/drm/amd/display/Kconfig 
b/drivers/gpu/drm/amd/display/Kconfig
index 0c9bd0a53e60..e36261d546af 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -8,7 +8,7 @@ config DRM_AMD_DC
depends on BROKEN || !CC_IS_CLANG || X86_64 || SPARC64 || ARM64
select SND_HDA_COMPONENT if SND_HDA_CORE
# !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
-   select DRM_AMD_DC_DCN if (X86 || PPC_LONG_DOUBLE_128 || (ARM64 && 
KERNEL_MODE_NEON && !CC_IS_CLANG))
+   select DRM_AMD_DC_DCN if (X86 || (PPC64 && ALTIVEC) || (ARM64 && 
KERNEL_MODE_NEON && !CC_IS_CLANG))
help
  Choose this option if you want to use the new display engine
  support for AMDGPU. This adds required support for Vega and

--
Hamza



Re: [PATCHv2 pci-next 2/2] PCI/AER: Rate limit the reporting of the correctable errors

2023-04-06 Thread Bjorn Helgaas
On Fri, Mar 17, 2023 at 10:51:09AM -0700, Grant Grundler wrote:
> From: Rajat Khandelwal 
> 
> There are many instances where correctable errors tend to inundate
> the message buffer. We observe such instances during thunderbolt PCIe
> tunneling.
> 
> It's true that they are mitigated by the hardware and are non-fatal
> but we shouldn't be spamming the logs with such correctable errors as it
> confuses other kernel developers less familiar with PCI errors, support
> staff, and users who happen to look at the logs, hence rate limit them.
> 
> A typical example log inside an HP TBT4 dock:
> [54912.661142] pcieport :00:07.0: AER: Multiple Corrected error received: 
> :2b:00.0
> [54912.661194] igc :2b:00.0: PCIe Bus Error: severity=Corrected, 
> type=Data Link Layer, (Transmitter ID)
> [54912.661203] igc :2b:00.0:   device [8086:5502] error 
> status/mask=1100/2000
> [54912.661211] igc :2b:00.0:[ 8] Rollover
> [54912.661219] igc :2b:00.0:[12] Timeout
> [54982.838760] pcieport :00:07.0: AER: Corrected error received: 
> :2b:00.0
> [54982.838798] igc :2b:00.0: PCIe Bus Error: severity=Corrected, 
> type=Data Link Layer, (Transmitter ID)
> [54982.838808] igc :2b:00.0:   device [8086:5502] error 
> status/mask=1000/2000
> [54982.838817] igc :2b:00.0:[12] Timeout

The timestamps don't contribute to understanding the problem, so we
can omit them.

> This gets repeated continuously, thus inundating the buffer.
> 
> Signed-off-by: Rajat Khandelwal 
> Signed-off-by: Grant Grundler 
> ---
>  drivers/pci/pcie/aer.c | 42 --
>  1 file changed, 28 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index cb6b96233967..b592cea8bffe 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -706,8 +706,8 @@ static void __aer_print_error(struct pci_dev *dev,
>   errmsg = "Unknown Error Bit";
>  
>   if (info->severity == AER_CORRECTABLE)
> - pci_info(dev, "   [%2d] %-22s%s\n", i, errmsg,
> - info->first_error == i ? " (First)" : "");
> + pci_info_ratelimited(dev, "   [%2d] %-22s%s\n", i, 
> errmsg,
> +  info->first_error == i ? " 
> (First)" : "");

I don't think this is going to reliably work the way we want.  We have
a bunch of pci_info_ratelimited() calls, and each caller has its own
ratelimit_state data.  Unless we call pci_info_ratelimited() exactly
the same number of times for each error, the ratelimit counters will
get out of sync and we'll end up printing fragments from error A mixed
with fragments from error B.

I think we need to explicitly manage the ratelimiting ourselves,
similar to print_hmi_event_info() or print_extlog_rcd().  Then we can
have a *single* ratelimit_state, and we can check it once to determine
whether to log this correctable error.

>   else
>   pci_err(dev, "   [%2d] %-22s%s\n", i, errmsg,
>   info->first_error == i ? " (First)" : "");
> @@ -719,7 +719,6 @@ void aer_print_error(struct pci_dev *dev, struct 
> aer_err_info *info)
>  {
>   int layer, agent;
>   int id = ((dev->bus->number << 8) | dev->devfn);
> - const char *level;
>  
>   if (!info->status) {
>   pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, 
> (Unregistered Agent ID)\n",
> @@ -730,14 +729,21 @@ void aer_print_error(struct pci_dev *dev, struct 
> aer_err_info *info)
>   layer = AER_GET_LAYER_ERROR(info->severity, info->status);
>   agent = AER_GET_AGENT(info->severity, info->status);
>  
> - level = (info->severity == AER_CORRECTABLE) ? KERN_INFO : KERN_ERR;
> + if (info->severity == AER_CORRECTABLE) {
> + pci_info_ratelimited(dev, "PCIe Bus Error: severity=%s, 
> type=%s, (%s)\n",
> +  aer_error_severity_string[info->severity],
> +  aer_error_layer[layer], 
> aer_agent_string[agent]);
>  
> - pci_printk(level, dev, "PCIe Bus Error: severity=%s, type=%s, (%s)\n",
> -aer_error_severity_string[info->severity],
> -aer_error_layer[layer], aer_agent_string[agent]);
> + pci_info_ratelimited(dev, "  device [%04x:%04x] error 
> status/mask=%08x/%08x\n",
> +  dev->vendor, dev->device, info->status, 
> info->mask);
> + } else {
> + pci_err(dev, "PCIe Bus Error: severity=%s, type=%s, (%s)\n",
> + aer_error_severity_string[info->severity],
> + aer_error_layer[layer], aer_agent_string[agent]);
>  
> - pci_printk(level, dev, "  device [%04x:%04x] error 
> status/mask=%08x/%08x\n",
> -dev->vendor, dev->device, info->status, info->mask);
> + pci_err(dev, "  device [%04x:%04x] error 

Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2023 at 05:51:52PM +0200, David Hildenbrand wrote:
> On 06.04.23 17:02, Peter Zijlstra wrote:

> > DavidH, what do you thikn about reviving Jann's patches here:
> > 
> >https://bugs.chromium.org/p/project-zero/issues/detail?id=2365#c1
> > 
> > Those are far more invasive, but afaict they seem to do the right thing.
> > 
> 
> I recall seeing those while discussed on secur...@kernel.org. What we
> currently have was (IMHO for good reasons) deemed better to fix the issue,
> especially when caring about backports and getting it right.

Yes, and I think that was the right call. However, we can now revisit
without having the pressure of a known defect and backport
considerations.

> The alternative that was discussed in that context IIRC was to simply
> allocate a fresh page table, place the fresh page table into the list
> instead, and simply free the old page table (then using common machinery).
> 
> TBH, I'd wish (and recently raised) that we could just stop wasting memory
> on page tables for THPs that are maybe never going to get PTE-mapped ... and
> eventually just allocate on demand (with some caching?) and handle the
> places where we're OOM and cannot PTE-map a THP in some descend way.
> 
> ... instead of trying to figure out how to deal with these page tables we
> cannot free but have to special-case simply because of GUP-fast.

Not keeping them around sounds good to me, but I'm not *that* familiar
with the THP code, most of that happened after I stopped tracking mm. So
I'm not sure how feasible is it.

But it does look entirely feasible to rework this page-table freeing
along the lines Jann did.


[PATCH] powerpc/32: Include thread_info.h in head_booke.h

2023-04-06 Thread Nathan Chancellor
When building with W=1 after commit 80b6093b55e3 ("kbuild: add -Wundef
to KBUILD_CPPFLAGS for W=1 builds"), the following warning occurs.

  In file included from arch/powerpc/kvm/bookehv_interrupts.S:26:
  arch/powerpc/kvm/../kernel/head_booke.h:20:6: warning: "THREAD_SHIFT" is not 
defined, evaluates to 0 [-Wundef]
 20 | #if (THREAD_SHIFT < 15)
|  ^~~~

THREAD_SHIFT is defined in thread_info.h but it is not directly included
in head_booke.h, so it is possible for THREAD_SHIFT to be undefined. Add
the include to ensure that THREAD_SHIFT is always defined.

Reported-by: kernel test robot 
Link: https://lore.kernel.org/202304050954.yskldczh-...@intel.com/
Signed-off-by: Nathan Chancellor 
---
 arch/powerpc/kernel/head_booke.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index 37d43c172676..b6b5b01a173c 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -5,6 +5,7 @@
 #include /* for STACK_FRAME_REGS_MARKER */
 #include 
 #include 
+#include/* for THREAD_SHIFT */
 
 #ifdef __ASSEMBLY__
 

---
base-commit: b0bbe5a2915201e3231e788d716d39dc54493b03
change-id: 20230406-wundef-thread_shift_booke-e08d806ed656

Best regards,
-- 
Nathan Chancellor 



Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread David Hildenbrand

On 06.04.23 17:02, Peter Zijlstra wrote:

On Thu, Apr 06, 2023 at 04:04:23PM +0200, Peter Zijlstra wrote:

On Thu, Apr 06, 2023 at 03:29:28PM +0200, Peter Zijlstra wrote:

On Thu, Apr 06, 2023 at 09:38:50AM -0300, Marcelo Tosatti wrote:


To actually hit this path you're doing something really dodgy.


Apparently khugepaged is using the same infrastructure:

$ grep tlb_remove_table khugepaged.c
tlb_remove_table_sync_one();
tlb_remove_table_sync_one();

So just enabling khugepaged will hit that path.


Urgh, WTF..

Let me go read that stuff :/


At the very least the one on collapse_and_free_pmd() could easily become
a call_rcu() based free.

I'm not sure I'm following what collapse_huge_page() does just yet.


DavidH, what do you thikn about reviving Jann's patches here:

   https://bugs.chromium.org/p/project-zero/issues/detail?id=2365#c1

Those are far more invasive, but afaict they seem to do the right thing.



I recall seeing those while discussed on secur...@kernel.org. What we 
currently have was (IMHO for good reasons) deemed better to fix the 
issue, especially when caring about backports and getting it right.


The alternative that was discussed in that context IIRC was to simply 
allocate a fresh page table, place the fresh page table into the list 
instead, and simply free the old page table (then using common machinery).


TBH, I'd wish (and recently raised) that we could just stop wasting 
memory on page tables for THPs that are maybe never going to get 
PTE-mapped ... and eventually just allocate on demand (with some 
caching?) and handle the places where we're OOM and cannot PTE-map a THP 
in some descend way.


... instead of trying to figure out how to deal with these page tables 
we cannot free but have to special-case simply because of GUP-fast.


--
Thanks,

David / dhildenb



Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2023 at 04:42:02PM +0200, David Hildenbrand wrote:
> On 06.04.23 16:04, Peter Zijlstra wrote:
> > On Thu, Apr 06, 2023 at 03:29:28PM +0200, Peter Zijlstra wrote:
> > > On Thu, Apr 06, 2023 at 09:38:50AM -0300, Marcelo Tosatti wrote:
> > > 
> > > > > To actually hit this path you're doing something really dodgy.
> > > > 
> > > > Apparently khugepaged is using the same infrastructure:
> > > > 
> > > > $ grep tlb_remove_table khugepaged.c
> > > > tlb_remove_table_sync_one();
> > > > tlb_remove_table_sync_one();
> > > > 
> > > > So just enabling khugepaged will hit that path.
> > > 
> > > Urgh, WTF..
> > > 
> > > Let me go read that stuff :/
> > 
> > At the very least the one on collapse_and_free_pmd() could easily become
> > a call_rcu() based free.
> > 
> > I'm not sure I'm following what collapse_huge_page() does just yet.
> 
> It wants to replace a leaf page table by a THP (Transparent Huge Page mapped
> by a PMD). So we want to rip out a leaf page table while other code
> (GUP-fast) might still be walking it. 

Right, I got that far.

> In contrast to freeing the page table,
> we put it into a list where it can be reuse when having to PTE-map a THP
> again.

Yeah, this is the bit I couldn't find, that code is a bit of a maze.

> Now, similar to after freeing the page table, someone else could reuse that
> page table and modify it.

So ideally we'll RCU free the page instead of sticking it on that list.


Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2023 at 04:04:23PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 06, 2023 at 03:29:28PM +0200, Peter Zijlstra wrote:
> > On Thu, Apr 06, 2023 at 09:38:50AM -0300, Marcelo Tosatti wrote:
> > 
> > > > To actually hit this path you're doing something really dodgy.
> > > 
> > > Apparently khugepaged is using the same infrastructure:
> > > 
> > > $ grep tlb_remove_table khugepaged.c 
> > >   tlb_remove_table_sync_one();
> > >   tlb_remove_table_sync_one();
> > > 
> > > So just enabling khugepaged will hit that path.
> > 
> > Urgh, WTF..
> > 
> > Let me go read that stuff :/
> 
> At the very least the one on collapse_and_free_pmd() could easily become
> a call_rcu() based free.
> 
> I'm not sure I'm following what collapse_huge_page() does just yet.

DavidH, what do you thikn about reviving Jann's patches here:

  https://bugs.chromium.org/p/project-zero/issues/detail?id=2365#c1

Those are far more invasive, but afaict they seem to do the right thing.


Re: [PATCH v2 09/19] arch/mips: Implement with generic helpers

2023-04-06 Thread Arnd Bergmann
On Thu, Apr 6, 2023, at 16:30, Thomas Zimmermann wrote:
> Replace the architecture's fb_is_primary_device() with the generic
> one from . No functional changes.
>
> Signed-off-by: Thomas Zimmermann 
> Cc: Thomas Bogendoerfer 

I think you should at least mention that the existing
fb_pgprotect() function is probably incorrect and should
be replaced with the generic version.

For reference, the fb_pgprotect function using pgprot_uncached()
was introduced in 2.6.22 along with all the other ones, but
the pgprot_writecombine function was only added in commit
4b050ba7a66c ("MIPS: pgtable.h: Implement the pgprot_writecombine
function for MIPS") for 3.18.

 Arnd


[PATCH 1/4] powerpc/64: Mark prep_irq_for_idle() __cpuidle

2023-04-06 Thread Michael Ellerman
Code in the idle path is not allowed to be instrumented because RCU is
disabled, see commit 0e985e9d2286 ("cpuidle: Add comments about
noinstr/__cpuidle usage").

Mark prep_irq_for_idle() __cpuidle, which is equivalent to noinstr, to
enforce that.

Suggested-by: Peter Zijlstra 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/irq_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/irq_64.c b/arch/powerpc/kernel/irq_64.c
index c788c55512ed..2ab0e8d84c1d 100644
--- a/arch/powerpc/kernel/irq_64.c
+++ b/arch/powerpc/kernel/irq_64.c
@@ -354,7 +354,7 @@ EXPORT_SYMBOL(arch_local_irq_restore);
  * disabled and marked as such, so the local_irq_enable() call
  * in arch_cpu_idle() will properly re-enable everything.
  */
-bool prep_irq_for_idle(void)
+__cpuidle bool prep_irq_for_idle(void)
 {
/*
 * First we need to hard disable to ensure no interrupt
-- 
2.39.2



[PATCH 2/4] powerpc/64: Don't call trace_hardirqs_on() in prep_irq_for_idle()

2023-04-06 Thread Michael Ellerman
Since commit a01353cf1896 ("cpuidle: Fix ct_idle_*() usage"), the
cpuidle entry code calls trace_hardirqs_on() (actually
trace_hardirqs_on_prepare()) in ct_cpuidle_enter() before calling into
the cpuidle driver.

Suggested-by: Peter Zijlstra 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/irq_64.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/irq_64.c b/arch/powerpc/kernel/irq_64.c
index 2ab0e8d84c1d..938e66829eae 100644
--- a/arch/powerpc/kernel/irq_64.c
+++ b/arch/powerpc/kernel/irq_64.c
@@ -348,9 +348,8 @@ EXPORT_SYMBOL(arch_local_irq_restore);
  * already the case when ppc_md.power_save is called). The function
  * will return whether to enter power save or just return.
  *
- * In the former case, it will have notified lockdep of interrupts
- * being re-enabled and generally sanitized the lazy irq state,
- * and in the latter case it will leave with interrupts hard
+ * In the former case, it will have generally sanitized the lazy irq
+ * state, and in the latter case it will leave with interrupts hard
  * disabled and marked as such, so the local_irq_enable() call
  * in arch_cpu_idle() will properly re-enable everything.
  */
@@ -370,9 +369,6 @@ __cpuidle bool prep_irq_for_idle(void)
if (lazy_irq_pending())
return false;
 
-   /* Tell lockdep we are about to re-enable */
-   trace_hardirqs_on();
-
/*
 * Mark interrupts as soft-enabled and clear the
 * PACA_IRQ_HARD_DIS from the pending mask since we
-- 
2.39.2



[PATCH 4/4] powerpc/pseries: Always inline functions called from cpuidle

2023-04-06 Thread Michael Ellerman
Code in the idle path is not allowed to be instrumented because RCU is
disabled, see commit 0e985e9d2286 ("cpuidle: Add comments about
noinstr/__cpuidle usage").

Force inlining of the inline functions called from cpuidle, to ensure
they are not emitted out-of-line and then available for tracing.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/idle.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index accd1f50085a..00f360667391 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -9,17 +9,17 @@ DECLARE_PER_CPU(u64, idle_spurr_cycles);
 DECLARE_PER_CPU(u64, idle_entry_purr_snap);
 DECLARE_PER_CPU(u64, idle_entry_spurr_snap);
 
-static inline void snapshot_purr_idle_entry(void)
+static __always_inline void snapshot_purr_idle_entry(void)
 {
*this_cpu_ptr(_entry_purr_snap) = mfspr(SPRN_PURR);
 }
 
-static inline void snapshot_spurr_idle_entry(void)
+static __always_inline void snapshot_spurr_idle_entry(void)
 {
*this_cpu_ptr(_entry_spurr_snap) = mfspr(SPRN_SPURR);
 }
 
-static inline void update_idle_purr_accounting(void)
+static __always_inline void update_idle_purr_accounting(void)
 {
u64 wait_cycles;
u64 in_purr = *this_cpu_ptr(_entry_purr_snap);
@@ -29,7 +29,7 @@ static inline void update_idle_purr_accounting(void)
get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
 }
 
-static inline void update_idle_spurr_accounting(void)
+static __always_inline void update_idle_spurr_accounting(void)
 {
u64 *idle_spurr_cycles_ptr = this_cpu_ptr(_spurr_cycles);
u64 in_spurr = *this_cpu_ptr(_entry_spurr_snap);
@@ -37,7 +37,7 @@ static inline void update_idle_spurr_accounting(void)
*idle_spurr_cycles_ptr += mfspr(SPRN_SPURR) - in_spurr;
 }
 
-static inline void pseries_idle_prolog(void)
+static __always_inline void pseries_idle_prolog(void)
 {
ppc64_runlatch_off();
snapshot_purr_idle_entry();
@@ -49,7 +49,7 @@ static inline void pseries_idle_prolog(void)
get_lppaca()->idle = 1;
 }
 
-static inline void pseries_idle_epilog(void)
+static __always_inline void pseries_idle_epilog(void)
 {
update_idle_purr_accounting();
update_idle_spurr_accounting();
-- 
2.39.2



[PATCH 3/4] cpuidle: pseries: Mark ->enter() functions as __cpuidle

2023-04-06 Thread Michael Ellerman
Code in the idle path is not allowed to be instrumented because RCU is
disabled, see commit 0e985e9d2286 ("cpuidle: Add comments about
noinstr/__cpuidle usage").

Mark the cpuidle ->enter() callbacks as __cpuidle and use the
raw_local_irq_*() routines to ensure that is the case.

Reported-by: Sachin Sant 
Suggested-by: Peter Zijlstra 
Signed-off-by: Michael Ellerman 
---
 drivers/cpuidle/cpuidle-pseries.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-pseries.c 
b/drivers/cpuidle/cpuidle-pseries.c
index 1bad4d2b7be3..a7d33f3ee01e 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -33,16 +33,16 @@ static struct cpuidle_state *cpuidle_state_table 
__read_mostly;
 static u64 snooze_timeout __read_mostly;
 static bool snooze_timeout_en __read_mostly;
 
-static int snooze_loop(struct cpuidle_device *dev,
-   struct cpuidle_driver *drv,
-   int index)
+static __cpuidle
+int snooze_loop(struct cpuidle_device *dev, struct cpuidle_driver *drv,
+   int index)
 {
u64 snooze_exit_time;
 
set_thread_flag(TIF_POLLING_NRFLAG);
 
pseries_idle_prolog();
-   local_irq_enable();
+   raw_local_irq_enable();
snooze_exit_time = get_tb() + snooze_timeout;
dev->poll_time_limit = false;
 
@@ -65,14 +65,14 @@ static int snooze_loop(struct cpuidle_device *dev,
HMT_medium();
clear_thread_flag(TIF_POLLING_NRFLAG);
 
-   local_irq_disable();
+   raw_local_irq_disable();
 
pseries_idle_epilog();
 
return index;
 }
 
-static void check_and_cede_processor(void)
+static __cpuidle void check_and_cede_processor(void)
 {
/*
 * Ensure our interrupt state is properly tracked,
@@ -216,9 +216,9 @@ static int __init parse_cede_parameters(void)
 #define NR_DEDICATED_STATES2 /* snooze, CEDE */
 static u8 cede_latency_hint[NR_DEDICATED_STATES];
 
-static int dedicated_cede_loop(struct cpuidle_device *dev,
-   struct cpuidle_driver *drv,
-   int index)
+static __cpuidle
+int dedicated_cede_loop(struct cpuidle_device *dev, struct cpuidle_driver *drv,
+   int index)
 {
u8 old_latency_hint;
 
@@ -230,7 +230,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
HMT_medium();
check_and_cede_processor();
 
-   local_irq_disable();
+   raw_local_irq_disable();
get_lppaca()->donate_dedicated_cpu = 0;
get_lppaca()->cede_latency_hint = old_latency_hint;
 
@@ -239,9 +239,9 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
return index;
 }
 
-static int shared_cede_loop(struct cpuidle_device *dev,
-   struct cpuidle_driver *drv,
-   int index)
+static __cpuidle
+int shared_cede_loop(struct cpuidle_device *dev, struct cpuidle_driver *drv,
+int index)
 {
 
pseries_idle_prolog();
@@ -255,7 +255,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
 */
check_and_cede_processor();
 
-   local_irq_disable();
+   raw_local_irq_disable();
pseries_idle_epilog();
 
return index;
-- 
2.39.2



Re: [PATCH v2 02/19] arch/arc: Implement with generic helpers

2023-04-06 Thread Arnd Bergmann
On Thu, Apr 6, 2023, at 16:30, Thomas Zimmermann wrote:
> +
>  static inline void fb_pgprotect(struct file *file, struct vm_area_struct 
> *vma,
>   unsigned long off)
>  {
>   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>  }
> +#define fb_pgprotect fb_pgprotect

I still feel that for architectures like arc that don't have
pgprot_writecombine(), it would b best to go with the
generic implementation that currently behaves the exact
same way. If pgprot_writecombine() gets added in the future,
it would cause the architecture to behave as expected rather
than introducing the same bug that mips has.

  Arnd


Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread David Hildenbrand

On 06.04.23 16:04, Peter Zijlstra wrote:

On Thu, Apr 06, 2023 at 03:29:28PM +0200, Peter Zijlstra wrote:

On Thu, Apr 06, 2023 at 09:38:50AM -0300, Marcelo Tosatti wrote:


To actually hit this path you're doing something really dodgy.


Apparently khugepaged is using the same infrastructure:

$ grep tlb_remove_table khugepaged.c
tlb_remove_table_sync_one();
tlb_remove_table_sync_one();

So just enabling khugepaged will hit that path.


Urgh, WTF..

Let me go read that stuff :/


At the very least the one on collapse_and_free_pmd() could easily become
a call_rcu() based free.

I'm not sure I'm following what collapse_huge_page() does just yet.


It wants to replace a leaf page table by a THP (Transparent Huge Page 
mapped by a PMD). So we want to rip out a leaf page table while other 
code (GUP-fast) might still be walking it. In contrast to freeing the 
page table, we put it into a list where it can be reuse when having to 
PTE-map a THP again.


Now, similar to after freeing the page table, someone else could reuse 
that page table and modify it.


If we have GUP-fast walking the page table while that is happening, 
we're in trouble. So we have to make sure GUP-fast is done before 
enqueuing the now-free page table.


That's why the tlb_remove_table_sync_one() was recently added (by Jann 
IIRC).


--
Thanks,

David / dhildenb



Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2023 at 03:11:52PM +0100, Valentin Schneider wrote:
> On 06/04/23 15:38, Peter Zijlstra wrote:
> > On Wed, Apr 05, 2023 at 01:45:02PM +0100, Valentin Schneider wrote:
> >>
> >> I've been hacking on something like this (CSD deferral for NOHZ-full),
> >> and unfortunately this uses the CPU-local cfd_data storage thing, which
> >> means any further smp_call_function() from the same CPU to the same
> >> destination will spin on csd_lock_wait(), waiting for the target CPU to
> >> come out of userspace and flush the queue - and we've just spent extra
> >> effort into *not* disturbing it, so that'll take a while :(
> >
> > I'm not sure I buy into deferring stuff.. a NOHZ_FULL cpu might 'never'
> > come back. Queueing data just in case it does seems wasteful.
> 
> Putting those callbacks straight into the bin would make my life much
> easier!

Well, it's either they get inhibited at the source like the parent patch
does, or they go through. I really don't see a sane middle way here.

> Unfortunately, even if they really should, I don't believe all of the
> things being crammed onto NOHZ_FULL CPUs have the same definition of
> 'never' as we do :/

That's not entirely the point, the point is that there are proper
NOHZ_FULL users that won't return to the kernel until the machine shuts
down. Buffering stuff for them is more or less a direct memory leak.



Re: [PATCH v2 01/19] fbdev: Prepare generic architecture helpers

2023-04-06 Thread Arnd Bergmann
On Thu, Apr 6, 2023, at 16:30, Thomas Zimmermann wrote:
> Generic implementations of fb_pgprotect() and fb_is_primary_device()
> have been in the source code for a long time. Prepare the header file
> to make use of them.
>
> Improve the code by using an inline function for fb_pgprotect()
> and by removing include statements. The default mode set by
> fb_pgprotect() is now writecombine, which is what most platforms
> want.
>
> Symbols are protected by preprocessor guards. Architectures that
> provide a symbol need to define a preprocessor token of the same
> name and value. Otherwise the header file will provide a generic
> implementation. This pattern has been taken from .
>
> v2:
>   *  use writecombine mappings by default (Arnd)
>
> Signed-off-by: Thomas Zimmermann 

Acked-by: Arnd Bergmann 


[PATCH v2 19/19] arch/x86: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Include  and set the required preprocessor tokens
correctly. x86 now implements its own set of fb helpers, but still
follows the overall pattern.

Signed-off-by: Thomas Zimmermann 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: "H. Peter Anvin" 
---
 arch/x86/include/asm/fb.h | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fb.h b/arch/x86/include/asm/fb.h
index ab4c960146e3..a3fb801f12f1 100644
--- a/arch/x86/include/asm/fb.h
+++ b/arch/x86/include/asm/fb.h
@@ -2,10 +2,11 @@
 #ifndef _ASM_X86_FB_H
 #define _ASM_X86_FB_H
 
-#include 
-#include 
 #include 
 
+struct fb_info;
+struct file;
+
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
unsigned long off)
 {
@@ -16,7 +17,11 @@ static inline void fb_pgprotect(struct file *file, struct 
vm_area_struct *vma,
pgprot_val(vma->vm_page_prot) =
prot | cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS);
 }
+#define fb_pgprotect fb_pgprotect
+
+int fb_is_primary_device(struct fb_info *info);
+#define fb_is_primary_device fb_is_primary_device
 
-extern int fb_is_primary_device(struct fb_info *info);
+#include 
 
 #endif /* _ASM_X86_FB_H */
-- 
2.40.0



[PATCH v2 18/19] arch/sparc: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Include  for correctness. Sparc does provide
it's own implementation of the contained functions.

v2:
* restore the original fb_pgprotect()

Signed-off-by: Thomas Zimmermann 
Cc: "David S. Miller" 
---
 arch/sparc/include/asm/fb.h | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/include/asm/fb.h b/arch/sparc/include/asm/fb.h
index 28609f7a965c..496e58d22e7b 100644
--- a/arch/sparc/include/asm/fb.h
+++ b/arch/sparc/include/asm/fb.h
@@ -2,11 +2,10 @@
 #ifndef _SPARC_FB_H_
 #define _SPARC_FB_H_
 
-#include 
-
 #include 
 
 struct fb_info;
+struct file;
 
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
unsigned long off)
@@ -15,7 +14,11 @@ static inline void fb_pgprotect(struct file *file, struct 
vm_area_struct *vma,
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 #endif
 }
+#define fb_pgprotect fb_pgprotect
 
 int fb_is_primary_device(struct fb_info *info);
+#define fb_is_primary_device fb_is_primary_device
+
+#include 
 
 #endif /* _SPARC_FB_H_ */
-- 
2.40.0



[PATCH v2 17/19] arch/sparc: Implement fb_is_primary_device() in source file

2023-04-06 Thread Thomas Zimmermann
Other architectures implment fb_is_primary_device() in a source
file. Do the same on sparc. No functional changes, but allows to
remove several include statement from .

v2:
* don't include  in header file

Signed-off-by: Thomas Zimmermann 
Cc: "David S. Miller" 
---
 arch/sparc/Makefile |  1 +
 arch/sparc/include/asm/fb.h | 23 +--
 arch/sparc/video/Makefile   |  3 +++
 arch/sparc/video/fbdev.c| 24 
 4 files changed, 33 insertions(+), 18 deletions(-)
 create mode 100644 arch/sparc/video/Makefile
 create mode 100644 arch/sparc/video/fbdev.c

diff --git a/arch/sparc/Makefile b/arch/sparc/Makefile
index a4ea5b05f288..95a9211e48e3 100644
--- a/arch/sparc/Makefile
+++ b/arch/sparc/Makefile
@@ -60,6 +60,7 @@ libs-y += arch/sparc/prom/
 libs-y += arch/sparc/lib/
 
 drivers-$(CONFIG_PM) += arch/sparc/power/
+drivers-$(CONFIG_FB) += arch/sparc/video/
 
 boot := arch/sparc/boot
 
diff --git a/arch/sparc/include/asm/fb.h b/arch/sparc/include/asm/fb.h
index f699962e9ddf..28609f7a965c 100644
--- a/arch/sparc/include/asm/fb.h
+++ b/arch/sparc/include/asm/fb.h
@@ -1,11 +1,12 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #ifndef _SPARC_FB_H_
 #define _SPARC_FB_H_
-#include 
-#include 
+
 #include 
+
 #include 
-#include 
+
+struct fb_info;
 
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
unsigned long off)
@@ -15,20 +16,6 @@ static inline void fb_pgprotect(struct file *file, struct 
vm_area_struct *vma,
 #endif
 }
 
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   struct device *dev = info->device;
-   struct device_node *node;
-
-   if (console_set_on_cmdline)
-   return 0;
-
-   node = dev->of_node;
-   if (node &&
-   node == of_console_device)
-   return 1;
-
-   return 0;
-}
+int fb_is_primary_device(struct fb_info *info);
 
 #endif /* _SPARC_FB_H_ */
diff --git a/arch/sparc/video/Makefile b/arch/sparc/video/Makefile
new file mode 100644
index ..6baddbd58e4d
--- /dev/null
+++ b/arch/sparc/video/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_FB) += fbdev.o
diff --git a/arch/sparc/video/fbdev.c b/arch/sparc/video/fbdev.c
new file mode 100644
index ..dadd5799fbb3
--- /dev/null
+++ b/arch/sparc/video/fbdev.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+int fb_is_primary_device(struct fb_info *info)
+{
+   struct device *dev = info->device;
+   struct device_node *node;
+
+   if (console_set_on_cmdline)
+   return 0;
+
+   node = dev->of_node;
+   if (node && node == of_console_device)
+   return 1;
+
+   return 0;
+}
+EXPORT_SYMBOL(fb_is_primary_device);
-- 
2.40.0



[PATCH v2 16/19] arch/sh: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fbdev helpers with the generic
ones from . No functional changes.

v2:
* use default implementation for fb_pgprotect() (Arnd)

Signed-off-by: Thomas Zimmermann 
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: John Paul Adrian Glaubitz 
---
 arch/sh/include/asm/fb.h | 15 +--
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/sh/include/asm/fb.h b/arch/sh/include/asm/fb.h
index 9a0bca2686fd..19df13ee9ca7 100644
--- a/arch/sh/include/asm/fb.h
+++ b/arch/sh/include/asm/fb.h
@@ -2,19 +2,6 @@
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-#include 
-#include 
-#include 
-
-static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
-   unsigned long off)
-{
-   vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
-}
-
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#include 
 
 #endif /* _ASM_FB_H_ */
-- 
2.40.0



[PATCH v2 15/19] arch/powerpc: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fb_is_primary_device() with the generic
one from . No functional changes.

Signed-off-by: Thomas Zimmermann 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
---
 arch/powerpc/include/asm/fb.h | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/fb.h b/arch/powerpc/include/asm/fb.h
index 6541ab77c5b9..5f1a2e5f7654 100644
--- a/arch/powerpc/include/asm/fb.h
+++ b/arch/powerpc/include/asm/fb.h
@@ -2,8 +2,8 @@
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-#include 
 #include 
+
 #include 
 
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
@@ -13,10 +13,8 @@ static inline void fb_pgprotect(struct file *file, struct 
vm_area_struct *vma,
 vma->vm_end - vma->vm_start,
 vma->vm_page_prot);
 }
+#define fb_pgprotect fb_pgprotect
 
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#include 
 
 #endif /* _ASM_FB_H_ */
-- 
2.40.0



[PATCH v2 14/19] arch/parisc: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fb_is_primary_device() with the generic
one from  on systems without CONFIG_STI_CORE. No
functional changes.

Signed-off-by: Thomas Zimmermann 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
---
 arch/parisc/include/asm/fb.h | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/parisc/include/asm/fb.h b/arch/parisc/include/asm/fb.h
index 0b9a38ced5c8..66bb401c0cda 100644
--- a/arch/parisc/include/asm/fb.h
+++ b/arch/parisc/include/asm/fb.h
@@ -2,23 +2,24 @@
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-#include 
-#include 
 #include 
+#include 
+
+struct fb_info;
+struct file;
 
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
unsigned long off)
 {
pgprot_val(vma->vm_page_prot) |= _PAGE_NO_CACHE;
 }
+#define fb_pgprotect fb_pgprotect
 
 #if defined(CONFIG_STI_CORE)
 int fb_is_primary_device(struct fb_info *info);
-#else
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#define fb_is_primary_device fb_is_primary_device
 #endif
 
+#include 
+
 #endif /* _ASM_FB_H_ */
-- 
2.40.0



[PATCH v2 13/19] arch/parisc: Implement fb_is_primary_device() under arch/parisc

2023-04-06 Thread Thomas Zimmermann
Move PARISC's implementation of fb_is_primary_device() into the
architecture directory. This the place of the declaration and
where other architectures implement this function. No functional
changes.

Signed-off-by: Thomas Zimmermann 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
---
 arch/parisc/Makefile |  2 ++
 arch/parisc/include/asm/fb.h |  2 +-
 arch/parisc/video/Makefile   |  3 +++
 arch/parisc/video/fbdev.c| 27 +++
 drivers/video/sticore.c  | 19 ---
 include/video/sticore.h  |  2 ++
 6 files changed, 35 insertions(+), 20 deletions(-)
 create mode 100644 arch/parisc/video/Makefile
 create mode 100644 arch/parisc/video/fbdev.c

diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile
index 0d049a6f6a60..968ebe17494c 100644
--- a/arch/parisc/Makefile
+++ b/arch/parisc/Makefile
@@ -119,6 +119,8 @@ export LIBGCC
 
 libs-y += arch/parisc/lib/ $(LIBGCC)
 
+drivers-y += arch/parisc/video/
+
 boot   := arch/parisc/boot
 
 PALO := $(shell if (which palo 2>&1); then : ; \
diff --git a/arch/parisc/include/asm/fb.h b/arch/parisc/include/asm/fb.h
index 55d29c4f716e..0b9a38ced5c8 100644
--- a/arch/parisc/include/asm/fb.h
+++ b/arch/parisc/include/asm/fb.h
@@ -12,7 +12,7 @@ static inline void fb_pgprotect(struct file *file, struct 
vm_area_struct *vma,
pgprot_val(vma->vm_page_prot) |= _PAGE_NO_CACHE;
 }
 
-#if defined(CONFIG_FB_STI)
+#if defined(CONFIG_STI_CORE)
 int fb_is_primary_device(struct fb_info *info);
 #else
 static inline int fb_is_primary_device(struct fb_info *info)
diff --git a/arch/parisc/video/Makefile b/arch/parisc/video/Makefile
new file mode 100644
index ..16a73cce4661
--- /dev/null
+++ b/arch/parisc/video/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+obj-$(CONFIG_STI_CORE) += fbdev.o
diff --git a/arch/parisc/video/fbdev.c b/arch/parisc/video/fbdev.c
new file mode 100644
index ..4a0ae08fc75b
--- /dev/null
+++ b/arch/parisc/video/fbdev.c
@@ -0,0 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2000 Philipp Rumpf 
+ * Copyright (C) 2001-2020 Helge Deller 
+ * Copyright (C) 2001-2002 Thomas Bogendoerfer 
+ */
+
+#include 
+
+#include 
+
+#include 
+
+int fb_is_primary_device(struct fb_info *info)
+{
+   struct sti_struct *sti;
+
+   sti = sti_get_rom(0);
+
+   /* if no built-in graphics card found, allow any fb driver as default */
+   if (!sti)
+   return true;
+
+   /* return true if it's the default built-in framebuffer driver */
+   return (sti->info == info);
+}
+EXPORT_SYMBOL(fb_is_primary_device);
diff --git a/drivers/video/sticore.c b/drivers/video/sticore.c
index f8aaedea437d..7eb925f2ba9c 100644
--- a/drivers/video/sticore.c
+++ b/drivers/video/sticore.c
@@ -30,7 +30,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 
@@ -1148,24 +1147,6 @@ int sti_call(const struct sti_struct *sti, unsigned long 
func,
return ret;
 }
 
-#if defined(CONFIG_FB_STI)
-/* check if given fb_info is the primary device */
-int fb_is_primary_device(struct fb_info *info)
-{
-   struct sti_struct *sti;
-
-   sti = sti_get_rom(0);
-
-   /* if no built-in graphics card found, allow any fb driver as default */
-   if (!sti)
-   return true;
-
-   /* return true if it's the default built-in framebuffer driver */
-   return (sti->info == info);
-}
-EXPORT_SYMBOL(fb_is_primary_device);
-#endif
-
 MODULE_AUTHOR("Philipp Rumpf, Helge Deller, Thomas Bogendoerfer");
 MODULE_DESCRIPTION("Core STI driver for HP's NGLE series graphics cards in HP 
PARISC machines");
 MODULE_LICENSE("GPL v2");
diff --git a/include/video/sticore.h b/include/video/sticore.h
index c0879352cde4..fbb78d7e7565 100644
--- a/include/video/sticore.h
+++ b/include/video/sticore.h
@@ -2,6 +2,8 @@
 #ifndef STICORE_H
 #define STICORE_H
 
+struct fb_info;
+
 /* generic STI structures & functions */
 
 #define MAX_STI_ROMS 4 /* max no. of ROMs which this driver handles */
-- 
2.40.0



[PATCH v2 12/19] arch/parisc: Remove trailing whitespaces

2023-04-06 Thread Thomas Zimmermann
Fix trailing whitespaces. No functional changes.

Signed-off-by: Thomas Zimmermann 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
---
 arch/parisc/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile
index a2d8600521f9..0d049a6f6a60 100644
--- a/arch/parisc/Makefile
+++ b/arch/parisc/Makefile
@@ -11,7 +11,7 @@
 # Copyright (C) 1994 by Linus Torvalds
 # Portions Copyright (C) 1999 The Puffin Group
 #
-# Modified for PA-RISC Linux by Paul Lahaie, Alex deVries, 
+# Modified for PA-RISC Linux by Paul Lahaie, Alex deVries,
 # Mike Shaver, Helge Deller and Martin K. Petersen
 #
 
-- 
2.40.0



[PATCH v2 10/19] video: Remove trailing whitespaces

2023-04-06 Thread Thomas Zimmermann
Fix trailing whitespaces. No functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/video/console/sticon.c  |   4 +-
 drivers/video/console/sticore.c | 102 ++---
 drivers/video/fbdev/sticore.h   |  14 +--
 drivers/video/fbdev/stifb.c | 156 
 4 files changed, 138 insertions(+), 138 deletions(-)

diff --git a/drivers/video/console/sticon.c b/drivers/video/console/sticon.c
index 2cea69418a83..89ad7ade6cf9 100644
--- a/drivers/video/console/sticon.c
+++ b/drivers/video/console/sticon.c
@@ -282,7 +282,7 @@ static void sticon_init(struct vc_data *c, int init)
 vc_cols = sti_onscreen_x(sti) / sti->font->width;
 vc_rows = sti_onscreen_y(sti) / sti->font->height;
 c->vc_can_do_color = 1;
-
+
 if (init) {
c->vc_cols = vc_cols;
c->vc_rows = vc_rows;
@@ -374,7 +374,7 @@ static const struct consw sti_con = {
.con_font_set   = sticon_font_set,
.con_font_default   = sticon_font_default,
.con_build_attr = sticon_build_attr,
-   .con_invert_region  = sticon_invert_region, 
+   .con_invert_region  = sticon_invert_region,
 };
 
 
diff --git a/drivers/video/console/sticore.c b/drivers/video/console/sticore.c
index db568f67e4dc..6ea9596a3c4b 100644
--- a/drivers/video/console/sticore.c
+++ b/drivers/video/console/sticore.c
@@ -6,12 +6,12 @@
  * Copyright (C) 2000 Philipp Rumpf 
  * Copyright (C) 2001-2020 Helge Deller 
  * Copyright (C) 2001-2002 Thomas Bogendoerfer 
- * 
+ *
  * TODO:
  * - call STI in virtual mode rather than in real mode
- * - screen blanking with state_mgmt() in text mode STI ? 
+ * - screen blanking with state_mgmt() in text mode STI ?
  * - try to make it work on m68k hp workstations ;)
- * 
+ *
  */
 
 #define pr_fmt(fmt) "%s: " fmt, KBUILD_MODNAME
@@ -66,12 +66,12 @@ static const u8 col_trans[8] = {
 #define c_index(sti, c) ((c) & 0xff)
 
 static const struct sti_init_flags default_init_flags = {
-   .wait   = STI_WAIT, 
+   .wait   = STI_WAIT,
.reset  = 1,
-   .text   = 1, 
+   .text   = 1,
.nontext = 1,
-   .no_chg_bet = 1, 
-   .no_chg_bei = 1, 
+   .no_chg_bet = 1,
+   .no_chg_bei = 1,
.init_cmap_tx = 1,
 };
 
@@ -104,7 +104,7 @@ static int sti_init_graph(struct sti_struct *sti)
pr_err("STI init_graph failed (ret %d, errno %d)\n", ret, err);
return -1;
}
-   
+
return 0;
 }
 
@@ -120,7 +120,7 @@ static void sti_inq_conf(struct sti_struct *sti)
s32 ret;
 
outptr->ext_ptr = STI_PTR(>sti_data->inq_outptr_ext);
-   
+
do {
spin_lock_irqsave(>lock, flags);
memset(inptr, 0, sizeof(*inptr));
@@ -162,9 +162,9 @@ sti_putc(struct sti_struct *sti, int c, int y, int x,
 }
 
 static const struct sti_blkmv_flags clear_blkmv_flags = {
-   .wait   = STI_WAIT, 
-   .color  = 1, 
-   .clear  = 1, 
+   .wait   = STI_WAIT,
+   .color  = 1,
+   .clear  = 1,
 };
 
 void
@@ -185,7 +185,7 @@ sti_set(struct sti_struct *sti, int src_y, int src_x,
struct sti_blkmv_outptr *outptr = >sti_data->blkmv_outptr;
s32 ret;
unsigned long flags;
-   
+
do {
spin_lock_irqsave(>lock, flags);
*inptr = inptr_default;
@@ -224,7 +224,7 @@ sti_clear(struct sti_struct *sti, int src_y, int src_x,
 }
 
 static const struct sti_blkmv_flags default_blkmv_flags = {
-   .wait = STI_WAIT, 
+   .wait = STI_WAIT,
 };
 
 void
@@ -291,14 +291,14 @@ static int __init sti_setup(char *str)
 {
if (str)
strscpy(default_sti_path, str, sizeof(default_sti_path));
-   
+
return 1;
 }
 
 /* Assuming the machine has multiple STI consoles (=graphic cards) which
  * all get detected by sticon, the user may define with the linux kernel
  * parameter sti= which of them will be the initial boot-console.
- *  is a number between 0 and MAX_STI_ROMS, with 0 as the default 
+ *  is a number between 0 and MAX_STI_ROMS, with 0 as the default
  * STI screen.
  */
 __setup("sti=", sti_setup);
@@ -341,13 +341,13 @@ static int sti_font_setup(char *str)
  * should be used by the sticon driver to draw characters to the screen.
  * Possible values are:
  * - sti_font=:
- *  is the name of one of the linux-kernel built-in 
- * framebuffer font names (e.g. VGA8x16, SUN22x18). 
- * This is only available if the fonts have been statically 
compiled 
+ *  is the name of one of the linux-kernel built-in
+ * framebuffer font names (e.g. VGA8x16, SUN22x18).
+ * This is only available if the fonts have been statically 
compiled
  * in with e.g. the CONFIG_FONT_8x16 or CONFIG_FONT_SUN12x22 
options.
  * - sti_font= ( = 1,2,3,...)
  * most STI ROMs have built-in HP specific fonts, which 

[PATCH v2 09/19] arch/mips: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fb_is_primary_device() with the generic
one from . No functional changes.

Signed-off-by: Thomas Zimmermann 
Cc: Thomas Bogendoerfer 
---
 arch/mips/include/asm/fb.h | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/mips/include/asm/fb.h b/arch/mips/include/asm/fb.h
index bd3f68c9ddfc..6bda0a81d8ca 100644
--- a/arch/mips/include/asm/fb.h
+++ b/arch/mips/include/asm/fb.h
@@ -1,19 +1,17 @@
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-#include 
-#include 
 #include 
 
+struct file;
+
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
unsigned long off)
 {
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 }
+#define fb_pgprotect fb_pgprotect
 
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#include 
 
 #endif /* _ASM_FB_H_ */
-- 
2.40.0



[PATCH v2 11/19] video: Move HP PARISC STI core code to shared location

2023-04-06 Thread Thomas Zimmermann
STI core files have been located in console and fbdev code. Move
the source code and header to the directories for video helpers.
Also update the config and build rules such that the code depends
on the config symbol CONFIG_STI_CORE, which STI console and STI
framebuffer select automatically.

Cleans up the console makefile and prepares PARISC to implement
fb_is_primary_device() within the arch/ directory. No functional
changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/video/Kconfig| 7 +++
 drivers/video/Makefile   | 1 +
 drivers/video/console/Kconfig| 1 +
 drivers/video/console/Makefile   | 4 +---
 drivers/video/console/sticon.c   | 2 +-
 drivers/video/fbdev/Kconfig  | 3 +--
 drivers/video/fbdev/stifb.c  | 2 +-
 drivers/video/{console => }/sticore.c| 2 +-
 {drivers/video/fbdev => include/video}/sticore.h | 0
 9 files changed, 14 insertions(+), 8 deletions(-)
 rename drivers/video/{console => }/sticore.c (99%)
 rename {drivers/video/fbdev => include/video}/sticore.h (100%)

diff --git a/drivers/video/Kconfig b/drivers/video/Kconfig
index bf05363d8906..8b2b9ac37c3d 100644
--- a/drivers/video/Kconfig
+++ b/drivers/video/Kconfig
@@ -11,6 +11,13 @@ config APERTURE_HELPERS
  Support tracking and hand-over of aperture ownership. Required
  by graphics drivers for firmware-provided framebuffers.
 
+config STI_CORE
+   bool
+   depends on PARISC
+   help
+ STI refers to the HP "Standard Text Interface" which is a set of
+ BIOS routines contained in a ROM chip in HP PA-RISC based machines.
+
 config VIDEO_CMDLINE
bool
 
diff --git a/drivers/video/Makefile b/drivers/video/Makefile
index 831c9fa57a6c..6bbc03950899 100644
--- a/drivers/video/Makefile
+++ b/drivers/video/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_APERTURE_HELPERS)+= aperture.o
+obj-$(CONFIG_STI_CORE)+= sticore.o
 obj-$(CONFIG_VGASTATE)+= vgastate.o
 obj-$(CONFIG_VIDEO_CMDLINE)   += cmdline.o
 obj-$(CONFIG_VIDEO_NOMODESET) += nomodeset.o
diff --git a/drivers/video/console/Kconfig b/drivers/video/console/Kconfig
index 22cea5082ac4..a2a88d42edf0 100644
--- a/drivers/video/console/Kconfig
+++ b/drivers/video/console/Kconfig
@@ -141,6 +141,7 @@ config STI_CONSOLE
depends on PARISC && HAS_IOMEM
select FONT_SUPPORT
select CRC32
+   select STI_CORE
default y
help
  The STI console is the builtin display/keyboard on HP-PARISC
diff --git a/drivers/video/console/Makefile b/drivers/video/console/Makefile
index db07b784bd2c..fd79016a0d95 100644
--- a/drivers/video/console/Makefile
+++ b/drivers/video/console/Makefile
@@ -5,8 +5,6 @@
 
 obj-$(CONFIG_DUMMY_CONSOLE)   += dummycon.o
 obj-$(CONFIG_SGI_NEWPORT_CONSOLE) += newport_con.o
-obj-$(CONFIG_STI_CONSOLE) += sticon.o sticore.o
+obj-$(CONFIG_STI_CONSOLE) += sticon.o
 obj-$(CONFIG_VGA_CONSOLE) += vgacon.o
 obj-$(CONFIG_MDA_CONSOLE) += mdacon.o
-
-obj-$(CONFIG_FB_STI)  += sticore.o
diff --git a/drivers/video/console/sticon.c b/drivers/video/console/sticon.c
index 89ad7ade6cf9..d11cfd2d68b5 100644
--- a/drivers/video/console/sticon.c
+++ b/drivers/video/console/sticon.c
@@ -50,7 +50,7 @@
 
 #include 
 
-#include "../fbdev/sticore.h"
+#include 
 
 /* switching to graphics mode */
 #define BLANK 0
diff --git a/drivers/video/fbdev/Kconfig b/drivers/video/fbdev/Kconfig
index 96e91570cdd3..485e8c35d5c6 100644
--- a/drivers/video/fbdev/Kconfig
+++ b/drivers/video/fbdev/Kconfig
@@ -551,10 +551,9 @@ config FB_STI
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA
select FB_CFB_IMAGEBLIT
+   select STI_CORE
default y
help
- STI refers to the HP "Standard Text Interface" which is a set of
- BIOS routines contained in a ROM chip in HP PA-RISC based machines.
  Enabling this option will implement the linux framebuffer device
  using calls to the STI BIOS routines for initialisation.
 
diff --git a/drivers/video/fbdev/stifb.c b/drivers/video/fbdev/stifb.c
index 6bc7e6d9..baca6974e288 100644
--- a/drivers/video/fbdev/stifb.c
+++ b/drivers/video/fbdev/stifb.c
@@ -69,7 +69,7 @@
 #include   /* for HP-UX compatibility */
 #include 
 
-#include "sticore.h"
+#include 
 
 /* REGION_BASE(fb_info, index) returns the virtual address for region  
*/
 #define REGION_BASE(fb_info, index) \
diff --git a/drivers/video/console/sticore.c b/drivers/video/sticore.c
similarity index 99%
rename from drivers/video/console/sticore.c
rename to drivers/video/sticore.c
index 6ea9596a3c4b..f8aaedea437d 100644
--- a/drivers/video/console/sticore.c
+++ b/drivers/video/sticore.c
@@ -32,7 +32,7 @@
 #include 
 #include 
 
-#include "../fbdev/sticore.h"
+#include 
 
 #define STI_DRIVERVERSION "Version 

[PATCH v2 08/19] arch/m68k: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fb_is_primary_device() with the generic
one from . No functional changes.

v2:
* provide empty fb_pgprotect() on non-MMU systems

Signed-off-by: Thomas Zimmermann 
Cc: Geert Uytterhoeven 
---
 arch/m68k/include/asm/fb.h | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/m68k/include/asm/fb.h b/arch/m68k/include/asm/fb.h
index 4f96989922af..24273fc7ad91 100644
--- a/arch/m68k/include/asm/fb.h
+++ b/arch/m68k/include/asm/fb.h
@@ -2,11 +2,11 @@
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-#include 
-#include 
 #include 
 #include 
 
+struct file;
+
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
unsigned long off)
 {
@@ -24,10 +24,8 @@ static inline void fb_pgprotect(struct file *file, struct 
vm_area_struct *vma,
 #endif /* CONFIG_SUN3 */
 #endif /* CONFIG_MMU */
 }
+#define fb_pgprotect fb_pgprotect
 
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#include 
 
 #endif /* _ASM_FB_H_ */
-- 
2.40.0



[PATCH v2 07/19] arch/m68k: Merge variants of fb_pgprotect() into single function

2023-04-06 Thread Thomas Zimmermann
Merge all variants of fb_pgprotect() into a single function body.
There are two different cases for MMU systems. For non-MMU systems,
the function body will be empty. No functional changes, but this
will help with the switch to .

Signed-off-by: Thomas Zimmermann 
---
 arch/m68k/include/asm/fb.h | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/arch/m68k/include/asm/fb.h b/arch/m68k/include/asm/fb.h
index b86c6e2e26dd..4f96989922af 100644
--- a/arch/m68k/include/asm/fb.h
+++ b/arch/m68k/include/asm/fb.h
@@ -7,17 +7,13 @@
 #include 
 #include 
 
-#ifdef CONFIG_MMU
-#ifdef CONFIG_SUN3
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
unsigned long off)
 {
+#ifdef CONFIG_MMU
+#ifdef CONFIG_SUN3
pgprot_val(vma->vm_page_prot) |= SUN3_PAGE_NOCACHE;
-}
 #else
-static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
-   unsigned long off)
-{
if (CPU_IS_020_OR_030)
pgprot_val(vma->vm_page_prot) |= _PAGE_NOCACHE030;
if (CPU_IS_040_OR_060) {
@@ -25,11 +21,9 @@ static inline void fb_pgprotect(struct file *file, struct 
vm_area_struct *vma,
/* Use no-cache mode, serialized */
pgprot_val(vma->vm_page_prot) |= _PAGE_NOCACHE_S;
}
-}
 #endif /* CONFIG_SUN3 */
-#else
-#define fb_pgprotect(...) do {} while (0)
 #endif /* CONFIG_MMU */
+}
 
 static inline int fb_is_primary_device(struct fb_info *info)
 {
-- 
2.40.0



[PATCH v2 06/19] arch/loongarch: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fbdev helpers with the generic
ones from . No functional changes.

v2:
* use default implementation for fb_pgprotect() (Arnd)

Signed-off-by: Thomas Zimmermann 
Cc: Huacai Chen 
Cc: WANG Xuerui 
---
 arch/loongarch/include/asm/fb.h | 15 +--
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/loongarch/include/asm/fb.h b/arch/loongarch/include/asm/fb.h
index 3116bde8772d..ff82f20685c8 100644
--- a/arch/loongarch/include/asm/fb.h
+++ b/arch/loongarch/include/asm/fb.h
@@ -5,19 +5,6 @@
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-#include 
-#include 
-#include 
-
-static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
-   unsigned long off)
-{
-   vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
-}
-
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#include 
 
 #endif /* _ASM_FB_H_ */
-- 
2.40.0



[PATCH v2 05/19] arch/ia64: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fb_is_primary_device() with the generic
one from . No functional changes.

Signed-off-by: Thomas Zimmermann 
---
 arch/ia64/include/asm/fb.h | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/ia64/include/asm/fb.h b/arch/ia64/include/asm/fb.h
index 5f95782bfa46..0208f64a0da0 100644
--- a/arch/ia64/include/asm/fb.h
+++ b/arch/ia64/include/asm/fb.h
@@ -2,11 +2,12 @@
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-#include 
-#include 
 #include 
+
 #include 
 
+struct file;
+
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
unsigned long off)
 {
@@ -15,10 +16,8 @@ static inline void fb_pgprotect(struct file *file, struct 
vm_area_struct *vma,
else
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 }
+#define fb_pgprotect fb_pgprotect
 
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#include 
 
 #endif /* _ASM_FB_H_ */
-- 
2.40.0



[PATCH v2 04/19] arch/arm64: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fbdev helpers with the generic
ones from . No functional changes.

v2:
* use default implementation for fb_pgprotect() (Arnd)

Signed-off-by: Thomas Zimmermann 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/fb.h | 15 +--
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/fb.h b/arch/arm64/include/asm/fb.h
index bdc735ee1f67..1a495d8fb2ce 100644
--- a/arch/arm64/include/asm/fb.h
+++ b/arch/arm64/include/asm/fb.h
@@ -5,19 +5,6 @@
 #ifndef __ASM_FB_H_
 #define __ASM_FB_H_
 
-#include 
-#include 
-#include 
-
-static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
-   unsigned long off)
-{
-   vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
-}
-
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#include 
 
 #endif /* __ASM_FB_H_ */
-- 
2.40.0



[PATCH v2 03/19] arch/arm: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fbdev helpers with the generic
ones from . No functional changes.

v2:
* use default implementation for fb_pgprotect() (Arnd)

Signed-off-by: Thomas Zimmermann 
Cc: Russell King 
---
 arch/arm/include/asm/fb.h | 15 +--
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/arm/include/asm/fb.h b/arch/arm/include/asm/fb.h
index d92e99cd8c8a..ce20a43c3033 100644
--- a/arch/arm/include/asm/fb.h
+++ b/arch/arm/include/asm/fb.h
@@ -1,19 +1,6 @@
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-#include 
-#include 
-#include 
-
-static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
-   unsigned long off)
-{
-   vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
-}
-
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#include 
 
 #endif /* _ASM_FB_H_ */
-- 
2.40.0



[PATCH v2 01/19] fbdev: Prepare generic architecture helpers

2023-04-06 Thread Thomas Zimmermann
Generic implementations of fb_pgprotect() and fb_is_primary_device()
have been in the source code for a long time. Prepare the header file
to make use of them.

Improve the code by using an inline function for fb_pgprotect()
and by removing include statements. The default mode set by
fb_pgprotect() is now writecombine, which is what most platforms
want.

Symbols are protected by preprocessor guards. Architectures that
provide a symbol need to define a preprocessor token of the same
name and value. Otherwise the header file will provide a generic
implementation. This pattern has been taken from .

v2:
*  use writecombine mappings by default (Arnd)

Signed-off-by: Thomas Zimmermann 
---
 include/asm-generic/fb.h | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/fb.h b/include/asm-generic/fb.h
index f9f18101ed36..631d97c507ca 100644
--- a/include/asm-generic/fb.h
+++ b/include/asm-generic/fb.h
@@ -1,13 +1,32 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+
 #ifndef __ASM_GENERIC_FB_H_
 #define __ASM_GENERIC_FB_H_
-#include 
 
-#define fb_pgprotect(...) do {} while (0)
+/*
+ * Only include this header file from your architecture's .
+ */
+
+#include 
+
+struct fb_info;
+struct file;
+
+#ifndef fb_pgprotect
+#define fb_pgprotect fb_pgprotect
+static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
+   unsigned long off)
+{
+   vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+}
+#endif
 
+#ifndef fb_is_primary_device
+#define fb_is_primary_device fb_is_primary_device
 static inline int fb_is_primary_device(struct fb_info *info)
 {
return 0;
 }
+#endif
 
 #endif /* __ASM_GENERIC_FB_H_ */
-- 
2.40.0



[PATCH v2 00/19] arch: Consolidate

2023-04-06 Thread Thomas Zimmermann
Various architectures provide  with helpers for fbdev
framebuffer devices. Share the contained code where possible. There
is already , which implements generic (as in
'empty') functions of the fbdev helpers. The header was added in
commit aafe4dbed0bf ("asm-generic: add generic versions of common
headers"), but never used.

Each per-architecture header file declares and/or implements fbdev
helpers and defines a preprocessor token for each. The generic
header then provides the remaining helpers. It works like the I/O
helpers in .

For PARISC, the architecture helpers are mixed up with helpers
for the system's STI graphics firmware. We first move the STI code
to appropriate locations under video/ and then move the architecture
helper under arch/parisc.

For Sparc, there's an additional patch that moves the implementation
from the header into a source file. This allows to avoid some include
statements in the header file.

Built on arm, arm64, m68k, mips, parisc, powerpc, sparc and x86.

v2:
* make writecombine the default mapping mode (Arnd)
* rework fb_pgprotect() on m68k

Thomas Zimmermann (19):
  fbdev: Prepare generic architecture helpers
  arch/arc: Implement  with generic helpers
  arch/arm: Implement  with generic helpers
  arch/arm64: Implement  with generic helpers
  arch/ia64: Implement  with generic helpers
  arch/loongarch: Implement  with generic helpers
  arch/m68k: Merge variants of fb_pgprotect() into single function
  arch/m68k: Implement  with generic helpers
  arch/mips: Implement  with generic helpers
  video: Remove trailing whitespaces
  video: Move HP PARISC STI core code to shared location
  arch/parisc: Remove trailing whitespaces
  arch/parisc: Implement fb_is_primary_device() under arch/parisc
  arch/parisc: Implement  with generic helpers
  arch/powerpc: Implement  with generic helpers
  arch/sh: Implement  with generic helpers
  arch/sparc: Implement fb_is_primary_device() in source file
  arch/sparc: Implement  with generic helpers
  arch/x86: Implement  with generic helpers

 arch/arc/include/asm/fb.h |  11 +-
 arch/arm/include/asm/fb.h |  15 +-
 arch/arm64/include/asm/fb.h   |  15 +-
 arch/ia64/include/asm/fb.h|  11 +-
 arch/loongarch/include/asm/fb.h   |  15 +-
 arch/m68k/include/asm/fb.h|  22 +--
 arch/mips/include/asm/fb.h|  10 +-
 arch/parisc/Makefile  |   4 +-
 arch/parisc/include/asm/fb.h  |  17 +-
 arch/parisc/video/Makefile|   3 +
 arch/parisc/video/fbdev.c |  27 +++
 arch/powerpc/include/asm/fb.h |   8 +-
 arch/sh/include/asm/fb.h  |  15 +-
 arch/sparc/Makefile   |   1 +
 arch/sparc/include/asm/fb.h   |  26 +--
 arch/sparc/video/Makefile |   3 +
 arch/sparc/video/fbdev.c  |  24 +++
 arch/x86/include/asm/fb.h |  11 +-
 drivers/video/Kconfig |   7 +
 drivers/video/Makefile|   1 +
 drivers/video/console/Kconfig |   1 +
 drivers/video/console/Makefile|   4 +-
 drivers/video/console/sticon.c|   6 +-
 drivers/video/fbdev/Kconfig   |   3 +-
 drivers/video/fbdev/stifb.c   | 158 +-
 drivers/video/{console => }/sticore.c | 123 ++
 include/asm-generic/fb.h  |  23 ++-
 .../video/fbdev => include/video}/sticore.h   |  16 +-
 28 files changed, 289 insertions(+), 291 deletions(-)
 create mode 100644 arch/parisc/video/Makefile
 create mode 100644 arch/parisc/video/fbdev.c
 create mode 100644 arch/sparc/video/Makefile
 create mode 100644 arch/sparc/video/fbdev.c
 rename drivers/video/{console => }/sticore.c (95%)
 rename {drivers/video/fbdev => include/video}/sticore.h (99%)


base-commit: a7180debb9c631375684f4d717466cfb9f238660
prerequisite-patch-id: 0aa359f6144c4015c140c8a6750be19099c676fb
prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24
prerequisite-patch-id: d571861118316645b0124ac21260571720b632e2
prerequisite-patch-id: 15c0024c23be42851054ea840eec195f69716c08
prerequisite-patch-id: 441ee4341b183e4577c20a3f3bdebee529d21c34
prerequisite-patch-id: cbc453ee02fae02af22fbfdce56ab732c7a88c36
-- 
2.40.0



[PATCH v2 02/19] arch/arc: Implement with generic helpers

2023-04-06 Thread Thomas Zimmermann
Replace the architecture's fb_is_primary_device() with the generic
one from . No functional changes.

Signed-off-by: Thomas Zimmermann 
Cc: Vineet Gupta 
---
 arch/arc/include/asm/fb.h | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/arc/include/asm/fb.h b/arch/arc/include/asm/fb.h
index dc2e303cdbbb..dff149eaecaf 100644
--- a/arch/arc/include/asm/fb.h
+++ b/arch/arc/include/asm/fb.h
@@ -1,20 +1,19 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-#include 
-#include 
 #include 
 
+struct file;
+
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
unsigned long off)
 {
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 }
+#define fb_pgprotect fb_pgprotect
 
-static inline int fb_is_primary_device(struct fb_info *info)
-{
-   return 0;
-}
+#include 
 
 #endif /* _ASM_FB_H_ */
-- 
2.40.0



Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Valentin Schneider
On 06/04/23 15:38, Peter Zijlstra wrote:
> On Wed, Apr 05, 2023 at 01:45:02PM +0100, Valentin Schneider wrote:
>>
>> I've been hacking on something like this (CSD deferral for NOHZ-full),
>> and unfortunately this uses the CPU-local cfd_data storage thing, which
>> means any further smp_call_function() from the same CPU to the same
>> destination will spin on csd_lock_wait(), waiting for the target CPU to
>> come out of userspace and flush the queue - and we've just spent extra
>> effort into *not* disturbing it, so that'll take a while :(
>
> I'm not sure I buy into deferring stuff.. a NOHZ_FULL cpu might 'never'
> come back. Queueing data just in case it does seems wasteful.

Putting those callbacks straight into the bin would make my life much
easier!

Unfortunately, even if they really should, I don't believe all of the
things being crammed onto NOHZ_FULL CPUs have the same definition of
'never' as we do :/



Re: [PATCH 01/18] fbdev: Prepare generic architecture helpers

2023-04-06 Thread Thomas Zimmermann

Hi

Am 05.04.23 um 17:53 schrieb Arnd Bergmann:

On Wed, Apr 5, 2023, at 17:05, Thomas Zimmermann wrote:

Generic implementations of fb_pgprotect() and fb_is_primary_device()
have been in the source code for a long time. Prepare the header file
to make use of them.

Improve the code by using an inline function for fb_pgprotect() and
by removing include statements.

Symbols are protected by preprocessor guards. Architectures that
provide a symbol need to define a preprocessor token of the same
name and value. Otherwise the header file will provide a generic
implementation. This pattern has been taken from .

Signed-off-by: Thomas Zimmermann 


Moving this into generic code is good, but I'm not sure
about the default for fb_pgprotect():


+
+#ifndef fb_pgprotect
+#define fb_pgprotect fb_pgprotect
+static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
+   unsigned long off)
+{ }
+#endif


I think most architectures will want the version we have on
arc, arm, arm64, loongarch, and sh already:

static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
 unsigned long off)
{
vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
}

so I'd suggest making that version the default, and treating the
empty ones (m68knommu, sparc32) as architecture specific
workarounds.

I see that sparc64 and parisc use pgprot_uncached here, but as
they don't define a custom pgprot_writecombine, this ends up being
the same, and they can use the above definition as well.

mips defines pgprot_writecombine but uses pgprot_noncached
in fb_pgprotect(), which is probably a mistake and should have
been updated as part of commit 4b050ba7a66c ("MIPS: pgtable.h:
Implement the pgprot_writecombine function for MIPS").


I would not want to change any of the other platform's functions unless 
the rsp platform maintainers ask me to.


Best regards
Thomas



 Arnd


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2023 at 03:29:28PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 06, 2023 at 09:38:50AM -0300, Marcelo Tosatti wrote:
> 
> > > To actually hit this path you're doing something really dodgy.
> > 
> > Apparently khugepaged is using the same infrastructure:
> > 
> > $ grep tlb_remove_table khugepaged.c 
> > tlb_remove_table_sync_one();
> > tlb_remove_table_sync_one();
> > 
> > So just enabling khugepaged will hit that path.
> 
> Urgh, WTF..
> 
> Let me go read that stuff :/

At the very least the one on collapse_and_free_pmd() could easily become
a call_rcu() based free.

I'm not sure I'm following what collapse_huge_page() does just yet.


Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Peter Zijlstra
On Wed, Apr 05, 2023 at 01:45:02PM +0100, Valentin Schneider wrote:
> On 05/04/23 14:05, Frederic Weisbecker wrote:
> >  static void smp_call_function_many_cond(const struct cpumask *mask,
> >   smp_call_func_t func, void *info,
> > @@ -946,10 +948,13 @@ static void smp_call_function_many_cond(const struct 
> > cpumask *mask,
> >  #endif
> >   cfd_seq_store(pcpu->seq_queue, this_cpu, cpu, 
> > CFD_SEQ_QUEUE);
> >   if (llist_add(>node.llist, 
> > _cpu(call_single_queue, cpu))) {
> > -   __cpumask_set_cpu(cpu, cfd->cpumask_ipi);
> > -   nr_cpus++;
> > -   last_cpu = cpu;
> > -
> > +   if (!(scf_flags & SCF_NO_USER) ||
> > +   !IS_ENABLED(CONFIG_GENERIC_ENTRY) ||
> > +ct_state_cpu(cpu) != CONTEXT_USER) {
> > +   __cpumask_set_cpu(cpu, 
> > cfd->cpumask_ipi);
> > +   nr_cpus++;
> > +   last_cpu = cpu;
> > +   }
> 
> I've been hacking on something like this (CSD deferral for NOHZ-full),
> and unfortunately this uses the CPU-local cfd_data storage thing, which
> means any further smp_call_function() from the same CPU to the same
> destination will spin on csd_lock_wait(), waiting for the target CPU to
> come out of userspace and flush the queue - and we've just spent extra
> effort into *not* disturbing it, so that'll take a while :(

I'm not sure I buy into deferring stuff.. a NOHZ_FULL cpu might 'never'
come back. Queueing data just in case it does seems wasteful.


Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2023 at 09:49:22AM -0300, Marcelo Tosatti wrote:

> > > 2) Depends on the application and the definition of "occasional".
> > > 
> > > For certain types of applications (for example PLC software or
> > > RAN processing), upon occurrence of an event, it is necessary to
> > > complete a certain task in a maximum amount of time (deadline).
> > 
> > If the application is properly NOHZ_FULL and never does a kernel entry,
> > it will never get that IPI. If it is a pile of shit and does kernel
> > entries while it pretends to be NOHZ_FULL it gets to keep the pieces and
> > no amount of crying will get me to care.
> 
> I suppose its common practice to use certain system calls in latency
> sensitive applications, for example nanosleep. Some examples:
> 
> 1) cyclictest (nanosleep)

cyclictest is not a NOHZ_FULL application, if you tihnk it is, you're
deluded.

> 2) PLC programs   (nanosleep)

What's a PLC? Programmable Logic Circuit?

> A system call does not necessarily have to take locks, does it ?

This all is unrelated to locks

> Or even if application does system calls, but runs under a VM,
> then you are requiring it to never VM-exit.

That seems to be a goal for performance anyway.

> This reduces the flexibility of developing such applications.

Yeah, that's the cards you're dealt, deal with it.


Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2023 at 09:38:50AM -0300, Marcelo Tosatti wrote:

> > To actually hit this path you're doing something really dodgy.
> 
> Apparently khugepaged is using the same infrastructure:
> 
> $ grep tlb_remove_table khugepaged.c 
>   tlb_remove_table_sync_one();
>   tlb_remove_table_sync_one();
> 
> So just enabling khugepaged will hit that path.

Urgh, WTF..

Let me go read that stuff :/


Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Marcelo Tosatti
On Wed, Apr 05, 2023 at 09:54:57PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 05, 2023 at 04:43:14PM -0300, Marcelo Tosatti wrote:
> 
> > Two points:
> > 
> > 1) For a virtualized system, the overhead is not only of executing the
> > IPI but:
> > 
> > VM-exit
> > run VM-exit code in host
> > handle IPI
> > run VM-entry code in host
> > VM-entry
> 
> I thought we could do IPIs without VMexit these days? 

Yes, IPIs to vCPU (guest context). In this case we can consider
an IPI to the host pCPU (which requires VM-exit from guest context).

> Also virt... /me walks away.
> 
> > 2) Depends on the application and the definition of "occasional".
> > 
> > For certain types of applications (for example PLC software or
> > RAN processing), upon occurrence of an event, it is necessary to
> > complete a certain task in a maximum amount of time (deadline).
> 
> If the application is properly NOHZ_FULL and never does a kernel entry,
> it will never get that IPI. If it is a pile of shit and does kernel
> entries while it pretends to be NOHZ_FULL it gets to keep the pieces and
> no amount of crying will get me to care.

I suppose its common practice to use certain system calls in latency
sensitive applications, for example nanosleep. Some examples:

1) cyclictest   (nanosleep)
2) PLC programs (nanosleep)

A system call does not necessarily have to take locks, does it ?

Or even if application does system calls, but runs under a VM,
then you are requiring it to never VM-exit.

This reduces the flexibility of developing such applications.





Re: [PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

2023-04-06 Thread Marcelo Tosatti
On Wed, Apr 05, 2023 at 09:52:26PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 05, 2023 at 04:45:32PM -0300, Marcelo Tosatti wrote:
> > On Wed, Apr 05, 2023 at 01:10:07PM +0200, Frederic Weisbecker wrote:
> > > On Wed, Apr 05, 2023 at 12:44:04PM +0200, Frederic Weisbecker wrote:
> > > > On Tue, Apr 04, 2023 at 04:42:24PM +0300, Yair Podemsky wrote:
> > > > > + int state = atomic_read(>state);
> > > > > + /* will return true only for cpus in kernel space */
> > > > > + return state & CT_STATE_MASK == CONTEXT_KERNEL;
> > > > > +}
> > > > 
> > > > Also note that this doesn't stricly prevent userspace from being 
> > > > interrupted.
> > > > You may well observe the CPU in kernel but it may receive the IPI later 
> > > > after
> > > > switching to userspace.
> > > > 
> > > > We could arrange for avoiding that with marking ct->state with a 
> > > > pending work bit
> > > > to flush upon user entry/exit but that's a bit more overhead so I first 
> > > > need to
> > > > know about your expectations here, ie: can you tolerate such an 
> > > > occasional
> > > > interruption or not?
> > > 
> > > Bah, actually what can we do to prevent from that racy IPI? Not much I 
> > > fear...
> > 
> > Use a different mechanism other than an IPI to ensure in progress
> > __get_free_pages_fast() has finished execution.
> > 
> > Isnt this codepath slow path enough that it can use
> > synchronize_rcu_expedited?
> 
> To actually hit this path you're doing something really dodgy.

Apparently khugepaged is using the same infrastructure:

$ grep tlb_remove_table khugepaged.c 
tlb_remove_table_sync_one();
tlb_remove_table_sync_one();

So just enabling khugepaged will hit that path.



[PATCH] powerpc/irq: Mark check_return_regs_valid() notrace

2023-04-06 Thread Michael Ellerman
check_return_regs_valid() is called from the middle of the irq exit
handling, which is all notrace, so mark it notrace also.

Rerported-by: Sachin Sant 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/interrupt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 0ec1581619db..e34c72285b4e 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -95,7 +95,7 @@ static notrace void booke_load_dbcr0(void)
 #endif
 }
 
-static void check_return_regs_valid(struct pt_regs *regs)
+static notrace void check_return_regs_valid(struct pt_regs *regs)
 {
 #ifdef CONFIG_PPC_BOOK3S_64
unsigned long trap, srr0, srr1;
-- 
2.39.2



Re: [PATCH v8 0/7] Add pci_dev_for_each_resource() helper and update users

2023-04-06 Thread Andy Shevchenko
On Wed, Apr 05, 2023 at 03:18:32PM -0500, Bjorn Helgaas wrote:
> On Wed, Apr 05, 2023 at 11:28:27AM +0300, Andy Shevchenko wrote:
> > On Tue, Apr 04, 2023 at 11:11:01AM -0500, Bjorn Helgaas wrote:
> > > On Thu, Mar 30, 2023 at 07:24:27PM +0300, Andy Shevchenko wrote:

...

> > > I omitted
> > > 
> > >   [1/7] kernel.h: Split out COUNT_ARGS() and CONCATENATE()"
> > > 
> > > only because it's not essential to this series and has only a trivial
> > > one-line impact on include/linux/pci.h.
> > 
> > I'm not sure I understood what exactly "essentiality" means to you, but
> > I included that because it makes the split which can be used later by
> > others and not including kernel.h in the header is the objective I want
> > to achieve. Without this patch the achievement is going to be deferred.
> > Yet, this, as you have noticed, allows to compile and use the macros in
> > the rest of the patches.
> 
> I haven't followed the kernel.h splitting, and I try to avoid
> incidental changes outside of the files I maintain, so I just wanted
> to keep this series purely PCI and avoid any possible objections to a
> new include file or discussion about how it should be done.

Okay, fair enough :-) Thank you for elaboration, I will send the new version of
patch 7 separately.

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH 13/21] arc: dma-mapping: skip invalidating before bidirectional DMA

2023-04-06 Thread Shahab Vahedi
On 4/2/23 08:52, Vineet Gupta wrote:
> CC Shahab
> 
> On 3/27/23 17:43, Arnd Bergmann wrote:
>> From: Arnd Bergmann
>>
>> Some architectures that need to invalidate buffers after bidirectional
>> DMA because of speculative prefetching only do a simpler writeback
>> before that DMA, while architectures that don't need to do the second
>> invalidate tend to have a combined writeback+invalidate before the
>> DMA.
>>
>> arc is one of the architectures that does both, which seems unnecessary.
>>
>> Change it to behave like arm/arm64/xtensa instead, and use just a
>> writeback before the DMA when we do the invalidate afterwards.
>>
>> Signed-off-by: Arnd Bergmann
> 
> Reviewed-by: Vineet Gupta 
> 
> Shahab can you give this a spin on hsdk - run glibc testsuite over ssh
> and make sure nothing strange happens.
> 
> Thx,
> -Vineet

Tested-by: Shahab Vahedi 

No regression was observed for the ARC target before and after applying
these 21 patches. The test environment and its summary follow.

board:  ARC HSDK
base:   repo:   linux-next
tag:next-20230403
commit: 31bd35b66249 Add linux-next specific files for 20230403
hotfix: net: stmmac: check fwnode for phy device before scanning for phy [1]
glibc:  2.37

Summary of test results:
 20 FAIL
   4227 PASS
 38 UNSUPPORTED
 16 XFAIL
  2 XPASS

[1]
https://lore.kernel.org/lkml/20230405093945.3549491-1-michael.wei.hong@intel.com/#r

-- 
Shahab



RE: [PATCH v2 0/5] locking: Introduce local{,64}_try_cmpxchg

2023-04-06 Thread David Laight
From: Uros Bizjak
> Sent: 06 April 2023 09:39
> 
> On Thu, Apr 6, 2023 at 10:26 AM David Laight  wrote:
> >
> > From: Dave Hansen
> > > Sent: 05 April 2023 17:37
> > >
> > > On 4/5/23 07:17, Uros Bizjak wrote:
> > > > Add generic and target specific support for local{,64}_try_cmpxchg
> > > > and wire up support for all targets that use local_t infrastructure.
> > >
> > > I feel like I'm missing some context.
> > >
> > > What are the actual end user visible effects of this series?  Is there a
> > > measurable decrease in perf overhead?  Why go to all this trouble for
> > > perf?  Who else will use local_try_cmpxchg()?
> >
> > I'm assuming the local_xxx operations only have to be save wrt interrupts?
> > On x86 it is possible that an alternate instruction sequence
> > that doesn't use a locked instruction may actually be faster!
> 
> Please note that "local" functions do not use lock prefix. Only atomic
> properties of cmpxchg instruction are exploited since it only needs to
> be safe wrt interrupts.

Gah, I was assuming that LOCK was implied - like it is for xchg
and all the bit instructions.

In any case I suspect it makes little difference unless the
locked variant affects the instruction pipeline.
In fact, you may want to stop the cacheline being invalidated
between the read and write in order to avoid an extra cache
line bounce.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH v2 0/5] locking: Introduce local{,64}_try_cmpxchg

2023-04-06 Thread Uros Bizjak
On Thu, Apr 6, 2023 at 10:26 AM David Laight  wrote:
>
> From: Dave Hansen
> > Sent: 05 April 2023 17:37
> >
> > On 4/5/23 07:17, Uros Bizjak wrote:
> > > Add generic and target specific support for local{,64}_try_cmpxchg
> > > and wire up support for all targets that use local_t infrastructure.
> >
> > I feel like I'm missing some context.
> >
> > What are the actual end user visible effects of this series?  Is there a
> > measurable decrease in perf overhead?  Why go to all this trouble for
> > perf?  Who else will use local_try_cmpxchg()?
>
> I'm assuming the local_xxx operations only have to be save wrt interrupts?
> On x86 it is possible that an alternate instruction sequence
> that doesn't use a locked instruction may actually be faster!

Please note that "local" functions do not use lock prefix. Only atomic
properties of cmpxchg instruction are exploited since it only needs to
be safe wrt interrupts.

Uros.

> Although, maybe, any kind of locked cmpxchg just needs to ensure
> the cache line isn't 'stolen', so apart from possible slight
> delays on another cpu that gets a cache miss for the line in
> all makes little difference.
> The cache line miss costs a lot anyway, line bouncing more
> and is best avoided.
> So is there actually much of a benefit at all?
>
> Clearly the try_cmpxchg help - but that is a different issue.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)


RE: [PATCH v2 0/5] locking: Introduce local{,64}_try_cmpxchg

2023-04-06 Thread David Laight
From: Dave Hansen
> Sent: 05 April 2023 17:37
> 
> On 4/5/23 07:17, Uros Bizjak wrote:
> > Add generic and target specific support for local{,64}_try_cmpxchg
> > and wire up support for all targets that use local_t infrastructure.
> 
> I feel like I'm missing some context.
> 
> What are the actual end user visible effects of this series?  Is there a
> measurable decrease in perf overhead?  Why go to all this trouble for
> perf?  Who else will use local_try_cmpxchg()?

I'm assuming the local_xxx operations only have to be save wrt interrupts?
On x86 it is possible that an alternate instruction sequence
that doesn't use a locked instruction may actually be faster!

Although, maybe, any kind of locked cmpxchg just needs to ensure
the cache line isn't 'stolen', so apart from possible slight
delays on another cpu that gets a cache miss for the line in
all makes little difference.
The cache line miss costs a lot anyway, line bouncing more
and is best avoided.
So is there actually much of a benefit at all?

Clearly the try_cmpxchg help - but that is a different issue.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


[PATCH] powerpc/bpf: populate extable entries only during the last pass

2023-04-06 Thread Hari Bathini
Since commit 85e031154c7c ("powerpc/bpf: Perform complete extra passes
to update addresses"), two additional passes are performed to avoid
space and CPU time wastage on powerpc. But these extra passes led to
WARN_ON_ONCE() hits in bpf_add_extable_entry(). Fix it by not adding
extable entries during the extra pass.

Fixes: 85e031154c7c ("powerpc/bpf: Perform complete extra passes to update 
addresses")
Signed-off-by: Hari Bathini 
---
 arch/powerpc/net/bpf_jit_comp32.c | 2 +-
 arch/powerpc/net/bpf_jit_comp64.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/net/bpf_jit_comp32.c 
b/arch/powerpc/net/bpf_jit_comp32.c
index 7f91ea064c08..e788b1fbeee6 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -977,7 +977,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, 
struct codegen_context *
if (size != BPF_DW && !fp->aux->verifier_zext)
EMIT(PPC_RAW_LI(dst_reg_h, 0));
 
-   if (BPF_MODE(code) == BPF_PROBE_MEM) {
+   if (BPF_MODE(code) == BPF_PROBE_MEM && !extra_pass) {
int insn_idx = ctx->idx - 1;
int jmp_off = 4;
 
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index 8dd3cabaa83a..1cc2777ec846 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -921,7 +921,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, 
struct codegen_context *
if (size != BPF_DW && insn_is_zext([i + 1]))
addrs[++i] = ctx->idx * 4;
 
-   if (BPF_MODE(code) == BPF_PROBE_MEM) {
+   if (BPF_MODE(code) == BPF_PROBE_MEM && !extra_pass) {
ret = bpf_add_extable_entry(fp, image, pass, 
ctx, ctx->idx - 1,
4, dst_reg);
if (ret)
-- 
2.39.2



Re: [PATCH v4 0/3] Use dma_default_coherent for devicetree default coherency

2023-04-06 Thread Jiaxun Yang



> 2023年4月1日 10:15,Jiaxun Yang  写道:
> 
> Hi all,
> 
> This series split out second half of my previous series
> "[PATCH 0/4] MIPS DMA coherence fixes".
> 
> It intends to use dma_default_coherent to determine the default coherency of
> devicetree probed devices instead of hardcoding it with Kconfig options.
> 
> For some MIPS systems, dma_default_coherent is determined with either
> bootloader or hardware registers in platform initilization code, and 
> devicetree
> does not explicility specify the coherency of the device, so we need the 
> ability
> to change the default coherency of devicetree probed devices.
> 
> For other platforms that supports noncoherent, dma_default_coherent is a fixed
> value set by arch code. It's defaulted to false for most archs except RISC-V
> and powerpc in some cases.

Ping.

Is there any issue remain in this series?

Thanks
Jiaxun

> 
> Thanks
> - Jiaxun
> ---
> v2:
>  - Add PATCH 1 to help with backporting
>  - Use Kconfig option to set dma_default_coherent 
> 
> v3:
>  - Style fixes
>  - Squash setting ARCH_DMA_DEFAULT_COHERENT into PATCH 4
>  - Setting ARCH_DMA_DEFAULT_COHERENT for PowerPC
> 
> v4:
>  - Drop first patch
> 
> Jiaxun Yang (3):
>  dma-mapping: Provide a fallback dma_default_coherent
>  dma-mapping: Provide CONFIG_ARCH_DMA_DEFAULT_COHERENT
>  of: address: Always use dma_default_coherent for default coherency
> 
> arch/powerpc/Kconfig| 2 +-
> arch/riscv/Kconfig  | 2 +-
> drivers/of/Kconfig  | 4 
> drivers/of/address.c| 2 +-
> include/linux/dma-map-ops.h | 2 ++
> kernel/dma/Kconfig  | 7 +++
> kernel/dma/mapping.c| 6 +-
> 7 files changed, 17 insertions(+), 8 deletions(-)
> 
> -- 
> 2.39.2 (Apple Git-143)
> 



Re: [PATCH] powerpc: Use of_address_to_resource()

2023-04-06 Thread Michael Ellerman
Michael Ellerman  writes:
> On Sun, 19 Mar 2023 11:31:53 -0500, Rob Herring wrote:
>> Replace open coded reading of "reg" or of_get_address()/
>> of_translate_address() calls with a single call to
>> of_address_to_resource().
>> 
>> 
>
> Applied to powerpc/next.
>
> [1/1] powerpc: Use of_address_to_resource()
>   
> https://git.kernel.org/powerpc/c/2500763dd3db37fad94d9b506907c59c2f5e97c6

I actually merged v3, b4 got confused when sending the thanks.

cheers