Re: [PATCH] powerpc/eeh: Permanently disable the removed device

2024-04-14 Thread Ganesh G R

On 4/9/24 14:37, Michael Ellerman wrote:


Hi Ganesh,

Ganesh Goudar  writes:

When a device is hot removed on powernv, the hotplug
driver clears the device's state. However, on pseries,
if a device is removed by phyp after reaching the error
threshold, the kernel remains unaware, leading to the
device not being torn down. This prevents necessary
remediation actions like failover.

Permanently disable the device if the presence check
fails.

You can wrap your changelogs a bit wider, 70 or 80 columns is fine.


ok


diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index ab316e155ea9..8d1606406d3f 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -508,7 +508,9 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
 * state, PE is in good state.
 */
if ((ret < 0) ||
-   (ret == EEH_STATE_NOT_SUPPORT) || eeh_state_active(ret)) {
+   (ret == EEH_STATE_NOT_SUPPORT &&
+dev->error_state == pci_channel_io_perm_failure) ||
+   eeh_state_active(ret)) {
eeh_stats.false_positives++;
pe->false_positives++;
rc = 0;

How does this hunk relate the changelog?

This is adding an extra condition to the false positive check, so
there's a risk this causes devices to go into failure when previously
they didn't, right? So please explain why it's a good change. The
comment above the if needs updating too.


We need this change to log the event and get the device removed, I will explain 
this
in commit message.


diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 48773d2d9be3..10317badf471 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -867,7 +867,13 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
if (!devices) {
pr_debug("EEH: Frozen PHB#%x-PE#%x is empty!\n",
pe->phb->global_number, pe->addr);
-   goto out; /* nothing to recover */

The other cases that go to recover_failed usually print something at
warn level, so this probably should too. So either make the above a
pr_warn(), or change it to a warn with a more helpful message.


ok


+   /*
+* The device is removed, Tear down its state,
+* On powernv hotplug driver would take care of
+* it but not on pseries, Permanently disable the
+* card as it is hot removed.
+*/

Formatting and punctuation is weird. It can be wider, and capital letter
is only required after a full stop, not a comma.


ok, i will take care of it.


Also you say that the powernv hotplug driver "would" take care of it,
that's past tense, is that what you mean? Does the powernv hotplug
driver still take care of it after this change? And (how) does that
driver cope with it happening here also?


Yes, hotplug driver can still remove the device and the removal of
device is covered by pci rescan lock.


+   goto recover_failed;
}


cheers


[PATCH v2] KVM: PPC: Book3S HV nestedv2: Cancel pending DEC exception

2024-04-14 Thread Vaibhav Jain
This reverts commit 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not
cancel pending decrementer exception") [1] which prevented canceling a
pending HDEC exception for nestedv2 KVM guests. It was done to avoid
overhead of a H_GUEST_GET_STATE hcall to read the 'DEC expiry TB' register
which was higher compared to handling extra decrementer exceptions.

However recent benchmarks indicate that overhead of not handling 'DECR'
expiry for Nested KVM Guest(L2) is higher and results in much larger exits
to Pseries Host(L1) as indicated by the Unixbench-arithoh bench[2]

Metric| Current upstream| Revert [1]  | Difference %

arithoh-count (10)| 3244831634  | 3403089673  | +04.88%
kvm_hv:kvm_guest_exit | 513558  | 152441  | -70.32%
probe:kvmppc_gsb_recv | 28060   | 28110   | +00.18%

N=1

As indicated by the data above that reverting [1] results in substantial
reduction in number of L2->L1 exits with only slight increase in number of
H_GUEST_GET_STATE hcalls to read the value of 'DEC expiry TB'. This results
in an overall ~4% improvement of arithoh[2] throughput.

[1] commit 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending 
decrementer exception")
[2] https://github.com/kdlucas/byte-unixbench/

Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending 
decrementer exception")
Signed-off-by: Vaibhav Jain 

---
Changelog:
Since v1: 
https://lore.kernel.org/all/20240313072625.76804-1-vaib...@linux.ibm.com
* Updated/Corrected patch title and description
* Included data on test benchmark results for Unixbench-arithoh bench.
---
 arch/powerpc/kvm/book3s_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 8e86eb577eb8..692a7c6f5fd9 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4857,7 +4857,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 
time_limit,
 * entering a nested guest in which case the decrementer is now owned
 * by L2 and the L1 decrementer is provided in hdec_expires
 */
-   if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) &&
+   if (kvmppc_core_pending_dec(vcpu) &&
((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
 (trap == BOOK3S_INTERRUPT_SYSCALL &&
  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
-- 
2.44.0



Re: [PATCH] bug: Fix no-return-statement warning with !CONFIG_BUG

2024-04-14 Thread Michael Ellerman
"Arnd Bergmann"  writes:
> On Thu, Apr 11, 2024, at 11:27, Adrian Hunter wrote:
>> On 11/04/24 11:22, Christophe Leroy wrote:
>>> Le 11/04/2024 à 10:12, Christophe Leroy a écrit :

 Looking at the report, I think the correct fix should be to use 
 BUILD_BUG() instead of BUG()
>>> 
>>> I confirm the error goes away with the following change to next-20240411 
>>> on powerpc tinyconfig with gcc 13.2
>>> 
>>> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>>> index 4e18db1819f8..3d5ac0cdd721 100644
>>> --- a/kernel/time/timekeeping.c
>>> +++ b/kernel/time/timekeeping.c
>>> @@ -282,7 +282,7 @@ static inline void timekeeping_check_update(struct 
>>> timekeeper *tk, u64 offset)
>>>   }
>>>   static inline u64 timekeeping_debug_get_ns(const struct tk_read_base *tkr)
>>>   {
>>> -   BUG();
>>> +   BUILD_BUG();
>>>   }
>>>   #endif
>>> 
>>
>> That is fragile because it depends on defined(__OPTIMIZE__),
>> so it should still be:
>
> If there is a function that is defined but that must never be
> called, I think we are doing something wrong.

It's a pretty inevitable result of using IS_ENABLED(), which the docs
encourage people to use.

In this case it could easily be turned into a build error by just making
it an extern rather than a static inline.

But I think Christophe's solution is actually better, because it's more
explicit, ie. this function should not be called and if it is that's a
build time error.

cheers


Re: [PATCH 1/3] x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n

2024-04-14 Thread Stephen Rothwell
Hi all,

On Sat, 13 Apr 2024 19:38:47 +1000 Michael Ellerman  wrote:
>
> Michael Ellerman  writes:
> > Stephen Rothwell  writes:  
> ...
> >> On Tue,  9 Apr 2024 10:51:05 -0700 Sean Christopherson  
> >> wrote:  
> ...
> >>> diff --git a/kernel/cpu.c b/kernel/cpu.c
> >>> index 8f6affd051f7..07ad53b7f119 100644
> >>> --- a/kernel/cpu.c
> >>> +++ b/kernel/cpu.c
> >>> @@ -3207,7 +3207,8 @@ enum cpu_mitigations {
> >>>  };
> >>>  
> >>>  static enum cpu_mitigations cpu_mitigations __ro_after_init =
> >>> - CPU_MITIGATIONS_AUTO;
> >>> + IS_ENABLED(CONFIG_SPECULATION_MITIGATIONS) ? CPU_MITIGATIONS_AUTO :
> >>> +  CPU_MITIGATIONS_OFF;
> >>>  
> >>>  static int __init mitigations_parse_cmdline(char *arg)
> >>>  {  
> 
> I think a minimal workaround/fix would be:
> 
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 2b8fd6bb7da0..290be2f9e909 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -191,6 +191,10 @@ config GENERIC_CPU_AUTOPROBE
>  config GENERIC_CPU_VULNERABILITIES
> bool
> 
> +config SPECULATION_MITIGATIONS
> +   def_bool y
> +   depends on !X86
> +
>  config SOC_BUS
> bool
> select GLOB

The original commit is now in Linus' tree.

-- 
Cheers,
Stephen Rothwell


pgpbWMytisory.pgp
Description: OpenPGP digital signature


Re: [PATCH v12 8/8] PCI: endpoint: Remove "core_init_notifier" flag

2024-04-14 Thread Manivannan Sadhasivam
On Fri, Apr 12, 2024 at 03:22:16PM -0500, Bjorn Helgaas wrote:
> On Wed, Mar 27, 2024 at 02:43:37PM +0530, Manivannan Sadhasivam wrote:
> > "core_init_notifier" flag is set by the glue drivers requiring refclk from
> > the host to complete the DWC core initialization. Also, those drivers will
> > send a notification to the EPF drivers once the initialization is fully
> > completed using the pci_epc_init_notify() API. Only then, the EPF drivers
> > will start functioning.
> > 
> > For the rest of the drivers generating refclk locally, EPF drivers will
> > start functioning post binding with them. EPF drivers rely on the
> > 'core_init_notifier' flag to differentiate between the drivers.
> > Unfortunately, this creates two different flows for the EPF drivers.
> > 
> > So to avoid that, let's get rid of the "core_init_notifier" flag and follow
> > a single initialization flow for the EPF drivers. This is done by calling
> > the dw_pcie_ep_init_notify() from all glue drivers after the completion of
> > dw_pcie_ep_init_registers() API. This will allow all the glue drivers to
> > send the notification to the EPF drivers once the initialization is fully
> > completed.
> 
> Thanks for doing this!  I think this is a significantly nicer
> solution than core_init_notifier was.
> 
> One question: both qcom and tegra194 call dw_pcie_ep_init_registers()
> from an interrupt handler, but they register that handler in a
> different order with respect to dw_pcie_ep_init().
> 
> I don't know what actually starts the process that leads to the
> interrupt, but if it's dw_pcie_ep_init(), then one of these (qcom, I
> think) must be racy:
> 

Your analysis is correct. But there is no race observed as of now since the IRQ
will only be enabled by configuring the endpoint using configfs interface and
right now I use an init script to do that. By that time, the driver would've
already probed completely.

But there is a slight chance that if the driver gets loaded as a module and the
userspace script starts configuring the endpoint interface using inotify watch
or something similar, then race could occur since the IRQ handler may not be
registered at that point.

>   qcom_pcie_ep_probe
> dw_pcie_ep_init <- A
> qcom_pcie_ep_enable_irq_resources
>   devm_request_threaded_irq(qcom_pcie_ep_perst_irq_thread)  <- B
> 
>   qcom_pcie_ep_perst_irq_thread
> qcom_pcie_perst_deassert
>   dw_pcie_ep_init_registers
> 
>   tegra_pcie_dw_probe
> tegra_pcie_config_ep
>   devm_request_threaded_irq(tegra_pcie_ep_pex_rst_irq)  <- B
>   dw_pcie_ep_init   <- A
> 
>   tegra_pcie_ep_pex_rst_irq
> pex_ep_event_pex_rst_deassert
>   dw_pcie_ep_init_registers
> 
> Whatever the right answer is, I think qcom and tegra194 should both
> order dw_pcie_ep_init() and the devm_request_threaded_irq() the same
> way.
> 

Agree. The right way is to register the IRQ handler first and then do
dw_pcie_ep_init(). I will fix it in the qcom driver.

Thanks for spotting!

- Mani

-- 
மணிவண்ணன் சதாசிவம்


[powerpc] WARN at drivers/md/dm-bio-prison-v1.c:128 [dm_bio_prison]

2024-04-14 Thread Sachin Sant
While running file system tests (xfstest) on IBM Power following warning
was seen:

[ 750.845015] run fstests generic/347 at 2024-04-13 03:58:42
[ 751.017900] XFS (loop0): Mounting V5 Filesystem 
998a731d-ad3f-467d-ad31-92990b381696
[ 751.019105] XFS (loop0): Ending clean mount
[ 751.372715] [ cut here ]
[ 751.372729] WARNING: CPU: 2 PID: 12 at drivers/md/dm-bio-prison-v1.c:128 
dm_cell_key_has_valid_range+0x44/0x68 [dm_bio_prison]
[ 751.372741] Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison 
dm_snapshot dm_bufio dm_flakey xfs loop dm_mod nft_fib_inet nft_fib_ipv4 
nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject 
nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding 
tls rfkill ip_set nf_tables libcrc32c nfnetlink sunrpc pseries_rng vmx_crypto 
fuse ext4 mbcache jbd2 sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft 
crc64 sg ibmvscsi scsi_transport_srp ibmveth [last unloaded: scsi_debug]
[ 751.372785] CPU: 2 PID: 12 Comm: kworker/u256:1 Kdump: loaded Not tainted 
6.9.0-rc3-next-20240412 #1
[ 751.372790] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 
of:IBM,FW1060.00 (NH1060_018) hv:phyp pSeries
[ 751.372795] Workqueue: dm-thin do_worker [dm_thin_pool]
[ 751.372801] NIP: c0080ca80100 LR: c0080cfd66e8 CTR: c0080ca800bc
[ 751.372805] REGS: c4bbf9b0 TRAP: 0700 Not tainted 
(6.9.0-rc3-next-20240412)
[ 751.372810] MSR: 8282b033  CR: 
44002482 XER: 2004
[ 751.372820] CFAR: c0080ca800d4 IRQMASK: 0 
[ 751.372820] GPR00: c0080cfd66e8 c4bbfc50 c0080d008b00 
c4bbfcb8 
[ 751.372820] GPR04: c001f7afb1b8 c00156dd4a30 0005 
0400 
[ 751.372820] GPR08: 1000 1000  
c0080cfdf390 
[ 751.372820] GPR12: c0080ca800bc c00ecf00 c01a063c 
c4045380 
[ 751.372820] GPR16:    
 
[ 751.372820] GPR20: c4082000 c00156dd4890 0001 
c0006683d24c 
[ 751.372820] GPR24: c00156dd4a40   
c0006683d200 
[ 751.372820] GPR28: c001f7afb1b8 c0006683d200  
1000 
[ 751.372865] NIP [c0080ca80100] dm_cell_key_has_valid_range+0x44/0x68 
[dm_bio_prison]
[ 751.372871] LR [c0080cfd66e8] process_discard_bio+0xac/0x1f0 
[dm_thin_pool]
[ 751.372877] Call Trace:
[ 751.372880] [c4bbfd00] [c0080cfd89d4] 
process_thin_deferred_bios+0x158/0x428 [dm_thin_pool]
[ 751.372887] [c4bbfdc0] [c0080cfd8d00] 
process_deferred_bios+0x5c/0x2f4 [dm_thin_pool]
[ 751.372894] [c4bbfe00] [c0080cfd9098] do_worker+0x100/0x1f8 
[dm_thin_pool]
[ 751.372900] [c4bbfe40] [c019326c] process_one_work+0x20c/0x4f4
[ 751.372908] [c4bbfef0] [c01941ec] worker_thread+0x378/0x544
[ 751.372914] [c4bbff90] [c01a076c] kthread+0x138/0x140
[ 751.372919] [c4bbffe0] [c000df98] 
start_kernel_thread+0x14/0x18
[ 751.372924] Code: 28280400 4181002c 3929 794ab282 3861 7929b282 
7c2a4800 40820024 786307e0 4e800020 6000 6000 <0fe0> 3860 
786307e0 4e800020 
[ 751.372938] ---[ end trace  ]---
[ 751.372941] device-mapper: thin: Discard doesn't respect bio prison limits
[ 751.373000] device-mapper: thin: Discard doesn't respect bio prison limits
[ 751.373022] device-mapper: thin: Discard doesn't respect bio prison limits

This WARN_ON_ONCE was introduced by

commit 3f8d3f5432078a558151e27230e20bcf93c23ffe
dm bio prison v1: add dm_cell_key_has_valid_range


bool dm_cell_key_has_valid_range(struct dm_cell_key *key)
{
if (WARN_ON_ONCE(key->block_end - key->block_begin > 
BIO_PRISON_MAX_RANGE))
return false;

— Sachin



Re: [RFC PATCH 5/7] x86/module: perpare module loading for ROX allocations of text

2024-04-14 Thread Mike Rapoport
On Fri, Apr 12, 2024 at 11:08:00AM +0200, Ingo Molnar wrote:
> 
> * Mike Rapoport  wrote:
> 
> > for (s = start; s < end; s++) {
> > void *addr = (void *)s + *s;
> > +   void *wr_addr = addr + module_writable_offset(mod, addr);
> 
> So instead of repeating this pattern in a dozen of places, why not use a 
> simpler method:
> 
>   void *wr_addr = module_writable_address(mod, addr);
> 
> or so, since we have to pass 'addr' to the module code anyway.

Agree.
 
> The text patching code is pretty complex already.
> 
> Thanks,
> 
>   Ingo

-- 
Sincerely yours,
Mike.


Re: [RFC PATCH 2/7] mm: vmalloc: don't account for number of nodes for HUGE_VMAP allocations

2024-04-14 Thread Mike Rapoport
On Fri, Apr 12, 2024 at 06:07:19AM +, Christophe Leroy wrote:
> 
> 
> Le 11/04/2024 à 18:05, Mike Rapoport a écrit :
> > From: "Mike Rapoport (IBM)" 
> > 
> > vmalloc allocations with VM_ALLOW_HUGE_VMAP that do not explictly
> > specify node ID will use huge pages only if size_per_node is larger than
> > PMD_SIZE.
> > Still the actual allocated memory is not distributed between nodes and
> > there is no advantage in such approach.
> > On the contrary, BPF allocates PMD_SIZE * num_possible_nodes() for each
> > new bpf_prog_pack, while it could do with PMD_SIZE'ed packs.
> > 
> > Don't account for number of nodes for VM_ALLOW_HUGE_VMAP with
> > NUMA_NO_NODE and use huge pages whenever the requested allocation size
> > is larger than PMD_SIZE.
> 
> Patch looks ok but message is confusing. We also use huge pages at PTE 
> size, for instance 512k pages or 16k pages on powerpc 8xx, while 
> PMD_SIZE is 4M.

Ok, I'll rephrase.
 
> Christophe
> 
> > 
> > Signed-off-by: Mike Rapoport (IBM) 
> > ---
> >   mm/vmalloc.c | 9 ++---
> >   1 file changed, 2 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 22aa63f4ef63..5fc8b514e457 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -3737,8 +3737,6 @@ void *__vmalloc_node_range(unsigned long size, 
> > unsigned long align,
> > }
> >   
> > if (vmap_allow_huge && (vm_flags & VM_ALLOW_HUGE_VMAP)) {
> > -   unsigned long size_per_node;
> > -
> > /*
> >  * Try huge pages. Only try for PAGE_KERNEL allocations,
> >  * others like modules don't yet expect huge pages in
> > @@ -3746,13 +3744,10 @@ void *__vmalloc_node_range(unsigned long size, 
> > unsigned long align,
> >  * supporting them.
> >  */
> >   
> > -   size_per_node = size;
> > -   if (node == NUMA_NO_NODE)
> > -   size_per_node /= num_online_nodes();
> > -   if (arch_vmap_pmd_supported(prot) && size_per_node >= PMD_SIZE)
> > +   if (arch_vmap_pmd_supported(prot) && size >= PMD_SIZE)
> > shift = PMD_SHIFT;
> > else
> > -   shift = arch_vmap_pte_supported_shift(size_per_node);
> > +   shift = arch_vmap_pte_supported_shift(size);
> >   
> > align = max(real_align, 1UL << shift);
> > size = ALIGN(real_size, 1UL << shift);

-- 
Sincerely yours,
Mike.


Re: [PATCH v4 06/15] mm/execmem, arch: convert simple overrides of module_alloc to execmem

2024-04-14 Thread Mike Rapoport
On Thu, Apr 11, 2024 at 10:53:46PM +0200, Sam Ravnborg wrote:
> Hi Mike.
> 
> On Thu, Apr 11, 2024 at 07:00:42PM +0300, Mike Rapoport wrote:
> > From: "Mike Rapoport (IBM)" 
> > 
> > Several architectures override module_alloc() only to define address
> > range for code allocations different than VMALLOC address space.
> > 
> > Provide a generic implementation in execmem that uses the parameters for
> > address space ranges, required alignment and page protections provided
> > by architectures.
> > 
> > The architectures must fill execmem_info structure and implement
> > execmem_arch_setup() that returns a pointer to that structure. This way the
> > execmem initialization won't be called from every architecture, but rather
> > from a central place, namely a core_initcall() in execmem.
> > 
> > The execmem provides execmem_alloc() API that wraps __vmalloc_node_range()
> > with the parameters defined by the architectures.  If an architecture does
> > not implement execmem_arch_setup(), execmem_alloc() will fall back to
> > module_alloc().
> > 
> > Signed-off-by: Mike Rapoport (IBM) 
> > ---
> 
> This code snippet could be more readable ...
> > diff --git a/arch/sparc/kernel/module.c b/arch/sparc/kernel/module.c
> > index 66c45a2764bc..b70047f944cc 100644
> > --- a/arch/sparc/kernel/module.c
> > +++ b/arch/sparc/kernel/module.c
> > @@ -14,6 +14,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  #include 
> > @@ -21,34 +22,26 @@
> >  
> >  #include "entry.h"
> >  
> > +static struct execmem_info execmem_info __ro_after_init = {
> > +   .ranges = {
> > +   [EXECMEM_DEFAULT] = {
> >  #ifdef CONFIG_SPARC64
> > -
> > -#include 
> > -
> > -static void *module_map(unsigned long size)
> > -{
> > -   if (PAGE_ALIGN(size) > MODULES_LEN)
> > -   return NULL;
> > -   return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> > -   GFP_KERNEL, PAGE_KERNEL, 0, NUMA_NO_NODE,
> > -   __builtin_return_address(0));
> > -}
> > +   .start = MODULES_VADDR,
> > +   .end = MODULES_END,
> >  #else
> > -static void *module_map(unsigned long size)
> > +   .start = VMALLOC_START,
> > +   .end = VMALLOC_END,
> > +#endif
> > +   .alignment = 1,
> > +   },
> > +   },
> > +};
> > +
> > +struct execmem_info __init *execmem_arch_setup(void)
> >  {
> > -   return vmalloc(size);
> > -}
> > -#endif /* CONFIG_SPARC64 */
> > -
> > -void *module_alloc(unsigned long size)
> > -{
> > -   void *ret;
> > -
> > -   ret = module_map(size);
> > -   if (ret)
> > -   memset(ret, 0, size);
> > +   execmem_info.ranges[EXECMEM_DEFAULT].pgprot = PAGE_KERNEL;
> >  
> > -   return ret;
> > +   return &execmem_info;
> >  }
> >  
> >  /* Make generic code ignore STT_REGISTER dummy undefined symbols.  */
> 
> ... if the following was added:
> 
> diff --git a/arch/sparc/include/asm/pgtable_32.h 
> b/arch/sparc/include/asm/pgtable_32.h
> index 9e85d57ac3f2..62bcafe38b1f 100644
> --- a/arch/sparc/include/asm/pgtable_32.h
> +++ b/arch/sparc/include/asm/pgtable_32.h
> @@ -432,6 +432,8 @@ static inline int io_remap_pfn_range(struct 
> vm_area_struct *vma,
> 
>  #define VMALLOC_START   _AC(0xfe60,UL)
>  #define VMALLOC_END _AC(0xffc0,UL)
> +#define MODULES_VADDR   VMALLOC_START
> +#define MODULES_END VMALLOC_END
> 
> 
> Then the #ifdef CONFIG_SPARC64 could be dropped and the code would be
> the same for 32 and 64 bits.
 
Yeah, the #ifdef there can be dropped even regardless of execmem.
I'll add a patch for that.

> Just a drive-by comment.
> 
>   Sam
> 

-- 
Sincerely yours,
Mike.