Re: [PATCH v5] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms
On Thu, Oct 05, 2023 at 12:24:03PM -0400, Kris Van Hees wrote: > On Wed, Sep 27, 2023 at 05:35:16PM +, Alessandro Carminati (Red Hat) > wrote: > > It is not uncommon for drivers or modules related to similar peripherals > > to have symbols with the exact same name. > > While this is not a problem for the kernel's binary itself, it becomes an > > issue when attempting to trace or probe specific functions using > > infrastructure like ftrace or kprobe. > > > > The tracing subsystem relies on the `nm -n vmlinux` output, which provides > > symbol information from the kernel's ELF binary. However, when multiple > > symbols share the same name, the standard nm output does not differentiate > > between them. This can lead to confusion and difficulty when trying to > > probe the intended symbol. > > > > ~ # cat /proc/kallsyms | grep " name_show" > > 8c4f76d0 t name_show > > 8c9cccb0 t name_show > > 8cb0ac20 t name_show > > 8cc728c0 t name_show > > 8ce0efd0 t name_show > > 8ce126c0 t name_show > > 8ce1dd20 t name_show > > 8ce24e70 t name_show > > 8d1104c0 t name_show > > 8d1fe480 t name_show > > One problem that remains as far as I can see is that this approach does not > take into consideration that there can be duplicate symbols in the core > kernel, but also between core kernel and loadable modules, and even between > loadable modules. So, I think more is needed to also ensure that this > approach of adding alias symbols is also done for loadable modules. Re-reading my email a bit more, I realize that I did not accurate demonstrate the issue I arefer to. Obviously, the current patch provides aliases for duplicate symbols in the core kernel. And loadable modules are annotated with the module name. So, all is well unless modules have duplicate symbols themselves. And although it is rare, it does happen: # grep ' metadata_show' /proc/kallsyms c05659c0 t metadata_show[md_mod] c05739f0 t metadata_show[md_mod] This again shows a case where there are duplicate symbols that cannot be distinguished for tracing purposes without additional information. So, the proposed patch unfortunately does not cover all cases because of loadable modules. > Earlier work that cover all symbols (core kernel and loadable modules) was > posted quite a while ago by Nick Alcock: > > https://lore.kernel.org/all/20221205163157.269335-1-nick.alc...@oracle.com/ > > It takes a different approach and adds in other info that is very useful for > tracing, but unfortunately it has been dormant for a long time now. > > While module symbols are handled quite differently (for kallsyms) from the > core kernel symbols, I think that a similar approach tied in with modpost > ought to be quite possible. It will add to the size of modules because the > data needs to be stored in the .ko but that is unavoidable. But not doing it > unfortunately would mean that the duplicate symbol issue remains unresolved > in the presence of loadable modules. > > > kas_alias addresses this challenge by enhancing symbol names with > > meaningful suffixes generated from the source file and line number > > during the kernel build process. > > These newly generated aliases provide tracers with the ability to > > comprehend the symbols they are interacting with when utilizing the > > ftracefs interface. > > This approach may also allow for the probing by name of previously > > inaccessible symbols. > > > > ~ # cat /proc/kallsyms | grep gic_mask_irq > > d15671e505ac t gic_mask_irq > > d15671e505ac t gic_mask_irq@drivers_irqchip_irq_gic_c_167 > > d15671e532a4 t gic_mask_irq > > d15671e532a4 t gic_mask_irq@drivers_irqchip_irq_gic_v3_c_407 > > ~ # > > In the same context as mentioned above (module symbols), I am hoping that the > alias you generate might also be able to contain a module identifier name, > much like the aforementioned patch series by Nick Alcock added. We have it > for loadable modules already of course, but as has been discussed in relation > to the earlier work, being able to associate a module name with a symbol > regardless of whether that module is configured to be built into the kernel > or whether it is configured to be a loadable module is helpful for tracing > purposes. Especially for tracers that use tracing scripts that might get > deployed on diverse systems where it cannot always be known at the time of > developing the tracer scripts whether a kernel module is configured to be > loadable or built-in. > > I'd be happy to work on something like this as a contribution to your work. > I would envision the alias entry not needing to have the typical [module] > added > to it because that will already be annotated on the actual symbol entry. So, > the alias could be extended to be something like: > > c0533720 t floppy_open [floppy] > c0533720 t
Re: [PATCH] bus: mhi: host: Add tracing support
On Thu, Oct 05, 2023 at 03:55:20PM +0530, Krishna chaitanya chundru wrote: > This change adds ftrace support for following: > 1. mhi_intvec_threaded_handler > 2. mhi_process_data_event_ring > 3. mhi_process_ctrl_ev_ring > 4. mhi_gen_tre > 5. mhi_update_channel_state > 6. mhi_tryset_pm_state > 7. mhi_pm_st_worker This is not the best "problem description". > > Usage: > echo 1 > /sys/kernel/debug/tracing/events/mhi_host/enable > cat /sys/kernel/debug/tracing/trace This does not need to be included in the commit message, how to use the tracing framework is documented elsewhere. [..] > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c > index dcf627b36e82..499590437e9b 100644 > --- a/drivers/bus/mhi/host/main.c > +++ b/drivers/bus/mhi/host/main.c > @@ -491,11 +491,10 @@ irqreturn_t mhi_intvec_threaded_handler(int irq_number, > void *priv) > > state = mhi_get_mhi_state(mhi_cntrl); > ee = mhi_get_exec_env(mhi_cntrl); > - dev_dbg(dev, "local ee: %s state: %s device ee: %s state: %s\n", > - TO_MHI_EXEC_STR(mhi_cntrl->ee), > - mhi_state_str(mhi_cntrl->dev_state), > - TO_MHI_EXEC_STR(ee), mhi_state_str(state)); > > + trace_mhi_intvec_threaded_handler(mhi_cntrl->mhi_dev->name, > TO_MHI_EXEC_STR(mhi_cntrl->ee), > + mhi_state_str(mhi_cntrl->dev_state), > + TO_MHI_EXEC_STR(ee), > mhi_state_str(state)); All these helper functions that translates a state to a string, pass the raw state into the trace event and use __print_symbolic() in your TP_printk() instead. This will allow you to read the state, but you can have tools act of the numerical value. (This comment applies to all the trace events) > if (state == MHI_STATE_SYS_ERR) { > dev_dbg(dev, "System error detected\n"); > pm_state = mhi_tryset_pm_state(mhi_cntrl, [..] > diff --git a/include/trace/events/mhi_host.h b/include/trace/events/mhi_host.h [..] > + > +TRACE_EVENT(mhi_pm_st_worker, Why is this trace event called "worker", isn't the event a "mhi_pm_state_transition"? Don't just name your trace event based on the function that triggers them, but what they represent and make sure they carry useful information to understand the system. If you want to trace the flow through your functions, you can use e.g. ftrace. Regards, Bjorn
Re: [PATCH v7 00/13] selftests/sgx: Fix compilation errors
Hi Jo, Just FYI I won't review the rest patches in this series. One of the reasons is I am not that familiar with the rest. Jarkko has reviewed anyway :-). On Thu, 2023-10-05 at 17:38 +0200, Jo Van Bulck wrote: > Hi, > > This patch series ensures that all SGX selftests succeed when compiling with > optimizations (as tested with -O{0,1,2,3,s} for both gcc 11.3.0 and clang > 14.0.0). The aim of the patches is to avoid reliance on undefined, > compiler-specific behavior that can make the test results fragile. > > As far as I see, all commits in this series now have an explicit reviewed-by > tag, so hopefully this can get merged upstream? Please let me know if any > concerns remain and I'd happily address them. > > Reference output below: > > .. Testing gcc -O0[OK] > .. Testing gcc -O1[OK] > .. Testing gcc -O2[OK] > .. Testing gcc -O3[OK] > .. Testing gcc -Os[OK] > .. Testing gcc -Ofast [OK] > .. Testing gcc -Og[OK] > .. Testing clang -O0[OK] > .. Testing clang -O1[OK] > .. Testing clang -O2[OK] > .. Testing clang -O3[OK] > .. Testing clang -Os[OK] > .. Testing clang -Ofast [OK] > .. Testing clang -Og[OK] > > Changelog > - > > v7 > - Add reviewed-by tag (Jarkko) > > v6 > - Collect final ack/reviewed-by tags (Jarkko, Kai) > > v5 > - Reorder patches (Jarkko, Kai) > - Include fixes tag for inline asm memory clobber patch (Kai) > - Include linker error in static-pie commit message (Kai) > - Include generated assembly in relocations commit (Kai) > > v4 > - Remove redundant -nostartfiles compiler flag (Jarkko) > - Split dynamic symbol table removal in separate commit (Kai) > - Split redundant push/pop elimination in separate commit (Kai) > - Remove (incomplete) register cleansing on enclave exit > - Fix possibly uninitialized pointer dereferences in load.c > > v3 > - Refactor encl_op_array declaration and indexing (Jarkko) > - Annotate encl_buffer with "used" attribute (Kai) > - Split encl_buffer size and placement commits (Kai) > > v2 > - Add additional check for NULL pointer (Kai) > - Refine to produce proper static-pie executable > - Fix linker script assertions > - Specify memory clobber for inline asm instead of volatile (Kai) > - Clarify why encl_buffer non-static (Jarkko, Kai) > - Clarify -ffreestanding (Jarkko) > > Best, > Jo > > Jo Van Bulck (13): > selftests/sgx: Fix uninitialized pointer dereference in error path > selftests/sgx: Fix uninitialized pointer dereferences in > encl_get_entry > selftests/sgx: Include memory clobber for inline asm in test enclave > selftests/sgx: Separate linker options > selftests/sgx: Specify freestanding environment for enclave > compilation > selftests/sgx: Remove redundant enclave base address save/restore > selftests/sgx: Produce static-pie executable for test enclave > selftests/sgx: Handle relocations in test enclave > selftests/sgx: Fix linker script asserts > selftests/sgx: Ensure test enclave buffer is entirely preserved > selftests/sgx: Ensure expected location of test enclave buffer > selftests/sgx: Discard unsupported ELF sections > selftests/sgx: Remove incomplete ABI sanitization code in test enclave > > tools/testing/selftests/sgx/Makefile | 12 ++-- > tools/testing/selftests/sgx/defines.h | 2 + > tools/testing/selftests/sgx/load.c| 9 ++- > tools/testing/selftests/sgx/sigstruct.c | 5 +- > tools/testing/selftests/sgx/test_encl.c | 67 +-- > tools/testing/selftests/sgx/test_encl.lds | 10 +-- > .../selftests/sgx/test_encl_bootstrap.S | 28 +++- > 7 files changed, 77 insertions(+), 56 deletions(-) >
RE: [PATCH v5 1/2] mm/memory_hotplug: split memmap_on_memory requests across memblocks
Vishal Verma wrote: > The MHP_MEMMAP_ON_MEMORY flag for hotplugged memory is restricted to > 'memblock_size' chunks of memory being added. Adding a larger span of > memory precludes memmap_on_memory semantics. > > For users of hotplug such as kmem, large amounts of memory might get > added from the CXL subsystem. In some cases, this amount may exceed the > available 'main memory' to store the memmap for the memory being added. > In this case, it is useful to have a way to place the memmap on the > memory being added, even if it means splitting the addition into > memblock-sized chunks. > > Change add_memory_resource() to loop over memblock-sized chunks of > memory if caller requested memmap_on_memory, and if other conditions for > it are met. Teach try_remove_memory() to also expect that a memory > range being removed might have been split up into memblock sized chunks, > and to loop through those as needed. > > Cc: Andrew Morton > Cc: David Hildenbrand > Cc: Michal Hocko > Cc: Oscar Salvador > Cc: Dan Williams > Cc: Dave Jiang > Cc: Dave Hansen > Cc: Huang Ying > Suggested-by: David Hildenbrand > Signed-off-by: Vishal Verma > --- > mm/memory_hotplug.c | 162 > > 1 file changed, 99 insertions(+), 63 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index f8d3e7427e32..77ec6f15f943 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1380,6 +1380,44 @@ static bool mhp_supports_memmap_on_memory(unsigned > long size) > return arch_supports_memmap_on_memory(vmemmap_size); > } > > +static int add_memory_create_devices(int nid, struct memory_group *group, > + u64 start, u64 size, mhp_t mhp_flags) > +{ > + struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; > + struct vmem_altmap mhp_altmap = { > + .base_pfn = PHYS_PFN(start), > + .end_pfn = PHYS_PFN(start + size - 1), > + }; > + int ret; > + > + if ((mhp_flags & MHP_MEMMAP_ON_MEMORY)) { > + mhp_altmap.free = memory_block_memmap_on_memory_pages(); > + params.altmap = kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); > + if (!params.altmap) > + return -ENOMEM; > + > + memcpy(params.altmap, _altmap, sizeof(mhp_altmap)); Isn't this just open coded kmemdup()? Other than that, I am not seeing anything else to comment on, you can add: Reviewed-by: Dan Williams
RE: [PATCH v5 2/2] dax/kmem: allow kmem to add memory with memmap_on_memory
Vishal Verma wrote: > Large amounts of memory managed by the kmem driver may come in via CXL, > and it is often desirable to have the memmap for this memory on the new > memory itself. > > Enroll kmem-managed memory for memmap_on_memory semantics if the dax > region originates via CXL. For non-CXL dax regions, retain the existing > default behavior of hot adding without memmap_on_memory semantics. > > Add a sysfs override under the dax device to control this behavior and > override either default. > > Cc: Andrew Morton > Cc: David Hildenbrand > Cc: Michal Hocko > Cc: Oscar Salvador > Cc: Dan Williams > Cc: Dave Jiang > Cc: Dave Hansen > Cc: Huang Ying > Reviewed-by: Jonathan Cameron > Reviewed-by: David Hildenbrand > Signed-off-by: Vishal Verma > --- > drivers/dax/bus.h | 1 + > drivers/dax/dax-private.h | 1 + > drivers/dax/bus.c | 38 ++ > drivers/dax/cxl.c | 1 + > drivers/dax/hmem/hmem.c | 1 + > drivers/dax/kmem.c| 8 +++- > drivers/dax/pmem.c| 1 + > 7 files changed, 50 insertions(+), 1 deletion(-) > > diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h > index 1ccd23360124..cbbf64443098 100644 > --- a/drivers/dax/bus.h > +++ b/drivers/dax/bus.h > @@ -23,6 +23,7 @@ struct dev_dax_data { > struct dev_pagemap *pgmap; > resource_size_t size; > int id; > + bool memmap_on_memory; > }; > > struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data); > diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h > index 27cf2d79..446617b73aea 100644 > --- a/drivers/dax/dax-private.h > +++ b/drivers/dax/dax-private.h > @@ -70,6 +70,7 @@ struct dev_dax { > struct ida ida; > struct device dev; > struct dev_pagemap *pgmap; > + bool memmap_on_memory; > int nr_range; > struct dev_dax_range { > unsigned long pgoff; > diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c > index 0ee96e6fc426..43be95a231c9 100644 > --- a/drivers/dax/bus.c > +++ b/drivers/dax/bus.c > @@ -367,6 +367,7 @@ static ssize_t create_store(struct device *dev, struct > device_attribute *attr, > .dax_region = dax_region, > .size = 0, > .id = -1, > + .memmap_on_memory = false, > }; > struct dev_dax *dev_dax = devm_create_dev_dax(); > > @@ -1269,6 +1270,40 @@ static ssize_t numa_node_show(struct device *dev, > } > static DEVICE_ATTR_RO(numa_node); > > +static ssize_t memmap_on_memory_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct dev_dax *dev_dax = to_dev_dax(dev); > + > + return sprintf(buf, "%d\n", dev_dax->memmap_on_memory); > +} > + > +static ssize_t memmap_on_memory_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t len) > +{ > + struct dev_dax *dev_dax = to_dev_dax(dev); > + struct dax_region *dax_region = dev_dax->region; > + ssize_t rc; > + bool val; > + > + rc = kstrtobool(buf, ); > + if (rc) > + return rc; Perhaps: if (dev_dax->memmap_on_memory == val) return len; ...and skip the check below when it is going to be a nop > + > + device_lock(dax_region->dev); > + if (!dax_region->dev->driver) { Is the polarity backwards here? I.e. if the device is already attached to the kmem driver it is too late to modify memmap_on_memory policy. > + device_unlock(dax_region->dev); > + return -ENXIO; I would expect -EBUSY since disabling the device allows the property to be set and -ENXIO implies a more permanent error. > + } > + > + dev_dax->memmap_on_memory = val; > + > + device_unlock(dax_region->dev); > + return len; > +} > +static DEVICE_ATTR_RW(memmap_on_memory); This new attribute needs a new Documentation/ABI/ entry... in fact all of these attributes need Documentation/ entries. I can help with that base document to get things started. Perhaps split this sysfs ABI into its own patch and, depending on how fast we can pull the Documentation together, start with the region-driver-conveyed approach in the meantime.
Re: [PATCH v5 16/18] x86/sgx: Limit process EPC usage with misc cgroup controller
> --- > arch/x86/Kconfig | 13 + > arch/x86/kernel/cpu/sgx/Makefile | 1 + > arch/x86/kernel/cpu/sgx/epc_cgroup.c | 415 +++ > arch/x86/kernel/cpu/sgx/epc_cgroup.h | 59 > arch/x86/kernel/cpu/sgx/main.c | 68 - > arch/x86/kernel/cpu/sgx/sgx.h| 17 +- > 6 files changed, 556 insertions(+), 17 deletions(-) > create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.c > create mode 100644 arch/x86/kernel/cpu/sgx/epc_cgroup.h Given how large this patch is, it's better to split if we can. It seems we can at least split ... [...] > > @@ -970,6 +1005,7 @@ static void __init arch_update_sysfs_visibility(int nid) > {} > static bool __init sgx_page_cache_init(void) > { > u32 eax, ebx, ecx, edx, type; > + u64 capacity = 0; > u64 pa, size; > int nid; > int i; > @@ -1020,6 +1056,7 @@ static bool __init sgx_page_cache_init(void) > > sgx_epc_sections[i].node = _numa_nodes[nid]; > sgx_numa_nodes[nid].size += size; > + capacity += size; > > sgx_nr_epc_sections++; > } > @@ -1029,6 +1066,9 @@ static bool __init sgx_page_cache_init(void) > return false; > } > > + misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, capacity); > + sgx_epc_total_pages = capacity >> PAGE_SHIFT; > + > return true; > } > ... setting up capacity out as a separate patch, as it is a top-level only file showing the maximum instances of the resource. I'll review rest later.
Re: [PATCH v5 15/18] x86/sgx: Prepare for multiple LRUs
On Thu, 2023-10-05 at 14:33 -0500, Haitao Huang wrote: > On Thu, 05 Oct 2023 07:30:46 -0500, Huang, Kai wrote: > > > On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > > > +static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct > > > sgx_epc_page *epc_page) > > > +{ > > > + return _global_lru; > > > +} > > > + > > > +static inline bool sgx_can_reclaim(void) > > > +{ > > > + return !list_empty(_global_lru.reclaimable); > > > +} > > > + > > > > Shouldn't sgx_can_reclaim() also take a 'struct sgx_epc_lru_lists *'? > > > > I thought we also need to check whether a cgroup's LRU lists can be > > reclaimed? > > This is only used to check if any pages reclaimable at the top level in > this file. Later sgx_epc_cgroup_lru_empty(NULL) is used in this function > to recursively check all cgroups starting from the root. > > This again falls to the "impossible to review unless review a later patch first" category. This patch says nothing about sgx_can_reclaim() will only be used at the top level. Even if it does, why cannot it take LRU lists as input? All this patch says is we need to prepare these functions to suit multiple LRU lists. Btw, why sgx_reclaim_epc_pages() doesn't take LRU lists as input either? Is it possible that it can be called across multiple LRU lists, or across different lists in one LRU? Why do we need to find some particular LRU lists by given EPC page? +static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct sgx_epc_page *epc_page) +{ + return _global_lru; +} + Maybe it's clear for other people, but to me it sounds some necessary design background is missing at least. Please try best to make the patch self-reviewable by justifying all of those.
Re: [PATCH v5 13/18] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup
> > > > > > -/* > > > +/** > > > + * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers > > > + * @nr_to_scan: Number of EPC pages to scan for reclaim > > > + * @ignore_age: Reclaim a page even if it is young > > > + * > > > * Take a fixed number of pages from the head of the active page pool > > > and > > > * reclaim them to the enclave's private shmem files. Skip the pages, > > > which have > > > * been accessed since the last scan. Move those pages to the tail of > > > active > > > @@ -292,15 +301,14 @@ static void sgx_reclaimer_write(struct > > > sgx_epc_page *epc_page, > > > * problematic as it would increase the lock contention too much, > > > which would > > > * halt forward progress. > > > */ > > > -static void sgx_reclaim_pages(void) > > > +size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age) > > > > 'size_t' looks odd. Any reason to use it? > > > > Given you only scan 32 at maximum, seems 'int' is good enough? > > > > Initially was int. > Jarkko was suggesting ssize_t. I changed to size_t as this function will > never return negative. Then 'unsigned int'. We are talking about 32 at max here. size_t is more suitable for bytes, but we are dealing with number of pages. Maybe Jarkko could comment why size_t is better. [...] > > > > > i = 0; > > > list_for_each_entry_safe(epc_page, tmp, , list) { > > > encl_page = epc_page->encl_page; > > > > > > - if (!sgx_reclaimer_age(epc_page)) > > > + if (i == SGX_NR_TO_SCAN_MAX || > > > > i == nr_to_scan? > > > Not needed if above for statement fixed for nr_to_scan. > Anything above MAX will be skipped and put back to LRU. I believe using nr_to_scan is more logically correct. [...] > > > > I found this function a little bit odd, given the mixing of 'nr_to_scan', > > SGX_NR_TO_SCAN and SGX_NR_TO_SCAN_MAX. > > > > From the changelog: > > > > 1) To take a parameter that specifies the number of pages to scan for > > reclaiming. Define a max value of 32, but scan 16 in the case for the > > global reclaimer (ksgxd). > > > > It appears we want to make this function to scan @nr_to_scan for cgroup, > > but > > still want to scan a fixed value for ksgxd, which is SGX_NR_TO_SCAN. And > > @nr_to_scan can be larger than SGX_NR_TO_SCAN but smaller than > > SGX_NR_TO_SCAN_MAX. > > > > Putting behind the mystery of why above is needed, to achieve it, is it > > more > > clear if we do below? > > > > int __sgx_reclaim_epc_pages(int nr_to_scan, bool ignore_age) > > { > > struct sgx_backing backing[SGX_NR_TO_SCAN_MAX]; > > ... > > > > if (nr_to_scan > SGX_NR_TO_SCAN_MAX) > > return 0; > > We could set nr_to_scan to MAX but since this is code internal to driver, > maybe just make sure callers don't call with bigger numbers. Please add this check, using WARN_ON_ONCE() if it's better. Then the code is much easier to review. > > > > > for (i = 0; i < nr_to_scan; i++) { > > ... > > } > > > > yes please fix this up, then ... > > > return reclaimed; > > } > > > > /* This is for ksgxd() */ > > int sgx_reclaim_epc_page(void) > > { > > return __sgx_reclaim_epc_pages(SGX_NR_TO_SCAN, false); > > } > > Some maintainers may prefer no wrapping. > ... OK.
Re: [PATCH v5 15/18] x86/sgx: Prepare for multiple LRUs
On Thu, 05 Oct 2023 07:30:46 -0500, Huang, Kai wrote: On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: +static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct sgx_epc_page *epc_page) +{ + return _global_lru; +} + +static inline bool sgx_can_reclaim(void) +{ + return !list_empty(_global_lru.reclaimable); +} + Shouldn't sgx_can_reclaim() also take a 'struct sgx_epc_lru_lists *'? I thought we also need to check whether a cgroup's LRU lists can be reclaimed? This is only used to check if any pages reclaimable at the top level in this file. Later sgx_epc_cgroup_lru_empty(NULL) is used in this function to recursively check all cgroups starting from the root. Haitao
Re: [PATCH v5 13/18] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup
On Thu, 05 Oct 2023 07:24:12 -0500, Huang, Kai wrote: On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: From: Sean Christopherson Adjust and expose the top-level reclaim function as sgx_reclaim_epc_pages() for use by the upcoming EPC cgroup, which will initiate reclaim to enforce the max limit. Make these adjustments to the function signature. 1) To take a parameter that specifies the number of pages to scan for reclaiming. Define a max value of 32, but scan 16 in the case for the global reclaimer (ksgxd). The EPC cgroup will use it to specify a desired number of pages to be reclaimed up to the max value of 32. 2) To take a flag to force reclaiming a page regardless of its age. The EPC cgroup will use the flag to enforce its limits by draining the reclaimable lists before resorting to other measures, e.g. forcefully kill enclaves. 3) Return the number of reclaimed pages. The EPC cgroup will use the result to track reclaiming progress and escalate to a more forceful reclaiming mode, e.g., calling this function with the flag to ignore age of pages. Signed-off-by: Sean Christopherson Co-developed-by: Kristen Carlson Accardi Signed-off-by: Kristen Carlson Accardi Co-developed-by: Haitao Huang Signed-off-by: Haitao Huang Cc: Sean Christopherson --- V4: - Combined the 3 patches that made the individual changes to the function signature. - Removed 'high' limit in commit message. --- arch/x86/kernel/cpu/sgx/main.c | 31 +-- arch/x86/kernel/cpu/sgx/sgx.h | 1 + 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 3b875ab4dcd0..4e1a3e038db5 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -18,6 +18,11 @@ #include "encl.h" #include "encls.h" +/* + * Maximum number of pages to scan for reclaiming. + */ +#define SGX_NR_TO_SCAN_MAX 32 + struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; @@ -279,7 +284,11 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, mutex_unlock(>lock); } -/* +/** + * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers + * @nr_to_scan: Number of EPC pages to scan for reclaim + * @ignore_age: Reclaim a page even if it is young + * * Take a fixed number of pages from the head of the active page pool and * reclaim them to the enclave's private shmem files. Skip the pages, which have * been accessed since the last scan. Move those pages to the tail of active @@ -292,15 +301,14 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * problematic as it would increase the lock contention too much, which would * halt forward progress. */ -static void sgx_reclaim_pages(void) +size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age) 'size_t' looks odd. Any reason to use it? Given you only scan 32 at maximum, seems 'int' is good enough? Initially was int. Jarkko was suggesting ssize_t. I changed to size_t as this function will never return negative. { - struct sgx_backing backing[SGX_NR_TO_SCAN]; + struct sgx_backing backing[SGX_NR_TO_SCAN_MAX]; struct sgx_epc_page *epc_page, *tmp; struct sgx_encl_page *encl_page; pgoff_t page_index; LIST_HEAD(iso); - int ret; - int i; + size_t ret, i; spin_lock(_global_lru.lock); for (i = 0; i < SGX_NR_TO_SCAN; i++) { This should be nr_to_scan It was missed during some rebase and reordering operations. The function comment says * @nr_to_scan: Number of EPC pages to scan for reclaim But I don't see it is even used, if my eye isn't deceiving me? @@ -326,13 +334,14 @@ static void sgx_reclaim_pages(void) spin_unlock(_global_lru.lock); if (list_empty()) - return; + return 0; i = 0; list_for_each_entry_safe(epc_page, tmp, , list) { encl_page = epc_page->encl_page; - if (!sgx_reclaimer_age(epc_page)) + if (i == SGX_NR_TO_SCAN_MAX || i == nr_to_scan? Not needed if above for statement fixed for nr_to_scan. Anything above MAX will be skipped and put back to LRU. And should we have a if (nr_to_scan < SGX_NR_TO_SCAN_MAX) return 0; at the very beginning of this function? In final version caller to make sure not call with nr_to_scan not larger than SGX_NR_TO_SCAN_MAX + (!ignore_age && !sgx_reclaimer_age(epc_page))) goto skip; page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base); @@ -371,6 +380,8 @@ static void sgx_reclaim_pages(void) sgx_free_epc_page(epc_page); } + + return i; } I found this function a little bit odd, given the mixing of
Re: [PATCH v1 2/9] docs: firmware-guide: ACPI: Clarify ACPI bus concepts
On 10/5/2023 8:28 PM, Wilczynski, Michal wrote: > > On 10/5/2023 7:57 PM, Rafael J. Wysocki wrote: >> On Monday, September 25, 2023 4:48:35 PM CEST Michal Wilczynski wrote: >>> Some devices implement ACPI driver as a way to manage devices >>> enumerated by the ACPI. This might be confusing as a preferred way to >>> implement a driver for devices not connected to any bus is a platform >>> driver, as stated in the documentation. Clarify relationships between >>> ACPI device, platform device and ACPI entries. >>> >>> Suggested-by: Elena Reshetova >>> Signed-off-by: Michal Wilczynski >>> --- >>> Documentation/firmware-guide/acpi/enumeration.rst | 13 + >>> 1 file changed, 13 insertions(+) >>> >>> diff --git a/Documentation/firmware-guide/acpi/enumeration.rst >>> b/Documentation/firmware-guide/acpi/enumeration.rst >>> index 56d9913a3370..f56cc79a9e83 100644 >>> --- a/Documentation/firmware-guide/acpi/enumeration.rst >>> +++ b/Documentation/firmware-guide/acpi/enumeration.rst >>> @@ -64,6 +64,19 @@ If the driver needs to perform more complex >>> initialization like getting and >>> configuring GPIOs it can get its ACPI handle and extract this information >>> from ACPI tables. >>> >>> +ACPI bus >>> + >>> + >>> +Historically some devices not connected to any bus were represented as ACPI >>> +devices, and had to implement ACPI driver. This is not a preferred way for >>> new >>> +drivers. As explained above devices not connected to any bus should >>> implement >>> +platform driver. ACPI device would be created during enumeration >>> nonetheless, >>> +and would be accessible through ACPI_COMPANION() macro, and the ACPI >>> handle would >>> +be accessible through ACPI_HANDLE() macro. ACPI device is meant to describe >>> +information related to ACPI entry e.g. handle of the ACPI entry. Think - >>> +ACPI device interfaces with the FW, and the platform device with the rest >>> of >>> +the system. >>> + >>> DMA support >>> === >> I rewrote the above entirely, so here's a new patch to replace this one: >> >> --- >> From: Rafael J. Wysocki >> Subject: [PATCH v2 2/9] ACPI: docs: enumeration: Clarify ACPI bus concepts >> >> In some cases, ACPI drivers are implemented as a way to manage devices >> enumerated with the help of the platform firmware through ACPI. >> >> This might be confusing, since the preferred way to implement a driver >> for a device that cannot be enumerated natively, is a platform >> driver, as stated in the documentation. >> >> Clarify relationships between ACPI device objects, platform devices and >> ACPI Namespace entries. >> >> Suggested-by: Elena Reshetova >> Co-developed-by: Michal Wilczynski >> Signed-off-by: Michal Wilczynski >> Signed-off-by: Rafael J. Wysocki >> --- >> Documentation/firmware-guide/acpi/enumeration.rst | 43 >> ++ >> 1 file changed, 43 insertions(+) >> >> Index: linux-pm/Documentation/firmware-guide/acpi/enumeration.rst >> === >> --- linux-pm.orig/Documentation/firmware-guide/acpi/enumeration.rst >> +++ linux-pm/Documentation/firmware-guide/acpi/enumeration.rst >> @@ -64,6 +64,49 @@ If the driver needs to perform more comp >> configuring GPIOs it can get its ACPI handle and extract this information >> from ACPI tables. >> >> +ACPI device objects >> +=== >> + >> +Generally speaking, there are two categories of devices in a system in which >> +ACPI is used as an interface between the platform firmware and the OS: >> Devices >> +that can be discovered and enumerated natively, through a protocol defined >> for >> +the specific bus that they are on (for example, configuration space in PCI), >> +without the platform firmware assistance, and devices that need to be >> described >> +by the platform firmware so that they can be discovered. Still, for any >> device >> +known to the platform firmware, regardless of which category it falls into, >> +there can be a corresponding ACPI device object in the ACPI Namespace in >> which >> +case the Linux kernel will create a struct acpi_device object based on it >> for >> +that device. >> + >> +Those struct acpi_device objects are never used for binding drivers to >> natively >> +discoverable devices, because they are represented by other types of device >> +objects (for example, struct pci_dev for PCI devices) that are bound to by >> +device drivers (the corresponding struct acpi_device object is then used as >> +an additional source of information on the configuration of the given >> device). >> +Moreover, the core ACPI device enumeration code creates struct >> platform_device >> +objects for the majority of devices that are discovered and enumerated with >> the >> +help of the platform firmware and those platform device objects can be >> bound to >> +by platform drivers in direct analogy with the natively enumerable devices >> +case. Therefore it is logically
[PATCH v5 1/2] mm/memory_hotplug: split memmap_on_memory requests across memblocks
The MHP_MEMMAP_ON_MEMORY flag for hotplugged memory is restricted to 'memblock_size' chunks of memory being added. Adding a larger span of memory precludes memmap_on_memory semantics. For users of hotplug such as kmem, large amounts of memory might get added from the CXL subsystem. In some cases, this amount may exceed the available 'main memory' to store the memmap for the memory being added. In this case, it is useful to have a way to place the memmap on the memory being added, even if it means splitting the addition into memblock-sized chunks. Change add_memory_resource() to loop over memblock-sized chunks of memory if caller requested memmap_on_memory, and if other conditions for it are met. Teach try_remove_memory() to also expect that a memory range being removed might have been split up into memblock sized chunks, and to loop through those as needed. Cc: Andrew Morton Cc: David Hildenbrand Cc: Michal Hocko Cc: Oscar Salvador Cc: Dan Williams Cc: Dave Jiang Cc: Dave Hansen Cc: Huang Ying Suggested-by: David Hildenbrand Signed-off-by: Vishal Verma --- mm/memory_hotplug.c | 162 1 file changed, 99 insertions(+), 63 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index f8d3e7427e32..77ec6f15f943 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1380,6 +1380,44 @@ static bool mhp_supports_memmap_on_memory(unsigned long size) return arch_supports_memmap_on_memory(vmemmap_size); } +static int add_memory_create_devices(int nid, struct memory_group *group, +u64 start, u64 size, mhp_t mhp_flags) +{ + struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; + struct vmem_altmap mhp_altmap = { + .base_pfn = PHYS_PFN(start), + .end_pfn = PHYS_PFN(start + size - 1), + }; + int ret; + + if ((mhp_flags & MHP_MEMMAP_ON_MEMORY)) { + mhp_altmap.free = memory_block_memmap_on_memory_pages(); + params.altmap = kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); + if (!params.altmap) + return -ENOMEM; + + memcpy(params.altmap, _altmap, sizeof(mhp_altmap)); + } + + /* call arch's memory hotadd */ + ret = arch_add_memory(nid, start, size, ); + if (ret < 0) + goto error; + + /* create memory block devices after memory was added */ + ret = create_memory_block_devices(start, size, params.altmap, group); + if (ret) + goto err_bdev; + + return 0; + +err_bdev: + arch_remove_memory(start, size, NULL); +error: + kfree(params.altmap); + return ret; +} + /* * NOTE: The caller must call lock_device_hotplug() to serialize hotplug * and online/offline operations (triggered e.g. by sysfs). @@ -1388,14 +1426,10 @@ static bool mhp_supports_memmap_on_memory(unsigned long size) */ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) { - struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; + unsigned long memblock_size = memory_block_size_bytes(); enum memblock_flags memblock_flags = MEMBLOCK_NONE; - struct vmem_altmap mhp_altmap = { - .base_pfn = PHYS_PFN(res->start), - .end_pfn = PHYS_PFN(res->end), - }; struct memory_group *group = NULL; - u64 start, size; + u64 start, size, cur_start; bool new_node = false; int ret; @@ -1436,28 +1470,21 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) /* * Self hosted memmap array */ - if (mhp_flags & MHP_MEMMAP_ON_MEMORY) { - if (mhp_supports_memmap_on_memory(size)) { - mhp_altmap.free = memory_block_memmap_on_memory_pages(); - params.altmap = kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); - if (!params.altmap) + if ((mhp_flags & MHP_MEMMAP_ON_MEMORY) && + mhp_supports_memmap_on_memory(memblock_size)) { + for (cur_start = start; cur_start < start + size; +cur_start += memblock_size) { + ret = add_memory_create_devices(nid, group, cur_start, + memblock_size, + mhp_flags); + if (ret) goto error; - - memcpy(params.altmap, _altmap, sizeof(mhp_altmap)); } - /* fallback to not using altmap */ - } - - /* call arch's memory hotadd */ - ret = arch_add_memory(nid, start, size, ); - if (ret < 0) - goto error_free; - - /* create memory block devices after memory was added */ - ret =
[PATCH v5 0/2] mm: use memmap_on_memory semantics for dax/kmem
The dax/kmem driver can potentially hot-add large amounts of memory originating from CXL memory expanders, or NVDIMMs, or other 'device memories'. There is a chance there isn't enough regular system memory available to fit the memmap for this new memory. It's therefore desirable, if all other conditions are met, for the kmem managed memory to place its memmap on the newly added memory itself. The main hurdle for accomplishing this for kmem is that memmap_on_memory can only be done if the memory being added is equal to the size of one memblock. To overcome this, allow the hotplug code to split an add_memory() request into memblock-sized chunks, and try_remove_memory() to also expect and handle such a scenario. Patch 1 teaches the memory_hotplug code to allow for splitting add_memory() and remove_memory() requests over memblock sized chunks. Patch 2 adds a sysfs control for the kmem driver that would allow an opt-out of using memmap_on_memory for the memory being added. Signed-off-by: Vishal Verma --- Changes in v5: - Separate out per-memblock operations from per memory block operations in try_remove_memory(), and rename the inner function appropriately. This does expand the scope of the memory hotplug lock to include remove_memory_block_devices(), but the alternative was to drop the lock in the inner function separately for each iteration, and then re-acquire it in try_remove_memory() creating a small window where the lock isn't held. (David Hildenbrand) - Remove unnecessary rc check from the memmap_on_memory_store sysfs helper in patch 2 (Dan Carpenter) - Link to v4: https://lore.kernel.org/r/20230928-vv-kmem_memmap-v4-0-6ff73fec5...@intel.com Changes in v4: - Rebase to Aneesh's PPC64 memmap_on_memory series v8 [2]. - Tweak a goto / error path in add_memory_create_devices() (Jonathan) - Retain the old behavior for dax devices, only default to memmap_on_memory for CXL (Jonathan) - Link to v3: https://lore.kernel.org/r/20230801-vv-kmem_memmap-v3-0-406e9aaf5...@intel.com [2]: https://lore.kernel.org/linux-mm/20230808091501.287660-1-aneesh.ku...@linux.ibm.com Changes in v3: - Rebase on Aneesh's patches [1] - Drop Patch 1 - it is not needed since [1] allows for dynamic setting of the memmap_on_memory param (David) - Link to v2: https://lore.kernel.org/r/20230720-vv-kmem_memmap-v2-0-88bdaab34...@intel.com [1]: https://lore.kernel.org/r/20230801044116.10674-1-aneesh.ku...@linux.ibm.com Changes in v2: - Drop the patch to create an override path for the memmap_on_memory module param (David) - Move the chunking into memory_hotplug.c so that any caller of add_memory() can request this behavior. (David) - Handle remove_memory() too. (David, Ying) - Add a sysfs control in the kmem driver for memmap_on_memory semantics (David, Jonathan) - Add a #else case to define mhp_supports_memmap_on_memory() if CONFIG_MEMORY_HOTPLUG is unset. (0day report) - Link to v1: https://lore.kernel.org/r/20230613-vv-kmem_memmap-v1-0-f6de9c6af...@intel.com --- Vishal Verma (2): mm/memory_hotplug: split memmap_on_memory requests across memblocks dax/kmem: allow kmem to add memory with memmap_on_memory drivers/dax/bus.h | 1 + drivers/dax/dax-private.h | 1 + drivers/dax/bus.c | 38 +++ drivers/dax/cxl.c | 1 + drivers/dax/hmem/hmem.c | 1 + drivers/dax/kmem.c| 8 ++- drivers/dax/pmem.c| 1 + mm/memory_hotplug.c | 162 -- 8 files changed, 149 insertions(+), 64 deletions(-) --- base-commit: 25b5b1a0646c3d39e1d885e27c10be1c9e202bf2 change-id: 20230613-vv-kmem_memmap-5483c8d04279 Best regards, -- Vishal Verma
[PATCH v5 2/2] dax/kmem: allow kmem to add memory with memmap_on_memory
Large amounts of memory managed by the kmem driver may come in via CXL, and it is often desirable to have the memmap for this memory on the new memory itself. Enroll kmem-managed memory for memmap_on_memory semantics if the dax region originates via CXL. For non-CXL dax regions, retain the existing default behavior of hot adding without memmap_on_memory semantics. Add a sysfs override under the dax device to control this behavior and override either default. Cc: Andrew Morton Cc: David Hildenbrand Cc: Michal Hocko Cc: Oscar Salvador Cc: Dan Williams Cc: Dave Jiang Cc: Dave Hansen Cc: Huang Ying Reviewed-by: Jonathan Cameron Reviewed-by: David Hildenbrand Signed-off-by: Vishal Verma --- drivers/dax/bus.h | 1 + drivers/dax/dax-private.h | 1 + drivers/dax/bus.c | 38 ++ drivers/dax/cxl.c | 1 + drivers/dax/hmem/hmem.c | 1 + drivers/dax/kmem.c| 8 +++- drivers/dax/pmem.c| 1 + 7 files changed, 50 insertions(+), 1 deletion(-) diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h index 1ccd23360124..cbbf64443098 100644 --- a/drivers/dax/bus.h +++ b/drivers/dax/bus.h @@ -23,6 +23,7 @@ struct dev_dax_data { struct dev_pagemap *pgmap; resource_size_t size; int id; + bool memmap_on_memory; }; struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data); diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index 27cf2d79..446617b73aea 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -70,6 +70,7 @@ struct dev_dax { struct ida ida; struct device dev; struct dev_pagemap *pgmap; + bool memmap_on_memory; int nr_range; struct dev_dax_range { unsigned long pgoff; diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 0ee96e6fc426..43be95a231c9 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -367,6 +367,7 @@ static ssize_t create_store(struct device *dev, struct device_attribute *attr, .dax_region = dax_region, .size = 0, .id = -1, + .memmap_on_memory = false, }; struct dev_dax *dev_dax = devm_create_dev_dax(); @@ -1269,6 +1270,40 @@ static ssize_t numa_node_show(struct device *dev, } static DEVICE_ATTR_RO(numa_node); +static ssize_t memmap_on_memory_show(struct device *dev, +struct device_attribute *attr, char *buf) +{ + struct dev_dax *dev_dax = to_dev_dax(dev); + + return sprintf(buf, "%d\n", dev_dax->memmap_on_memory); +} + +static ssize_t memmap_on_memory_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct dev_dax *dev_dax = to_dev_dax(dev); + struct dax_region *dax_region = dev_dax->region; + ssize_t rc; + bool val; + + rc = kstrtobool(buf, ); + if (rc) + return rc; + + device_lock(dax_region->dev); + if (!dax_region->dev->driver) { + device_unlock(dax_region->dev); + return -ENXIO; + } + + dev_dax->memmap_on_memory = val; + + device_unlock(dax_region->dev); + return len; +} +static DEVICE_ATTR_RW(memmap_on_memory); + static umode_t dev_dax_visible(struct kobject *kobj, struct attribute *a, int n) { struct device *dev = container_of(kobj, struct device, kobj); @@ -1295,6 +1330,7 @@ static struct attribute *dev_dax_attributes[] = { _attr_align.attr, _attr_resource.attr, _attr_numa_node.attr, + _attr_memmap_on_memory.attr, NULL, }; @@ -1400,6 +1436,8 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data) dev_dax->align = dax_region->align; ida_init(_dax->ida); + dev_dax->memmap_on_memory = data->memmap_on_memory; + inode = dax_inode(dax_dev); dev->devt = inode->i_rdev; dev->bus = _bus_type; diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c index 8bc9d04034d6..c696837ab23c 100644 --- a/drivers/dax/cxl.c +++ b/drivers/dax/cxl.c @@ -26,6 +26,7 @@ static int cxl_dax_region_probe(struct device *dev) .dax_region = dax_region, .id = -1, .size = range_len(_dax->hpa_range), + .memmap_on_memory = true, }; return PTR_ERR_OR_ZERO(devm_create_dev_dax()); diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c index 5d2ddef0f8f5..b9da69f92697 100644 --- a/drivers/dax/hmem/hmem.c +++ b/drivers/dax/hmem/hmem.c @@ -36,6 +36,7 @@ static int dax_hmem_probe(struct platform_device *pdev) .dax_region = dax_region, .id = -1, .size = region_idle ? 0 : range_len(>range), + .memmap_on_memory = false, }; return
Re: [PATCH v1 1/9] ACPI: bus: Make notify wrappers more generic
On Thu, Oct 5, 2023 at 8:27 PM Wilczynski, Michal wrote: > > > > On 10/5/2023 7:03 PM, Rafael J. Wysocki wrote: > > On Thursday, October 5, 2023 5:30:59 PM CEST Rafael J. Wysocki wrote: > >> On Thu, Oct 5, 2023 at 2:05 PM Wilczynski, Michal > >> wrote: > >>> On 10/5/2023 12:57 PM, Rafael J. Wysocki wrote: > On Thu, Oct 5, 2023 at 10:10 AM Wilczynski, Michal > wrote: > >> [cut] > >> > >> That said, why exactly is it better to use acpi_handle instead of a > >> struct acpi_device pointer? > > I wanted to make the wrapper as close as possible to the wrapped > > function. > > This way it would be easier to remove it in the future i.e if we ever > > deem > > extra synchronization not worth it etc. What the ACPICA function need to > > install a wrapper is a handle not a pointer to a device. > > So there is no need for a middle man. > Taking a struct acpi_device pointer as the first argument is part of > duplication reduction, however, because in the most common case it > saves the users of it the need to dereference the struct acpi_device > they get from ACPI_COMPANION() in order to obtain the handle. > >>> User don't even have to use acpi device anywhere, as he can choose > >>> to use ACPI_HANDLE() instead on 'struct device*' and never interact > >>> with acpi device directly. > >> Have you actually looked at this macro? It is a wrapper around > >> ACPI_COMPANION(). > >> > >> So they may think that they don't use struct acpi_device pointers, but > >> in fact they do. > >> > Arguably, acpi_handle is an ACPICA concept and it is better to reduce > its usage outside ACPICA. > >>> Use of acpi_handle is deeply entrenched in the kernel. There is even > >>> a macro ACPI_HANDLE() that returns acpi_handle. I would say it's > >>> way too late to limit it to ACPICA internal code. > >> So there is a difference between "limiting to ACPICA" and "reducing". > >> It cannot be limited to ACPICA, because the code outside ACPICA needs > >> to evaluate ACPI objects sometimes and ACPI handles are needed for > >> that. > >> > >> And this observation doesn't invalidate the point. > >> > >> Realistically, in a platform driver you'll need the latter to obtain > >> the former anyway. > > I don't want to introduce arbitrary limitations where they are not > > necessary. > I'm not sure what you mean. This patch is changing existing functions. > >>> That's true, but those functions aren't yet deeply entrenched in the > >>> kernel yet, so in my view how they should look like should still be > >>> a subject for discussion, as for now they're only used locally in > >>> drivers/acpi, and my next patchset, that would remove .notify in > >>> platform directory would spread them more, and would > >>> make them harder to change. For now we can change how they > >>> work pretty painlessly. > >> I see no particular reason to do that, though. > >> > >> What specifically is a problem with passing struct acpi_device > >> pointers to the wrappers? I don't see any TBH. > >> > > It is often the case that driver allocates it's own private struct > > using kmalloc > > family of functions, and that structure already contains everything > > that is > > needed to remove the handler, so why force ? There are already examples > > in the drivers that do that i.e in acpi_video the function > > acpi_video_dev_add_notify_handler() uses raw ACPICA handler to install > > a notify handler and it passes private structure there. > > So there is value in leaving the choice of an actual type to the user > > of the > > API. > No, if the user has a pointer to struct acpi_device already, there is > no difference between passing this and passing the acpi_handle from it > except for the extra dereference in the latter case. > >>> Dereference would happen anyway in the wrapper, and it doesn't cause > >>> any harm anyway for readability in my opinion. And of course you don't > >>> have to use acpi device at all, you can use ACPI_HANDLE() macro. > >> So one can use ACPI_COMPANION() just as well and it is slightly less > >> overhead. > >> > If the user doesn't have a struct acpi_device pointer, let them use > the raw ACPICA handler directly and worry about the synchronization > themselves. > >>> As mentioned acpi_device pointer is not really required to use the > >>> wrapper. > >>> Instead we can use ACPI_HANDLE() macro directly. Look at the usage of > >>> the wrapper in the AC driver [1]. > >> You don't really have to repeat the same argument several times and I > >> know how ACPI_HANDLE() works. Also I don't like some of the things > >> done by this patch. > >> > >> Whoever uses ACPI_HANDLE(), they also use ACPI_COMPANION() which is > >> hidden in the former. > >> > >> If they don't need to store either the acpi_handle or the struct > >> acpi_device pointer, there is no reason at all to use the
Re: [PATCH v1 2/9] docs: firmware-guide: ACPI: Clarify ACPI bus concepts
On 10/5/2023 7:57 PM, Rafael J. Wysocki wrote: > On Monday, September 25, 2023 4:48:35 PM CEST Michal Wilczynski wrote: >> Some devices implement ACPI driver as a way to manage devices >> enumerated by the ACPI. This might be confusing as a preferred way to >> implement a driver for devices not connected to any bus is a platform >> driver, as stated in the documentation. Clarify relationships between >> ACPI device, platform device and ACPI entries. >> >> Suggested-by: Elena Reshetova >> Signed-off-by: Michal Wilczynski >> --- >> Documentation/firmware-guide/acpi/enumeration.rst | 13 + >> 1 file changed, 13 insertions(+) >> >> diff --git a/Documentation/firmware-guide/acpi/enumeration.rst >> b/Documentation/firmware-guide/acpi/enumeration.rst >> index 56d9913a3370..f56cc79a9e83 100644 >> --- a/Documentation/firmware-guide/acpi/enumeration.rst >> +++ b/Documentation/firmware-guide/acpi/enumeration.rst >> @@ -64,6 +64,19 @@ If the driver needs to perform more complex >> initialization like getting and >> configuring GPIOs it can get its ACPI handle and extract this information >> from ACPI tables. >> >> +ACPI bus >> + >> + >> +Historically some devices not connected to any bus were represented as ACPI >> +devices, and had to implement ACPI driver. This is not a preferred way for >> new >> +drivers. As explained above devices not connected to any bus should >> implement >> +platform driver. ACPI device would be created during enumeration >> nonetheless, >> +and would be accessible through ACPI_COMPANION() macro, and the ACPI handle >> would >> +be accessible through ACPI_HANDLE() macro. ACPI device is meant to describe >> +information related to ACPI entry e.g. handle of the ACPI entry. Think - >> +ACPI device interfaces with the FW, and the platform device with the rest of >> +the system. >> + >> DMA support >> === > I rewrote the above entirely, so here's a new patch to replace this one: > > --- > From: Rafael J. Wysocki > Subject: [PATCH v2 2/9] ACPI: docs: enumeration: Clarify ACPI bus concepts > > In some cases, ACPI drivers are implemented as a way to manage devices > enumerated with the help of the platform firmware through ACPI. > > This might be confusing, since the preferred way to implement a driver > for a device that cannot be enumerated natively, is a platform > driver, as stated in the documentation. > > Clarify relationships between ACPI device objects, platform devices and > ACPI Namespace entries. > > Suggested-by: Elena Reshetova > Co-developed-by: Michal Wilczynski > Signed-off-by: Michal Wilczynski > Signed-off-by: Rafael J. Wysocki > --- > Documentation/firmware-guide/acpi/enumeration.rst | 43 > ++ > 1 file changed, 43 insertions(+) > > Index: linux-pm/Documentation/firmware-guide/acpi/enumeration.rst > === > --- linux-pm.orig/Documentation/firmware-guide/acpi/enumeration.rst > +++ linux-pm/Documentation/firmware-guide/acpi/enumeration.rst > @@ -64,6 +64,49 @@ If the driver needs to perform more comp > configuring GPIOs it can get its ACPI handle and extract this information > from ACPI tables. > > +ACPI device objects > +=== > + > +Generally speaking, there are two categories of devices in a system in which > +ACPI is used as an interface between the platform firmware and the OS: > Devices > +that can be discovered and enumerated natively, through a protocol defined > for > +the specific bus that they are on (for example, configuration space in PCI), > +without the platform firmware assistance, and devices that need to be > described > +by the platform firmware so that they can be discovered. Still, for any > device > +known to the platform firmware, regardless of which category it falls into, > +there can be a corresponding ACPI device object in the ACPI Namespace in > which > +case the Linux kernel will create a struct acpi_device object based on it for > +that device. > + > +Those struct acpi_device objects are never used for binding drivers to > natively > +discoverable devices, because they are represented by other types of device > +objects (for example, struct pci_dev for PCI devices) that are bound to by > +device drivers (the corresponding struct acpi_device object is then used as > +an additional source of information on the configuration of the given > device). > +Moreover, the core ACPI device enumeration code creates struct > platform_device > +objects for the majority of devices that are discovered and enumerated with > the > +help of the platform firmware and those platform device objects can be bound > to > +by platform drivers in direct analogy with the natively enumerable devices > +case. Therefore it is logically inconsistent and so generally invalid to > bind > +drivers to struct acpi_device objects, including drivers for devices that are > +discovered with the help of the platform
Re: [PATCH v1 1/9] ACPI: bus: Make notify wrappers more generic
On 10/5/2023 7:03 PM, Rafael J. Wysocki wrote: > On Thursday, October 5, 2023 5:30:59 PM CEST Rafael J. Wysocki wrote: >> On Thu, Oct 5, 2023 at 2:05 PM Wilczynski, Michal >> wrote: >>> On 10/5/2023 12:57 PM, Rafael J. Wysocki wrote: On Thu, Oct 5, 2023 at 10:10 AM Wilczynski, Michal wrote: >> [cut] >> >> That said, why exactly is it better to use acpi_handle instead of a >> struct acpi_device pointer? > I wanted to make the wrapper as close as possible to the wrapped function. > This way it would be easier to remove it in the future i.e if we ever deem > extra synchronization not worth it etc. What the ACPICA function need to > install a wrapper is a handle not a pointer to a device. > So there is no need for a middle man. Taking a struct acpi_device pointer as the first argument is part of duplication reduction, however, because in the most common case it saves the users of it the need to dereference the struct acpi_device they get from ACPI_COMPANION() in order to obtain the handle. >>> User don't even have to use acpi device anywhere, as he can choose >>> to use ACPI_HANDLE() instead on 'struct device*' and never interact >>> with acpi device directly. >> Have you actually looked at this macro? It is a wrapper around >> ACPI_COMPANION(). >> >> So they may think that they don't use struct acpi_device pointers, but >> in fact they do. >> Arguably, acpi_handle is an ACPICA concept and it is better to reduce its usage outside ACPICA. >>> Use of acpi_handle is deeply entrenched in the kernel. There is even >>> a macro ACPI_HANDLE() that returns acpi_handle. I would say it's >>> way too late to limit it to ACPICA internal code. >> So there is a difference between "limiting to ACPICA" and "reducing". >> It cannot be limited to ACPICA, because the code outside ACPICA needs >> to evaluate ACPI objects sometimes and ACPI handles are needed for >> that. >> >> And this observation doesn't invalidate the point. >> >> Realistically, in a platform driver you'll need the latter to obtain >> the former anyway. > I don't want to introduce arbitrary limitations where they are not > necessary. I'm not sure what you mean. This patch is changing existing functions. >>> That's true, but those functions aren't yet deeply entrenched in the >>> kernel yet, so in my view how they should look like should still be >>> a subject for discussion, as for now they're only used locally in >>> drivers/acpi, and my next patchset, that would remove .notify in >>> platform directory would spread them more, and would >>> make them harder to change. For now we can change how they >>> work pretty painlessly. >> I see no particular reason to do that, though. >> >> What specifically is a problem with passing struct acpi_device >> pointers to the wrappers? I don't see any TBH. >> > It is often the case that driver allocates it's own private struct using > kmalloc > family of functions, and that structure already contains everything that > is > needed to remove the handler, so why force ? There are already examples > in the drivers that do that i.e in acpi_video the function > acpi_video_dev_add_notify_handler() uses raw ACPICA handler to install > a notify handler and it passes private structure there. > So there is value in leaving the choice of an actual type to the user of > the > API. No, if the user has a pointer to struct acpi_device already, there is no difference between passing this and passing the acpi_handle from it except for the extra dereference in the latter case. >>> Dereference would happen anyway in the wrapper, and it doesn't cause >>> any harm anyway for readability in my opinion. And of course you don't >>> have to use acpi device at all, you can use ACPI_HANDLE() macro. >> So one can use ACPI_COMPANION() just as well and it is slightly less >> overhead. >> If the user doesn't have a struct acpi_device pointer, let them use the raw ACPICA handler directly and worry about the synchronization themselves. >>> As mentioned acpi_device pointer is not really required to use the wrapper. >>> Instead we can use ACPI_HANDLE() macro directly. Look at the usage of >>> the wrapper in the AC driver [1]. >> You don't really have to repeat the same argument several times and I >> know how ACPI_HANDLE() works. Also I don't like some of the things >> done by this patch. >> >> Whoever uses ACPI_HANDLE(), they also use ACPI_COMPANION() which is >> hidden in the former. >> >> If they don't need to store either the acpi_handle or the struct >> acpi_device pointer, there is no reason at all to use the former >> instead of the latter. >> >> If they get an acpi_handle from somewhere else than ACPI_HANDLE(), >> then yes, they would need to get the ACPI devices from there (which is >> possible still), but they may be better off by using the raw ACPICA >> interface for events
Re: [PATCH] [v3] ieee802154: ca8210: Fix a potential UAF in ca8210_probe
Hello Dinghao, On 01.10.23 09:19, kernel test robot wrote: Hi Dinghao, kernel test robot noticed the following build warnings: [auto build test WARNING on linus/master] [also build test WARNING on v6.6-rc3 next-20230929] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Dinghao-Liu/ieee802154-ca8210-Fix-a-potential-UAF-in-ca8210_probe/20231001-135130 base: linus/master patch link: https://lore.kernel.org/r/20231001054949.14624-1-dinghao.liu%40zju.edu.cn patch subject: [PATCH] [v3] ieee802154: ca8210: Fix a potential UAF in ca8210_probe config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20231001/202310011548.qyqmuodi-...@intel.com/config) compiler: m68k-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231001/202310011548.qyqmuodi-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202310011548.qyqmuodi-...@intel.com/ All warnings (new ones prefixed by >>): drivers/net/ieee802154/ca8210.c: In function 'ca8210_register_ext_clock': drivers/net/ieee802154/ca8210.c:2743:13: warning: unused variable 'ret' [-Wunused-variable] 2743 | int ret = 0; | ^~~ Please take care of this now unused variable after your re-factor. With this fixed and send out as v4 I am happy to get this applied to the wpan tree. regards Stefan Schmidt
Re: [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem
On Mon, 2023-09-18 at 10:29 +0300, Mike Rapoport wrote: > +/** > + * struct execmem_range - definition of a memory range suitable for > code and > + * related data allocations > + * @start: address space start > + * @end: address space end (inclusive) > + * @pgprot:permissions for memory in this address space > + * @alignment: alignment required for text allocations > + */ > +struct execmem_range { > + unsigned long start; > + unsigned long end; > + pgprot_t pgprot; > + unsigned intalignment; > +}; Not a strong opinion, but range doesn't seem an appropriate name. It *has* a range, but also other allocation configuration. It gets especially confusing when multiple "ranges" have the same range. Maybe execmem_alloc_params?
Re: [PATCH v3 03/13] mm/execmem, arch: convert simple overrides of module_alloc to execmem
On Thu, 2023-10-05 at 08:26 +0300, Mike Rapoport wrote: > On Wed, Oct 04, 2023 at 03:39:26PM +, Edgecombe, Rick P wrote: > > On Tue, 2023-10-03 at 17:29 -0700, Rick Edgecombe wrote: > > > It seems a bit weird to copy all of this. Is it trying to be > > > faster > > > or > > > something? > > > > > > Couldn't it just check r->start in execmem_text/data_alloc() path > > > and > > > switch to EXECMEM_DEFAULT if needed then? The > > > execmem_range_is_data() > > > part that comes later could be added to the logic there too. So > > > this > > > seems like unnecessary complexity to me or I don't see the > > > reason. > > > > I guess this is a bad idea because if you have the full size array > > sitting around anyway you might as well use it and reduce the > > exec_mem_alloc() logic. > > That's was the idea, indeed. :) > > > Just looking at it from the x86 side (and > > similar) though, where there is actually only one execmem_range and > > it > > building this whole array with identical data and it seems weird. > > Right, most architectures have only one range, but to support all > variants > that we have, execmem has to maintain the whole array. What about just having an index into a smaller set of ranges. The module area and the extra JIT area. So ->ranges can be size 3 (statically allocated in the arch code) for three areas and then the index array can be size EXECMEM_TYPE_MAX. The default 0 value of the indexing array will point to the default area and any special areas can be set in the index point to the desired range. Looking at how it would do for x86 and arm64, it looks maybe a bit better to me. A little bit less code and memory usage, and a bit easier to trace the configuration through to the final state (IMO). What do you think? Very rough, on top of this series, below. As I was playing around with this, I was also wondering why it needs two copies of struct execmem_params: one returned from the arch code and one in exec mem. And why the temporary arch copy is ro_after_init, but the final execmem.c copy is not ro_after_init? arch/arm64/mm/init.c| 67 ++--- -- arch/x86/mm/init.c | 24 +--- include/linux/execmem.h | 5 +++-- mm/execmem.c| 61 - 4 files changed, 70 insertions(+), 87 deletions(-) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 9b7716b4d84c..7df119101f20 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -633,49 +633,58 @@ static int __init module_init_limits(void) return 0; } -static struct execmem_params execmem_params __ro_after_init = { - .ranges = { - [EXECMEM_DEFAULT] = { - .flags = EXECMEM_KASAN_SHADOW, - .alignment = MODULE_ALIGN, - }, - [EXECMEM_KPROBES] = { - .start = VMALLOC_START, - .end = VMALLOC_END, - .alignment = 1, - }, - [EXECMEM_BPF] = { - .start = VMALLOC_START, - .end = VMALLOC_END, - .alignment = 1, - }, +static struct execmem_range[2] ranges __ro_after_init = { + /* Module area */ + [0] = { + .flags = EXECMEM_KASAN_SHADOW, + .alignment = MODULE_ALIGN, + }, + /* Kprobes area */ + [1] = { + .start = VMALLOC_START, + .end = VMALLOC_END, + .alignment = 1, + }, + /* BPF area */ + [2] = { + .start = VMALLOC_START, + .end = VMALLOC_END, + .alignment = 1, }, }; -struct execmem_params __init *execmem_arch_params(void) +void __init execmem_arch_params(struct execmem_params *p) { - struct execmem_range *r = _params.ranges[EXECMEM_DEFAULT]; + struct execmem_range *default; + struct execmem_range *jit; + + p->ranges = module_init_limits(); - r->pgprot = PAGE_KERNEL; - + /* Default area */ + default = [0]; + default->pgprot = PAGE_KERNEL; if (module_direct_base) { - r->start = module_direct_base; - r->end = module_direct_base + SZ_128M; + default->start = module_direct_base; + default->end = module_direct_base + SZ_128M; if (module_plt_base) { - r->fallback_start = module_plt_base; - r->fallback_end = module_plt_base + SZ_2G; + default->fallback_start = module_plt_base; + default->fallback_end = module_plt_base + SZ_2G; } } else if (module_plt_base) { - r->start = module_plt_base; - r->end = module_plt_base + SZ_2G; + default->start =
Re: [PATCH v1 2/9] docs: firmware-guide: ACPI: Clarify ACPI bus concepts
On Monday, September 25, 2023 4:48:35 PM CEST Michal Wilczynski wrote: > Some devices implement ACPI driver as a way to manage devices > enumerated by the ACPI. This might be confusing as a preferred way to > implement a driver for devices not connected to any bus is a platform > driver, as stated in the documentation. Clarify relationships between > ACPI device, platform device and ACPI entries. > > Suggested-by: Elena Reshetova > Signed-off-by: Michal Wilczynski > --- > Documentation/firmware-guide/acpi/enumeration.rst | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/Documentation/firmware-guide/acpi/enumeration.rst > b/Documentation/firmware-guide/acpi/enumeration.rst > index 56d9913a3370..f56cc79a9e83 100644 > --- a/Documentation/firmware-guide/acpi/enumeration.rst > +++ b/Documentation/firmware-guide/acpi/enumeration.rst > @@ -64,6 +64,19 @@ If the driver needs to perform more complex initialization > like getting and > configuring GPIOs it can get its ACPI handle and extract this information > from ACPI tables. > > +ACPI bus > + > + > +Historically some devices not connected to any bus were represented as ACPI > +devices, and had to implement ACPI driver. This is not a preferred way for > new > +drivers. As explained above devices not connected to any bus should implement > +platform driver. ACPI device would be created during enumeration nonetheless, > +and would be accessible through ACPI_COMPANION() macro, and the ACPI handle > would > +be accessible through ACPI_HANDLE() macro. ACPI device is meant to describe > +information related to ACPI entry e.g. handle of the ACPI entry. Think - > +ACPI device interfaces with the FW, and the platform device with the rest of > +the system. > + > DMA support > === I rewrote the above entirely, so here's a new patch to replace this one: --- From: Rafael J. Wysocki Subject: [PATCH v2 2/9] ACPI: docs: enumeration: Clarify ACPI bus concepts In some cases, ACPI drivers are implemented as a way to manage devices enumerated with the help of the platform firmware through ACPI. This might be confusing, since the preferred way to implement a driver for a device that cannot be enumerated natively, is a platform driver, as stated in the documentation. Clarify relationships between ACPI device objects, platform devices and ACPI Namespace entries. Suggested-by: Elena Reshetova Co-developed-by: Michal Wilczynski Signed-off-by: Michal Wilczynski Signed-off-by: Rafael J. Wysocki --- Documentation/firmware-guide/acpi/enumeration.rst | 43 ++ 1 file changed, 43 insertions(+) Index: linux-pm/Documentation/firmware-guide/acpi/enumeration.rst === --- linux-pm.orig/Documentation/firmware-guide/acpi/enumeration.rst +++ linux-pm/Documentation/firmware-guide/acpi/enumeration.rst @@ -64,6 +64,49 @@ If the driver needs to perform more comp configuring GPIOs it can get its ACPI handle and extract this information from ACPI tables. +ACPI device objects +=== + +Generally speaking, there are two categories of devices in a system in which +ACPI is used as an interface between the platform firmware and the OS: Devices +that can be discovered and enumerated natively, through a protocol defined for +the specific bus that they are on (for example, configuration space in PCI), +without the platform firmware assistance, and devices that need to be described +by the platform firmware so that they can be discovered. Still, for any device +known to the platform firmware, regardless of which category it falls into, +there can be a corresponding ACPI device object in the ACPI Namespace in which +case the Linux kernel will create a struct acpi_device object based on it for +that device. + +Those struct acpi_device objects are never used for binding drivers to natively +discoverable devices, because they are represented by other types of device +objects (for example, struct pci_dev for PCI devices) that are bound to by +device drivers (the corresponding struct acpi_device object is then used as +an additional source of information on the configuration of the given device). +Moreover, the core ACPI device enumeration code creates struct platform_device +objects for the majority of devices that are discovered and enumerated with the +help of the platform firmware and those platform device objects can be bound to +by platform drivers in direct analogy with the natively enumerable devices +case. Therefore it is logically inconsistent and so generally invalid to bind +drivers to struct acpi_device objects, including drivers for devices that are +discovered with the help of the platform firmware. + +Historically, ACPI drivers that bound directly to struct acpi_device objects +were implemented for some devices enumerated with the help of the platform +firmware, but this is not recommended for any new drivers. As explained
Re: [PATCH v1 1/9] ACPI: bus: Make notify wrappers more generic
On Thursday, October 5, 2023 5:30:59 PM CEST Rafael J. Wysocki wrote: > On Thu, Oct 5, 2023 at 2:05 PM Wilczynski, Michal > wrote: > > On 10/5/2023 12:57 PM, Rafael J. Wysocki wrote: > > > On Thu, Oct 5, 2023 at 10:10 AM Wilczynski, Michal > > > wrote: > > [cut] > > > >>> > > >>> That said, why exactly is it better to use acpi_handle instead of a > > >>> struct acpi_device pointer? > > >> I wanted to make the wrapper as close as possible to the wrapped > > >> function. > > >> This way it would be easier to remove it in the future i.e if we ever > > >> deem > > >> extra synchronization not worth it etc. What the ACPICA function need to > > >> install a wrapper is a handle not a pointer to a device. > > >> So there is no need for a middle man. > > > Taking a struct acpi_device pointer as the first argument is part of > > > duplication reduction, however, because in the most common case it > > > saves the users of it the need to dereference the struct acpi_device > > > they get from ACPI_COMPANION() in order to obtain the handle. > > > > User don't even have to use acpi device anywhere, as he can choose > > to use ACPI_HANDLE() instead on 'struct device*' and never interact > > with acpi device directly. > > Have you actually looked at this macro? It is a wrapper around > ACPI_COMPANION(). > > So they may think that they don't use struct acpi_device pointers, but > in fact they do. > > > > > > > Arguably, acpi_handle is an ACPICA concept and it is better to reduce > > > its usage outside ACPICA. > > > > Use of acpi_handle is deeply entrenched in the kernel. There is even > > a macro ACPI_HANDLE() that returns acpi_handle. I would say it's > > way too late to limit it to ACPICA internal code. > > So there is a difference between "limiting to ACPICA" and "reducing". > It cannot be limited to ACPICA, because the code outside ACPICA needs > to evaluate ACPI objects sometimes and ACPI handles are needed for > that. > > And this observation doesn't invalidate the point. > > > > > > >>> Realistically, in a platform driver you'll need the latter to obtain > > >>> the former anyway. > > >> I don't want to introduce arbitrary limitations where they are not > > >> necessary. > > > I'm not sure what you mean. This patch is changing existing functions. > > > > That's true, but those functions aren't yet deeply entrenched in the > > kernel yet, so in my view how they should look like should still be > > a subject for discussion, as for now they're only used locally in > > drivers/acpi, and my next patchset, that would remove .notify in > > platform directory would spread them more, and would > > make them harder to change. For now we can change how they > > work pretty painlessly. > > I see no particular reason to do that, though. > > What specifically is a problem with passing struct acpi_device > pointers to the wrappers? I don't see any TBH. > > > > > > >> It is often the case that driver allocates it's own private struct using > > >> kmalloc > > >> family of functions, and that structure already contains everything that > > >> is > > >> needed to remove the handler, so why force ? There are already examples > > >> in the drivers that do that i.e in acpi_video the function > > >> acpi_video_dev_add_notify_handler() uses raw ACPICA handler to install > > >> a notify handler and it passes private structure there. > > >> So there is value in leaving the choice of an actual type to the user of > > >> the > > >> API. > > > No, if the user has a pointer to struct acpi_device already, there is > > > no difference between passing this and passing the acpi_handle from it > > > except for the extra dereference in the latter case. > > > > Dereference would happen anyway in the wrapper, and it doesn't cause > > any harm anyway for readability in my opinion. And of course you don't > > have to use acpi device at all, you can use ACPI_HANDLE() macro. > > So one can use ACPI_COMPANION() just as well and it is slightly less overhead. > > > > > > > If the user doesn't have a struct acpi_device pointer, let them use > > > the raw ACPICA handler directly and worry about the synchronization > > > themselves. > > > > As mentioned acpi_device pointer is not really required to use the wrapper. > > Instead we can use ACPI_HANDLE() macro directly. Look at the usage of > > the wrapper in the AC driver [1]. > > You don't really have to repeat the same argument several times and I > know how ACPI_HANDLE() works. Also I don't like some of the things > done by this patch. > > Whoever uses ACPI_HANDLE(), they also use ACPI_COMPANION() which is > hidden in the former. > > If they don't need to store either the acpi_handle or the struct > acpi_device pointer, there is no reason at all to use the former > instead of the latter. > > If they get an acpi_handle from somewhere else than ACPI_HANDLE(), > then yes, they would need to get the ACPI devices from there (which is > possible still), but they may be better off by using
Re: [PATCH] bus: mhi: host: Add tracing support
Hi Krishna, kernel test robot noticed the following build warnings: [auto build test WARNING on 3006adf3be79cde4d14b1800b963b82b6e5572e0] url: https://github.com/intel-lab-lkp/linux/commits/Krishna-chaitanya-chundru/bus-mhi-host-Add-tracing-support/20231005-231430 base: 3006adf3be79cde4d14b1800b963b82b6e5572e0 patch link: https://lore.kernel.org/r/20231005-ftrace_support-v1-1-23a2f394fa49%40quicinc.com patch subject: [PATCH] bus: mhi: host: Add tracing support config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20231006/202310060033.z0ojejxe-...@intel.com/config) compiler: m68k-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231006/202310060033.z0ojejxe-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202310060033.z0ojejxe-...@intel.com/ All warnings (new ones prefixed by >>): drivers/bus/mhi/host/main.c: In function 'mhi_process_ctrl_ev_ring': >> drivers/bus/mhi/host/main.c:834:74: warning: cast from pointer to integer of >> different size [-Wpointer-to-int-cast] 834 | trace_mhi_process_ctrl_ev_ring(mhi_cntrl->mhi_dev->name, (u64)(local_rp), | ^ drivers/bus/mhi/host/main.c: In function 'mhi_gen_tre': drivers/bus/mhi/host/main.c:1245:69: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] 1245 | trace_mhi_gen_tre(mhi_cntrl->mhi_dev->name, mhi_chan->chan, (u64)(mhi_tre), | ^ -- drivers/bus/mhi/host/pm.c: In function 'mhi_pm_st_worker': >> drivers/bus/mhi/host/pm.c:758:24: warning: unused variable 'dev' >> [-Wunused-variable] 758 | struct device *dev = _cntrl->mhi_dev->dev; |^~~ vim +834 drivers/bus/mhi/host/main.c 799 800 int mhi_process_ctrl_ev_ring(struct mhi_controller *mhi_cntrl, 801 struct mhi_event *mhi_event, 802 u32 event_quota) 803 { 804 struct mhi_ring_element *dev_rp, *local_rp; 805 struct mhi_ring *ev_ring = _event->ring; 806 struct mhi_event_ctxt *er_ctxt = 807 _cntrl->mhi_ctxt->er_ctxt[mhi_event->er_index]; 808 struct mhi_chan *mhi_chan; 809 struct device *dev = _cntrl->mhi_dev->dev; 810 u32 chan; 811 int count = 0; 812 dma_addr_t ptr = le64_to_cpu(er_ctxt->rp); 813 814 /* 815 * This is a quick check to avoid unnecessary event processing 816 * in case MHI is already in error state, but it's still possible 817 * to transition to error state while processing events 818 */ 819 if (unlikely(MHI_EVENT_ACCESS_INVALID(mhi_cntrl->pm_state))) 820 return -EIO; 821 822 if (!is_valid_ring_ptr(ev_ring, ptr)) { 823 dev_err(_cntrl->mhi_dev->dev, 824 "Event ring rp points outside of the event ring\n"); 825 return -EIO; 826 } 827 828 dev_rp = mhi_to_virtual(ev_ring, ptr); 829 local_rp = ev_ring->rp; 830 831 while (dev_rp != local_rp) { 832 enum mhi_pkt_type type = MHI_TRE_GET_EV_TYPE(local_rp); 833 > 834 > trace_mhi_process_ctrl_ev_ring(mhi_cntrl->mhi_dev->name, (u64)(local_rp), 835 local_rp->ptr, local_rp->dword[0], 836 local_rp->dword[1], 837 mhi_state_str(MHI_TRE_GET_EV_STATE(local_rp))); 838 839 switch (type) { 840 case MHI_PKT_TYPE_BW_REQ_EVENT: 841 { 842 struct mhi_link_info *link_info; 843 844 link_info = _cntrl->mhi_link_info; 845 write_lock_irq(_cntrl->pm_lock); 846 link_info->target_link_speed = 847 MHI_TRE_GET_EV_LINKSPEED(local_rp); 848 link_info->target_link_width = 849 MHI_TRE_GET_EV_LINKWIDTH(local_rp); 850 write_unlock_irq(_cntrl->pm_lock); 851 dev_dbg(dev, "Received BW_REQ event\n"); 852 m
[PATCH v7 13/13] selftests/sgx: Remove incomplete ABI sanitization code in test enclave
As the selftest enclave is *not* intended for production, simplify the code by not initializing CPU configuration registers as expected by the ABI on enclave entry or cleansing caller-save registers on enclave exit. Link: https://lore.kernel.org/all/da0cfb1e-e347-f7f2-ac72-aec0ee0d8...@intel.com/ Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen --- .../testing/selftests/sgx/test_encl_bootstrap.S | 16 +++- 1 file changed, 3 insertions(+), 13 deletions(-) diff --git a/tools/testing/selftests/sgx/test_encl_bootstrap.S b/tools/testing/selftests/sgx/test_encl_bootstrap.S index 28fe5d2ac0af..d8c4ac94e032 100644 --- a/tools/testing/selftests/sgx/test_encl_bootstrap.S +++ b/tools/testing/selftests/sgx/test_encl_bootstrap.S @@ -59,21 +59,11 @@ encl_entry_core: push%rcx # push the address after EENTER + # NOTE: as the selftest enclave is *not* intended for production, + # simplify the code by not initializing ABI registers on entry or + # cleansing caller-save registers on exit. callencl_body - /* Clear volatile GPRs, except RAX (EEXIT function). */ - xor %rcx, %rcx - xor %rdx, %rdx - xor %rdi, %rdi - xor %rsi, %rsi - xor %r8, %r8 - xor %r9, %r9 - xor %r10, %r10 - xor %r11, %r11 - - # Reset status flags. - add %rdx, %rdx # OF = SF = AF = CF = 0; ZF = PF = 1 - # Prepare EEXIT target by popping the address of the instruction after # EENTER to RBX. pop %rbx -- 2.25.1
[PATCH v7 10/13] selftests/sgx: Ensure test enclave buffer is entirely preserved
Attach the "used" attribute to instruct the compiler to preserve the static encl_buffer, even if it appears it is not entirely referenced in the enclave code, as expected by the external tests manipulating page permissions. Link: https://lore.kernel.org/all/a2732938-f3db-a0af-3d68-a18060f66...@cs.kuleuven.be/ Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen Acked-by: Kai Huang --- tools/testing/selftests/sgx/defines.h | 1 + tools/testing/selftests/sgx/test_encl.c | 9 + 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h index d8587c971941..b8f482667ce1 100644 --- a/tools/testing/selftests/sgx/defines.h +++ b/tools/testing/selftests/sgx/defines.h @@ -13,6 +13,7 @@ #define __aligned(x) __attribute__((__aligned__(x))) #define __packed __attribute__((packed)) +#define __used __attribute__((used)) #include "../../../../arch/x86/include/asm/sgx.h" #include "../../../../arch/x86/include/asm/enclu.h" diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c index 649604c526e7..7465f121fb74 100644 --- a/tools/testing/selftests/sgx/test_encl.c +++ b/tools/testing/selftests/sgx/test_encl.c @@ -5,11 +5,12 @@ #include "defines.h" /* - * Data buffer spanning two pages that will be placed first in .data - * segment. Even if not used internally the second page is needed by - * external test manipulating page permissions. + * Data buffer spanning two pages that will be placed first in the .data + * segment. Even if not used internally the second page is needed by external + * test manipulating page permissions, so mark encl_buffer as "used" to make + * sure it is entirely preserved by the compiler. */ -static uint8_t encl_buffer[8192] = { 1 }; +static uint8_t __used encl_buffer[8192] = { 1 }; enum sgx_enclu_function { EACCEPT = 0x5, -- 2.25.1
Re: [PATCH v5] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms
On Wed, Sep 27, 2023 at 05:35:16PM +, Alessandro Carminati (Red Hat) wrote: > It is not uncommon for drivers or modules related to similar peripherals > to have symbols with the exact same name. > While this is not a problem for the kernel's binary itself, it becomes an > issue when attempting to trace or probe specific functions using > infrastructure like ftrace or kprobe. > > The tracing subsystem relies on the `nm -n vmlinux` output, which provides > symbol information from the kernel's ELF binary. However, when multiple > symbols share the same name, the standard nm output does not differentiate > between them. This can lead to confusion and difficulty when trying to > probe the intended symbol. > > ~ # cat /proc/kallsyms | grep " name_show" > 8c4f76d0 t name_show > 8c9cccb0 t name_show > 8cb0ac20 t name_show > 8cc728c0 t name_show > 8ce0efd0 t name_show > 8ce126c0 t name_show > 8ce1dd20 t name_show > 8ce24e70 t name_show > 8d1104c0 t name_show > 8d1fe480 t name_show One problem that remains as far as I can see is that this approach does not take into consideration that there can be duplicate symbols in the core kernel, but also between core kernel and loadable modules, and even between loadable modules. So, I think more is needed to also ensure that this approach of adding alias symbols is also done for loadable modules. Earlier work that cover all symbols (core kernel and loadable modules) was posted quite a while ago by Nick Alcock: https://lore.kernel.org/all/20221205163157.269335-1-nick.alc...@oracle.com/ It takes a different approach and adds in other info that is very useful for tracing, but unfortunately it has been dormant for a long time now. While module symbols are handled quite differently (for kallsyms) from the core kernel symbols, I think that a similar approach tied in with modpost ought to be quite possible. It will add to the size of modules because the data needs to be stored in the .ko but that is unavoidable. But not doing it unfortunately would mean that the duplicate symbol issue remains unresolved in the presence of loadable modules. > kas_alias addresses this challenge by enhancing symbol names with > meaningful suffixes generated from the source file and line number > during the kernel build process. > These newly generated aliases provide tracers with the ability to > comprehend the symbols they are interacting with when utilizing the > ftracefs interface. > This approach may also allow for the probing by name of previously > inaccessible symbols. > > ~ # cat /proc/kallsyms | grep gic_mask_irq > d15671e505ac t gic_mask_irq > d15671e505ac t gic_mask_irq@drivers_irqchip_irq_gic_c_167 > d15671e532a4 t gic_mask_irq > d15671e532a4 t gic_mask_irq@drivers_irqchip_irq_gic_v3_c_407 > ~ # In the same context as mentioned above (module symbols), I am hoping that the alias you generate might also be able to contain a module identifier name, much like the aforementioned patch series by Nick Alcock added. We have it for loadable modules already of course, but as has been discussed in relation to the earlier work, being able to associate a module name with a symbol regardless of whether that module is configured to be built into the kernel or whether it is configured to be a loadable module is helpful for tracing purposes. Especially for tracers that use tracing scripts that might get deployed on diverse systems where it cannot always be known at the time of developing the tracer scripts whether a kernel module is configured to be loadable or built-in. I'd be happy to work on something like this as a contribution to your work. I would envision the alias entry not needing to have the typical [module] added to it because that will already be annotated on the actual symbol entry. So, the alias could be extended to be something like: c0533720 t floppy_open [floppy] c0533720 t floppy_open@floppy:drivers_block_floppy_c_3988 (absence of a name: prefix to the path would indicate the symbol is not associated with any module) Doing this is more realistic now as a result of the clean-up patches that Nick introduced, e.g. https://lore.kernel.org/lkml/20230302211759.30135-1-nick.alc...@oracle.com/ > Changes from v1: > - Integrated changes requested by Masami to exclude symbols with prefixes > "_cfi" and "_pfx". > - Introduced a small framework to handle patterns that need to be excluded > from the alias production. > - Excluded other symbols using the framework. > - Introduced the ability to discriminate between text and data symbols. > - Added two new config symbols in this version: CONFIG_KALLSYMS_ALIAS_DATA, > which allows data for data, and CONFIG_KALLSYMS_ALIAS_DATA_ALL, which > excludes all filters and provides an alias for each duplicated symbol. > >
[PATCH v7 09/13] selftests/sgx: Fix linker script asserts
DEFINED only considers symbols, not section names. Hence, replace the check for .got.plt with the _GLOBAL_OFFSET_TABLE_ symbol and remove other (non-essential) asserts. Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen --- tools/testing/selftests/sgx/test_encl.lds | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/tools/testing/selftests/sgx/test_encl.lds b/tools/testing/selftests/sgx/test_encl.lds index 62d37160f59b..6ffdfc9fb4cf 100644 --- a/tools/testing/selftests/sgx/test_encl.lds +++ b/tools/testing/selftests/sgx/test_encl.lds @@ -35,8 +35,4 @@ SECTIONS } } -ASSERT(!DEFINED(.altinstructions), "ALTERNATIVES are not supported in enclaves") -ASSERT(!DEFINED(.altinstr_replacement), "ALTERNATIVES are not supported in enclaves") -ASSERT(!DEFINED(.discard.retpoline_safe), "RETPOLINE ALTERNATIVES are not supported in enclaves") -ASSERT(!DEFINED(.discard.nospec), "RETPOLINE ALTERNATIVES are not supported in enclaves") -ASSERT(!DEFINED(.got.plt), "Libcalls are not supported in enclaves") +ASSERT(!DEFINED(_GLOBAL_OFFSET_TABLE_), "Libcalls through GOT are not supported in enclaves") -- 2.25.1
[PATCH v7 07/13] selftests/sgx: Produce static-pie executable for test enclave
The current combination of -static and -fPIC creates a static executable with position-dependent addresses for global variables. Use -static-pie and -fPIE to create a proper static position independent executable that can be loaded at any address without a dynamic linker. When building the original "lea (encl_stack)(%rbx), %rax" assembly code with -static-pie -fPIE, the linker complains about a relocation it cannot resolve: /usr/local/bin/ld: /tmp/cchIWyfG.o: relocation R_X86_64_32S against `.data' can not be used when making a PIE object; recompile with -fPIE collect2: error: ld returned 1 exit status Thus, since only RIP-relative addressing is legit for local symbols, use "encl_stack(%rip)" and declare an explicit "__encl_base" symbol at the start of the linker script to be able to calculate the stack address relative to the current TCS in the enclave assembly entry code. Link: https://lore.kernel.org/all/f9c24d89-ed72-7d9e-c650-050d722c6...@cs.kuleuven.be/ Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen Acked-by: Kai Huang --- tools/testing/selftests/sgx/Makefile | 2 +- tools/testing/selftests/sgx/test_encl.lds | 1 + tools/testing/selftests/sgx/test_encl_bootstrap.S | 9 ++--- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/sgx/Makefile b/tools/testing/selftests/sgx/Makefile index 7eb890bdd3f0..8d2ba6adc92b 100644 --- a/tools/testing/selftests/sgx/Makefile +++ b/tools/testing/selftests/sgx/Makefile @@ -14,7 +14,7 @@ endif INCLUDES := -I$(top_srcdir)/tools/include HOST_CFLAGS := -Wall -Werror -g $(INCLUDES) -fPIC HOST_LDFLAGS := -z noexecstack -lcrypto -ENCL_CFLAGS += -Wall -Werror -static -nostdlib -ffreestanding -fPIC \ +ENCL_CFLAGS += -Wall -Werror -static-pie -nostdlib -ffreestanding -fPIE \ -fno-stack-protector -mrdrnd $(INCLUDES) ENCL_LDFLAGS := -Wl,-T,test_encl.lds,--build-id=none diff --git a/tools/testing/selftests/sgx/test_encl.lds b/tools/testing/selftests/sgx/test_encl.lds index a1ec64f7d91f..62d37160f59b 100644 --- a/tools/testing/selftests/sgx/test_encl.lds +++ b/tools/testing/selftests/sgx/test_encl.lds @@ -10,6 +10,7 @@ PHDRS SECTIONS { . = 0; +__encl_base = .; .tcs : { *(.tcs*) } : tcs diff --git a/tools/testing/selftests/sgx/test_encl_bootstrap.S b/tools/testing/selftests/sgx/test_encl_bootstrap.S index e0ce993d3f2c..28fe5d2ac0af 100644 --- a/tools/testing/selftests/sgx/test_encl_bootstrap.S +++ b/tools/testing/selftests/sgx/test_encl_bootstrap.S @@ -42,9 +42,12 @@ encl_entry: # RBX contains the base address for TCS, which is the first address # inside the enclave for TCS #1 and one page into the enclave for - # TCS #2. By adding the value of encl_stack to it, we get - # the absolute address for the stack. - lea (encl_stack)(%rbx), %rax + # TCS #2. First make it relative by substracting __encl_base and + # then add the address of encl_stack to get the address for the stack. + lea __encl_base(%rip), %rax + sub %rax, %rbx + lea encl_stack(%rip), %rax + add %rbx, %rax jmp encl_entry_core encl_dyn_entry: # Entry point for dynamically created TCS page expected to follow -- 2.25.1
[PATCH v7 06/13] selftests/sgx: Remove redundant enclave base address save/restore
Remove redundant push/pop pair that stores and restores the enclave base address in the test enclave, as it is never used after the pop and can anyway be easily retrieved via the __encl_base symbol. Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen Acked-by: Kai Huang --- tools/testing/selftests/sgx/test_encl_bootstrap.S | 3 --- 1 file changed, 3 deletions(-) diff --git a/tools/testing/selftests/sgx/test_encl_bootstrap.S b/tools/testing/selftests/sgx/test_encl_bootstrap.S index 03ae0f57e29d..e0ce993d3f2c 100644 --- a/tools/testing/selftests/sgx/test_encl_bootstrap.S +++ b/tools/testing/selftests/sgx/test_encl_bootstrap.S @@ -55,12 +55,9 @@ encl_entry_core: push%rax push%rcx # push the address after EENTER - push%rbx # push the enclave base address callencl_body - pop %rbx # pop the enclave base address - /* Clear volatile GPRs, except RAX (EEXIT function). */ xor %rcx, %rcx xor %rdx, %rdx -- 2.25.1
[PATCH v7 04/13] selftests/sgx: Separate linker options
Fixes "'linker' input unused [-Wunused-command-line-argument]" errors when compiling with clang. Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen --- tools/testing/selftests/sgx/Makefile | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/sgx/Makefile b/tools/testing/selftests/sgx/Makefile index 50aab6b57da3..dcdd04b322f8 100644 --- a/tools/testing/selftests/sgx/Makefile +++ b/tools/testing/selftests/sgx/Makefile @@ -12,9 +12,11 @@ OBJCOPY := $(CROSS_COMPILE)objcopy endif INCLUDES := -I$(top_srcdir)/tools/include -HOST_CFLAGS := -Wall -Werror -g $(INCLUDES) -fPIC -z noexecstack -ENCL_CFLAGS := -Wall -Werror -static -nostdlib -nostartfiles -fPIC \ +HOST_CFLAGS := -Wall -Werror -g $(INCLUDES) -fPIC +HOST_LDFLAGS := -z noexecstack -lcrypto +ENCL_CFLAGS += -Wall -Werror -static -nostdlib -nostartfiles -fPIC \ -fno-stack-protector -mrdrnd $(INCLUDES) +ENCL_LDFLAGS := -Wl,-T,test_encl.lds,--build-id=none TEST_CUSTOM_PROGS := $(OUTPUT)/test_sgx TEST_FILES := $(OUTPUT)/test_encl.elf @@ -28,7 +30,7 @@ $(OUTPUT)/test_sgx: $(OUTPUT)/main.o \ $(OUTPUT)/sigstruct.o \ $(OUTPUT)/call.o \ $(OUTPUT)/sign_key.o - $(CC) $(HOST_CFLAGS) -o $@ $^ -lcrypto + $(CC) $(HOST_CFLAGS) -o $@ $^ $(HOST_LDFLAGS) $(OUTPUT)/main.o: main.c $(CC) $(HOST_CFLAGS) -c $< -o $@ @@ -45,8 +47,8 @@ $(OUTPUT)/call.o: call.S $(OUTPUT)/sign_key.o: sign_key.S $(CC) $(HOST_CFLAGS) -c $< -o $@ -$(OUTPUT)/test_encl.elf: test_encl.lds test_encl.c test_encl_bootstrap.S - $(CC) $(ENCL_CFLAGS) -T $^ -o $@ -Wl,--build-id=none +$(OUTPUT)/test_encl.elf: test_encl.c test_encl_bootstrap.S + $(CC) $(ENCL_CFLAGS) $^ -o $@ $(ENCL_LDFLAGS) EXTRA_CLEAN := \ $(OUTPUT)/test_encl.elf \ -- 2.25.1
[PATCH v7 02/13] selftests/sgx: Fix uninitialized pointer dereferences in encl_get_entry
Ensure sym_tab and sym_names are zero-initialized and add an early-out condition in the unlikely (erroneous) case that the enclave ELF file would not contain a symbol table. This addresses -Werror=maybe-uninitialized compiler warnings for gcc -O2. Fixes: 33c5aac3bf32 ("selftests/sgx: Test complete changing of page type flow") Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen --- tools/testing/selftests/sgx/load.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c index 94bdeac1cf04..c9f658e44de6 100644 --- a/tools/testing/selftests/sgx/load.c +++ b/tools/testing/selftests/sgx/load.c @@ -136,11 +136,11 @@ static bool encl_ioc_add_pages(struct encl *encl, struct encl_segment *seg) */ uint64_t encl_get_entry(struct encl *encl, const char *symbol) { + Elf64_Sym *symtab = NULL; + char *sym_names = NULL; Elf64_Shdr *sections; - Elf64_Sym *symtab; Elf64_Ehdr *ehdr; - char *sym_names; - int num_sym; + int num_sym = 0; int i; ehdr = encl->bin; @@ -161,6 +161,9 @@ uint64_t encl_get_entry(struct encl *encl, const char *symbol) } } + if (!symtab || !sym_names) + return 0; + for (i = 0; i < num_sym; i++) { Elf64_Sym *sym = [i]; -- 2.25.1
[PATCH v7 08/13] selftests/sgx: Handle relocations in test enclave
Static-pie binaries normally include a startup routine to perform any ELF relocations from .rela.dyn. Since the enclave loading process is different and glibc is not included, do the necessary relocation for encl_op_array entries manually at runtime relative to the enclave base to ensure correct function pointers. When keeping encl_op_array as a local variable on the stack, gcc without optimizations generates code that explicitly gets the right function addresses and stores them to create the array on the stack: encl_body: /* snipped */ leado_encl_op_put_to_buf(%rip), %rax mov%rax, -0x50(%rbp) leado_encl_op_get_from_buf(%rip), %rax mov%rax,-0x48(%rbp) leado_encl_op_put_to_addr(%rip), %rax /* snipped */ However, gcc -Os or clang generate more efficient code that initializes encl_op_array by copying a "prepared copy" containing the absolute addresses of the functions (i.e., relative to the image base starting from 0) generated by the compiler/linker: encl_body: /* snipped */ leaprepared_copy(%rip), %rsi lea-0x48(%rsp), %rdi mov$0x10,%ecx rep movsl %ds:(%rsi),%es:(%rdi) /* snipped */ When building the enclave with -static-pie, the compiler/linker includes relocation entries for the function symbols in the "prepared copy": Relocation section '.rela.dyn' at offset 0x4000 contains 12 entries: Offset Info Type Symbol /* snipped; "prepared_copy" starts at 0x6000 */ 6000 0008 R_X86_64_RELATIVE 6008 0008 R_X86_64_RELATIVE 6010 0008 R_X86_64_RELATIVE 6018 0008 R_X86_64_RELATIVE 6020 0008 R_X86_64_RELATIVE 6028 0008 R_X86_64_RELATIVE 6030 0008 R_X86_64_RELATIVE 6038 0008 R_X86_64_RELATIVE Static-pie binaries normally include a glibc "_dl_relocate_static_pie" routine that will perform these relocations as part of the startup. However, since the enclave loading process is different and glibc is not included, we cannot rely on these relocations to be performed. Without relocations, the code would erroneously jump to the _absolute_ function address loaded from the local copy. Thus, declare "encl_op_array" as global and manually relocate the loaded function-pointer entries relative to the enclave base at runtime. This generates the following code: encl_body: /* snipped */ leaencl_op_array(%rip), %rcx lea__encl_base(%rip), %rax add(%rcx,%rdx,8),%rax jmp*%rax Link: https://lore.kernel.org/all/150d8ca8-2c66-60d1-f9fc-8e6279824...@cs.kuleuven.be/ Link: https://lore.kernel.org/all/5c22de5a-4b3b-1f38-9771-409b4ec7f...@cs.kuleuven.be/#r Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen Acked-by: Kai Huang --- tools/testing/selftests/sgx/test_encl.c | 50 + 1 file changed, 35 insertions(+), 15 deletions(-) diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c index ae791df3e5a5..649604c526e7 100644 --- a/tools/testing/selftests/sgx/test_encl.c +++ b/tools/testing/selftests/sgx/test_encl.c @@ -121,21 +121,41 @@ static void do_encl_op_nop(void *_op) } +/* + * Symbol placed at the start of the enclave image by the linker script. + * Declare this extern symbol with visibility "hidden" to ensure the compiler + * does not access it through the GOT and generates position-independent + * addressing as __encl_base(%rip), so we can get the actual enclave base + * during runtime. + */ +extern const uint8_t __attribute__((visibility("hidden"))) __encl_base; + +typedef void (*encl_op_t)(void *); +static const encl_op_t encl_op_array[ENCL_OP_MAX] = { + do_encl_op_put_to_buf, + do_encl_op_get_from_buf, + do_encl_op_put_to_addr, + do_encl_op_get_from_addr, + do_encl_op_nop, + do_encl_eaccept, + do_encl_emodpe, + do_encl_init_tcs_page, +}; + void encl_body(void *rdi, void *rsi) { - const void (*encl_op_array[ENCL_OP_MAX])(void *) = { - do_encl_op_put_to_buf, - do_encl_op_get_from_buf, - do_encl_op_put_to_addr, - do_encl_op_get_from_addr, - do_encl_op_nop, - do_encl_eaccept, - do_encl_emodpe, - do_encl_init_tcs_page, - }; - - struct encl_op_header *op = (struct encl_op_header *)rdi; - - if (op->type < ENCL_OP_MAX) - (*encl_op_array[op->type])(op); + struct encl_op_header *header = (struct encl_op_header *)rdi; + encl_op_t op; + + if (header->type >= ENCL_OP_MAX) + return; + + /* +* The enclave base address needs to be added, as this call site +* *cannot be* made rip-relative by the compiler, or fixed up by +* any other possible means. +*/ + op = ((uint64_t)&__encl_base) +
[PATCH v7 03/13] selftests/sgx: Include memory clobber for inline asm in test enclave
Add the "memory" clobber to the EMODPE and EACCEPT asm blocks to tell the compiler the assembly code accesses to the secinfo struct. This ensures the compiler treats the asm block as a memory barrier and the write to secinfo will be visible to ENCLU. Fixes: 20404a808593 ("selftests/sgx: Add test for EPCM permission changes") Signed-off-by: Jo Van Bulck Reviewed-by: Kai Huang Reviewed-by: Jarkko Sakkinen --- tools/testing/selftests/sgx/test_encl.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c index c0d6397295e3..ae791df3e5a5 100644 --- a/tools/testing/selftests/sgx/test_encl.c +++ b/tools/testing/selftests/sgx/test_encl.c @@ -24,10 +24,11 @@ static void do_encl_emodpe(void *_op) secinfo.flags = op->flags; asm volatile(".byte 0x0f, 0x01, 0xd7" - : + : /* no outputs */ : "a" (EMODPE), "b" (), - "c" (op->epc_addr)); + "c" (op->epc_addr) + : "memory" /* read from secinfo pointer */); } static void do_encl_eaccept(void *_op) @@ -42,7 +43,8 @@ static void do_encl_eaccept(void *_op) : "=a" (rax) : "a" (EACCEPT), "b" (), - "c" (op->epc_addr)); + "c" (op->epc_addr) + : "memory" /* read from secinfo pointer */); op->ret = rax; } -- 2.25.1
[PATCH v7 12/13] selftests/sgx: Discard unsupported ELF sections
Building the test enclave with -static-pie may produce a dynamic symbol table, but this is not supported for enclaves and any relocations need to happen manually (e.g., as for "encl_op_array"). Thus, opportunistically discard ".dyn*" and ".gnu.hash" which the enclave loader cannot handle. Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen --- tools/testing/selftests/sgx/test_encl.lds | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/testing/selftests/sgx/test_encl.lds b/tools/testing/selftests/sgx/test_encl.lds index 333a3e78fdc9..ffe851a1cac4 100644 --- a/tools/testing/selftests/sgx/test_encl.lds +++ b/tools/testing/selftests/sgx/test_encl.lds @@ -33,6 +33,8 @@ SECTIONS *(.note*) *(.debug*) *(.eh_frame*) + *(.dyn*) + *(.gnu.hash) } } -- 2.25.1
[PATCH v7 00/13] selftests/sgx: Fix compilation errors
Hi, This patch series ensures that all SGX selftests succeed when compiling with optimizations (as tested with -O{0,1,2,3,s} for both gcc 11.3.0 and clang 14.0.0). The aim of the patches is to avoid reliance on undefined, compiler-specific behavior that can make the test results fragile. As far as I see, all commits in this series now have an explicit reviewed-by tag, so hopefully this can get merged upstream? Please let me know if any concerns remain and I'd happily address them. Reference output below: .. Testing gcc -O0[OK] .. Testing gcc -O1[OK] .. Testing gcc -O2[OK] .. Testing gcc -O3[OK] .. Testing gcc -Os[OK] .. Testing gcc -Ofast [OK] .. Testing gcc -Og[OK] .. Testing clang -O0[OK] .. Testing clang -O1[OK] .. Testing clang -O2[OK] .. Testing clang -O3[OK] .. Testing clang -Os[OK] .. Testing clang -Ofast [OK] .. Testing clang -Og[OK] Changelog - v7 - Add reviewed-by tag (Jarkko) v6 - Collect final ack/reviewed-by tags (Jarkko, Kai) v5 - Reorder patches (Jarkko, Kai) - Include fixes tag for inline asm memory clobber patch (Kai) - Include linker error in static-pie commit message (Kai) - Include generated assembly in relocations commit (Kai) v4 - Remove redundant -nostartfiles compiler flag (Jarkko) - Split dynamic symbol table removal in separate commit (Kai) - Split redundant push/pop elimination in separate commit (Kai) - Remove (incomplete) register cleansing on enclave exit - Fix possibly uninitialized pointer dereferences in load.c v3 - Refactor encl_op_array declaration and indexing (Jarkko) - Annotate encl_buffer with "used" attribute (Kai) - Split encl_buffer size and placement commits (Kai) v2 - Add additional check for NULL pointer (Kai) - Refine to produce proper static-pie executable - Fix linker script assertions - Specify memory clobber for inline asm instead of volatile (Kai) - Clarify why encl_buffer non-static (Jarkko, Kai) - Clarify -ffreestanding (Jarkko) Best, Jo Jo Van Bulck (13): selftests/sgx: Fix uninitialized pointer dereference in error path selftests/sgx: Fix uninitialized pointer dereferences in encl_get_entry selftests/sgx: Include memory clobber for inline asm in test enclave selftests/sgx: Separate linker options selftests/sgx: Specify freestanding environment for enclave compilation selftests/sgx: Remove redundant enclave base address save/restore selftests/sgx: Produce static-pie executable for test enclave selftests/sgx: Handle relocations in test enclave selftests/sgx: Fix linker script asserts selftests/sgx: Ensure test enclave buffer is entirely preserved selftests/sgx: Ensure expected location of test enclave buffer selftests/sgx: Discard unsupported ELF sections selftests/sgx: Remove incomplete ABI sanitization code in test enclave tools/testing/selftests/sgx/Makefile | 12 ++-- tools/testing/selftests/sgx/defines.h | 2 + tools/testing/selftests/sgx/load.c| 9 ++- tools/testing/selftests/sgx/sigstruct.c | 5 +- tools/testing/selftests/sgx/test_encl.c | 67 +-- tools/testing/selftests/sgx/test_encl.lds | 10 +-- .../selftests/sgx/test_encl_bootstrap.S | 28 +++- 7 files changed, 77 insertions(+), 56 deletions(-) -- 2.25.1
[PATCH v7 01/13] selftests/sgx: Fix uninitialized pointer dereference in error path
Ensure ctx is zero-initialized, such that the encl_measure function will not call EVP_MD_CTX_destroy with an uninitialized ctx pointer in case of an early error during key generation. Fixes: 2adcba79e69d ("selftests/x86: Add a selftest for SGX") Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen Acked-by: Kai Huang --- tools/testing/selftests/sgx/sigstruct.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/sgx/sigstruct.c b/tools/testing/selftests/sgx/sigstruct.c index a07896a46364..d73b29becf5b 100644 --- a/tools/testing/selftests/sgx/sigstruct.c +++ b/tools/testing/selftests/sgx/sigstruct.c @@ -318,9 +318,9 @@ bool encl_measure(struct encl *encl) struct sgx_sigstruct *sigstruct = >sigstruct; struct sgx_sigstruct_payload payload; uint8_t digest[SHA256_DIGEST_LENGTH]; + EVP_MD_CTX *ctx = NULL; unsigned int siglen; RSA *key = NULL; - EVP_MD_CTX *ctx; int i; memset(sigstruct, 0, sizeof(*sigstruct)); @@ -384,7 +384,8 @@ bool encl_measure(struct encl *encl) return true; err: - EVP_MD_CTX_destroy(ctx); + if (ctx) + EVP_MD_CTX_destroy(ctx); RSA_free(key); return false; } -- 2.25.1
[PATCH v7 05/13] selftests/sgx: Specify freestanding environment for enclave compilation
Use -ffreestanding to assert the enclave compilation targets a freestanding environment (i.e., without "main" or standard libraries). This fixes clang reporting "undefined reference to `memset'" after erroneously optimizing away the provided memset/memcpy implementations. Still need to instruct the linker from using standard system startup functions, but drop -nostartfiles as it is implied by -nostdlib. Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen Acked-by: Kai Huang --- tools/testing/selftests/sgx/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/sgx/Makefile b/tools/testing/selftests/sgx/Makefile index dcdd04b322f8..7eb890bdd3f0 100644 --- a/tools/testing/selftests/sgx/Makefile +++ b/tools/testing/selftests/sgx/Makefile @@ -14,7 +14,7 @@ endif INCLUDES := -I$(top_srcdir)/tools/include HOST_CFLAGS := -Wall -Werror -g $(INCLUDES) -fPIC HOST_LDFLAGS := -z noexecstack -lcrypto -ENCL_CFLAGS += -Wall -Werror -static -nostdlib -nostartfiles -fPIC \ +ENCL_CFLAGS += -Wall -Werror -static -nostdlib -ffreestanding -fPIC \ -fno-stack-protector -mrdrnd $(INCLUDES) ENCL_LDFLAGS := -Wl,-T,test_encl.lds,--build-id=none -- 2.25.1
[PATCH v7 11/13] selftests/sgx: Ensure expected location of test enclave buffer
The external tests manipulating page permissions expect encl_buffer to be placed at the start of the test enclave's .data section. As this is not guaranteed per the C standard, explicitly place encl_buffer in a separate section that is explicitly placed at the start of the .data segment in the linker script to avoid the compiler placing it somewhere else in .data. Signed-off-by: Jo Van Bulck Reviewed-by: Jarkko Sakkinen Acked-by: Kai Huang --- tools/testing/selftests/sgx/defines.h | 1 + tools/testing/selftests/sgx/test_encl.c | 8 tools/testing/selftests/sgx/test_encl.lds | 1 + 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h index b8f482667ce1..402f8787a71c 100644 --- a/tools/testing/selftests/sgx/defines.h +++ b/tools/testing/selftests/sgx/defines.h @@ -14,6 +14,7 @@ #define __aligned(x) __attribute__((__aligned__(x))) #define __packed __attribute__((packed)) #define __used __attribute__((used)) +#define __section(x)__attribute__((__section__(x))) #include "../../../../arch/x86/include/asm/sgx.h" #include "../../../../arch/x86/include/asm/enclu.h" diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c index 7465f121fb74..2c4d709cce2d 100644 --- a/tools/testing/selftests/sgx/test_encl.c +++ b/tools/testing/selftests/sgx/test_encl.c @@ -6,11 +6,11 @@ /* * Data buffer spanning two pages that will be placed first in the .data - * segment. Even if not used internally the second page is needed by external - * test manipulating page permissions, so mark encl_buffer as "used" to make - * sure it is entirely preserved by the compiler. + * segment via the linker script. Even if not used internally the second page + * is needed by external test manipulating page permissions, so mark + * encl_buffer as "used" to make sure it is entirely preserved by the compiler. */ -static uint8_t __used encl_buffer[8192] = { 1 }; +static uint8_t __used __section(".data.encl_buffer") encl_buffer[8192] = { 1 }; enum sgx_enclu_function { EACCEPT = 0x5, diff --git a/tools/testing/selftests/sgx/test_encl.lds b/tools/testing/selftests/sgx/test_encl.lds index 6ffdfc9fb4cf..333a3e78fdc9 100644 --- a/tools/testing/selftests/sgx/test_encl.lds +++ b/tools/testing/selftests/sgx/test_encl.lds @@ -24,6 +24,7 @@ SECTIONS } : text .data : { + *(.data.encl_buffer) *(.data*) } : data -- 2.25.1
Re: (subset) [PATCH v2 1/2] dt-bindings: mfd: qcom,spmi-pmic: Drop unused labels from examples
On Mon, 02 Oct 2023 09:00:11 +0200, Luca Weiss wrote: > There's not much point in having unused labels in the binding example, > so drop them. > > This patch was originally motivated by ea25d61b448a ("arm64: dts: qcom: > Use plural _gpios node label for PMIC gpios") updating all dts files to > use the plural _gpios label instead of the singular _gpio as label but > this example wasn't updated. But since we should just drop the label > alltogether, do that. > > [...] Applied, thanks! [1/2] dt-bindings: mfd: qcom,spmi-pmic: Drop unused labels from examples commit: cac94656ff2b16827d7cd455f0d3746280cf3138 -- Lee Jones [李琼斯]
Re: [PATCH v1 1/9] ACPI: bus: Make notify wrappers more generic
On Thu, Oct 5, 2023 at 2:05 PM Wilczynski, Michal wrote: > On 10/5/2023 12:57 PM, Rafael J. Wysocki wrote: > > On Thu, Oct 5, 2023 at 10:10 AM Wilczynski, Michal > > wrote: [cut] > >>> > >>> That said, why exactly is it better to use acpi_handle instead of a > >>> struct acpi_device pointer? > >> I wanted to make the wrapper as close as possible to the wrapped function. > >> This way it would be easier to remove it in the future i.e if we ever deem > >> extra synchronization not worth it etc. What the ACPICA function need to > >> install a wrapper is a handle not a pointer to a device. > >> So there is no need for a middle man. > > Taking a struct acpi_device pointer as the first argument is part of > > duplication reduction, however, because in the most common case it > > saves the users of it the need to dereference the struct acpi_device > > they get from ACPI_COMPANION() in order to obtain the handle. > > User don't even have to use acpi device anywhere, as he can choose > to use ACPI_HANDLE() instead on 'struct device*' and never interact > with acpi device directly. Have you actually looked at this macro? It is a wrapper around ACPI_COMPANION(). So they may think that they don't use struct acpi_device pointers, but in fact they do. > > > > Arguably, acpi_handle is an ACPICA concept and it is better to reduce > > its usage outside ACPICA. > > Use of acpi_handle is deeply entrenched in the kernel. There is even > a macro ACPI_HANDLE() that returns acpi_handle. I would say it's > way too late to limit it to ACPICA internal code. So there is a difference between "limiting to ACPICA" and "reducing". It cannot be limited to ACPICA, because the code outside ACPICA needs to evaluate ACPI objects sometimes and ACPI handles are needed for that. And this observation doesn't invalidate the point. > > > >>> Realistically, in a platform driver you'll need the latter to obtain > >>> the former anyway. > >> I don't want to introduce arbitrary limitations where they are not > >> necessary. > > I'm not sure what you mean. This patch is changing existing functions. > > That's true, but those functions aren't yet deeply entrenched in the > kernel yet, so in my view how they should look like should still be > a subject for discussion, as for now they're only used locally in > drivers/acpi, and my next patchset, that would remove .notify in > platform directory would spread them more, and would > make them harder to change. For now we can change how they > work pretty painlessly. I see no particular reason to do that, though. What specifically is a problem with passing struct acpi_device pointers to the wrappers? I don't see any TBH. > > > >> It is often the case that driver allocates it's own private struct using > >> kmalloc > >> family of functions, and that structure already contains everything that is > >> needed to remove the handler, so why force ? There are already examples > >> in the drivers that do that i.e in acpi_video the function > >> acpi_video_dev_add_notify_handler() uses raw ACPICA handler to install > >> a notify handler and it passes private structure there. > >> So there is value in leaving the choice of an actual type to the user of > >> the > >> API. > > No, if the user has a pointer to struct acpi_device already, there is > > no difference between passing this and passing the acpi_handle from it > > except for the extra dereference in the latter case. > > Dereference would happen anyway in the wrapper, and it doesn't cause > any harm anyway for readability in my opinion. And of course you don't > have to use acpi device at all, you can use ACPI_HANDLE() macro. So one can use ACPI_COMPANION() just as well and it is slightly less overhead. > > > > If the user doesn't have a struct acpi_device pointer, let them use > > the raw ACPICA handler directly and worry about the synchronization > > themselves. > > As mentioned acpi_device pointer is not really required to use the wrapper. > Instead we can use ACPI_HANDLE() macro directly. Look at the usage of > the wrapper in the AC driver [1]. You don't really have to repeat the same argument several times and I know how ACPI_HANDLE() works. Also I don't like some of the things done by this patch. Whoever uses ACPI_HANDLE(), they also use ACPI_COMPANION() which is hidden in the former. If they don't need to store either the acpi_handle or the struct acpi_device pointer, there is no reason at all to use the former instead of the latter. If they get an acpi_handle from somewhere else than ACPI_HANDLE(), then yes, they would need to get the ACPI devices from there (which is possible still), but they may be better off by using the raw ACPICA interface for events in that case. > -static void acpi_ac_remove(struct acpi_device *device) > +static void acpi_ac_remove(struct platform_device *pdev) > { > - struct acpi_ac *ac = acpi_driver_data(device); > + struct acpi_ac *ac = platform_get_drvdata(pdev); > > -
[PATCH rebased] kbuild: rpm-pkg: Fix build with non-default MODLIB
The default MODLIB value is composed of two variables and the hardcoded string '/lib/modules/'. MODLIB = $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE) Defining this middle part as a variable was rejected on the basis that users can pass the whole MODLIB to make, such as make 'MODLIB=$(INSTALL_MOD_PATH)/usr/lib/modules/$(KERNELRELEASE)' However, this middle part of MODLIB is independently hardcoded by rpm-pkg, and when the user alters MODLIB this is not reflected when building the package. Given that $(INSTALL_MOD_PATH) is overridden during the rpm package build it is likely going to be empty. Then MODLIB can be passed to the rpm package, and used in place of the whole /usr/lib/modules/$(KERNELRELEASE) part. Signed-off-by: Michal Suchanek --- scripts/package/kernel.spec | 8 scripts/package/mkspec | 1 + 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/scripts/package/kernel.spec b/scripts/package/kernel.spec index 3eee0143e0c5..15f49c5077db 100644 --- a/scripts/package/kernel.spec +++ b/scripts/package/kernel.spec @@ -67,7 +67,7 @@ cp $(%{make} %{makeflags} -s image_name) %{buildroot}/boot/vmlinuz-%{KERNELRELEA %{make} %{makeflags} INSTALL_HDR_PATH=%{buildroot}/usr headers_install cp System.map %{buildroot}/boot/System.map-%{KERNELRELEASE} cp .config %{buildroot}/boot/config-%{KERNELRELEASE} -ln -fns /usr/src/kernels/%{KERNELRELEASE} %{buildroot}/lib/modules/%{KERNELRELEASE}/build +ln -fns /usr/src/kernels/%{KERNELRELEASE} %{buildroot}%{MODLIB}/build %if %{with_devel} %{make} %{makeflags} run-command KBUILD_RUN_COMMAND='${srctree}/scripts/package/install-extmod-build %{buildroot}/usr/src/kernels/%{KERNELRELEASE}' %endif @@ -98,8 +98,8 @@ fi %files %defattr (-, root, root) -/lib/modules/%{KERNELRELEASE} -%exclude /lib/modules/%{KERNELRELEASE}/build +%{MODLIB} +%exclude %{MODLIB}/build /boot/* %files headers @@ -110,5 +110,5 @@ fi %files devel %defattr (-, root, root) /usr/src/kernels/%{KERNELRELEASE} -/lib/modules/%{KERNELRELEASE}/build +%{MODLIB}/build %endif diff --git a/scripts/package/mkspec b/scripts/package/mkspec index d41608efb747..d41b2e5304ac 100755 --- a/scripts/package/mkspec +++ b/scripts/package/mkspec @@ -18,6 +18,7 @@ fi cat<
Re: [PATCH v5 09/18] x86/sgx: Store struct sgx_encl when allocating new VA pages
On Wed, 2023-10-04 at 23:22 -0500, Haitao Huang wrote: > On Wed, 04 Oct 2023 16:13:41 -0500, Huang, Kai wrote: > > > On Wed, 2023-10-04 at 10:03 -0500, Haitao Huang wrote: > > > On Tue, 03 Oct 2023 15:07:42 -0500, Huang, Kai > > > wrote: > > > > > > > On Tue, 2023-10-03 at 01:45 -0500, Haitao Huang wrote: > > > > > > > > > > > > Btw, probably a dumb question: > > > > > > > > > > > > Theoretically if you only need to find a victim enclave you don't > > > need > > > > > > to put VA > > > > > > pages to the unreclaimable list, because those VA pages will be > > > freed > > > > > > anyway > > > > > > when enclave is killed. So keeping VA pages in the list is for> > > > > > accounting all > > > > > > the pages that the cgroup is having? > > > > > > > > > > Yes basically tracking them in cgroups as they are allocated. > > > > > > > > > > VAs and SECS may also come and go as swapping/unswapping happens. > > > But > > > > > if acgroup is OOM, and all reclaimables are gone (swapped out), it'd > > > > > have toreclaim VAs/SECs in the same cgroup starting from the front > > > of > > > > > the LRUlist. To reclaim a VA/SECS, it identifies the enclave from > > > the > > > > > owner ofthe VA/SECS page and kills it, as killing enclave is the > > > only > > > > > way toreclaim VA/SECS pages. > > > > > > > > To kill enclave you just need to track SECS in the unreclaimable > > > list. > > > > Only when you want to account the total EPC pages via some list you > > > > _probably_ > > > > need to track VA as well. But I am not quite sure about this either. > > > > > > There is a case where even SECS is paged out for an enclave with all > > > reclaimables out. > > > > Yes. But this essentially means these enclaves are not active, thus > > shouldn't > > be the victim of OOM? > > > > But there are VA pages for the enclave at that moment. So it can be > candidate for OOM victim. Yes. I am not familiar with how does OOM choose victim, but it seems choosing inactive enclaves seems more reasonable. [...] > > > There were some discussion on paging out VAs without killing enclaves > > > but > > > it'd be complicated and not implemented yet. > > > > No we don't involve swapping VA pages now. It's a separate topic. > > > Only mentioned it as a kind of constraints impacting current design. > > Another potential alternative: we don't reclaim SECS either until OOM and > only track SECS pages for cgroups. But that would change current behavior. > And I'm not sure about other consequences, e.g., enclaves theoretically > can allocate pages (including VA pages) in different cgroups/processes, so > we may still end up tracking all VA pages for cgroups or we track SECS > page in all cgroups in which enclave allocated any pages. Let me know your > thoughts. Let's not change current behaviour. I seriously doubt that is needed. So it seems to me that what we need is just some way to let the OOM find some victim enclave. I am not sure whether "tracking EPC pages in some lists" has anything to do with cgroup accounting EPC pages, so will take a look the rest of the patches.
Re: [PATCH 2/2] arm64: dts: qcom: msm8939-huawei-kiwi: Add initial device tree
>> Is this fine? >> Should I send a V2 with the signoff and reserved-memory changes? >I don't quite get it, what signoff? Krzysztof noticed that the Signed-off-by are in the wrong order. But I was told this alone is not worth a V2.
Re: [PATCH v5 13/18] x86/sgx: Expose sgx_reclaim_pages() for use by EPC cgroup
On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > From: Sean Christopherson > > Adjust and expose the top-level reclaim function as > sgx_reclaim_epc_pages() for use by the upcoming EPC cgroup, which will > initiate reclaim to enforce the max limit. > > Make these adjustments to the function signature. > > 1) To take a parameter that specifies the number of pages to scan for > reclaiming. Define a max value of 32, but scan 16 in the case for the > global reclaimer (ksgxd). The EPC cgroup will use it to specify a > desired number of pages to be reclaimed up to the max value of 32. > > 2) To take a flag to force reclaiming a page regardless of its age. The > EPC cgroup will use the flag to enforce its limits by draining the > reclaimable lists before resorting to other measures, e.g. forcefully > kill enclaves. > > 3) Return the number of reclaimed pages. The EPC cgroup will use the > result to track reclaiming progress and escalate to a more forceful > reclaiming mode, e.g., calling this function with the flag to ignore age > of pages. > > Signed-off-by: Sean Christopherson > Co-developed-by: Kristen Carlson Accardi > Signed-off-by: Kristen Carlson Accardi > Co-developed-by: Haitao Huang > Signed-off-by: Haitao Huang > Cc: Sean Christopherson > --- > V4: > - Combined the 3 patches that made the individual changes to the > function signature. > - Removed 'high' limit in commit message. > --- > arch/x86/kernel/cpu/sgx/main.c | 31 +-- > arch/x86/kernel/cpu/sgx/sgx.h | 1 + > 2 files changed, 22 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index 3b875ab4dcd0..4e1a3e038db5 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -18,6 +18,11 @@ > #include "encl.h" > #include "encls.h" > > +/* > + * Maximum number of pages to scan for reclaiming. > + */ > +#define SGX_NR_TO_SCAN_MAX 32 > + > struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; > static int sgx_nr_epc_sections; > static struct task_struct *ksgxd_tsk; > @@ -279,7 +284,11 @@ static void sgx_reclaimer_write(struct sgx_epc_page > *epc_page, > mutex_unlock(>lock); > } > > -/* > +/** > + * sgx_reclaim_epc_pages() - Reclaim EPC pages from the consumers > + * @nr_to_scan: Number of EPC pages to scan for reclaim > + * @ignore_age: Reclaim a page even if it is young > + * > * Take a fixed number of pages from the head of the active page pool and > * reclaim them to the enclave's private shmem files. Skip the pages, which > have > * been accessed since the last scan. Move those pages to the tail of active > @@ -292,15 +301,14 @@ static void sgx_reclaimer_write(struct sgx_epc_page > *epc_page, > * problematic as it would increase the lock contention too much, which would > * halt forward progress. > */ > -static void sgx_reclaim_pages(void) > +size_t sgx_reclaim_epc_pages(size_t nr_to_scan, bool ignore_age) 'size_t' looks odd. Any reason to use it? Given you only scan 32 at maximum, seems 'int' is good enough? > { > - struct sgx_backing backing[SGX_NR_TO_SCAN]; > + struct sgx_backing backing[SGX_NR_TO_SCAN_MAX]; > struct sgx_epc_page *epc_page, *tmp; > struct sgx_encl_page *encl_page; > pgoff_t page_index; > LIST_HEAD(iso); > - int ret; > - int i; > + size_t ret, i; > > spin_lock(_global_lru.lock); > for (i = 0; i < SGX_NR_TO_SCAN; i++) { The function comment says * @nr_to_scan: Number of EPC pages to scan for reclaim But I don't see it is even used, if my eye isn't deceiving me? > @@ -326,13 +334,14 @@ static void sgx_reclaim_pages(void) > spin_unlock(_global_lru.lock); > > if (list_empty()) > - return; > + return 0; > > i = 0; > list_for_each_entry_safe(epc_page, tmp, , list) { > encl_page = epc_page->encl_page; > > - if (!sgx_reclaimer_age(epc_page)) > + if (i == SGX_NR_TO_SCAN_MAX || i == nr_to_scan? And should we have a if (nr_to_scan < SGX_NR_TO_SCAN_MAX) return 0; at the very beginning of this function? > + (!ignore_age && !sgx_reclaimer_age(epc_page))) > goto skip; > > page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base); > @@ -371,6 +380,8 @@ static void sgx_reclaim_pages(void) > > sgx_free_epc_page(epc_page); > } > + > + return i; > } > I found this function a little bit odd, given the mixing of 'nr_to_scan', SGX_NR_TO_SCAN and SGX_NR_TO_SCAN_MAX. From the changelog: 1) To take a parameter that specifies the number of pages to scan for reclaiming. Define a max value of 32, but scan 16 in the case for the global reclaimer (ksgxd). It appears we want to make this function to scan @nr_to_scan for
Re: [PATCH v5 09/18] x86/sgx: Store struct sgx_encl when allocating new VA pages
On Wed, 04 Oct 2023 16:13:41 -0500, Huang, Kai wrote: On Wed, 2023-10-04 at 10:03 -0500, Haitao Huang wrote: On Tue, 03 Oct 2023 15:07:42 -0500, Huang, Kai wrote: > On Tue, 2023-10-03 at 01:45 -0500, Haitao Huang wrote: > > > > > > Btw, probably a dumb question: > > > > > > Theoretically if you only need to find a victim enclave you don't need > > > to put VA > > > pages to the unreclaimable list, because those VA pages will be freed > > > anyway > > > when enclave is killed. So keeping VA pages in the list is for> > > accounting all > > > the pages that the cgroup is having? > > > > Yes basically tracking them in cgroups as they are allocated. > > > > VAs and SECS may also come and go as swapping/unswapping happens. But > > if acgroup is OOM, and all reclaimables are gone (swapped out), it'd > > have toreclaim VAs/SECs in the same cgroup starting from the front of > > the LRUlist. To reclaim a VA/SECS, it identifies the enclave from the > > owner ofthe VA/SECS page and kills it, as killing enclave is the only > > way toreclaim VA/SECS pages. > > To kill enclave you just need to track SECS in the unreclaimable list. > Only when you want to account the total EPC pages via some list you > _probably_ > need to track VA as well. But I am not quite sure about this either. There is a case where even SECS is paged out for an enclave with all reclaimables out. Yes. But this essentially means these enclaves are not active, thus shouldn't be the victim of OOM? But there are VA pages for the enclave at that moment. So it can be candidate for OOM victim. So cgroup needs to track each page used by an enclave and kill enclave when cgroup needs to lower usage by evicting an VA or SECS page. Let's discuss more on tracking SECS on unreclaimable list only. Could we assume that when the OOM wants to pick up a victim to serve the new enclave, there must be at least another one *active* enclave which still has the SECS page in EPC? No, at a given instant when OOM happens, "active" enclave's SECS may not be in EPC, but lots of VAs. OOM := "no reclaimable pages left in the cgroup to reclaim and total usage is still at/near limit". If yes, that enclave will be selected as victim. If not, then no other enclave will be selected as victim. Instead, only the new enclave which is requesting more EPC will be selected, because it's SECS is on the unreclaimable list. You can't assume the requesting enclave's SECS is in unreclaimable list either. Think the request is from #PF in the scenario we fixed the NULL pointer of SECS by reloading it. Somehow this is unacceptable, thus we need to track VA pages too in order to kill other inactive enclave? If we know for sure SECS will always be in EPC, thus tracked in unreclaimables, then we probably can do it (see below). I hope the reason given above is clear. There were some discussion on paging out VAs without killing enclaves but it'd be complicated and not implemented yet. No we don't involve swapping VA pages now. It's a separate topic. Only mentioned it as a kind of constraints impacting current design. Another potential alternative: we don't reclaim SECS either until OOM and only track SECS pages for cgroups. But that would change current behavior. And I'm not sure about other consequences, e.g., enclaves theoretically can allocate pages (including VA pages) in different cgroups/processes, so we may still end up tracking all VA pages for cgroups or we track SECS page in all cgroups in which enclave allocated any pages. Let me know your thoughts. BTW, I need clarify tracking pages which is done by LRUs vs usage accounting which is done by charge/uncharge to misc. To me tracking is for reclaiming not accounting. Also vEPCs not tracked at all but they are accounted for. I'll review the rest patches. Thanks. Thank you! Haitao
Re: [PATCH v5 15/18] x86/sgx: Prepare for multiple LRUs
On Fri, 2023-09-22 at 20:06 -0700, Haitao Huang wrote: > +static inline struct sgx_epc_lru_lists *sgx_lru_lists(struct sgx_epc_page > *epc_page) > +{ > + return _global_lru; > +} > + > +static inline bool sgx_can_reclaim(void) > +{ > + return !list_empty(_global_lru.reclaimable); > +} > + Shouldn't sgx_can_reclaim() also take a 'struct sgx_epc_lru_lists *'? I thought we also need to check whether a cgroup's LRU lists can be reclaimed?
Re: [RFC PATCH] tracing: change syscall number type in struct syscall_trace_*
On Wed, Oct 04, 2023 at 02:55:47PM +0200, Artem Savkov wrote: > On Tue, Oct 03, 2023 at 09:38:44PM -0400, Steven Rostedt wrote: > > On Mon, 2 Oct 2023 15:52:42 +0200 > > Artem Savkov wrote: > > > > > linux-rt-devel tree contains a patch that adds an extra member to struct > > > trace_entry. This causes the offset of args field in struct > > > trace_event_raw_sys_enter be different from the one in struct > > > syscall_trace_enter: > > > > This patch looks like it's fixing the symptom and not the issue. No code > > should rely on the two event structures to be related. That's an unwanted > > coupling, that will likely cause issues down the road (like the RT patch > > you mentioned). > > I agree, but I didn't see a better solution and that was my way of > starting conversation, thus the RFC. > > > > > > > struct trace_event_raw_sys_enter { > > > struct trace_entry ent; /* 012 */ > > > > > > /* XXX last struct has 3 bytes of padding */ > > > /* XXX 4 bytes hole, try to pack */ > > > > > > long int id; /*16 8 */ > > > long unsigned int args[6]; /*2448 */ > > > /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */ > > > char __data[]; /*72 0 */ > > > > > > /* size: 72, cachelines: 2, members: 4 */ > > > /* sum members: 68, holes: 1, sum holes: 4 */ > > > /* paddings: 1, sum paddings: 3 */ > > > /* last cacheline: 8 bytes */ > > > }; > > > > > > struct syscall_trace_enter { > > > struct trace_entry ent; /* 012 */ > > > > > > /* XXX last struct has 3 bytes of padding */ > > > > > > intnr; /*12 4 */ > > > long unsigned int args[]; /*16 0 */ > > > > > > /* size: 16, cachelines: 1, members: 3 */ > > > /* paddings: 1, sum paddings: 3 */ > > > /* last cacheline: 16 bytes */ > > > }; > > > > > > This, in turn, causes perf_event_set_bpf_prog() fail while running bpf > > > test_profiler testcase because max_ctx_offset is calculated based on the > > > former struct, while off on the latter: > > > > The above appears to be pointing to the real bug. The "is calculated based > > on the former struct while off on the latter" Why are the two being used > > together? They are supposed to be *unrelated*! > > > > > > > > > > 10488 if (is_tracepoint || is_syscall_tp) { > > > 10489 int off = > > > trace_event_get_offsets(event->tp_event); > > > > So basically this is clumping together the raw_syscalls with the syscalls > > events as if they are the same. But the are not. They are created > > differently. It's basically like using one structure to get the offsets of > > another structure. That would be a bug anyplace else in the kernel. Sounds > > like it's a bug here too. > > > > I think the issue is with this code, not the tracing code. > > > > We could expose the struct syscall_trace_enter and syscall_trace_exit if > > the offsets to those are needed. > > I don't think we need syscall_trace_* offsets, looks like > trace_event_get_offsets() should return offset trace_event_raw_sys_enter > instead. I am still trying to figure out how all of this works together. > Maybe Alexei or Andrii have more context here. Turns out it is even more confusing. The tests dereference the context as struct trace_event_raw_sys_enter so bpf verifier sets max_ctx_offset based on that, then perf_event_set_bpf_prog() checks this offset against the one in struct syscall_trace_enter, but what bpf prog really gets is a pointer to struct syscall_tp_t from kernel/trace/trace_syscalls.c. I don't know the history behind these decisions, but should the tests dereference context as struct syscall_trace_enter instead and struct syscall_tp_t be changed to have syscall_nr as int? -- Artem
[PATCH] bus: mhi: host: Add tracing support
ord0, dword1), + + TP_STRUCT__entry( + __string(name, name) + __field(u64, ptr) + __field(int, dword0) + __field(int, dword1) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->ptr = ptr; + __entry->dword0 = dword0; + __entry->dword1 = dword1; + ), + + TP_printk("%s: Processing Event:0x%llx 0x%08x 0x%08x\n", + __get_str(name), __entry->ptr, __entry->dword0, __entry->dword1) +); + +TRACE_EVENT(mhi_process_ctrl_ev_ring, + + TP_PROTO(const char *name, u64 rp, u64 ptr, int dword0, int dword1, const char *state), + + TP_ARGS(name, rp, ptr, dword0, dword1, state), + + TP_STRUCT__entry( + __string(name, name) + __field(u64, rp) + __field(u64, ptr) + __field(int, dword0) + __field(int, dword1) + __string(state, state) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->rp = rp; + __entry->ptr = ptr; + __entry->dword0 = dword0; + __entry->dword1 = dword1; + __assign_str(state, state); + ), + + TP_printk("%s: RP:0x%llx Processing Event:0x%llx 0x%08x 0x%08x state:%s\n", + __get_str(name), __entry->rp, __entry->ptr, __entry->dword0, + __entry->dword1, __get_str(state)) +); + +TRACE_EVENT(mhi_update_channel_state_start, + + TP_PROTO(const char *name, int ch_num, const char *state), + + TP_ARGS(name, ch_num, state), + + TP_STRUCT__entry( + __string(name, name) + __field(int, ch_num) + __string(state, state) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->ch_num = ch_num; + __assign_str(state, state); + ), + + TP_printk("%s: ch%d: Updating state to: %s\n", + __get_str(name), __entry->ch_num, __get_str(state)) +); + +TRACE_EVENT(mhi_update_channel_state_end, + + TP_PROTO(const char *name, int ch_num, const char *state), + + TP_ARGS(name, ch_num, state), + + TP_STRUCT__entry( + __string(name, name) + __field(int, ch_num) + __string(state, state) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->ch_num = ch_num; + __assign_str(state, state); + ), + + TP_printk("%s: ch%d: Updated state to: %s\n", + __get_str(name), __entry->ch_num, __get_str(state)) +); + +TRACE_EVENT(mhi_pm_st_worker, + + TP_PROTO(const char *name, const char *state), + + TP_ARGS(name, state), + + TP_STRUCT__entry( + __string(name, name) + __string(state, state) + ), + + TP_fast_assign( + __assign_str(name, name); + __assign_str(state, state); + ), + + TP_printk("%s: Handling state transition: %s\n", __get_str(name), __get_str(state)) +); + +#endif +#include --- base-commit: 3006adf3be79cde4d14b1800b963b82b6e5572e0 change-id: 20231005-ftrace_support-6869d4156139 Best regards, -- Krishna chaitanya chundru
Re: [PATCH v1 1/9] ACPI: bus: Make notify wrappers more generic
On 10/5/2023 12:57 PM, Rafael J. Wysocki wrote: > On Thu, Oct 5, 2023 at 10:10 AM Wilczynski, Michal > wrote: >> Hi, >> >> Thanks for your review ! >> >> On 10/4/2023 9:09 PM, Rafael J. Wysocki wrote: >>> On Mon, Sep 25, 2023 at 6:31 PM Michal Wilczynski >>> wrote: acpi_dev_install_notify_handler() and acpi_dev_remove_notify_handler() are wrappers around ACPICA installers. They are meant to save some duplicated code from drivers. However as we're moving towards drivers operating on platform_device they become a bit inconvenient to use as inside the driver code we mostly want to use driver data of platform device instead of ACPI device. >>> That's fair enough, but -> >>> Make notify handlers installer wrappers more generic, while still saving some code that would be duplicated otherwise. Reviewed-by: Andy Shevchenko Signed-off-by: Michal Wilczynski --- Notes: So one solution could be to just replace acpi_device with platform_device as an argument in those functions. However I don't believe this is a correct solution, as it is very often the case that drivers declare their own private structures which gets allocated during the .probe() callback, and become the heart of the driver. When drivers do that it makes much more sense to just pass the private structure to the notify handler instead of forcing user to dance with the platform_device or acpi_device. drivers/acpi/ac.c | 6 +++--- drivers/acpi/acpi_video.c | 6 +++--- drivers/acpi/battery.c| 6 +++--- drivers/acpi/bus.c| 14 ++ drivers/acpi/hed.c| 6 +++--- drivers/acpi/nfit/core.c | 6 +++--- drivers/acpi/thermal.c| 6 +++--- include/acpi/acpi_bus.h | 9 - 8 files changed, 28 insertions(+), 31 deletions(-) diff --git a/drivers/acpi/ac.c b/drivers/acpi/ac.c index 225dc6818751..0b245f9f7ec8 100644 --- a/drivers/acpi/ac.c +++ b/drivers/acpi/ac.c @@ -256,8 +256,8 @@ static int acpi_ac_add(struct acpi_device *device) ac->battery_nb.notifier_call = acpi_ac_battery_notify; register_acpi_notifier(>battery_nb); - result = acpi_dev_install_notify_handler(device, ACPI_ALL_NOTIFY, -acpi_ac_notify); + result = acpi_dev_install_notify_handler(device->handle, ACPI_ALL_NOTIFY, +acpi_ac_notify, device); if (result) goto err_unregister; @@ -306,7 +306,7 @@ static void acpi_ac_remove(struct acpi_device *device) ac = acpi_driver_data(device); - acpi_dev_remove_notify_handler(device, ACPI_ALL_NOTIFY, + acpi_dev_remove_notify_handler(device->handle, ACPI_ALL_NOTIFY, acpi_ac_notify); power_supply_unregister(ac->charger); unregister_acpi_notifier(>battery_nb); diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c index 948e31f7ce6e..025c17890127 100644 --- a/drivers/acpi/acpi_video.c +++ b/drivers/acpi/acpi_video.c @@ -2059,8 +2059,8 @@ static int acpi_video_bus_add(struct acpi_device *device) acpi_video_bus_add_notify_handler(video); - error = acpi_dev_install_notify_handler(device, ACPI_DEVICE_NOTIFY, - acpi_video_bus_notify); + error = acpi_dev_install_notify_handler(device->handle, ACPI_DEVICE_NOTIFY, + acpi_video_bus_notify, device); if (error) goto err_remove; @@ -2092,7 +2092,7 @@ static void acpi_video_bus_remove(struct acpi_device *device) video = acpi_driver_data(device); - acpi_dev_remove_notify_handler(device, ACPI_DEVICE_NOTIFY, + acpi_dev_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY, acpi_video_bus_notify); mutex_lock(_list_lock); diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c index 969bf81e8d54..45dae32a8646 100644 --- a/drivers/acpi/battery.c +++ b/drivers/acpi/battery.c @@ -1213,8 +1213,8 @@ static int acpi_battery_add(struct acpi_device *device) device_init_wakeup(>dev, 1); - result = acpi_dev_install_notify_handler(device, ACPI_ALL_NOTIFY, -acpi_battery_notify); + result = acpi_dev_install_notify_handler(device->handle, ACPI_ALL_NOTIFY, +acpi_battery_notify, device);
Re: [PATCH v1 1/9] ACPI: bus: Make notify wrappers more generic
On Thu, Oct 5, 2023 at 10:10 AM Wilczynski, Michal wrote: > > Hi, > > Thanks for your review ! > > On 10/4/2023 9:09 PM, Rafael J. Wysocki wrote: > > On Mon, Sep 25, 2023 at 6:31 PM Michal Wilczynski > > wrote: > >> acpi_dev_install_notify_handler() and acpi_dev_remove_notify_handler() > >> are wrappers around ACPICA installers. They are meant to save some > >> duplicated code from drivers. However as we're moving towards drivers > >> operating on platform_device they become a bit inconvenient to use as > >> inside the driver code we mostly want to use driver data of platform > >> device instead of ACPI device. > > That's fair enough, but -> > > > >> Make notify handlers installer wrappers more generic, while still > >> saving some code that would be duplicated otherwise. > >> > >> Reviewed-by: Andy Shevchenko > >> Signed-off-by: Michal Wilczynski > >> --- > >> > >> Notes: > >> So one solution could be to just replace acpi_device with > >> platform_device as an argument in those functions. However I don't > >> believe this is a correct solution, as it is very often the case that > >> drivers declare their own private structures which gets allocated > >> during > >> the .probe() callback, and become the heart of the driver. When drivers > >> do that it makes much more sense to just pass the private structure > >> to the notify handler instead of forcing user to dance with the > >> platform_device or acpi_device. > >> > >> drivers/acpi/ac.c | 6 +++--- > >> drivers/acpi/acpi_video.c | 6 +++--- > >> drivers/acpi/battery.c| 6 +++--- > >> drivers/acpi/bus.c| 14 ++ > >> drivers/acpi/hed.c| 6 +++--- > >> drivers/acpi/nfit/core.c | 6 +++--- > >> drivers/acpi/thermal.c| 6 +++--- > >> include/acpi/acpi_bus.h | 9 - > >> 8 files changed, 28 insertions(+), 31 deletions(-) > >> > >> diff --git a/drivers/acpi/ac.c b/drivers/acpi/ac.c > >> index 225dc6818751..0b245f9f7ec8 100644 > >> --- a/drivers/acpi/ac.c > >> +++ b/drivers/acpi/ac.c > >> @@ -256,8 +256,8 @@ static int acpi_ac_add(struct acpi_device *device) > >> ac->battery_nb.notifier_call = acpi_ac_battery_notify; > >> register_acpi_notifier(>battery_nb); > >> > >> - result = acpi_dev_install_notify_handler(device, ACPI_ALL_NOTIFY, > >> -acpi_ac_notify); > >> + result = acpi_dev_install_notify_handler(device->handle, > >> ACPI_ALL_NOTIFY, > >> +acpi_ac_notify, device); > >> if (result) > >> goto err_unregister; > >> > >> @@ -306,7 +306,7 @@ static void acpi_ac_remove(struct acpi_device *device) > >> > >> ac = acpi_driver_data(device); > >> > >> - acpi_dev_remove_notify_handler(device, ACPI_ALL_NOTIFY, > >> + acpi_dev_remove_notify_handler(device->handle, ACPI_ALL_NOTIFY, > >>acpi_ac_notify); > >> power_supply_unregister(ac->charger); > >> unregister_acpi_notifier(>battery_nb); > >> diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c > >> index 948e31f7ce6e..025c17890127 100644 > >> --- a/drivers/acpi/acpi_video.c > >> +++ b/drivers/acpi/acpi_video.c > >> @@ -2059,8 +2059,8 @@ static int acpi_video_bus_add(struct acpi_device > >> *device) > >> > >> acpi_video_bus_add_notify_handler(video); > >> > >> - error = acpi_dev_install_notify_handler(device, ACPI_DEVICE_NOTIFY, > >> - acpi_video_bus_notify); > >> + error = acpi_dev_install_notify_handler(device->handle, > >> ACPI_DEVICE_NOTIFY, > >> + acpi_video_bus_notify, > >> device); > >> if (error) > >> goto err_remove; > >> > >> @@ -2092,7 +2092,7 @@ static void acpi_video_bus_remove(struct acpi_device > >> *device) > >> > >> video = acpi_driver_data(device); > >> > >> - acpi_dev_remove_notify_handler(device, ACPI_DEVICE_NOTIFY, > >> + acpi_dev_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY, > >>acpi_video_bus_notify); > >> > >> mutex_lock(_list_lock); > >> diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c > >> index 969bf81e8d54..45dae32a8646 100644 > >> --- a/drivers/acpi/battery.c > >> +++ b/drivers/acpi/battery.c > >> @@ -1213,8 +1213,8 @@ static int acpi_battery_add(struct acpi_device > >> *device) > >> > >> device_init_wakeup(>dev, 1); > >> > >> - result = acpi_dev_install_notify_handler(device, ACPI_ALL_NOTIFY, > >> -acpi_battery_notify); > >> + result = acpi_dev_install_notify_handler(device->handle, > >> ACPI_ALL_NOTIFY, > >> +acpi_battery_notify, > >> device); > >> if (result) > >> goto fail_pm; > >> > >>
Re: [PATCH v1 1/9] ACPI: bus: Make notify wrappers more generic
Hi, Thanks for your review ! On 10/4/2023 9:09 PM, Rafael J. Wysocki wrote: > On Mon, Sep 25, 2023 at 6:31 PM Michal Wilczynski > wrote: >> acpi_dev_install_notify_handler() and acpi_dev_remove_notify_handler() >> are wrappers around ACPICA installers. They are meant to save some >> duplicated code from drivers. However as we're moving towards drivers >> operating on platform_device they become a bit inconvenient to use as >> inside the driver code we mostly want to use driver data of platform >> device instead of ACPI device. > That's fair enough, but -> > >> Make notify handlers installer wrappers more generic, while still >> saving some code that would be duplicated otherwise. >> >> Reviewed-by: Andy Shevchenko >> Signed-off-by: Michal Wilczynski >> --- >> >> Notes: >> So one solution could be to just replace acpi_device with >> platform_device as an argument in those functions. However I don't >> believe this is a correct solution, as it is very often the case that >> drivers declare their own private structures which gets allocated during >> the .probe() callback, and become the heart of the driver. When drivers >> do that it makes much more sense to just pass the private structure >> to the notify handler instead of forcing user to dance with the >> platform_device or acpi_device. >> >> drivers/acpi/ac.c | 6 +++--- >> drivers/acpi/acpi_video.c | 6 +++--- >> drivers/acpi/battery.c| 6 +++--- >> drivers/acpi/bus.c| 14 ++ >> drivers/acpi/hed.c| 6 +++--- >> drivers/acpi/nfit/core.c | 6 +++--- >> drivers/acpi/thermal.c| 6 +++--- >> include/acpi/acpi_bus.h | 9 - >> 8 files changed, 28 insertions(+), 31 deletions(-) >> >> diff --git a/drivers/acpi/ac.c b/drivers/acpi/ac.c >> index 225dc6818751..0b245f9f7ec8 100644 >> --- a/drivers/acpi/ac.c >> +++ b/drivers/acpi/ac.c >> @@ -256,8 +256,8 @@ static int acpi_ac_add(struct acpi_device *device) >> ac->battery_nb.notifier_call = acpi_ac_battery_notify; >> register_acpi_notifier(>battery_nb); >> >> - result = acpi_dev_install_notify_handler(device, ACPI_ALL_NOTIFY, >> -acpi_ac_notify); >> + result = acpi_dev_install_notify_handler(device->handle, >> ACPI_ALL_NOTIFY, >> +acpi_ac_notify, device); >> if (result) >> goto err_unregister; >> >> @@ -306,7 +306,7 @@ static void acpi_ac_remove(struct acpi_device *device) >> >> ac = acpi_driver_data(device); >> >> - acpi_dev_remove_notify_handler(device, ACPI_ALL_NOTIFY, >> + acpi_dev_remove_notify_handler(device->handle, ACPI_ALL_NOTIFY, >>acpi_ac_notify); >> power_supply_unregister(ac->charger); >> unregister_acpi_notifier(>battery_nb); >> diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c >> index 948e31f7ce6e..025c17890127 100644 >> --- a/drivers/acpi/acpi_video.c >> +++ b/drivers/acpi/acpi_video.c >> @@ -2059,8 +2059,8 @@ static int acpi_video_bus_add(struct acpi_device >> *device) >> >> acpi_video_bus_add_notify_handler(video); >> >> - error = acpi_dev_install_notify_handler(device, ACPI_DEVICE_NOTIFY, >> - acpi_video_bus_notify); >> + error = acpi_dev_install_notify_handler(device->handle, >> ACPI_DEVICE_NOTIFY, >> + acpi_video_bus_notify, >> device); >> if (error) >> goto err_remove; >> >> @@ -2092,7 +2092,7 @@ static void acpi_video_bus_remove(struct acpi_device >> *device) >> >> video = acpi_driver_data(device); >> >> - acpi_dev_remove_notify_handler(device, ACPI_DEVICE_NOTIFY, >> + acpi_dev_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY, >>acpi_video_bus_notify); >> >> mutex_lock(_list_lock); >> diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c >> index 969bf81e8d54..45dae32a8646 100644 >> --- a/drivers/acpi/battery.c >> +++ b/drivers/acpi/battery.c >> @@ -1213,8 +1213,8 @@ static int acpi_battery_add(struct acpi_device *device) >> >> device_init_wakeup(>dev, 1); >> >> - result = acpi_dev_install_notify_handler(device, ACPI_ALL_NOTIFY, >> -acpi_battery_notify); >> + result = acpi_dev_install_notify_handler(device->handle, >> ACPI_ALL_NOTIFY, >> +acpi_battery_notify, >> device); >> if (result) >> goto fail_pm; >> >> @@ -1241,7 +1241,7 @@ static void acpi_battery_remove(struct acpi_device >> *device) >> >> battery = acpi_driver_data(device); >> >> - acpi_dev_remove_notify_handler(device, ACPI_ALL_NOTIFY, >> + acpi_dev_remove_notify_handler(device->handle, ACPI_ALL_NOTIFY, >>