Re: [PATCH v1 09/10] powerpc/pseries/iommu: Make use of DDW even if it does not map the partition
On 18/08/2020 09:40, Leonardo Bras wrote: > As of today, if the biggest DDW that can be created can't map the whole > partition, it's creation is skipped and the default DMA window > "ibm,dma-window" is used instead. > > DDW is 16x bigger than the default DMA window, 16x only under very specific circumstances which are 1. phyp 2. sriov 3. device class in hmc (or what that priority number is in the lpar config). > having the same amount of > pages, but increasing the page size to 64k. > Besides larger DMA window, "Besides being larger"? > it performs better for allocations over 4k, Better how? > so it would be nice to use it instead. I'd rather say something like: === So far we assumed we can map the guest RAM 1:1 to the bus which worked with a small number of devices. SRIOV changes it as the user can configure hundreds VFs and since phyp preallocates TCEs and does not allow IOMMU pages bigger than 64K, it has to limit the number of TCEs per a PE to limit waste of physical pages. === > > The DDW created will be used for direct mapping by default. > If it's not available, indirect mapping will be used instead. > > For indirect mapping, it's necessary to update the iommu_table so > iommu_alloc() can use the DDW created. For this, > iommu_table_update_window() is called when everything else succeeds > at enable_ddw(). > > Removing the default DMA window for using DDW with indirect mapping > is only allowed if there is no current IOMMU memory allocated in > the iommu_table. enable_ddw() is aborted otherwise. > > As there will never have both direct and indirect mappings at the same > time, the same property name can be used for the created DDW. > > So renaming > define DIRECT64_PROPNAME "linux,direct64-ddr-window-info" > to > define DMA64_PROPNAME "linux,dma64-ddr-window-info" > looks the right thing to do. I know I suggested this but this does not look so good anymore as I suspect it breaks kexec (from older kernel to this one) so you either need to check for both DT names or just keep the old one. Changing the macro name is fine. > > To make sure the property differentiates both cases, a new u32 for flags > was added at the end of the property, where BIT(0) set means direct > mapping. > > Signed-off-by: Leonardo Bras > --- > arch/powerpc/platforms/pseries/iommu.c | 108 +++-- > 1 file changed, 84 insertions(+), 24 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/iommu.c > b/arch/powerpc/platforms/pseries/iommu.c > index 3a1ef02ad9d5..9544e3c91ced 100644 > --- a/arch/powerpc/platforms/pseries/iommu.c > +++ b/arch/powerpc/platforms/pseries/iommu.c > @@ -350,8 +350,11 @@ struct dynamic_dma_window_prop { > __be64 dma_base; /* address hi,lo */ > __be32 tce_shift; /* ilog2(tce_page_size) */ > __be32 window_shift; /* ilog2(tce_window_size) */ > + __be32 flags; /* DDW properties, see bellow */ > }; > > +#define DDW_FLAGS_DIRECT 0x01 This is set if ((1<= ddw_memory_hotplug_max()), you could simply check window_shift and drop the flags. > + > struct direct_window { > struct device_node *device; > const struct dynamic_dma_window_prop *prop; > @@ -377,7 +380,7 @@ static LIST_HEAD(direct_window_list); > static DEFINE_SPINLOCK(direct_window_list_lock); > /* protects initializing window twice for same device */ > static DEFINE_MUTEX(direct_window_init_mutex); > -#define DIRECT64_PROPNAME "linux,direct64-ddr-window-info" > +#define DMA64_PROPNAME "linux,dma64-ddr-window-info" > > static int tce_clearrange_multi_pSeriesLP(unsigned long start_pfn, > unsigned long num_pfn, const void *arg) > @@ -836,7 +839,7 @@ static void remove_ddw(struct device_node *np, bool > remove_prop) > if (ret) > return; > > - win = of_find_property(np, DIRECT64_PROPNAME, NULL); > + win = of_find_property(np, DMA64_PROPNAME, NULL); > if (!win) > return; > > @@ -852,7 +855,7 @@ static void remove_ddw(struct device_node *np, bool > remove_prop) > np, ret); > } > > -static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr) > +static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, bool > *direct_mapping) > { > struct direct_window *window; > const struct dynamic_dma_window_prop *direct64; > @@ -864,6 +867,7 @@ static bool find_existing_ddw(struct device_node *pdn, > u64 *dma_addr) > if (window->device == pdn) { > direct64 = window->prop; > *dma_addr = be64_to_cpu(direct64->dma_base); > + *direct_mapping = be32_to_cpu(direct64->flags) & > DDW_FLAGS_DIRECT; > found = true; > break; > } > @@ -901,8 +905,8 @@ static int find_existing_ddw_windows(void) > if (!firmware_has_feature(FW_FEATURE_LPAR)) > return 0; > >
Re: kernel since 5.6 do not boot anymore on Apple PowerBook
Hello Giuseppe, Le 22/08/2020 à 10:28, Giuseppe Sacco a écrit : Hello Christophe, Il giorno ven, 21/08/2020 alle 16.03 +0200, Christophe Leroy ha scritto: [...] You also said in a previous mail that your original issue also happens when CONFIG_VMAP_STACK is not selected. The above bug being linked to CONFIG_VMAP_STACK, maybe it would be easier to bisect with CONFIG_VMAP_STACK unselected. I was wrong. Disabling CONFIG_VMAP_STACK led me to all "good" compile and bisect ended without finding the culprit commit. So, I started from scratch: I rebuilt HEAD and found that it does show the original problem I am facing, then I rebuilt it without CONFIG_VMAP_STACK and found that it does pass (fix?) the problem, since kernel continue booting, but then it stops with three Oops related to command systemd-udevd. You may find a video that displays the complete boot, vmlinux, config, and system.map files here: The Oopses in the video are fixed in 5.9-rc2, see my response to your other mail. So now we know that your kernel doesn't boot when CONFIG_VMAP_STACK is set. Can you remind the exact problem ? One common problem with CONFIG_VMAP_STACK is when some drivers are invalidly using buffers in stack for DMA. Couldn't try with CONFIG_DEBUG_VIRTUAL (without CONFIG_VMAP_STACK) and see if it triggers some warnings ? Thanks Christophe
Re: Oops decoding help request
Hi, Le 23/08/2020 à 19:26, Giuseppe Sacco a écrit : Hello, I am not a kernel developer and I need much help in order to understand a kernel Oops (the first of a series of three Oops). It is: This is a bug in the kernel. It is fixed in 5.9-rc2. See the following commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/powerpc/mm/book3s32/mmu.c?h=v5.9-rc2=541cebb51f3422d4f2c6cb95c1e5cc3dcc9e5021 Christophe kernel tried to execute exec-protected page (f102) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch Faulting instruction address: 0xf102 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K MMU=Hash PowerMac Modules linked in: crct10dif_generic (+) crct10dif_common drm_panel_orientation_quirks CPU: 0 PID: 71 Comm: systemd-udevd Not tainted 5.9.0-rc1+ #298 NIP: f102 LR: c00053a4 CTR: f102 REGS: c1c6dd50 TRAP: 0400 Not tainted (5.9.0-rc1+) MSR: 10009032 CR: 2284 XER: GPR00: c0005390 c1c6de08 c1c6b400 0cc0 0008 ef6db038 0001 GPR08: 002e 2284 00b6fb58 0005 GPR16: bff0d768 bff0d770 01032cc0 00b0b31f 01020960 GPR24: 00b70954 010206c0 ef39f4a0 00a28380 f102 f10193a0 NIP [f102] crct10dif_mod_init+0x0/0x60 [crct10dif_generic] LR [c00053a4] do_one_initcall+0x50/0x1f4 Call Trace: [c1c6de08] [c0005390] do_one_initcall+0x3c/0x1f4 (unreliable) [c1c6de78] [c0102068] do_init_module+0x6c/0x27c [c1c6dea8] [c01053cc] sys_finit_module+0xc0/0x12c [c1c6df38] [c001c11c] ret_from_syscall+0x0/0x34 --- interrupt: c01 at 0x7a7780 LR = 0xa1bf64 Instruction dump: <7c0802a6> 90010004 6000 9421fff0 ---[ end trace 257a4bbda691894e ]--- From what I understand, this is a problem in the init function of module crct10dif_generic jumping at address f102. I think I understand that f102 is an address for data and not for code. In fact it belongs to "vmalloc & ioremap" area of the virtual memory layout: * 0xffbee000..0xf000 : fixmap * 0xff40..0xff80 : highmem PTEs * 0xfda27000..0xff40 : early ioremap * 0xf100..0xfda27000 : vmalloc & ioremap The init function is: : 0: 7c 08 02 a6 mflrr0 4: 90 01 00 04 stw r0,4(r1) 8: 48 00 00 01 bl 8 c: 94 21 ff f0 stwur1,-16(r1) 10: 7c 08 02 a6 mflrr0 14: 3c 60 00 00 lis r3,0 18: 90 01 00 14 stw r0,20(r1) 1c: 38 63 00 00 addir3,r3,0 20: 80 01 00 14 lwz r0,20(r1) 24: 38 21 00 10 addir1,r1,16 28: 7c 08 03 a6 mtlrr0 2c: 48 00 00 00 b 2c and its source code is: static int __init crct10dif_mod_init(void) { return crypto_register_shash(); } This is what I am not understanding. The error message seems to imply that code jumps to an invalid address, so the problem would be that address of function crypto_register_shash is calculated wrongly. About stack addresses, please note that CONFIG_VMAP_STACK is not set. Is this a correct reasoning? Thank you very much, Giuseppe
Re: [PATCH v1 08/10] powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw()
On 18/08/2020 09:40, Leonardo Bras wrote: > Code used to create a ddw property that was previously scattered in > enable_ddw() is now gathered in ddw_property_create(), which deals with > allocation and filling the property, letting it ready for > of_property_add(), which now occurs in sequence. > > This created an opportunity to reorganize the second part of enable_ddw(): > > Without this patch enable_ddw() does, in order: > kzalloc() property & members, create_ddw(), fill ddwprop inside property, > ddw_list_add(), do tce_setrange_multi_pSeriesLP_walk in all memory, > of_add_property(). > > With this patch enable_ddw() does, in order: > create_ddw(), ddw_property_create(), of_add_property(), ddw_list_add(), > do tce_setrange_multi_pSeriesLP_walk in all memory. > > This change requires of_remove_property() in case anything fails after > of_add_property(), but we get to do tce_setrange_multi_pSeriesLP_walk > in all memory, which looks the most expensive operation, only if > everything else succeeds. > > Signed-off-by: Leonardo Bras > --- > arch/powerpc/platforms/pseries/iommu.c | 97 +++--- > 1 file changed, 57 insertions(+), 40 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/iommu.c > b/arch/powerpc/platforms/pseries/iommu.c > index 4031127c9537..3a1ef02ad9d5 100644 > --- a/arch/powerpc/platforms/pseries/iommu.c > +++ b/arch/powerpc/platforms/pseries/iommu.c > @@ -1123,6 +1123,31 @@ static void reset_dma_window(struct pci_dev *dev, > struct device_node *par_dn) >ret); > } > > +static int ddw_property_create(struct property **ddw_win, const char > *propname, @propname is always the same, do you really want to pass it every time? > +u32 liobn, u64 dma_addr, u32 page_shift, u32 > window_shift) > +{ > + struct dynamic_dma_window_prop *ddwprop; > + struct property *win64; > + > + *ddw_win = win64 = kzalloc(sizeof(*win64), GFP_KERNEL); > + if (!win64) > + return -ENOMEM; > + > + win64->name = kstrdup(propname, GFP_KERNEL); Not clear why "win64->name = DIRECT64_PROPNAME" would not work here, the generic OF code does not try kfree() it but it is probably out of scope here. > + ddwprop = kzalloc(sizeof(*ddwprop), GFP_KERNEL); > + win64->value = ddwprop; > + win64->length = sizeof(*ddwprop); > + if (!win64->name || !win64->value) > + return -ENOMEM; Up to 2 memory leaks here. I see the cleanup at "out_free_prop:" but still looks fragile. Instead you could simply return win64 as the only error possible here is -ENOMEM and returning NULL is equally good. > + > + ddwprop->liobn = cpu_to_be32(liobn); > + ddwprop->dma_base = cpu_to_be64(dma_addr); > + ddwprop->tce_shift = cpu_to_be32(page_shift); > + ddwprop->window_shift = cpu_to_be32(window_shift); > + > + return 0; > +} > + > /* > * If the PE supports dynamic dma windows, and there is space for a table > * that can map all pages in a linear offset, then setup such a table, > @@ -1140,12 +1165,11 @@ static bool enable_ddw(struct pci_dev *dev, struct > device_node *pdn) > struct ddw_query_response query; > struct ddw_create_response create; > int page_shift; > - u64 max_addr; > + u64 max_addr, win_addr; > struct device_node *dn; > u32 ddw_avail[DDW_APPLICABLE_SIZE]; > struct direct_window *window; > - struct property *win64; > - struct dynamic_dma_window_prop *ddwprop; > + struct property *win64 = NULL; > struct failed_ddw_pdn *fpdn; > bool default_win_removed = false; > > @@ -1244,38 +1268,34 @@ static bool enable_ddw(struct pci_dev *dev, struct > device_node *pdn) > goto out_failed; > } > len = order_base_2(max_addr); > - win64 = kzalloc(sizeof(struct property), GFP_KERNEL); > - if (!win64) { > - dev_info(>dev, > - "couldn't allocate property for 64bit dma window\n"); > + > + ret = create_ddw(dev, ddw_avail, , page_shift, len); > + if (ret != 0) It is usually just "if (ret)" > goto out_failed; > - } > - win64->name = kstrdup(DIRECT64_PROPNAME, GFP_KERNEL); > - win64->value = ddwprop = kmalloc(sizeof(*ddwprop), GFP_KERNEL); > - win64->length = sizeof(*ddwprop); > - if (!win64->name || !win64->value) { > + > + dev_dbg(>dev, "created tce table LIOBN 0x%x for %pOF\n", > + create.liobn, dn); > + > + win_addr = ((u64)create.addr_hi << 32) | create.addr_lo; > + ret = ddw_property_create(, DIRECT64_PROPNAME, create.liobn, > win_addr, > + page_shift, len); > + if (ret) { > dev_info(>dev, > - "couldn't allocate property name and value\n"); > + "couldn't allocate property, property name, or > value\n"); > goto out_free_prop; > } > > - ret = create_ddw(dev, ddw_avail,
Re: [PATCH v1 06/10] powerpc/pseries/iommu: Add ddw_list_add() helper
On 18/08/2020 09:40, Leonardo Bras wrote: > There are two functions adding DDW to the direct_window_list in a > similar way, so create a ddw_list_add() to avoid duplicity and > simplify those functions. > > Also, on enable_ddw(), add list_del() on out_free_window to allow > removing the window from list if any error occurs. > > Signed-off-by: Leonardo Bras > --- > arch/powerpc/platforms/pseries/iommu.c | 42 -- > 1 file changed, 26 insertions(+), 16 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/iommu.c > b/arch/powerpc/platforms/pseries/iommu.c > index 39617ce0ec83..fcdefcc0f365 100644 > --- a/arch/powerpc/platforms/pseries/iommu.c > +++ b/arch/powerpc/platforms/pseries/iommu.c > @@ -872,6 +872,24 @@ static u64 find_existing_ddw(struct device_node *pdn) > return dma_addr; > } > > +static struct direct_window *ddw_list_add(struct device_node *pdn, > + const struct dynamic_dma_window_prop > *dma64) > +{ > + struct direct_window *window; > + > + window = kzalloc(sizeof(*window), GFP_KERNEL); > + if (!window) > + return NULL; > + > + window->device = pdn; > + window->prop = dma64; > + spin_lock(_window_list_lock); > + list_add(>list, _window_list); > + spin_unlock(_window_list_lock); > + > + return window; > +} > + > static int find_existing_ddw_windows(void) > { > int len; > @@ -887,18 +905,11 @@ static int find_existing_ddw_windows(void) > if (!direct64) > continue; > > - window = kzalloc(sizeof(*window), GFP_KERNEL); > - if (!window || len < sizeof(struct dynamic_dma_window_prop)) { > + window = ddw_list_add(pdn, direct64); > + if (!window || len < sizeof(*direct64)) { Since you are touching this code, it looks like the "len < sizeof(*direct64)" part should go above to "if (!direct64)". > kfree(window); > remove_ddw(pdn, true); > - continue; > } > - > - window->device = pdn; > - window->prop = direct64; > - spin_lock(_window_list_lock); > - list_add(>list, _window_list); > - spin_unlock(_window_list_lock); > } > > return 0; > @@ -1261,7 +1272,8 @@ static u64 enable_ddw(struct pci_dev *dev, struct > device_node *pdn) > dev_dbg(>dev, "created tce table LIOBN 0x%x for %pOF\n", > create.liobn, dn); > > - window = kzalloc(sizeof(*window), GFP_KERNEL); > + /* Add new window to existing DDW list */ The comment seems to duplicate what the ddw_list_add name already suggests. > + window = ddw_list_add(pdn, ddwprop); > if (!window) > goto out_clear_window; > > @@ -1280,16 +1292,14 @@ static u64 enable_ddw(struct pci_dev *dev, struct > device_node *pdn) > goto out_free_window; > } > > - window->device = pdn; > - window->prop = ddwprop; > - spin_lock(_window_list_lock); > - list_add(>list, _window_list); > - spin_unlock(_window_list_lock); I'd leave these 3 lines here and in find_existing_ddw_windows() (which would make ddw_list_add -> ddw_prop_alloc). In general you want to have less stuff to do on the failure path. kmalloc may fail and needs kfree but you can safely delay list_add (which cannot fail) and avoid having the lock help twice in the same function (one of them is hidden inside ddw_list_add). Not sure if this change is really needed after all. Thanks, > - > dma_addr = be64_to_cpu(ddwprop->dma_base); > goto out_unlock; > > out_free_window: > + spin_lock(_window_list_lock); > + list_del(>list); > + spin_unlock(_window_list_lock); > + > kfree(window); > > out_clear_window: > -- Alexey
Re: [PATCH v1 07/10] powerpc/pseries/iommu: Allow DDW windows starting at 0x00
On 18/08/2020 09:40, Leonardo Bras wrote: > enable_ddw() currently returns the address of the DMA window, which is > considered invalid if has the value 0x00. > > Also, it only considers valid an address returned from find_existing_ddw > if it's not 0x00. > > Changing this behavior makes sense, given the users of enable_ddw() only > need to know if direct mapping is possible. It can also allow a DMA window > starting at 0x00 to be used. > > This will be helpful for using a DDW with indirect mapping, as the window > address will be different than 0x00, but it will not map the whole > partition. > > Signed-off-by: Leonardo Bras > --- > arch/powerpc/platforms/pseries/iommu.c | 30 -- > 1 file changed, 14 insertions(+), 16 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/iommu.c > b/arch/powerpc/platforms/pseries/iommu.c > index fcdefcc0f365..4031127c9537 100644 > --- a/arch/powerpc/platforms/pseries/iommu.c > +++ b/arch/powerpc/platforms/pseries/iommu.c > @@ -852,24 +852,25 @@ static void remove_ddw(struct device_node *np, bool > remove_prop) > np, ret); > } > > -static u64 find_existing_ddw(struct device_node *pdn) > +static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr) > { > struct direct_window *window; > const struct dynamic_dma_window_prop *direct64; > - u64 dma_addr = 0; > + bool found = false; > > spin_lock(_window_list_lock); > /* check if we already created a window and dupe that config if so */ > list_for_each_entry(window, _window_list, list) { > if (window->device == pdn) { > direct64 = window->prop; > - dma_addr = be64_to_cpu(direct64->dma_base); > + *dma_addr = be64_to_cpu(direct64->dma_base); > + found = true; > break; > } > } > spin_unlock(_window_list_lock); > > - return dma_addr; > + return found; > } > > static struct direct_window *ddw_list_add(struct device_node *pdn, > @@ -1131,15 +1132,15 @@ static void reset_dma_window(struct pci_dev *dev, > struct device_node *par_dn) > * pdn: the parent pe node with the ibm,dma_window property > * Future: also check if we can remap the base window for our base page size > * > - * returns the dma offset for use by the direct mapped DMA code. > + * returns true if can map all pages (direct mapping), false otherwise.. > */ > -static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn) > +static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) > { > int len, ret; > struct ddw_query_response query; > struct ddw_create_response create; > int page_shift; > - u64 dma_addr, max_addr; > + u64 max_addr; > struct device_node *dn; > u32 ddw_avail[DDW_APPLICABLE_SIZE]; > struct direct_window *window; > @@ -1150,8 +1151,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct > device_node *pdn) > > mutex_lock(_window_init_mutex); > > - dma_addr = find_existing_ddw(pdn); > - if (dma_addr != 0) > + if (find_existing_ddw(pdn, >dev.archdata.dma_offset)) > goto out_unlock; > > /* > @@ -1292,7 +1292,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct > device_node *pdn) > goto out_free_window; > } > > - dma_addr = be64_to_cpu(ddwprop->dma_base); > + dev->dev.archdata.dma_offset = be64_to_cpu(ddwprop->dma_base); Do not you need the same chunk in the find_existing_ddw() case above as well? Thanks, > goto out_unlock; > > out_free_window: > @@ -1309,6 +1309,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct > device_node *pdn) > kfree(win64->name); > kfree(win64->value); > kfree(win64); > + win64 = NULL; > > out_failed: > if (default_win_removed) > @@ -1322,7 +1323,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct > device_node *pdn) > > out_unlock: > mutex_unlock(_window_init_mutex); > - return dma_addr; > + return win64; > } > > static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev) > @@ -1401,11 +1402,8 @@ static bool iommu_bypass_supported_pSeriesLP(struct > pci_dev *pdev, u64 dma_mask) > break; > } > > - if (pdn && PCI_DN(pdn)) { > - pdev->dev.archdata.dma_offset = enable_ddw(pdev, pdn); > - if (pdev->dev.archdata.dma_offset) > - return true; > - } > + if (pdn && PCI_DN(pdn)) > + return enable_ddw(pdev, pdn); > > return false; > } > -- Alexey
Re: [PATCH] powerpc/prom_init: Check display props exist before enabling btext
Alexey Kardashevskiy writes: > On 21/08/2020 20:34, Michael Ellerman wrote: >> It's possible to enable CONFIG_PPC_EARLY_DEBUG_BOOTX for a pseries >> kernel (maybe it shouldn't be), which is then booted with qemu/slof. > > > CONFIG_BOOTX_TEXT=y > CONFIG_PPC_EARLY_DEBUG=y > CONFIG_PPC_EARLY_DEBUG_BOOTX=y > > this does not crash my VM. The changed chunk is sitting under "if > (prom_getprop(node, "linux,boot-display", NULL, 0)" and I cannot find > what creates this property - it is neither slof/grub/qemu, unlikely that > it is phyp so it must be this one: > > arch/powerpc/platforms/powermac/bootx_init.c|244| > bootx_dt_add_string("linux,boot-display", mem_end); It's in prom_init.c: static void __init prom_init_stdout(void) { ... stdout_node = call_prom("instance-to-package", 1, 1, prom.stdout); if (stdout_node != PROM_ERROR) { val = cpu_to_be32(stdout_node); /* If it's a display, note it */ memset(type, 0, sizeof(type)); prom_getprop(stdout_node, "device_type", type, sizeof(type)); if (prom_strcmp(type, "display") == 0) prom_setprop(stdout_node, path, "linux,boot-display", NULL, 0); } } > which is powermac and not pseries. Or may be that pmac firmware. > > Where did you see this crash? Qemu pseries either TCG or KVM with eg: $ qemu-system-ppc64 -M pseries -cpu POWER8 -m 1G -kernel build~/vmlinux >> But if you do that the kernel crashes in draw_byte(), with a DAR >> pointing somewhere near INT_MAX. >> >> Adding some debug to prom_init we see that we're not able to read the >> "address" property from OF, so we're just using whatever junk value >> was on the stack. >> >> So check the properties can be read properly from OF, if not we bail >> out before initialising btext, which avoids the crash. > > This is a right thing any way, just the commit log is confusing. > > Reviewed-by: Alexey Kardashevskiy Thanks. cheers
[powerpc:merge] BUILD SUCCESS 3829b105d464789815a333dd34ceac63cfe9bd22
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge branch HEAD: 3829b105d464789815a333dd34ceac63cfe9bd22 Automatic merge of 'master', 'next' and 'fixes' (2020-08-23 21:44) elapsed time: 846m configs tested: 107 configs skipped: 8 The following configs have been built successfully. More configs may be tested in the coming days. arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig m68k m5475evb_defconfig m68k bvme6000_defconfig m68k sun3_defconfig powerpc chrp32_defconfig armqcom_defconfig arm mv78xx0_defconfig c6xevmc6457_defconfig powerpc pasemi_defconfig armhisi_defconfig armspear3xx_defconfig nds32alldefconfig powerpc powernv_defconfig shapsh4ad0a_defconfig arm colibri_pxa270_defconfig mips sb1250_swarm_defconfig arm shannon_defconfig m68kq40_defconfig ia64 bigsur_defconfig powerpc pmac32_defconfig powerpc skiroot_defconfig archsdk_defconfig csky alldefconfig riscvnommu_virt_defconfig armneponset_defconfig h8300allyesconfig sh polaris_defconfig sh urquell_defconfig powerpc defconfig m68k apollo_defconfig powerpc alldefconfig armpleb_defconfig armdove_defconfig powerpcamigaone_defconfig arm tango4_defconfig shshmin_defconfig arm tegra_defconfig armmps2_defconfig arm64alldefconfig arc nps_defconfig arm ep93xx_defconfig um x86_64_defconfig arm cns3420vb_defconfig armmulti_v5_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64 randconfig-a002-20200823 x86_64 randconfig-a003-20200823 x86_64 randconfig-a005-20200823 x86_64 randconfig-a001-20200823 x86_64 randconfig-a006-20200823 x86_64 randconfig-a004-20200823 i386 randconfig-a002-20200823 i386 randconfig-a004-20200823 i386 randconfig-a003-20200823 i386 randconfig-a005-20200823 i386 randconfig-a006-20200823 i386 randconfig-a001-20200823 i386 randconfig-a013-20200823 i386 randconfig-a012-20200823 i386 randconfig-a011-20200823 i386 randconfig-a016-20200823 i386 randconfig-a014-20200823 i386 randconfig-a015-20200823 riscvallyesconfig riscv allnoconfig riscv defconfig riscv
[powerpc:next-test] BUILD SUCCESS 02a4db8feb0e35a215b6b803bce0ab8a1fd32838
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test branch HEAD: 02a4db8feb0e35a215b6b803bce0ab8a1fd32838 powerpc/64: Remove unused generic_secondary_thread_init() elapsed time: 845m configs tested: 112 configs skipped: 8 The following configs have been built successfully. More configs may be tested in the coming days. arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig m68k m5475evb_defconfig m68k bvme6000_defconfig m68k sun3_defconfig powerpc chrp32_defconfig armqcom_defconfig arm mv78xx0_defconfig c6xevmc6457_defconfig powerpc pasemi_defconfig armhisi_defconfig armspear3xx_defconfig nds32alldefconfig powerpc powernv_defconfig shapsh4ad0a_defconfig arm colibri_pxa270_defconfig mips sb1250_swarm_defconfig arm shannon_defconfig m68kq40_defconfig ia64 bigsur_defconfig powerpc pmac32_defconfig powerpc skiroot_defconfig archsdk_defconfig csky alldefconfig riscvnommu_virt_defconfig armneponset_defconfig h8300allyesconfig sh polaris_defconfig sh urquell_defconfig powerpc defconfig m68k apollo_defconfig powerpc alldefconfig armpleb_defconfig armdove_defconfig mips malta_kvm_defconfig powerpc g5_defconfig x86_64 alldefconfig arm ep93xx_defconfig arm lpc32xx_defconfig sh shx3_defconfig powerpcamigaone_defconfig arm tango4_defconfig shshmin_defconfig arm tegra_defconfig armmps2_defconfig arm64alldefconfig arc nps_defconfig um x86_64_defconfig arm cns3420vb_defconfig armmulti_v5_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64 randconfig-a002-20200823 x86_64 randconfig-a003-20200823 x86_64 randconfig-a005-20200823 x86_64 randconfig-a001-20200823 x86_64 randconfig-a006-20200823 x86_64 randconfig-a004-20200823 i386 randconfig-a002-20200823 i386 randconfig-a004-20200823 i386 randconfig-a003-20200823 i386 randconfig-a005-20200823 i386 randconfig-a006-20200823 i386 randconfig-a001-20200823 i386 randconfig-a013-20200823 i386 randconfig-a012-20200823 i386 randconfig-a011-20200823 i386 randconfig-a016-20200823 i386 randconfig
Re: [PATCH v1 05/10] powerpc/pseries/iommu: Add iommu_pseries_alloc_table() helper
On 18/08/2020 09:40, Leonardo Bras wrote: > Creates a helper to allow allocating a new iommu_table without the need > to reallocate the iommu_group. > > This will be helpful for replacing the iommu_table for the new DMA window, > after we remove the old one with iommu_tce_table_put(). > > Signed-off-by: Leonardo Bras > --- > arch/powerpc/platforms/pseries/iommu.c | 25 ++--- > 1 file changed, 14 insertions(+), 11 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/iommu.c > b/arch/powerpc/platforms/pseries/iommu.c > index 8fe23b7dff3a..39617ce0ec83 100644 > --- a/arch/powerpc/platforms/pseries/iommu.c > +++ b/arch/powerpc/platforms/pseries/iommu.c > @@ -53,28 +53,31 @@ enum { > DDW_EXT_QUERY_OUT_SIZE = 2 > }; > > -static struct iommu_table_group *iommu_pseries_alloc_group(int node) > +static struct iommu_table *iommu_pseries_alloc_table(int node) > { > - struct iommu_table_group *table_group; > struct iommu_table *tbl; > > - table_group = kzalloc_node(sizeof(struct iommu_table_group), GFP_KERNEL, > -node); > - if (!table_group) > - return NULL; > - > tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL, node); > if (!tbl) > - goto free_group; > + return NULL; > > INIT_LIST_HEAD_RCU(>it_group_list); > kref_init(>it_kref); > + return tbl; > +} > > - table_group->tables[0] = tbl; > +static struct iommu_table_group *iommu_pseries_alloc_group(int node) > +{ > + struct iommu_table_group *table_group; > + > + table_group = kzalloc_node(sizeof(*table_group), GFP_KERNEL, node); I'd prefer you did not make unrelated changes (sizeof(struct iommu_table_group) -> sizeof(*table_group)) so the diff stays shorter and easier to follow. You changed sizeof(struct iommu_table_group) but not sizeof(struct iommu_table) and this confused me enough to spend more time than this straight forward change deserves. Not important in this case though so Reviewed-by: Alexey Kardashevskiy > + if (!table_group) > + return NULL; > > - return table_group; > + table_group->tables[0] = iommu_pseries_alloc_table(node); > + if (table_group->tables[0]) > + return table_group; > > -free_group: > kfree(table_group); > return NULL; > } > -- Alexey
[PATCH] Documentation/powerpc: fix malformed table in syscall64-abi
From: Randy Dunlap Fix malformed table warning in powerpc/syscall64-abi.rst by making two tables and moving the headings. Documentation/powerpc/syscall64-abi.rst:53: WARNING: Malformed table. Text in column margin in table line 2. === = --- For the sc instruction, differences with the ELF ABI --- r0 Volatile (System call number.) r3 Volatile (Parameter 1, and return value.) r4-r8 Volatile (Parameters 2-6.) cr0 Volatile (cr0.SO is the return error condition.) cr1, cr5-7 Nonvolatile lr Nonvolatile --- For the scv 0 instruction, differences with the ELF ABI --- r0 Volatile (System call number.) r3 Volatile (Parameter 1, and return value.) r4-r8 Volatile (Parameters 2-6.) === = Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions") Signed-off-by: Randy Dunlap Cc: Nicholas Piggin Cc: Michael Ellerman --- Documentation/powerpc/syscall64-abi.rst |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- lnx-59-rc2.orig/Documentation/powerpc/syscall64-abi.rst +++ lnx-59-rc2/Documentation/powerpc/syscall64-abi.rst @@ -49,16 +49,18 @@ Register preservation rules Register preservation rules match the ELF ABI calling sequence with the following differences: -=== = --- For the sc instruction, differences with the ELF ABI --- +=== = r0 Volatile (System call number.) r3 Volatile (Parameter 1, and return value.) r4-r8 Volatile (Parameters 2-6.) cr0 Volatile (cr0.SO is the return error condition.) cr1, cr5-7 Nonvolatile lr Nonvolatile +=== = --- For the scv 0 instruction, differences with the ELF ABI --- +=== = r0 Volatile (System call number.) r3 Volatile (Parameter 1, and return value.) r4-r8 Volatile (Parameters 2-6.)
Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.9-3 tag
The pull request you sent on Sun, 23 Aug 2020 22:50:13 +1000: > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git > tags/powerpc-5.9-3 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/cb95712138ec5e480db5160b41172bbc6f6494cc Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Oops decoding help request
Hello, I am not a kernel developer and I need much help in order to understand a kernel Oops (the first of a series of three Oops). It is: kernel tried to execute exec-protected page (f102) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch Faulting instruction address: 0xf102 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K MMU=Hash PowerMac Modules linked in: crct10dif_generic (+) crct10dif_common drm_panel_orientation_quirks CPU: 0 PID: 71 Comm: systemd-udevd Not tainted 5.9.0-rc1+ #298 NIP: f102 LR: c00053a4 CTR: f102 REGS: c1c6dd50 TRAP: 0400 Not tainted (5.9.0-rc1+) MSR: 10009032 CR: 2284 XER: GPR00: c0005390 c1c6de08 c1c6b400 0cc0 0008 ef6db038 0001 GPR08: 002e 2284 00b6fb58 0005 GPR16: bff0d768 bff0d770 01032cc0 00b0b31f 01020960 GPR24: 00b70954 010206c0 ef39f4a0 00a28380 f102 f10193a0 NIP [f102] crct10dif_mod_init+0x0/0x60 [crct10dif_generic] LR [c00053a4] do_one_initcall+0x50/0x1f4 Call Trace: [c1c6de08] [c0005390] do_one_initcall+0x3c/0x1f4 (unreliable) [c1c6de78] [c0102068] do_init_module+0x6c/0x27c [c1c6dea8] [c01053cc] sys_finit_module+0xc0/0x12c [c1c6df38] [c001c11c] ret_from_syscall+0x0/0x34 --- interrupt: c01 at 0x7a7780 LR = 0xa1bf64 Instruction dump: <7c0802a6> 90010004 6000 9421fff0 ---[ end trace 257a4bbda691894e ]--- >From what I understand, this is a problem in the init function of module crct10dif_generic jumping at address f102. I think I understand that f102 is an address for data and not for code. In fact it belongs to "vmalloc & ioremap" area of the virtual memory layout: * 0xffbee000..0xf000 : fixmap * 0xff40..0xff80 : highmem PTEs * 0xfda27000..0xff40 : early ioremap * 0xf100..0xfda27000 : vmalloc & ioremap The init function is: : 0: 7c 08 02 a6 mflrr0 4: 90 01 00 04 stw r0,4(r1) 8: 48 00 00 01 bl 8 c: 94 21 ff f0 stwur1,-16(r1) 10: 7c 08 02 a6 mflrr0 14: 3c 60 00 00 lis r3,0 18: 90 01 00 14 stw r0,20(r1) 1c: 38 63 00 00 addir3,r3,0 20: 80 01 00 14 lwz r0,20(r1) 24: 38 21 00 10 addir1,r1,16 28: 7c 08 03 a6 mtlrr0 2c: 48 00 00 00 b 2c and its source code is: static int __init crct10dif_mod_init(void) { return crypto_register_shash(); } This is what I am not understanding. The error message seems to imply that code jumps to an invalid address, so the problem would be that address of function crypto_register_shash is calculated wrongly. About stack addresses, please note that CONFIG_VMAP_STACK is not set. Is this a correct reasoning? Thank you very much, Giuseppe
[GIT PULL] Please pull powerpc/linux.git powerpc-5.9-3 tag
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi Linus, Please pull some more powerpc fixes for 5.9. There's one non-fix, which is the perf extended regs support. That was posted way back but I waited for the tools/perf part to land, which it now has. cheers The following changes since commit 9123e3a74ec7b934a4a099e98af6a61c2f80bbf5: Linux 5.9-rc1 (2020-08-16 13:04:57 -0700) are available in the git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.9-3 for you to fetch changes up to 64ef8f2c4791940d7f3945507b6a45c20d959260: powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver (2020-08-21 23:35:27 +1000) - -- powerpc fixes for 5.9 #3 Add perf support for emitting extended registers for power10. A fix for CPU hotplug on pseries, where on large/loaded systems we may not wait long enough for the CPU to be offlined, leading to crashes. Addition of a raw cputable entry for Power10, which is not required to boot, but is required to make our PMU setup work correctly in guests. Three fixes for the recent changes on 32-bit Book3S to move modules into their own segment for strict RWX. A fix for a recent change in our powernv PCI code that could lead to crashes. A change to our perf interrupt accounting to avoid soft lockups when using some events, found by syzkaller. A change in the way we handle power loss events from the hypervisor on pseries. We no longer immediately shut down if we're told we're running on a UPS. A few other minor fixes. Thanks to: Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Athira Rajeev, Christophe Leroy, Frederic Barrat, Greg Kurz, Kajol Jain, Madhavan Srinivasan, Michael Neuling, Michael Roth, Nageswara R Sastry, Oliver O'Halloran, Thiago Jung Bauermann, Vaidyanathan Srinivasan, Vasant Hegde. - -- Aneesh Kumar K.V (1): powerpc/pkeys: Fix build error with PPC_MEM_KEYS disabled Anju T Sudhakar (1): powerpc/perf: Add support for outputting extended regs in perf intr_regs Athira Rajeev (2): powerpc/perf: Add extended regs support for power10 platform powerpc/perf: Fix soft lockups due to missed interrupt accounting Christophe Leroy (4): powerpc/fixmap: Fix the size of the early debug area powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32 powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf000 Frederic Barrat (1): powerpc/powernv/pci: Fix possible crash when releasing DMA resources Kajol Jain (1): powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver Madhavan Srinivasan (2): powerpc: Add POWER10 raw mode cputable entry powerpc/kernel: Cleanup machine check function declarations Michael Neuling (1): powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores Michael Roth (1): powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death Vasant Hegde (1): powerpc/pseries: Do not initiate shutdown when system is running on UPS Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 | 2 +- arch/powerpc/include/asm/cputable.h | 5 +++ arch/powerpc/include/asm/fixmap.h| 2 +- arch/powerpc/include/asm/kasan.h | 9 +++- arch/powerpc/include/asm/mce.h | 7 arch/powerpc/include/asm/perf_event.h| 3 ++ arch/powerpc/include/asm/perf_event_server.h | 5 +++ arch/powerpc/include/uapi/asm/perf_regs.h| 20 - arch/powerpc/kernel/cputable.c | 22 -- arch/powerpc/kernel/dt_cpu_ftrs.c| 4 -- arch/powerpc/kernel/setup-common.c | 1 + arch/powerpc/mm/book3s32/mmu.c | 9 +++- arch/powerpc/mm/book3s64/hash_utils.c| 4 +- arch/powerpc/perf/core-book3s.c | 5 +++ arch/powerpc/perf/hv-24x7.c | 11 - arch/powerpc/perf/perf_regs.c| 44 ++-- arch/powerpc/perf/power10-pmu.c | 6 +++ arch/powerpc/perf/power9-pmu.c | 6 +++ arch/powerpc/platforms/powernv/pci-ioda.c| 2 +- arch/powerpc/platforms/pseries/hotplug-cpu.c | 18 +--- arch/powerpc/platforms/pseries/ras.c | 1 - 21 files changed, 161 insertions(+), 25 deletions(-) -BEGIN PGP SIGNATURE-