Re: [PATCH] mtd: partitions: Handle add_mtd_device() failures gracefully
On Thu, 26 Apr 2018 19:56:58 +0200 Geert Uytterhoevenwrote: > Hi Boris, > > On Thu, Apr 26, 2018 at 7:53 PM, Boris Brezillon > wrote: > > On Tue, 10 Apr 2018 15:26:20 +0200 > > Geert Uytterhoeven wrote: > >> On Mon, Apr 9, 2018 at 11:59 PM, Marek Vasut > >> wrote: > >> > On 04/09/2018 02:25 PM, Geert Uytterhoeven wrote: > >> >> Currently add_mtd_device() failures are plainly ignored, which may lead > >> >> to kernel crashes later. > >> > >> >> Fix this by ignoring and freeing partitions that failed to add in > >> >> add_mtd_partitions(). The same issue is present in mtd_add_partition(), > >> >> so fix that as well. > >> >> > >> >> Signed-off-by: Geert Uytterhoeven > >> >> --- > >> >> I don't know if it is worthwhile factoring out the common handling. > >> >> > >> >> Should allocate_partition() fail instead? There's a comment saying > >> >> "let's register it anyway to preserve ordering". > >> > >> >> --- a/drivers/mtd/mtdpart.c > >> >> +++ b/drivers/mtd/mtdpart.c > >> > >> >> @@ -746,7 +753,15 @@ int add_mtd_partitions(struct mtd_info *master, > >> >> list_add(>list, _partitions); > >> >> mutex_unlock(_partitions_mutex); > >> >> > >> >> - add_mtd_device(>mtd); > >> >> + ret = add_mtd_device(>mtd); > >> >> + if (ret) { > >> >> + mutex_lock(_partitions_mutex); > >> >> + list_del(>list); > >> >> + mutex_unlock(_partitions_mutex); > >> >> + free_partition(slave); > >> >> + continue; > >> >> + } > >> > > >> > Why is the partition even in the list in the first place ? Can we avoid > >> > adding it rather than adding and removing it ? > >> > >> Hence my question "Should allocate_partition() fail instead?". > > > > I'd prefer this option too. Can you prepare a new version doing that? > > OK, then I have another question ;-) > > Should this be a special failure, so all other valid partitions on the > same FLASH > are still added, or should it be fatal, so no partitions are added at all? I guess we can go for the "drop the invalid partitions and print a warning" approach. Anyway, I'm sure people will notice really quickly when one of their partition is missing, so it's not a big deal IMO.
Re: [PATCH] mtd: partitions: Handle add_mtd_device() failures gracefully
Hi Boris, On Thu, Apr 26, 2018 at 7:53 PM, Boris Brezillonwrote: > On Tue, 10 Apr 2018 15:26:20 +0200 > Geert Uytterhoeven wrote: >> On Mon, Apr 9, 2018 at 11:59 PM, Marek Vasut wrote: >> > On 04/09/2018 02:25 PM, Geert Uytterhoeven wrote: >> >> Currently add_mtd_device() failures are plainly ignored, which may lead >> >> to kernel crashes later. >> >> >> Fix this by ignoring and freeing partitions that failed to add in >> >> add_mtd_partitions(). The same issue is present in mtd_add_partition(), >> >> so fix that as well. >> >> >> >> Signed-off-by: Geert Uytterhoeven >> >> --- >> >> I don't know if it is worthwhile factoring out the common handling. >> >> >> >> Should allocate_partition() fail instead? There's a comment saying >> >> "let's register it anyway to preserve ordering". >> >> >> --- a/drivers/mtd/mtdpart.c >> >> +++ b/drivers/mtd/mtdpart.c >> >> >> @@ -746,7 +753,15 @@ int add_mtd_partitions(struct mtd_info *master, >> >> list_add(>list, _partitions); >> >> mutex_unlock(_partitions_mutex); >> >> >> >> - add_mtd_device(>mtd); >> >> + ret = add_mtd_device(>mtd); >> >> + if (ret) { >> >> + mutex_lock(_partitions_mutex); >> >> + list_del(>list); >> >> + mutex_unlock(_partitions_mutex); >> >> + free_partition(slave); >> >> + continue; >> >> + } >> > >> > Why is the partition even in the list in the first place ? Can we avoid >> > adding it rather than adding and removing it ? >> >> Hence my question "Should allocate_partition() fail instead?". > > I'd prefer this option too. Can you prepare a new version doing that? OK, then I have another question ;-) Should this be a special failure, so all other valid partitions on the same FLASH are still added, or should it be fatal, so no partitions are added at all? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH] mtd: partitions: Handle add_mtd_device() failures gracefully
Hi Geert, Sorry for the late reply. On Tue, 10 Apr 2018 15:26:20 +0200 Geert Uytterhoevenwrote: > Hi Marek, > > On Mon, Apr 9, 2018 at 11:59 PM, Marek Vasut wrote: > > On 04/09/2018 02:25 PM, Geert Uytterhoeven wrote: > >> Currently add_mtd_device() failures are plainly ignored, which may lead > >> to kernel crashes later. > > >> Fix this by ignoring and freeing partitions that failed to add in > >> add_mtd_partitions(). The same issue is present in mtd_add_partition(), > >> so fix that as well. > >> > >> Signed-off-by: Geert Uytterhoeven > >> --- > >> I don't know if it is worthwhile factoring out the common handling. > >> > >> Should allocate_partition() fail instead? There's a comment saying > >> "let's register it anyway to preserve ordering". > > >> --- a/drivers/mtd/mtdpart.c > >> +++ b/drivers/mtd/mtdpart.c > > >> @@ -746,7 +753,15 @@ int add_mtd_partitions(struct mtd_info *master, > >> list_add(>list, _partitions); > >> mutex_unlock(_partitions_mutex); > >> > >> - add_mtd_device(>mtd); > >> + ret = add_mtd_device(>mtd); > >> + if (ret) { > >> + mutex_lock(_partitions_mutex); > >> + list_del(>list); > >> + mutex_unlock(_partitions_mutex); > >> + free_partition(slave); > >> + continue; > >> + } > > > > Why is the partition even in the list in the first place ? Can we avoid > > adding it rather than adding and removing it ? > > Hence my question "Should allocate_partition() fail instead?". I'd prefer this option too. Can you prepare a new version doing that? Thanks, Boris
Re: [PATCH] mtd: partitions: Handle add_mtd_device() failures gracefully
On Tue, Apr 10, 2018 at 7:47 AM, Geert Uytterhoevenwrote: > Hi Marek, > > On Tue, Apr 10, 2018 at 4:37 PM, Marek Vasut wrote: >> On 04/10/2018 03:26 PM, Geert Uytterhoeven wrote: >>> On Mon, Apr 9, 2018 at 11:59 PM, Marek Vasut wrote: On 04/09/2018 02:25 PM, Geert Uytterhoeven wrote: > Currently add_mtd_device() failures are plainly ignored, which may lead > to kernel crashes later. >>> > Fix this by ignoring and freeing partitions that failed to add in > add_mtd_partitions(). The same issue is present in mtd_add_partition(), > so fix that as well. > > Signed-off-by: Geert Uytterhoeven > --- > I don't know if it is worthwhile factoring out the common handling. > > Should allocate_partition() fail instead? There's a comment saying > "let's register it anyway to preserve ordering". >>> > --- a/drivers/mtd/mtdpart.c > +++ b/drivers/mtd/mtdpart.c >>> > @@ -746,7 +753,15 @@ int add_mtd_partitions(struct mtd_info *master, > list_add(>list, _partitions); > mutex_unlock(_partitions_mutex); > > - add_mtd_device(>mtd); > + ret = add_mtd_device(>mtd); > + if (ret) { > + mutex_lock(_partitions_mutex); > + list_del(>list); > + mutex_unlock(_partitions_mutex); > + free_partition(slave); > + continue; > + } Why is the partition even in the list in the first place ? Can we avoid adding it rather than adding and removing it ? >>> >>> Hence my question "Should allocate_partition() fail instead?". >>> Note that if we go that route, it should be a "soft" failure, as we >>> probably don't >>> want to drop all other partitions on the device. >> Is the number of partitions ie. in /proc/mtdparts an ABI ? > > I don't know. > I don't know if it's an ABI, but having consistent /dev/mtdX numbering is important, even in the case of a failed partition. Many scripts on embedded systems are hard-coded to /dev/mtdX identifies with the expectation that they can access a particular address region of flash. I'm sure that's what the "let's register it anyway to preserve ordering" comment was trying to get across. I've even seen weird things in dts files where later entries specify earlier addresses in order to leave the old /dev/mtdX numbering alone. Obviously, a better user solution is to construct the mtdX number from /proc/mtd based on filtering for the name field, but not everyone does. I'd be wary about doing any fix that disturbs the numbering as you'll be disturbing users. At a minum, a loud warning in the log. That said - obviously fixing the kernel crash must happen. - Steve
Re: [PATCH] mtd: partitions: Handle add_mtd_device() failures gracefully
Hi Marek, On Tue, Apr 10, 2018 at 4:37 PM, Marek Vasutwrote: > On 04/10/2018 03:26 PM, Geert Uytterhoeven wrote: >> On Mon, Apr 9, 2018 at 11:59 PM, Marek Vasut wrote: >>> On 04/09/2018 02:25 PM, Geert Uytterhoeven wrote: Currently add_mtd_device() failures are plainly ignored, which may lead to kernel crashes later. >> Fix this by ignoring and freeing partitions that failed to add in add_mtd_partitions(). The same issue is present in mtd_add_partition(), so fix that as well. Signed-off-by: Geert Uytterhoeven --- I don't know if it is worthwhile factoring out the common handling. Should allocate_partition() fail instead? There's a comment saying "let's register it anyway to preserve ordering". >> --- a/drivers/mtd/mtdpart.c +++ b/drivers/mtd/mtdpart.c >> @@ -746,7 +753,15 @@ int add_mtd_partitions(struct mtd_info *master, list_add(>list, _partitions); mutex_unlock(_partitions_mutex); - add_mtd_device(>mtd); + ret = add_mtd_device(>mtd); + if (ret) { + mutex_lock(_partitions_mutex); + list_del(>list); + mutex_unlock(_partitions_mutex); + free_partition(slave); + continue; + } >>> >>> Why is the partition even in the list in the first place ? Can we avoid >>> adding it rather than adding and removing it ? >> >> Hence my question "Should allocate_partition() fail instead?". >> Note that if we go that route, it should be a "soft" failure, as we >> probably don't >> want to drop all other partitions on the device. > Is the number of partitions ie. in /proc/mtdparts an ABI ? I don't know. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH] mtd: partitions: Handle add_mtd_device() failures gracefully
On 04/10/2018 03:26 PM, Geert Uytterhoeven wrote: > Hi Marek, > > On Mon, Apr 9, 2018 at 11:59 PM, Marek Vasutwrote: >> On 04/09/2018 02:25 PM, Geert Uytterhoeven wrote: >>> Currently add_mtd_device() failures are plainly ignored, which may lead >>> to kernel crashes later. > >>> Fix this by ignoring and freeing partitions that failed to add in >>> add_mtd_partitions(). The same issue is present in mtd_add_partition(), >>> so fix that as well. >>> >>> Signed-off-by: Geert Uytterhoeven >>> --- >>> I don't know if it is worthwhile factoring out the common handling. >>> >>> Should allocate_partition() fail instead? There's a comment saying >>> "let's register it anyway to preserve ordering". > >>> --- a/drivers/mtd/mtdpart.c >>> +++ b/drivers/mtd/mtdpart.c > >>> @@ -746,7 +753,15 @@ int add_mtd_partitions(struct mtd_info *master, >>> list_add(>list, _partitions); >>> mutex_unlock(_partitions_mutex); >>> >>> - add_mtd_device(>mtd); >>> + ret = add_mtd_device(>mtd); >>> + if (ret) { >>> + mutex_lock(_partitions_mutex); >>> + list_del(>list); >>> + mutex_unlock(_partitions_mutex); >>> + free_partition(slave); >>> + continue; >>> + } >> >> Why is the partition even in the list in the first place ? Can we avoid >> adding it rather than adding and removing it ? > > Hence my question "Should allocate_partition() fail instead?". > Note that if we go that route, it should be a "soft" failure, as we > probably don't > want to drop all other partitions on the device. Is the number of partitions ie. in /proc/mtdparts an ABI ? -- Best regards, Marek Vasut
Re: [PATCH] mtd: partitions: Handle add_mtd_device() failures gracefully
Hi Marek, On Mon, Apr 9, 2018 at 11:59 PM, Marek Vasutwrote: > On 04/09/2018 02:25 PM, Geert Uytterhoeven wrote: >> Currently add_mtd_device() failures are plainly ignored, which may lead >> to kernel crashes later. >> Fix this by ignoring and freeing partitions that failed to add in >> add_mtd_partitions(). The same issue is present in mtd_add_partition(), >> so fix that as well. >> >> Signed-off-by: Geert Uytterhoeven >> --- >> I don't know if it is worthwhile factoring out the common handling. >> >> Should allocate_partition() fail instead? There's a comment saying >> "let's register it anyway to preserve ordering". >> --- a/drivers/mtd/mtdpart.c >> +++ b/drivers/mtd/mtdpart.c >> @@ -746,7 +753,15 @@ int add_mtd_partitions(struct mtd_info *master, >> list_add(>list, _partitions); >> mutex_unlock(_partitions_mutex); >> >> - add_mtd_device(>mtd); >> + ret = add_mtd_device(>mtd); >> + if (ret) { >> + mutex_lock(_partitions_mutex); >> + list_del(>list); >> + mutex_unlock(_partitions_mutex); >> + free_partition(slave); >> + continue; >> + } > > Why is the partition even in the list in the first place ? Can we avoid > adding it rather than adding and removing it ? Hence my question "Should allocate_partition() fail instead?". Note that if we go that route, it should be a "soft" failure, as we probably don't want to drop all other partitions on the device. >> mtd_add_partition_attrs(slave); >> if (parts[i].types) >> mtd_parse_part(slave, parts[i].types); >> Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH] mtd: partitions: Handle add_mtd_device() failures gracefully
On 04/09/2018 02:25 PM, Geert Uytterhoeven wrote: > Currently add_mtd_device() failures are plainly ignored, which may lead > to kernel crashes later. > > E.g. after flipping SW17 on r8a7791/koelsch, to switch from the large to > the small QSPI FLASH, without updating the partition description in DT, > the following happens: > > m25p80 spi0.0: found s25sl032p, expected s25fl512s > 3 fixed-partitions partitions found on MTD device spi0.0 > Creating 3 MTD partitions on "spi0.0": > 0x-0x0008 : "loader" > 0x0008-0x0060 : "user" > mtd: partition "user" extends beyond the end of device "spi0.0" -- size > truncated to 0x38 > > The second partition is truncated correctly. > > 0x0060-0x0400 : "flash" > mtd: partition "flash" is out of reach -- disabled > > The third partition is disabled by allocate_partition(), which means > fields like erasesize are not filled in. Hence add_mtd_device() fails > and screams, rightfully: > > [ cut here ] > WARNING: CPU: 1 PID: 1 at drivers/mtd/mtdcore.c:508 > add_mtd_device+0x2a0/0x2e0 > Modules linked in: > CPU: 1 PID: 1 Comm: swapper/0 Not tainted > 4.16.0-koelsch-08649-g58e35e77b00c075d #4029 > Hardware name: Generic R-Car Gen2 (Flattened Device Tree) > [] (unwind_backtrace) from [] (show_stack+0x10/0x14) > [] (show_stack) from [] (dump_stack+0x7c/0x9c) > [] (dump_stack) from [] (__warn+0xd4/0x104) > [] (__warn) from [] (warn_slowpath_null+0x38/0x44) > [] (warn_slowpath_null) from [] > (add_mtd_device+0x2a0/0x2e0) > [] (add_mtd_device) from [] > (add_mtd_partitions+0xd0/0x16c) > [] (add_mtd_partitions) from [] > (mtd_device_parse_register+0xc4/0x1b4) > [] (mtd_device_parse_register) from [] > (m25p_probe+0x148/0x188) > [] (m25p_probe) from [] (spi_drv_probe+0x84/0xa0) > > [...] > > ---[ end trace d43ce221bca7ab5c ]--- > > However, that failure is ignored by add_mtd_partitions(), leading to a > crash later: > > [ cut here ] > kernel BUG at fs/sysfs/file.c:330! > Internal error: Oops - BUG: 0 [#1] SMP ARM > Modules linked in: > CPU: 1 PID: 1 Comm: swapper/0 Tainted: GW > 4.16.0-koelsch-08649-g58e35e77b00c075d #4029 > Hardware name: Generic R-Car Gen2 (Flattened Device Tree) > PC is at sysfs_create_file_ns+0x24/0x40 > LR is at 0x1 > pc : []lr : [<0001>]psr: 6013 > sp : eb447c00 ip : fp : c0e20174 > r10: 0003 r9 : c0e20150 r8 : eb7e3818 > r7 : ea8b20f8 r6 : c0e2017c r5 : r4 : > r3 : 0200 r2 : r1 : c0e2019c r0 : > Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user > Control: 30c5387d Table: 40003000 DAC: > Process swapper/0 (pid: 1, stack limit = 0x7eba272f) > Stack: (0xeb447c00 to 0xeb448000) > > [...] > > [] (sysfs_create_file_ns) from [] > (sysfs_create_files+0x34/0x70) > [] (sysfs_create_files) from [] > (mtd_add_partition_attrs+0x10/0x34) > [] (mtd_add_partition_attrs) from [] > (add_mtd_partitions+0xd8/0x16c) > [] (add_mtd_partitions) from [] > (mtd_device_parse_register+0xc4/0x1b4) > [] (mtd_device_parse_register) from [] > (m25p_probe+0x148/0x188) > [] (m25p_probe) from [] (spi_drv_probe+0x84/0xa0) > > Fix this by ignoring and freeing partitions that failed to add in > add_mtd_partitions(). The same issue is present in mtd_add_partition(), > so fix that as well. > > Signed-off-by: Geert Uytterhoeven> --- > I don't know if it is worthwhile factoring out the common handling. > > Should allocate_partition() fail instead? There's a comment saying > "let's register it anyway to preserve ordering". > --- > drivers/mtd/mtdpart.c | 21 ++--- > 1 file changed, 18 insertions(+), 3 deletions(-) > > diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c > index 023516a632766c42..d41adc1397dcf95e 100644 > --- a/drivers/mtd/mtdpart.c > +++ b/drivers/mtd/mtdpart.c > @@ -637,7 +637,14 @@ int mtd_add_partition(struct mtd_info *parent, const > char *name, > list_add(>list, _partitions); > mutex_unlock(_partitions_mutex); > > - add_mtd_device(>mtd); > + ret = add_mtd_device(>mtd); > + if (ret) { > + mutex_lock(_partitions_mutex); > + list_del(>list); > + mutex_unlock(_partitions_mutex); > + free_partition(new); > + return ret; > + } > > mtd_add_partition_attrs(new); > > @@ -731,7 +738,7 @@ int add_mtd_partitions(struct mtd_info *master, > { > struct mtd_part *slave; > uint64_t cur_offset = 0; > - int i; > + int i, ret; > > printk(KERN_NOTICE "Creating %d MTD partitions on \"%s\":\n", nbparts, > master->name); > > @@ -746,7 +753,15 @@ int add_mtd_partitions(struct mtd_info