HI Peter
On Tue, Dec 16, 2025 at 2:41 PM Peter Suti <[email protected]> wrote: > > When performing an 'mtd read' on a NAND partition, the presence of bad > blocks can cause the command to fail with error -22 (EINVAL) if the > requested size is equal to (or close to) the partition size. > When size is not provided, the whole partition is read by default. > > This issue occurs because the bad block skipping logic increments the > physical read offset ('off') without decrementing the 'remaining' byte > count. If enough bad blocks are skipped, the 'off' pointer eventually > slides past the end of the partition boundary ('mtd->size') while > 'remaining' is still non-zero. The subsequent call to 'mtd_read_oob' > then fails its internal boundary checks. > > The Scenario: > Partition Size: 0x1200000 > Bad Blocks: 1 > Command: mtd read swufit ${loadaddr} > (Defaults to reading full 0x1200000 bytes) > > 1. U-Boot attempts to read 0x1200000 valid bytes. > 2. It encounters the bad block and skips it (off += mtd->erasesize). > 3. To satisfy the full request, it must read > (0x1200000 + mtd->erasesize) physical bytes. > 4. The read pointer 'off' exceeds 'mtd->size'. > 5. Command aborts with: "Read on swufit failed with error -22". > > Add a boundary check at the start of the read loop. If the read pointer > 'off' reaches the end of the physical device, check if the user explicitly > requested a size (argc >= 2). > > * If no size was specified (default read): Stop the read gracefully, > print a notice, and return success. We assume the user wants "all > available data" up to the physical limit. > * If a size was explicitly specified: Continue standard execution. This > will result in an error (-22) as expected, since the specific requirement > cannot be met. > > It is physically impossible to read the full logical size of a partition > if it contains bad blocks (as the bad blocks consume physical space > required for the data). In this case, truncating the read is reasonable. > For common use cases like loading FIT images (which often > contain padding at the end), missing the trailing 0xFF data is acceptable, > whereas failing the entire load operation is not. > > Also add some logging when bad blocks are encountered. > > NOTE: This changes the behavior of 'mtd read' when no size is provided. > Previously, it failed if the full read was impossible. Now, it > succeeds with a truncated buffer. The caller is responsible for > verifying data integrity (e.g., via fitImage hash verification). > > Signed-off-by: Peter Suti <[email protected]> > [polished the comment and commit message] > Signed-off-by: Radek Dostál <[email protected]> > --- > cmd/mtd.c | 27 ++++++++++++++++++++++++++- > 1 file changed, 26 insertions(+), 1 deletion(-) > > diff --git a/cmd/mtd.c b/cmd/mtd.c > index 7f25144098b..1d1845bce44 100644 > --- a/cmd/mtd.c > +++ b/cmd/mtd.c > @@ -559,8 +559,10 @@ static int do_mtd_io(struct cmd_tbl *cmdtp, int flag, > int argc, > > /* Search for the first good block after the given offset */ > off = start_off; > - while (mtd_block_isbad(mtd, off)) > + while (mtd_block_isbad(mtd, off)) { > + printf("Bad block: failed to read at offset 0x%llx, > skipping.\n", off); > off += mtd->erasesize; > + } Would like to avoid adding more debug stuff or printing here. > > led_activity_blink(); > > @@ -569,9 +571,32 @@ static int do_mtd_io(struct cmd_tbl *cmdtp, int flag, > int argc, > > /* Loop over the pages to do the actual read/write */ > while (remaining) { > + /* > + * Boundary Check: Bad block skipping may push the physical > read offset > + * past the partition end (mtd->size). > + * If the read pointer 'off' has reached the end of the > physical flash, > + * we physically cannot read more. > + * > + * Check 'argc' to see if the user explicitly requested a > specific size. > + * * If argc < 2, the user did NOT specify a size (defaulted > to partition size). > + * In this case, we treat "end of flash" as a natural > stopping point > + * and truncate the read gracefully. > + * * If argc >= 2, the user explicitly asked for N bytes. > + * In this case, we do NOT truncate. We let the loop > continue, which > + * will trigger the standard error (-22) because the > request cannot be met. > + */ > + if (argc < 2 && off >= mtd->size) { > + printf("Notice: Reached end of partition with %lld > bytes remaining. " > + "Truncating read.\n", remaining); This is ok as print, to inform about the boundary check > + remaining = 0; > + ret = CMD_RET_SUCCESS; What is the drawback to failing here, I mean you can still verify the integrity. GIve a script example where this is needed in the commit message > + break; > + } > + > /* Skip the block if it is bad */ > if (mtd_is_aligned_with_block_size(mtd, off) && > mtd_block_isbad(mtd, off)) { > + printf("Bad block: failed to read at offset 0x%llx, > skipping.\n", off); > off += mtd->erasesize; This is an extra print for now, you can propose it on a separated patch Michael > continue; > } > -- > 2.43.0 > -- Michael Nazzareno Trimarchi Co-Founder & Chief Executive Officer M. +39 347 913 2170 [email protected] __________________________________ Amarula Solutions BV Joop Geesinkweg 125, 1114 AB, Amsterdam, NL T. +31 (0)85 111 9172 [email protected] www.amarulasolutions.com

