On 3/30/26 01:11, Marek Vasut wrote:
The ufshcd_read_string_desc() can perform out of bounds write and
corrupt heap in case the input utf-16 string contains code points
which convert to anything more than plain 7-bit ASCII string.

This occurs because utf16_to_utf8(dst, src, size) in U-Boot behaves
differently than Linux utf16s_to_utf8s(..., maxlen), but the porting
process did not take that into consideration. The U-Boot variant of
the function converts up to $size utf-16 fixed-length 16-bit input
characters into as many 1..4 Byte long variable-length utf-8 output
characters. That means for 16 Byte input, the output can be up to 64
Bytes long. The Linux variant converts up utf-16 input into up to
$maxlen Bytes worth of utf-8 output, but stops at the $maxlen limit.
That means for 16 Byte input with maxlen=32, the processing will stop
after writing 32 output Bytes.

In case of U-Boot, use of utf16_to_utf8() leads to potential corruption
of data past the $size Bytes and therefore corruption of surrounding
content on the heap.

The fix is as simple, allocate buffer that is sufficient to fit the
utf-8 string. The rest of the code in ufshcd_read_string_desc() does
correctly limit the buffer to fit into the DMA descriptor afterward.

Signed-off-by: Marek Vasut <[email protected]>
---
NOTE: This is for 2026.04, but please do test it on your hardware too.
---
Cc: Bhupesh Sharma <[email protected]>
Cc: Julien Stephan <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Neha Malcom Francis <[email protected]>
Cc: Neil Armstrong <[email protected]>
Cc: Padmarao Begari <[email protected]>
Cc: Tom Rini <[email protected]>
Cc: [email protected]
---
  drivers/ufs/ufs-uclass.c | 10 +++++++++-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/ufs/ufs-uclass.c b/drivers/ufs/ufs-uclass.c
index 81fd431f951..6a51f337e47 100644
--- a/drivers/ufs/ufs-uclass.c
+++ b/drivers/ufs/ufs-uclass.c
@@ -1751,7 +1751,15 @@ static int ufshcd_read_string_desc(struct ufs_hba *hba, 
int desc_index,
                        goto out;
                }
- buff_ascii = kmalloc(ascii_len, GFP_KERNEL);

I think the whole function is a mess, I think we would rewrite with something 
like this:

...
int max_len;
int ascii_len;

max_len = (desc_len - QUERY_DESC_HDR_SIZE) * 2 + 1;
buff_ascii = kmalloc(max_len, GFP_KERNEL);

ascii_len = utf16_to_utf8(buff_ascii,
                (uint16_t *)&buf[QUERY_DESC_HDR_SIZE], max_len);
...

So we stop having random len, and use the _real_ len returned by utf16_to_utf8.

Neil


+               /*
+                * utf-8 is encoded using up to 4-Bytes per character,
+                * however, we only allocate such a buffer because the
+                * utf16_to_utf8() converts the entire $ascii_len worth
+                * of input characters into up to 4-Byte long utf-8
+                * characters. The rest of the function uses only up to
+                * $ascii_len bytes of that utf-8 string.
+                */
+               buff_ascii = kmalloc(ascii_len * 4, GFP_KERNEL);
                if (!buff_ascii) {
                        err = -ENOMEM;
                        goto out;

Reply via email to