On 3/30/26 01:11, Marek Vasut wrote:
The ufshcd_read_string_desc() can perform out of bounds write and
corrupt heap in case the input utf-16 string contains code points
which convert to anything more than plain 7-bit ASCII string.
This occurs because utf16_to_utf8(dst, src, size) in U-Boot behaves
differently than Linux utf16s_to_utf8s(..., maxlen), but the porting
process did not take that into consideration. The U-Boot variant of
the function converts up to $size utf-16 fixed-length 16-bit input
characters into as many 1..4 Byte long variable-length utf-8 output
characters. That means for 16 Byte input, the output can be up to 64
Bytes long. The Linux variant converts up utf-16 input into up to
$maxlen Bytes worth of utf-8 output, but stops at the $maxlen limit.
That means for 16 Byte input with maxlen=32, the processing will stop
after writing 32 output Bytes.
In case of U-Boot, use of utf16_to_utf8() leads to potential corruption
of data past the $size Bytes and therefore corruption of surrounding
content on the heap.
The fix is as simple, allocate buffer that is sufficient to fit the
utf-8 string. The rest of the code in ufshcd_read_string_desc() does
correctly limit the buffer to fit into the DMA descriptor afterward.
Signed-off-by: Marek Vasut <[email protected]>
---
NOTE: This is for 2026.04, but please do test it on your hardware too.
---
Cc: Bhupesh Sharma <[email protected]>
Cc: Julien Stephan <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Neha Malcom Francis <[email protected]>
Cc: Neil Armstrong <[email protected]>
Cc: Padmarao Begari <[email protected]>
Cc: Tom Rini <[email protected]>
Cc: [email protected]
---
drivers/ufs/ufs-uclass.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/ufs/ufs-uclass.c b/drivers/ufs/ufs-uclass.c
index 81fd431f951..6a51f337e47 100644
--- a/drivers/ufs/ufs-uclass.c
+++ b/drivers/ufs/ufs-uclass.c
@@ -1751,7 +1751,15 @@ static int ufshcd_read_string_desc(struct ufs_hba *hba,
int desc_index,
goto out;
}
- buff_ascii = kmalloc(ascii_len, GFP_KERNEL);
I think the whole function is a mess, I think we would rewrite with something
like this:
...
int max_len;
int ascii_len;
max_len = (desc_len - QUERY_DESC_HDR_SIZE) * 2 + 1;
buff_ascii = kmalloc(max_len, GFP_KERNEL);
ascii_len = utf16_to_utf8(buff_ascii,
(uint16_t *)&buf[QUERY_DESC_HDR_SIZE], max_len);
...
So we stop having random len, and use the _real_ len returned by utf16_to_utf8.
Neil
+ /*
+ * utf-8 is encoded using up to 4-Bytes per character,
+ * however, we only allocate such a buffer because the
+ * utf16_to_utf8() converts the entire $ascii_len worth
+ * of input characters into up to 4-Byte long utf-8
+ * characters. The rest of the function uses only up to
+ * $ascii_len bytes of that utf-8 string.
+ */
+ buff_ascii = kmalloc(ascii_len * 4, GFP_KERNEL);
if (!buff_ascii) {
err = -ENOMEM;
goto out;