This is an automated email from the ASF dual-hosted git repository.
Mryange pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new 86ac986053c [fix](function) Return null for invalid base64 input
length (#64788)
86ac986053c is described below
commit 86ac986053c20b5378fc5d7235035c2903b5e442
Author: Mryange <[email protected]>
AuthorDate: Thu Jun 25 10:22:26 2026 +0800
[fix](function) Return null for invalid base64 input length (#64788)
### What problem does this PR solve?
`from_base64` and `from_base64_binary` call the base64 decoder after
sizing the output buffer from `len / 4 * 3`. For invalid input whose
length is not a multiple of four, this can pass an undersized
destination buffer into the decoder before the function marks the row as
invalid.
Root cause: the functions only handled decoder failure after invoking
the decoder, but did not reject impossible base64 lengths first.
This patch returns `NULL` for inputs with invalid base64 length before
decoding, keeping the existing invalid-input SQL behavior while avoiding
unsafe decoder calls.
### Release note
None
---
be/src/exprs/function/function_string.cpp | 6 ++++++
be/src/exprs/function/function_varbinary.cpp | 5 +++++
be/test/exprs/function/function_string_test.cpp | 3 +++
be/test/exprs/function/function_varbinary_test.cpp | 3 +++
4 files changed, 17 insertions(+)
diff --git a/be/src/exprs/function/function_string.cpp
b/be/src/exprs/function/function_string.cpp
index 86c3e0f6a69..c1b2c3bad51 100644
--- a/be/src/exprs/function/function_string.cpp
+++ b/be/src/exprs/function/function_string.cpp
@@ -1185,6 +1185,12 @@ struct FromBase64Impl {
continue;
}
+ if (UNLIKELY(srclen % 4 != 0)) {
+ null_map[i] = 1;
+ dst_offsets[i] = cast_set<uint32_t>(offset);
+ continue;
+ }
+
auto outlen = base64_decode(source, srclen, dst_data_ptr + offset);
if (outlen < 0) {
diff --git a/be/src/exprs/function/function_varbinary.cpp
b/be/src/exprs/function/function_varbinary.cpp
index 5c02e637a5c..4cc406fa258 100644
--- a/be/src/exprs/function/function_varbinary.cpp
+++ b/be/src/exprs/function/function_varbinary.cpp
@@ -230,6 +230,11 @@ struct FromBase64BinaryImpl {
continue;
}
+ if (UNLIKELY(slen % 4 != 0)) {
+ null_map[i] = 1;
+ continue;
+ }
+
int cipher_len = slen / 4 * 3;
auto [cipher_inline, dst] = VarBinaryOP::alloc(res, i, cipher_len);
diff --git a/be/test/exprs/function/function_string_test.cpp
b/be/test/exprs/function/function_string_test.cpp
index 2e1aaa839c4..9b7e0d793ab 100644
--- a/be/test/exprs/function/function_string_test.cpp
+++ b/be/test/exprs/function/function_string_test.cpp
@@ -1599,6 +1599,9 @@ TEST(function_string_test, function_from_base64_test) {
{{std::string("5ZWK5ZOI5ZOI5ZOI8J+YhCDjgILigJTigJQh")},
std::string("εεεεπ γββ!")},
{{std::string("Γ²&ΓΈ")}, Null()},
+ {{std::string("bad@")}, Null()},
+ {{std::string("====")}, Null()},
+ {{std::string("YQ")}, Null()},
{{std::string("TVl0ZXN0U1RS")}, std::string("MYtestSTR")},
{{Null()}, Null()},
};
diff --git a/be/test/exprs/function/function_varbinary_test.cpp
b/be/test/exprs/function/function_varbinary_test.cpp
index bfd87ae1143..c0fe3b02f1c 100644
--- a/be/test/exprs/function/function_varbinary_test.cpp
+++ b/be/test/exprs/function/function_varbinary_test.cpp
@@ -100,6 +100,9 @@ TEST(function_binary_test,
function_from_base64_binary_test) {
{{std::string("")}, VARBINARY("")},
{{std::string("5ZWK5ZOI5ZOI5ZOI8J+YhCDjgILigJTigJQh")},
VARBINARY("εεεεπ γββ!")},
{{std::string("Γ²&ΓΈ")}, Null()},
+ {{std::string("bad@")}, Null()},
+ {{std::string("====")}, Null()},
+ {{std::string("YQ")}, Null()},
{{std::string("TVl0ZXN0U1RS")}, VARBINARY("MYtestSTR")},
{{std::string("YWFhYWFhYWFhYWE=")}, VARBINARY("aaaaaaaaaaa")},
{{std::string("TVl0ZXN0U1RSTVl0ZXN0U1RSTVl0ZXN0U1RS")},
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]