dmitry-chirkov-dremio commented on code in PR #49660:
URL: https://github.com/apache/arrow/pull/49660#discussion_r3051556051
##########
cpp/src/arrow/vendored/base64.cpp:
##########
@@ -93,18 +97,51 @@ std::string base64_encode(std::string_view
string_to_encode) {
return base64_encode(bytes_to_encode, in_len);
}
-std::string base64_decode(std::string_view encoded_string) {
+arrow::Result<std::string> base64_decode(std::string_view encoded_string) {
size_t in_len = encoded_string.size();
int i = 0;
int j = 0;
int in_ = 0;
unsigned char char_array_4[4], char_array_3[3];
std::string ret;
- while (in_len-- && ( encoded_string[in_] != '=') &&
is_base64(encoded_string[in_])) {
+ static const std::string base64_chars =
+ "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+ "abcdefghijklmnopqrstuvwxyz"
+ "0123456789+/";
+
+ auto is_base64 = [](unsigned char c) -> bool {
+ return (std::isalnum(c) || (c == '+') || (c == '/'));
Review Comment:
Seeing this logic right below base64_chars definition is a bit strange
##########
cpp/src/arrow/vendored/base64.cpp:
##########
@@ -93,18 +97,51 @@ std::string base64_encode(std::string_view
string_to_encode) {
return base64_encode(bytes_to_encode, in_len);
}
-std::string base64_decode(std::string_view encoded_string) {
+arrow::Result<std::string> base64_decode(std::string_view encoded_string) {
size_t in_len = encoded_string.size();
int i = 0;
int j = 0;
int in_ = 0;
unsigned char char_array_4[4], char_array_3[3];
std::string ret;
- while (in_len-- && ( encoded_string[in_] != '=') &&
is_base64(encoded_string[in_])) {
+ static const std::string base64_chars =
+ "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+ "abcdefghijklmnopqrstuvwxyz"
+ "0123456789+/";
+
+ auto is_base64 = [](unsigned char c) -> bool {
+ return (std::isalnum(c) || (c == '+') || (c == '/'));
+ };
+
+ if (encoded_string.size() % 4 != 0) {
+ return arrow::Status::Invalid("Invalid base64 input: length is not a
multiple of 4");
+ }
+
+ size_t padding_start = encoded_string.find('=');
Review Comment:
I am not a fan of separate validation loop as it is a performance overhead.
Please validate and decode in a single pass.
##########
cpp/src/arrow/util/base64.h:
##########
@@ -29,7 +31,7 @@ ARROW_EXPORT
std::string base64_encode(std::string_view s);
ARROW_EXPORT
-std::string base64_decode(std::string_view s);
+arrow::Result<std::string> base64_decode(std::string_view s);
Review Comment:
No need to change the callers?
Perhaps a single integration test that would call a caller function with an
invalid input to validate.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]