Copilot commented on code in PR #49748:
URL: https://github.com/apache/arrow/pull/49748#discussion_r3089398999
##########
cpp/src/arrow/vendored/base64.cpp:
##########
@@ -40,6 +41,17 @@ static const std::string base64_chars =
"abcdefghijklmnopqrstuvwxyz"
"0123456789+/";
+static const std::array<int8_t, 256> kBase64Lookup = [] {
+ std::array<int8_t, 256> table{};
+ table.fill(-1);
+
+ for (size_t i = 0; i < base64_chars.size(); ++i) {
+ table[static_cast<uint8_t>(base64_chars[i])] = i;
Review Comment:
`i` is a `size_t` and is implicitly narrowed when assigned into the `int8_t`
lookup table. Some compilers (notably MSVC with /W4 or /WX) warn/error on this
conversion; cast `i` to `int8_t` (or change the table element type) to avoid
build issues.
```suggestion
table[static_cast<uint8_t>(base64_chars[i])] = static_cast<int8_t>(i);
```
##########
cpp/src/arrow/vendored/base64.cpp:
##########
@@ -30,6 +30,7 @@
*/
#include "arrow/util/base64.h"
+#include <array>
Review Comment:
This file now uses int8_t/uint8_t (for kBase64Lookup and indexing) but
doesn’t include <cstdint>. Relying on indirect includes (e.g., via iostream) is
not guaranteed and can break builds on some toolchains; add an explicit
#include <cstdint>.
```suggestion
#include <array>
#include <cstdint>
```
##########
cpp/src/arrow/vendored/base64.cpp:
##########
@@ -119,22 +131,16 @@ Result<std::string> base64_decode(std::string_view
encoded_string) {
return Status::Invalid("Invalid base64 input: padding in wrong
position");
}
- if (base64_chars.find(c) == std::string::npos) {
+ if (kBase64Lookup[static_cast<uint8_t>(c)] == -1) {
return Status::Invalid("Invalid base64 input: character is not valid
base64 character");
}
- char_array_4[i++] = c;
+ char_array_4[i++] = kBase64Lookup[static_cast<uint8_t>(c)];
Review Comment:
The lookup table is indexed twice for each non-padding character (once for
validation, once for assignment). Cache the lookup result in a local variable
to avoid redundant loads and keep the hot path minimal, especially since this
PR is performance-motivated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]