wgtmac commented on code in PR #49334:
URL: https://github.com/apache/arrow/pull/49334#discussion_r2987795470
##########
cpp/src/parquet/bloom_filter.cc:
##########
@@ -104,6 +127,139 @@ static ::arrow::Status ValidateBloomFilterHeader(
return ::arrow::Status::OK();
}
+BlockSplitBloomFilter DeserializeEncryptedFromStream(
+ const ReaderProperties& properties, ArrowInputStream* input,
+ std::optional<int64_t> bloom_filter_length, Decryptor* decryptor,
+ int16_t row_group_ordinal, int16_t column_ordinal) {
+ ThriftDeserializer deserializer(properties);
+ format::BloomFilterHeader header;
+
+ // Read the length-prefixed ciphertext for the header.
+ PARQUET_ASSIGN_OR_THROW(auto length_buf, input->Read(kCiphertextLengthSize));
+ if (ARROW_PREDICT_FALSE(length_buf->size() < kCiphertextLengthSize)) {
Review Comment:
There are several similar checks scattered around. It would be good to
extract it as a dedicated function like below:
```cpp
enum class LengthCheckMode { kShortRead, kExactMatch };
void CheckBloomFilterLength(int64_t expected, int64_t actual,
std::string_view context,
LengthCheckMode mode) {
const bool failed = mode == LengthCheckMode::kShortRead ? actual < expected
: actual !=
expected;
if (ARROW_PREDICT_FALSE(failed)) {
std::stringstream ss;
ss << context;
if (mode == LengthCheckMode::kShortRead) {
ss << " read failed: expected ";
} else {
ss << " length mismatch: expected ";
}
ss << expected << " bytes, got " << actual;
throw ParquetException(ss.str());
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]