alamb commented on code in PR #7052:
URL: https://github.com/apache/arrow-rs/pull/7052#discussion_r1938506382
##########
parquet/src/thrift.rs:
##########
@@ -251,7 +255,12 @@ impl TInputProtocol for TCompactSliceInputProtocol<'_> {
fn collection_u8_to_type(b: u8) -> thrift::Result<TType> {
match b {
- 0x01 => Ok(TType::Bool),
+ // For historical and compatibility reasons, a reader should be
capable to deal with both cases.
+ // The only valid value in the original spec was 2, but due to an
widespread implementation bug
+ // the defacto standard across large parts of the library became 1
instead.
+ // As a result, both values are now allowed.
+ //
https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#list-and-set
+ 0x01 | 0x02 => Ok(TType::Bool),
Review Comment:
Upstream thrift doesn't seem to support 0x02 for this 🤔
https://github.com/apache/thrift/blob/a45618e05bbb2d29737514541b6d61f6850d9b16/lib/rs/src/protocol/compact.rs#L655
##########
parquet/src/thrift.rs:
##########
@@ -170,9 +170,13 @@ impl TInputProtocol for TCompactSliceInputProtocol<'_> {
Some(b) => Ok(b),
None => {
let b = self.read_byte()?;
+ // Previous versions of the thrift specification said to use 0
and 1 inside collections,
+ // but that differed from existing implementations.
+ // The specification was updated in
https://github.com/apache/thrift/commit/2c29c5665bc442e703480bb0ee60fe925ffe02e8.
+ // At least the go implementation seems to have followed the
previously documented values.
match b {
0x01 => Ok(true),
Review Comment:
It seems the upstream thrift implementation may simply treat anything that
is not 0x1 as false:
https://github.com/apache/thrift/blob/a45618e05bbb2d29737514541b6d61f6850d9b16/lib/rs/src/protocol/binary.rs#L167-L169
This doesn't look like it has changed for 9 years 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]