GregBowyer commented on a change in pull request #8698:
URL: https://github.com/apache/arrow/pull/8698#discussion_r534652479



##########
File path: rust/parquet/src/data_type.rs
##########
@@ -229,6 +235,82 @@ impl fmt::Display for ByteArray {
     }
 }
 
+/// Wrapper type for performance reasons, this represents 
`FIXED_LEN_BYTE_ARRAY` but in all other
+/// considerations behaves the same as `ByteArray`
+#[repr(transparent)]
+#[derive(Clone, Debug, Default)]
+pub struct FixedLenByteArray(ByteArray);

Review comment:
       Performance review note:
   
   This type is a little unfortunate, without it the compiler generates code 
that takes quite a big hit on the CPU pipeline. Essentially the previous 
version stalls awaiting the result of `T::get_physical_type() == 
Type::FIXED_LEN_BYTE_ARRAY`.
   
   Its debatable if this is wanted, it is out of spec for what parquet 
documents as its base types, although I feel that enough code paths in the rust 
(and potentially the C++) versions warrant this.
   
   With this wrapper type the compiler generates more targetted code paths 
matching the higher level logical types, removing the data-hazard from all 
decoding and encoding paths.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to