Jefffrey commented on code in PR #5860:
URL: https://github.com/apache/arrow-datafusion/pull/5860#discussion_r1160493739
##########
datafusion/core/src/datasource/file_format/file_type.rs:
##########
@@ -130,6 +136,56 @@ impl FileCompressionType {
None => Into::<DataFusionError>::into(e),
};
+ Ok(match self.variant {
+ #[cfg(feature = "compression")]
+ GZIP => Box::new(
+ ReaderStream::new(AsyncGzEncoder::new(StreamReader::new(s)))
+ .map_err(err_converter),
+ ),
+ #[cfg(feature = "compression")]
+ BZIP2 => Box::new(
+ ReaderStream::new(AsyncBzEncoder::new(StreamReader::new(s)))
+ .map_err(err_converter),
+ ),
+ #[cfg(feature = "compression")]
+ XZ => Box::new(
+ ReaderStream::new(AsyncXzEncoder::new(StreamReader::new(s)))
+ .map_err(err_converter),
+ ),
+ #[cfg(feature = "compression")]
+ ZSTD => Box::new(
+ ReaderStream::new(AsyncZstdEncoer::new(StreamReader::new(s)))
+ .map_err(err_converter),
+ ),
+ #[cfg(not(feature = "compression"))]
+ GZIP | BZIP2 | XZ | ZSTD => {
+ return Err(DataFusionError::NotImplemented(
+ "Compression feature is not enabled".to_owned(),
+ ))
+ }
+ UNCOMPRESSED => Box::new(s),
+ })
+ }
+
+ /// Given a `Stream`, create a `Stream` which data are decompressed with
`FileCompressionType`.
+ pub fn convert_stream<T: Stream<Item = Result<Bytes>> + Unpin + Send +
'static>(
+ &self,
+ s: T,
+ ) -> Result<Box<dyn Stream<Item = Result<Bytes>> + Send + Unpin>> {
+ // #[cfg(feature = "compression")]
+ let err_converter = |e: std::io::Error| match e
+ .get_ref()
+ .and_then(|e| e.downcast_ref::<DataFusionError>())
+ {
+ Some(_) => {
+ *(e.into_inner()
+ .unwrap()
+ .downcast::<DataFusionError>()
+ .unwrap())
+ }
+ None => Into::<DataFusionError>::into(e),
+ };
Review Comment:
I find this syntax to be a confusing and not ideal, however didn't find any
alternative when I experimented myself.
This unstable feature seems to solve the usecase, if ever get around to
refactoring this in future if/when it becomes available:
https://github.com/rust-lang/rust/issues/99262
Some other points:
- Is that first commented line about cfg supposed to be there or not?
- `None => Into::<DataFusionError>::into(e)` -> `None =>
DataFusionError::from(e)` looks cleaner I think
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]