ayushi-agarwal opened a new issue, #5427: URL: https://github.com/apache/incubator-gluten/issues/5427
### Description From spark 3.5 parquet has been upgraded to 1.13.0 which supports LZ4_RAW codec. Spark now supports read and write of parquet data with this codec https://github.com/apache/spark/pull/41507. Change in velox is needed to support read and write of file with this codec in native. While reading it maps it to LZ4 case thrift::CompressionCodec::LZ4_RAW: return common::CompressionKind::CompressionKind_LZ4; and fails with this stack trace 01:04:52.170 ERROR org.apache.spark.util.TaskResources: Task 94 failed by error: org.apache.gluten.exception.GlutenException: java.lang.RuntimeException: Exception: VeloxException Error Source: RUNTIME Error Code: UNKNOWN Reason: [Range Constraint Violation : 80>=4030799104] : {} decompression failed, remainingOutputSize is less than decompressedBlockSize, remainingOutputSize: {}, decompressedBlockSize: {}5804030799104 Retriable: False Expression: remainingOutputSize >= decompressedBlockSize Context: Split [Hive: file:///tmp/spark-dd5b62c5-00b8-4f3d-815c-337ac74e1916/part-00002-a44b5206-3087-4464-902c-a11df39ae28f-c000.lz4raw.parquet 0 - 846] Task Gluten_Stage_47_TID_94 Top-Level Context: Same as context. Function: decompress File: /home/workspace/gluten/ep/build-velox/build/velox_ep/velox/dwio/common/compression/Compression.cpp Line: 258 Stack trace: # 0 facebook::velox::VeloxException::VeloxException(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >) # 1 facebook::velox::dwio::common::exception::LoggedException::LoggedException(char const*, unsigned long, char const*, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) # 2 facebook::velox::dwio::common::compression::(anonymous namespace)::LzoAndLz4DecompressorCommon::decompress(char const*, unsigned long, char*, unsigned long) [clone .cold] # 3 facebook::velox::dwio::common::compression::PagedInputStream::readOrSkip(void const**, int*) # 4 facebook::velox::dwio::common::SeekableInputStream::readFully(char*, unsigned long) # 5 facebook::velox::parquet::PageReader::decompressData(char const*, unsigned int, unsigned int) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
