umehrot2 commented on issue #915: Shade and relocate Avro dependency in 
hadoop-mr-bundle
URL: https://github.com/apache/incubator-hudi/pull/915#issuecomment-534259843
 
 
   > My question to you is . Can Hive 2.3.5 as is support avro tables (not 
parquet) that have logical types? if yes, we can look into what we can do get 
parity.
   
   @vinothchandar I don't think we need to be concerned about Hive 2.3.5 being 
able to support Avro tables having Logical Types. If this were a problem it 
should exist even now. Like Spark 2.4.3 supports higher version of Avro, and 
has support for handling Logical Types by converting to fixed length byte 
arrays. On Hive 2.3.5 side I believe it will try to convert this fixed length 
byte arrays back to its own decimal type. It should not necessarily have to 
understand LogicalType (if I understand correctly).
   
   The problem is we are already bundling parquet-avro within the bundle jars. 
This is making it really difficult to upgrade parquet version. I think Hudi 
should strive to work with its own versions of parquet/avro irrespective of the 
consuming application. This particular change should make atleast the Avro 
version used by Hudi common with that of Spark, and we can claim to always 
compile Hudi with the version of Spark that is actually writing the dataset.
   
   If you are not confident about this change, I can definitely make it 
configurable like you said. But on EMR side we will have to maintain this to be 
able to support Hudi with Spark 2.4.3 and Hive 2.3.5.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to