gtwuser opened a new issue, #5636:
URL: https://github.com/apache/hudi/issues/5636

   **Describe the problem you faced**
   Need to use higher version of Spark libraries, so as to support casting of 
array<string> to array<null> type, because we dont know which combination of 
sprak-hudi-bundle jars and spark-avro jars wold work, im stuck with Glue 2.0 
and Spark 2.4. 
   The jars used for creating Hudi tables on glue catalog as of now are as 
follows :
   Setup/Env config:
   
   AWS Glue 2.0,
   Python 3,
   Spark 2
   external dependent jars for connecting AWS glue and Hudi:
   1. httpclient-4.5.9.jar
   2. hudi-spark-bundle_2.11-0.8.0.jar
   3. spark-avro_2.11-2.4.4.jar
   
   A clear and concise description of the problem.
   
   
   Have a use case where in we need to update the schema of received records to 
with empty array as value in few columns to array<null> type. 
   
   
   A clear and concise description of what you expected to happen.
   Link for reference of the issue 
   
https://stackoverflow.com/questions/72294587/how-to-automate-casting-of-empty-arraystring-elements-to-arraystruct-eleme
   
   Ultimately we want to know the which versions of `hudi-spark-bundle.jar`, 
`spark-avro.jars`  to be used so that we can switch to Glue 3.0 which 
internally works on Spark 3.1.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to