jmnatzaganian commented on issue #2498: URL: https://github.com/apache/hudi/issues/2498#issuecomment-974507962
I'm also having the same type of issue in EMR 6.4 after building and deploying Hudi 0.9.0. Note that as mentioned [above](https://github.com/apache/hudi/issues/2498#issuecomment-969228521), the default binaries work just fine (EMR 6.4 with Hudi 0.8.0). It seems that there's likely something off with the build or referencing. I used `mvn clean package -DskipTests -Dspark3 -Dscala-2.12 -T 30`. What's really interesting is that I can create an MoR table w/o issue, but trying to do a `load` renders the loaded DF unusable. It looks like the DF is loaded, but then becomes unusable. This [tip](https://github.com/apache/hudi/issues/2498#issuecomment-942282671) also worked for me (i.e. using `spark.sql` and referencing the table from the Glue data catalog). Unfortunately, querying the data this way seems to be *much* slower (compared to 0.8.0). I documented my build and installation process in [this](https://apache-hudi.slack.com/archives/C4D716NPQ/p1637354714476100) slack thread. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
