vinothchandar edited a comment on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511621749 got past that by including also `hive-exec` in the spark-bundle. but hit ``` aused by: java.lang.ClassNotFoundException: com.uber.hoodie.org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.hadoop.hive.ql.parse.ParseUtils.ensureClassExists(ParseUtils.java:261) ``` when I actually have it ``` $ jar tf packaging/hoodie-spark-bundle/target/hoodie-spark-bundle-0.4.8-SNAPSHOT.jar | grep MapredParquetOutputFormat com/uber/hoodie/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.class ``` Unshading hive from spark-bundle solves this issue and `mvn verify -DskipUTs=true -B` not error out. But it hangs after ``` 264440 [main] INFO com.uber.hoodie.integ.ITTestBase - Got error output for ([/bin/bash, /var/hoodie/ws/docker/demo/setup_demo_container.sh]) : 264440 [main] INFO com.uber.hoodie.integ.ITTestBase - Executing command ([spark-submit, --class, com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer, /var/hoodie/ws/docker/hoodie/hadoop/hive_base/target/hoodie-utilities.jar, --storage-type, COPY_ON_WRITE, --source-class, com.uber.hoodie.utilities.sources.JsonDFSSource, --source-ordering-field, ts, --target-base-path, /user/hive/warehouse/stock_ticks_cow, --target-table, stock_ticks_cow, --props, /var/demo/config/dfs-source.properties, --schemaprovider-class, com.uber.hoodie.utilities.schema.FilebasedSchemaProvider]) in container /adhoc-1 ``` do you want to give this a shot again in your branch with these fixes? In any case, it seems prudent to first land yours and then do #751 on top. if you have your hands full with bugs, then I can take over your PR as well. lmk
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
