[ 
https://issues.apache.org/jira/browse/HUDI-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480544#comment-17480544
 ] 

sivabalan narayanan edited comment on HUDI-3262 at 1/23/22, 3:21 AM:
---------------------------------------------------------------------

Guess we need to fix the way our bundles are packaged. For eg, I tried to query 
hudi table using hudi-utilities bundle and it succeeds w/ 0.10.1, but fails w/ 
master with the same stacktrace as above. Should be the same reason why integ 
test suite bundle fails to query hudi table. 
{code:java}
./bin/spark-shell \
  --packages org.apache.spark:spark-avro_2.11:2.4.4 \
  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars 
~/Documents/personal/projects/apache_hudi_dec/hudi/packaging/hudi-utilities-bundle/target/hudi-utilities-bundle_2.11-0.10.1-rc2.jar


scala> val df = spark.read.format("hudi").load("/tmp/hudi-deltastreamer-ny/")
scala> df.count
 {code}
 
{code:java}
./bin/spark-shell \
  --packages org.apache.spark:spark-avro_2.11:2.4.4 \
  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars 
~/Documents/personal/projects/nov26/hudi/packaging/hudi-utilities-bundle/target/hudi-utilities-bundle_2.11-0.11.0-SNAPSHOT.jar

scala> val df = spark.read.format("hudi").load("/tmp/hudi-deltastreamer-ny/")
java.lang.ClassNotFoundException: Failed to find data source: hudi. Please find 
packages at http://spark.apache.org/third-party-projects.html
  at 
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:675)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:213)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:197)
  ... 49 elided
Caused by: java.lang.ClassNotFoundException: hudi.DefaultSource
  at 
scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:652)
  at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:652)
  at scala.util.Try$.apply(Try.scala:192)
  at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:652)
  at 
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:652)
  at scala.util.Try.orElse(Try.scala:84)
  at 
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:652)
  ... 51 more {code}
 

 

 


was (Author: shivnarayan):
Guess we need to fix the way our bundles are packaged. For eg, I tried to query 
hudi table using hudi-utilities bundle and it succeeds w/ 0.10.1, but fails w/ 
master. Should be the same reason why integ test suite bundle fails to query 
hudi table. 
{code:java}
./bin/spark-shell \
  --packages org.apache.spark:spark-avro_2.11:2.4.4 \
  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars 
~/Documents/personal/projects/apache_hudi_dec/hudi/packaging/hudi-utilities-bundle/target/hudi-utilities-bundle_2.11-0.10.1-rc2.jar


scala> val df = spark.read.format("hudi").load("/tmp/hudi-deltastreamer-ny/")
scala> df.count
 {code}

> Integration test suite failure
> ------------------------------
>
>                 Key: HUDI-3262
>                 URL: https://issues.apache.org/jira/browse/HUDI-3262
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: tests-ci
>            Reporter: Raymond Xu
>            Assignee: sivabalan narayanan
>            Priority: Critical
>              Labels: sev:normal
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> detailed in https://github.com/apache/hudi/issues/4621



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to