[GitHub] [hudi] vinothchandar commented on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

GitBox Wed, 28 Apr 2021 18:47:54 -0700


vinothchandar commented on pull request #2645:
URL: https://github.com/apache/hudi/pull/2645#issuecomment-828888958



   @pengzhiwei2018 could we make the spark-shell experience better? I think we 
need the extensions added by default when the jar is pulled in?
   
   ```Scala 
   $ spark-shell --jars $HUDI_SPARK_BUNDLE --conf 
'spark.serializer=org.apache.spark.serializer.KryoSerializer'
   
   scala> spark.sql("create table t1 (id int, name string, price double, ts 
long) using hudi options(primaryKey= 'id', preCombineField = 'ts')").show 
   t, returning NoSuchObjectException
   org.apache.hudi.exception.HoodieException: 'path' or 
'hoodie.datasource.read.paths' or both must be specified.
     at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:77)
     at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:337)
     at 
org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:78)
     at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
     at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
     at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
     at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229)
     at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3616)
     at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)
     at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
     at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)
     at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)
     at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
     at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3614)
     at org.apache.spark.sql.Dataset.<init>(Dataset.scala:229)
     at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
     at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)
     at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
     at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:606)
     at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)
     at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:601)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] vinothchandar commented on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

Reply via email to