Re: Table Read fails in Spark Submit , Where as succeeds in spark-shell

cooper Tue, 21 Apr 2020 18:25:26 -0700

hi,periyasamy
Thanks for asking questions or reporting issues, please describe it in
detail by using github issue.
https://github.com/apache/incubator-hudi/issues


cooper

selvaraj periyasamy <[email protected]> 于2020年4月22日周三
上午7:35写道：

> Folks,
>
> I am using  Apache Hudi 0.5.0. Our hadoop cluster is miix of  spark
> version  2.3.0, Scala version 2.11.8 & Hive version 1.2.2.
> There are multiple use cases already working in Hudi.
>
> I need to read one of sequence table, which is continuously inserted on
> new partition by other process using Hive, not by Hudi.  And then write
> this DataFrame into another COW table using Hudi.
>
> When I use spark.sql in spark-shell, which was started with Hudi jar, I am
> able to do select as mentioned below.
>
> spark-shell --jars
> /Users/seperiya/Downloads/hudi-spark-bundle-0.5.0-incubating.jar --conf
> 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
>
>
> scala> spark.sql("select * from poc.request_result__ct").show
>
> 2020-04-21 15:56:07 WARN  ObjectStore:568 - Failed to get database
> global_temp, returning NoSuchObjectException
>
>
> +----------+--------------+------------+---------+-------------------+-------------------+-------------------+-------------------+----------------------+
>
> |request_id|prev_request_id|ref_no|type_code|
> transaction_date|process_ts|
> commit_ts|header__change_oper|header__partition_name|
>
>
> +----------+--------------+------------+---------+-------------------+-------------------+-------------------+-------------------+----------------------+
>
> |2020041011|          null|        null|       PA|2020-04-10
> 11:11:23|2020-04-10 11:11:30|2020-04-10 11:11:35|                  I|
> 20200117T235000_2...|
>
>
> +----------+--------------+------------+---------+-------------------+-------------------+-------------------+-------------------+----------------------+
>
>
>
>
>
> Whereas when I convert the same code into scala file and execute it using
> spark-submit , I am getting error. Attached the error logs.
>
>
> spark-submit --jars
> /Users/seperiya/Downloads/hudi-spark-bundle-0.5.0-incubating.jar --conf
> 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --class Test
> /Test-1.0.0-SNAPSHOT-jar-with-dependencies.jar
>
>
> object Test {
>
> def main(args: Array[String]): Unit = {
>
> implicit val sparkSession: SparkSession = SparkUtil.buildSession("Test_"+
>
>   
> now.get(Calendar.HOUR_OF_DAY)+now.get(Calendar.MINUTE)+now.get(Calendar.SECOND))
>
> sparkSession.sql( s"select * from poc.request_result__ct").show()
>
>  }
> }
>
>
> When I remove Hudi bundle jar and runs the same , it works.
>
>
> spark-submit --class Test /Test-1.0.0-SNAPSHOT-jar-with-dependencies.jar
>
>
> Even though Hudi code will come into picture only when I insert the data
> on other table,  for some reason, read fails . Could anyone shed some
> light not his issue?
>
>
>

Re: Table Read fails in Spark Submit , Where as succeeds in spark-shell

Reply via email to