Re: Table Read fails in Spark Submit , Where as succeeds in spark-shell

selvaraj periyasamy Tue, 21 Apr 2020 20:34:26 -0700

Sure . Issue https://github.com/apache/incubator-hudi/issues/1546 has been
raised .


Thanks,
Selva



On Tue, Apr 21, 2020 at 6:25 PM cooper <[email protected]> wrote:

> hi,periyasamy
> Thanks for asking questions or reporting issues, please describe it in
> detail by using github issue.
> https://github.com/apache/incubator-hudi/issues
>
> cooper
>
> selvaraj periyasamy <[email protected]> 于2020年4月22日周三
> 上午7:35写道：
>
> > Folks,
> >
> > I am using  Apache Hudi 0.5.0. Our hadoop cluster is miix of  spark
> > version  2.3.0, Scala version 2.11.8 & Hive version 1.2.2.
> > There are multiple use cases already working in Hudi.
> >
> > I need to read one of sequence table, which is continuously inserted on
> > new partition by other process using Hive, not by Hudi.  And then write
> > this DataFrame into another COW table using Hudi.
> >
> > When I use spark.sql in spark-shell, which was started with Hudi jar, I
> am
> > able to do select as mentioned below.
> >
> > spark-shell --jars
> > /Users/seperiya/Downloads/hudi-spark-bundle-0.5.0-incubating.jar --conf
> > 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
> >
> >
> > scala> spark.sql("select * from poc.request_result__ct").show
> >
> > 2020-04-21 15:56:07 WARN  ObjectStore:568 - Failed to get database
> > global_temp, returning NoSuchObjectException
> >
> >
> >
> +----------+--------------+------------+---------+-------------------+-------------------+-------------------+-------------------+----------------------+
> >
> > |request_id|prev_request_id|ref_no|type_code|
> > transaction_date|process_ts|
> > commit_ts|header__change_oper|header__partition_name|
> >
> >
> >
> +----------+--------------+------------+---------+-------------------+-------------------+-------------------+-------------------+----------------------+
> >
> > |2020041011|          null|        null|       PA|2020-04-10
> > 11:11:23|2020-04-10 11:11:30|2020-04-10 11:11:35|                  I|
> > 20200117T235000_2...|
> >
> >
> >
> +----------+--------------+------------+---------+-------------------+-------------------+-------------------+-------------------+----------------------+
> >
> >
> >
> >
> >
> > Whereas when I convert the same code into scala file and execute it using
> > spark-submit , I am getting error. Attached the error logs.
> >
> >
> > spark-submit --jars
> > /Users/seperiya/Downloads/hudi-spark-bundle-0.5.0-incubating.jar --conf
> > 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --class
> Test
> > /Test-1.0.0-SNAPSHOT-jar-with-dependencies.jar
> >
> >
> > object Test {
> >
> > def main(args: Array[String]): Unit = {
> >
> > implicit val sparkSession: SparkSession = SparkUtil.buildSession("Test_"+
> >
> >
>  
> now.get(Calendar.HOUR_OF_DAY)+now.get(Calendar.MINUTE)+now.get(Calendar.SECOND))
> >
> > sparkSession.sql( s"select * from poc.request_result__ct").show()
> >
> >  }
> > }
> >
> >
> > When I remove Hudi bundle jar and runs the same , it works.
> >
> >
> > spark-submit --class Test /Test-1.0.0-SNAPSHOT-jar-with-dependencies.jar
> >
> >
> > Even though Hudi code will come into picture only when I insert the data
> > on other table,  for some reason, read fails . Could anyone shed some
> > light not his issue?
> >
> >
> >
>

Re: Table Read fails in Spark Submit , Where as succeeds in spark-shell

Reply via email to