Hive api vs Dataset api

igor.berman Fri, 16 Sep 2016 04:27:44 -0700

Hi,
I wanted to understand if there is any other advantage besides api syntax
when using hive/table api vs. dataset api in spark sql(v2.0)?
Any additional optimizations maybe?
I'm most interested in parquet partitioned tables stored on s3. Is there any
difference if I'm comfortable with dataset api too?


In general our usecase is to stream data into s3 data partitioned by some
business keys(3 levels of nesting)
In addition do hive api somehow helps with "small files" problem?(I'm aware
of coalesce)


Thanks in advance



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Hive-api-vs-Dataset-api-tp27741.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Hive api vs Dataset api

Reply via email to