Hello guys,
I use spark streaming to receive data from kafka and need to store the data
into hive. I see the following ways to insert data into hive on the Internet:
1.use tmp_table
TmpDF=spark.createDataFrame(RDD,schema)
TmpDF.createOrReplaceTempView('TmpData')
sqlContext.sql('insert overwrite table tmp_table select
*from TmpData')
2.use DataFrameWriter.insertInto
3.use DataFrameWriter.saveAsTable
I didn't find too many examples, and I don't know if there is any difference
between them or there is a better way to write into hive. Please give me some
help.
Thank you