Spark-2.3.0 support INSERT OVERWRITE DIRECTORY to directly write data into
the filesystem from a query.

I have met a problem with sql

 "INSERT OVERWRITE  DIRECTORY '/tmp/test-insert-spark' select vrid, query,
url, loc_city from custom.common_wap_vr where logdate >= '2018073000' and
logdate <= '2018073023' and vrid = '11000801' group by vrid,query,
loc_city,url;" 

this will create a empty file /tmp/test-insert-spark in hdfs, rather than a
directory

but if a add 'using json' in sql
"INSERT OVERWRITE  DIRECTORY '/tmp/test-insert-spark'  using json select
vrid, query, url, loc_city from custom.common_wap_vr where logdate >=
'2018073000' and logdate <= '2018073023' and vrid = '11000801' group by
vrid,query, loc_city,url;" 

this wil create /tmp/test-insert-spark directory correctly and  output json
files in it.

Is this because I am using it in the wrong way?  Do we have a detailed
introduction to how to use it?
 



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to