Spark-2.3.0 support INSERT OVERWRITE DIRECTORY to directly write data into the filesystem from a query.
I have met a problem with sql "INSERT OVERWRITE DIRECTORY '/tmp/test-insert-spark' select vrid, query, url, loc_city from custom.common_wap_vr where logdate >= '2018073000' and logdate <= '2018073023' and vrid = '11000801' group by vrid,query, loc_city,url;" this will create a empty file /tmp/test-insert-spark in hdfs, rather than a directory but if a add 'using json' in sql "INSERT OVERWRITE DIRECTORY '/tmp/test-insert-spark' using json select vrid, query, url, loc_city from custom.common_wap_vr where logdate >= '2018073000' and logdate <= '2018073023' and vrid = '11000801' group by vrid,query, loc_city,url;" this wil create /tmp/test-insert-spark directory correctly and output json files in it. Is this because I am using it in the wrong way? Do we have a detailed introduction to how to use it? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org