Re: Why the json file used by sparkSession.read.json must be a valid json object per line

2016-10-15 Thread Hyukjin Kwon
Hi, The reason is just simply JSON data source depends on Hadoop's LineRecordReader when we first try to read the files. There is a workaround for this here in this link, http://searchdatascience.com/spark-adventures-1-processing-multi-line-json-files/ I hope this is helpful. Thanks! 2016-1

Why the json file used by sparkSession.read.json must be a valid json object per line

2016-10-15 Thread WangJianfei
Hi devs: I'm doubt about the design of spark.read.json, why the json file is not a standard json file, who can tell me the internal reason. Any advice is appreciated. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Why-the-json-file-used-by-sparkSe

Re: source for org.spark-project.hive:1.2.1.spark2

2016-10-15 Thread Steve Loughran
On 15 Oct 2016, at 01:28, Ryan Blue mailto:rb...@netflix.com.INVALID>> wrote: The Spark 2 branch is based on this one: https://github.com/JoshRosen/hive/commits/release-1.2.1-spark2 Didn't know this had moved I had an outstanding PR against patricks which should really go in, if not alre