Hi, I am new to spark. I met a problem when I intended to load one dataset.
I have a dataset where the data is in json format and I'd like to load it as a RDD. As one record may span multiple lines, so SparkContext.textFile() is not doable. I also tried to use json4s to parse the json manually and then merge them into RDD one by one, but this solution is not convenient and low efficient. It seems that there is JsonRDD in SparkSQL, but it seems that it is for query only. Could any one provide me some suggestion about how to load json format data as RDD? For example, given the file path, load the dataset as RDD[JObject]. Thank you very much! Regards, J