Eventually it would be nice for us to have some sort of function to do the
conversion you are talking about on a single column, but for now I usually
hack it as you suggested:
val withId = origRDD.map { case (id, str) => s"""{"id":$id,
${str.trim.drop(1)}""" }
val table = sqlContext.jsonRDD(withId
Hi Ayoub,
thanks for your mail!
On Thu, Jan 29, 2015 at 6:23 PM, Ayoub wrote:
>
> SQLContext and hiveContext have a "jsonRDD" method which accept an
> RDD[String] where the string is a JSON String a returns a SchemaRDD, it
> extends RDD[Row] which the type you want.
>
> After words you should be
chema inference. Is there *any*
> way other than string-manipulating the JSON string and adding the timestamp
> to it?
>
> Thanks
> Tobias
>
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Re-SQL-query-over-Long-JSON-string-tuples-tp21419
Hi,
I have data as RDD[(Long, String)], where the Long is a timestamp and the
String is a JSON-encoded string. I want to infer the schema of the JSON and
then do a SQL statement on the data (no aggregates, just column selection
and UDF application), but still have the timestamp associated with eac