Hi,

I have data as RDD[(Long, String)], where the Long is a timestamp and the
String is a JSON-encoded string. I want to infer the schema of the JSON and
then do a SQL statement on the data (no aggregates, just column selection
and UDF application), but still have the timestamp associated with each row
of the result. I completely fail to see how that would be possible. Any
suggestions?

I can't even see how I would get an RDD[(Long, Row)] so that I *might* be
able to add the timestamp to the row after schema inference. Is there *any*
way other than string-manipulating the JSON string and adding the timestamp
to it?

Thanks
Tobias

Reply via email to