I don’t think sql context is “deprecated” in this sense. It’s still accessible by earlier versions of Spark.
But yes, at first glance it looks like you are correct. I don’t see a recordWriter method for parquet outside of the SQL package. https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.streaming.DataStreamWriter Here is an example that uses Sql context. I believe the SQL context is necessary for strongly typed, self describing, binary, columnar formatted files like Parquet. https://community.hortonworks.com/articles/72941/writing-parquet-on-hdfs-using-spark-streaming.html Otherwise you’ll probably be looking at a customWriter. https://parquet.apache.org/documentation/latest/ AFAIK, If you were to implement a custom writer, you still wouldn’t escape the parquet formatting paradigm the DF API solves. Spark needs a way to map data types for Parquet conversion. Hope this helps, -Pat On 2/28/18, 11:09 AM, "karthikus" <aswin8...@gmail.com> wrote: Hi all, I have a Kafka stream data and I need to save the data in parquet format without using Structured Streaming (due to the lack of Kafka Message header support). val kafkaStream = KafkaUtils.createDirectStream( streamingContext, LocationStrategies.PreferConsistent, ConsumerStrategies.Subscribe[String, String]( topics, kafkaParams ) ) // process the messages val messages = kafkaStream.map(record => (record.key, record.value)) val lines = messages.map(_._2) Now, how do I save it as parquet ? All the examples that I have come across uses SQLContext which is deprecated. ! Any help appreciated ! -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org