hello,
Code:
ZkState zkState = new ZkState(kafkaConfig);
DynamicBrokersReader kafkaBrokerReader = new
DynamicBrokersReader(kafkaConfig, zkState);
int partionCount = kafkaBrokerReader.getNumPartitions();
SparkConf _sparkConf = new SparkConf().setAppName("KafkaReceiver");
final JavaStreamingContex
You can use SparkSQL for that very easily. You can convert the rdds you get
from kafka input stream, convert them to a RDDs of case classes and save as
parquet files.
More information here.
https://spark.apache.org/docs/latest/sql-programming-guide.html#parquet-files
On Wed, Aug 6, 2014 at 5:23 A
Hello,
I have referred link "https://github.com/dibbhatt/kafka-spark-consumer"; and
I have successfully consumed tuples from kafka.
Tuples are JSON objects and I want to store that objects in HDFS as parque
format.
Please suggest me any sample example for that.
Thanks in advance.
On Tue, Aug
:
DStream.foreach { rdd => rdd.saveAsHadoopFile(…) } to specify the OutputFormat
you want.
Thanks
Jerry
From: rafeeq s [mailto:rafeeq.ec...@gmail.com]
Sent: Tuesday, August 05, 2014 5:37 PM
To: Dibyendu Bhattacharya
Cc: u...@spark.incubator.apache.org
Subject: Re: Spark stream data from kafka topics
Thanks Dibyendu.
1. Spark itself have api jar for kafka, still we require manual offset
management (using simple consumer concept) and manual consumer ?
2.Kafka Spark Consumer which is implemented in kafka 0.8.0 ,Can we use it
for kafka 0.8.1 ?
3.How to use Kafka Spark Consumer to produce output
You can try this Kafka Spark Consumer which I recently wrote. This uses the
Low Level Kafka Consumer
https://github.com/dibbhatt/kafka-spark-consumer
Dibyendu
On Tue, Aug 5, 2014 at 12:52 PM, rafeeq s wrote:
> Hi,
>
> I am new to Apache Spark and Trying to Develop spark streaming program to