Hi Sourav,
For number of records received per second, you could use something like
this to calculate number of records in each batch, and divide it by your
batch size.
yourKafkaStream.foreachRDD(rdd => {
val count = rdd.count
println("Current rate = " + (count / batchSize) + " records / second")
})
For the delay metrics, take a look at the StreamingListener
interface<http://spark.incubator.apache.org/docs/latest/api/streaming/index.html#org.apache.spark.streaming.scheduler.StreamingListener>.
You can create a streaming listener object and use
streamingContext.addListener(object) to attach the listener.
On Tue, Feb 4, 2014 at 10:32 PM, Sourav Chandra <
[email protected]> wrote:
> HI,
>
> We are currently evaluating spark streaming for our analytics application.
>
> It reads from Kafka, processes and then persists into cassandra
>
> As part of poc project, we need to see the message processing rate of
> spark i.e. end to end time taken.
>
> I looked into metrics but did not find any way how to capture this info.
>
> Is there any way to see these metrics?
>
> I am using spark 0.9.0
>
> Thanks,
> --
>
> Sourav Chandra
>
> Senior Software Engineer
>
> · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
>
> [email protected]
>
> o: +91 80 4121 8723
>
> m: +91 988 699 3746
>
> skype: sourav.chandra
>
> Livestream
>
> "Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
> Block, Koramangala Industrial Area,
>
> Bangalore 560034
>
> www.livestream.com
>