Measuring cluster utilization of a streaming job

2017-11-14 Thread Nadeem Lalani
Hi,

I was wondering if anyone has done some work around measuring the cluster
resource utilization of a "typical" spark streaming job.

We are trying to build a message ingestion system which will read from
Kafka and do some processing.  We have had some concerns raised in the team
that a 24*7 streaming job might not be the best use of cluster resources
especially when our use cases are to process data in a micro batch fashion
and are not truly streaming.

We wanted to measure  as to how much resource does a spark streaming
process take. Any pointers on where one would start?

We are on Yarn and plan to use spark 2.1

Thanks in advance,
Nadeem


Unsubscribe

2022-08-15 Thread Nadeem Lalani