: correct use of DStream foreachRDD
I would like to make sure that I am using the DStream foreachRDD operation
correctly. I would like to read from a DStream transform the input and write to
HBase. The code below works , but I became confused when I read Note that the
function func is executed
Yes, for example val sensorRDD = rdd.map(Sensor.parseSensor) is a
line of code executed on the driver; it's part the function you
supplied to foreachRDD. However that line defines an operation on an
RDD, and the map function you supplied (parseSensor) will ultimately
be carried out on the cluster.
I would like to make sure that I am using the DStream foreachRDD
operation correctly. I would like to read from a DStream transform the
input and write to HBase. The code below works , but I became confused
when I read Note that the function *func* is executed in the driver
process ?
val
Thanks, this looks better
// parse the lines of data into sensor objects
val sensorDStream = ssc.textFileStream(/stream).
map(Sensor.parseSensor)
sensorDStream.foreachRDD { rdd =
// filter sensor data for low psi
val alertRDD = rdd.filter(sensor = sensor.psi 5.0)