this is a bit strange, because you’re trying to create an RDD inside of a
foreach function (the jsonElements). This executes on the workers, and so will
actually produce a different instance in each JVM on each worker, not one
single RDD referenced by the driver, which is what I think you’re try
Hello,
I tried to use sparkSQL to analyse json data streams within a standalone
application.
here the code snippet that receive the streaming data:
*final JavaReceiverInputDStream lines =
streamCtx.socketTextStream("localhost", Integer.parseInt(args[0]),
StorageLevel.MEMORY_AND_DISK_SER_2());*