Hi everyone,
I have a spark (1.6.0-cdh5.7.1) streaming job which receives data from
multiple kafka topics. After starting the job, everything works fine first
(like 700 req/sec) but after a while (couples of days or a week) it starts
processing only some part of the data (like 350 req/sec). When
The problem is solved after hadoop-core dependency added. But I think there
is a misunderstanding about local files. I found this one:
Note that if you've connected to a Spark master, it's possible that it will
attempt to load the file on one of the different machines in the cluster, so
make sure