Hi,
I have a java program in a jar that parses some Spark Api and does a distcp
action.
I intend to add a window / interval of 5 mins (arbitrary time) for this jar
to execute and do the function.
What's the easiest way to make a window and run the above jar in it and
make sure it runs every 5
+1 to dropping Hadoop 1.x
I am fairly certain there are very few legacy Hadoop users. 2.x is heavily
used at the moment.
Spark actually changed not just Hadoop but Python versions as well.
Hadoop 3 would take a while to mature so I would suggest holding off on
that after it is well baked in and