Hi, I want to read multiple paths into single RDD.
I know I can do it this way:
sc.sequenceFile("/data/new_rdd_/*,-,-,-)
What if they belong to different directories or may be different machines?
Is the only way by joining two RDD .
That is reading different path into different RDD and then join all.?
but my real requirement is not to join all RDD but MERGE them, like
appending 2nd to 1st and so on.
What is the best way for this?
Thanks and Regards,
Archit Thakur.
