Confused about shuffle read and shuffle write

2015-01-21 Thread Darin McBeath
I have the following code in a Spark Job. // Get the baseline input file(s) JavaPairRDDText,Text hsfBaselinePairRDDReadable = sc.hadoopFile(baselineInputBucketFile, SequenceFileInputFormat.class, Text.class, Text.class); JavaPairRDDString, String hsfBaselinePairRDD =

Confused about shuffle read and shuffle write

2015-01-20 Thread Darin McBeath
I have the following code in a Spark Job. // Get the baseline input file(s) JavaPairRDDText,Text hsfBaselinePairRDDReadable = sc.hadoopFile(baselineInputBucketFile, SequenceFileInputFormat.class, Text.class, Text.class); JavaPairRDDString, String hsfBaselinePairRDD =