oh I see, you are defining your own RDD & Partition types, and you had a bug where partition.index did not line up with the partitions slot in rdd.getPartitions. Is that correct?
On Thu, Aug 13, 2015 at 2:40 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > I figured that out, And these are my findings: > > -> It just enters in an infinite loop when there's a duplicate partition > id. > > -> It enters in an infinite loop when the partition id starts from 1 > rather than 0 > > > Something like this piece of code can reproduce it: (in getPartitions()) > > val total_partitions = 4 > val partitionsArray: Array[Partition] = > Array.ofDim[Partition](total_partitions) > > var i = 0 > > for(outer <- 0 to 1){ > for(partition <- 1 to total_partitions){ > partitionsArray(i) = new DeadLockPartitions(partition) > i = i + 1 > } > } > > partitionsArray > > > > > Thanks > Best Regards > > On Wed, Aug 12, 2015 at 10:57 PM, Imran Rashid <iras...@cloudera.com> > wrote: > >> yikes. >> >> Was this a one-time thing? Or does it happen consistently? can you turn >> on debug logging for o.a.s.scheduler (dunno if it will help, but maybe ...) >> >> On Tue, Aug 11, 2015 at 8:59 AM, Akhil Das <ak...@sigmoidanalytics.com> >> wrote: >> >>> Hi >>> >>> My Spark job (running in local[*] with spark 1.4.1) reads data from a >>> thrift server(Created an RDD, it will compute the partitions in >>> getPartitions() call and in computes hasNext will return records from these >>> partitions), count(), foreach() is working fine it returns the correct >>> number of records. But whenever there is shuffleMap stage (like reduceByKey >>> etc.) then all the tasks are executing properly but it enters in an >>> infinite loop saying : >>> >>> >>> 1. 15/08/11 13:05:54 INFO DAGScheduler: Resubmitting ShuffleMapStage >>> 1 (map at FilterMain.scala:59) because some of its tasks had failed: >>> 0, 3 >>> >>> >>> Here's the complete stack-trace http://pastebin.com/hyK7cG8S >>> >>> What could be the root cause of this problem? I looked up and bumped >>> into this closed JIRA <https://issues.apache.org/jira/browse/SPARK-583> >>> (which is very very old) >>> >>> >>> >>> >>> Thanks >>> Best Regards >>> >> >> >