By the way, there are a few places one can look for logs while testing: Unit test runner logs (should contain driver and worker logs): core/target/unit-tests.log Executor logs: work/app-*
This should help find the root exception when you see one caught by the DAGScheduler, such as in this case. On Tue, Nov 12, 2013 at 6:21 PM, Kyle Ellrott <kellr...@soe.ucsc.edu> wrote: > Sure, do you have a URL for your patch? > > Kyle > On Nov 12, 2013 5:59 PM, "Xia, Junluan" <junluan....@intel.com> wrote: > > > Hi kely > > > > I also build a patch for this issue, and pass the test, you could help me > > to review if you are free. > > > > -----Original Message----- > > From: Kyle Ellrott [mailto:kellr...@soe.ucsc.edu] > > Sent: Wednesday, November 13, 2013 8:44 AM > > To: dev@spark.incubator.apache.org > > Subject: Re: SPARK-942 > > > > I've posted a patch that I think produces the correct behavior at > > > > > https://github.com/kellrott/incubator-spark/commit/efe1102c8a7436b2fe112d3bece9f35fedea0dc8 > > > > It works fine on my programs, but if I run the unit tests, I get errors > > like: > > > > [info] - large number of iterations *** FAILED *** > > [info] org.apache.spark.SparkException: Job aborted: Task 4.0:0 failed > > more than 0 times; aborting job java.lang.ClassCastException: > > scala.collection.immutable.StreamIterator cannot be cast to > > scala.collection.mutable.ArrayBuffer > > [info] at > > > > > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:818) > > [info] at > > > > > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:816) > > [info] at > > > > > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60) > > [info] at > > scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > > [info] at > > > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:816) > > [info] at > > > > > org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:431) > > [info] at org.apache.spark.scheduler.DAGScheduler.org > > $apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:493) > > [info] at > > > org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:158) > > > > > > I can't figure out the line number of where the original error occurred. > > Or why I can't replicate them in my various test programs. > > Any help would be appreciated. > > > > Kyle > > > > > > > > > > > > > > On Tue, Nov 12, 2013 at 11:35 AM, Alex Boisvert <alex.boisv...@gmail.com > > >wrote: > > > > > On Tue, Nov 12, 2013 at 11:07 AM, Stephen Haberman < > > > stephen.haber...@gmail.com> wrote: > > > > > > > Huge disclaimer that this is probably a big pita to implement, and > > > > could likely not be as worthwhile as I naively think it would be. > > > > > > > > > > My perspective on this is it's already big pita of Spark users today. > > > > > > In the absence of explicit directions/hints, Spark should be able to > > > make ballpark estimates and conservatively pick # of partitions, > > > storage strategies (e.g., memory vs disk) and other runtime parameters > > that fit the > > > deployment architecture/capacities. If this requires code and extra > > > runtime resources for sampling/measuring data, guestimating job size, > > > and so on, so be it. > > > > > > Users want working jobs first. Optimal performance / resource > > > utilization follow from that. > > > > > >