Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/1124#issuecomment-46531776
  
    Just tested `- 1024` in `SchedulerBackend`. The system did hang up when the 
task size is close to 10M - 1024 ...
    
    ~~~
    scala> val random = new java.util.Random(0); val a = new Array[Byte](10 * 
1024 * 1024 - 110000); random.nextBytes(a); sc.parallelize(0 until 1, 1).map(i 
=> a).count()
    14/06/19 00:33:59 INFO SparkContext: Starting job: count at <console>:14
    14/06/19 00:33:59 INFO DAGScheduler: Got job 5 (count at <console>:14) with 
1 output partitions (allowLocal=false)
    14/06/19 00:33:59 INFO DAGScheduler: Final stage: Stage 5(count at 
<console>:14)
    14/06/19 00:33:59 INFO DAGScheduler: Parents of final stage: List()
    14/06/19 00:33:59 INFO DAGScheduler: Missing parents: List()
    14/06/19 00:33:59 INFO DAGScheduler: Submitting Stage 5 (MappedRDD[11] at 
map at <console>:14), which has no missing parents
    14/06/19 00:34:00 INFO DAGScheduler: Submitting 1 missing tasks from Stage 
5 (MappedRDD[11] at map at <console>:14)
    14/06/19 00:34:00 INFO TaskSchedulerImpl: Adding task set 5.0 with 1 tasks
    14/06/19 00:34:00 INFO TaskSetManager: Starting task 5.0:0 as TID 5 on 
executor 0: xm.att.net (PROCESS_LOCAL)
    14/06/19 00:34:00 INFO TaskSetManager: Serialized task 5.0:0 as 10431605 
bytes in 214 ms
    task size: 10482813
    ~~~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to