[GitHub] spark pull request #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSet...

squito Tue, 04 Oct 2016 15:05:57 -0700

Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15249#discussion_r81864312
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
    @@ -98,6 +84,14 @@ private[spark] class TaskSetManager(
       var totalResultSize = 0L
       var calculatedTasks = 0
     
    +  val taskSetBlacklistOpt: Option[TaskSetBlacklist] = {
    --- End diff --
    
    yeah, the "taskSet" is there because this code needs both in the later PR.  
The taskSetManager needs to communicate some info back to the app-level 
blacklist so it needs to work with both.  I could ignore it here and we can 
deal w/ that naming issue when we get there if you want.
    
    I go back and forth on liking the "Opt" suffix ... I'm happy to do either 
way, but I often like opt so that the naming & meaning is clear for code like
    
    ```scala
    fooOpt.foreach { foo =>
     ...
    }
    ```
    
    eg. 
https://github.com/squito/spark/blob/taskset_blacklist_only/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L594
    
    the naming pattern is used elsewhere in spark:
    
    ```
    > find . -name "*.scala" | xargs grep "val .*Opt ="
    ./core/src/main/scala/org/apache/spark/MapOutputTracker.scala:    val 
arrayOpt = mapStatuses.get(shuffleId)
    ./core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala:  
private val underlyingMethodOpt = {
    ./core/src/main/scala/org/apache/spark/status/api/v1/OneJobResource.scala:  
  val jobOpt = statusToJobs.flatMap(_._2).find { jobInfo => jobInfo.jobId == 
jobId}
    ./core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala:      val 
completionTimeOpt = jobUIData.completionTime
    ./core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala:        val 
metricsOpt = taskUIData.metrics
    ./examples/src/main/scala/org/apache/spark/examples/graphx/Analytics.scala: 
       val numIterOpt = options.remove("numIter").map(_.toInt)
    
./external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/SparkAvroCallbackHandler.scala:
  val transactionExecutorOpt = Option(Executors.newFixedThreadPool(threads,
    ./graphx/src/main/scala/org/apache/spark/graphx/impl/GraphImpl.scala:    
val activeDirectionOpt = activeSetOpt.map(_._2)
    
./sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
      val intOpt = ctx.freshName("intOpt")
    
./sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
      val longOpt = ctx.freshName("longOpt")
    
./sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
      val newConditionOpt = conditionOpt match {
    
./streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala:
  private val serializableConfOpt = conf.map(new SerializableConfiguration(_))
    ./yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala:     
 val hostOpt = allocatedContainerToHostMap.get(containerId)
    ```
    
    but all that said, I don't feel strongly, just wanted to explain what I was 
thinking, if you still want a change thats totally fine.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSet...

Reply via email to