Github user bomeng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21638#discussion_r215022562
  
    --- Diff: 
core/src/main/scala/org/apache/spark/input/PortableDataStream.scala ---
    @@ -47,7 +47,7 @@ private[spark] abstract class StreamFileInputFormat[T]
       def setMinPartitions(sc: SparkContext, context: JobContext, 
minPartitions: Int) {
         val defaultMaxSplitBytes = 
sc.getConf.get(config.FILES_MAX_PARTITION_BYTES)
         val openCostInBytes = sc.getConf.get(config.FILES_OPEN_COST_IN_BYTES)
    -    val defaultParallelism = sc.defaultParallelism
    +    val defaultParallelism = Math.max(sc.defaultParallelism, minPartitions)
    --- End diff --
    
    From the codes, you can see the calculation is just the intermediate result 
and this method won't return any value. Checking the split size does not make 
sense for this test case because it depends on multiple variables and this is 
just one of them.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to