LuciferYang commented on code in PR #43890:
URL: https://github.com/apache/spark/pull/43890#discussion_r1399115205


##########
core/src/main/scala/org/apache/spark/resource/ResourceUtils.scala:
##########
@@ -476,7 +476,7 @@ private[spark] object ResourceUtils extends Logging {
       if (maxTaskPerExec < (execAmount * numParts / taskAmount)) {
         val origTaskAmount = treq.amount
         val taskReqStr = s"${origTaskAmount}/${numParts}"
-        val resourceNumSlots = Math.floor(execAmount * numParts / 
taskAmount).toInt
+        val resourceNumSlots = Math.floor(execAmount * numParts / 
taskAmount.toDouble).toInt

Review Comment:
   Before this pr, the result here was a Long value from `execAmount * numParts 
/ taskAmount`. Is there a need to perform `Math.floor` on it? @tgravescs Do you 
have time to help check this logic? Thanks



##########
common/unsafe/src/test/scala/org/apache/spark/unsafe/types/UTF8StringPropertyCheckSuite.scala:
##########
@@ -80,7 +80,9 @@ class UTF8StringPropertyCheckSuite extends AnyFunSuite with 
ScalaCheckDrivenProp
 
   test("compare") {
     forAll { (s1: String, s2: String) =>
-      assert(Math.signum(toUTF8(s1).compareTo(toUTF8(s2))) === 
Math.signum(s1.compareTo(s2)))
+      assert(Math.signum {

Review Comment:
   What is the necessity of using `{}` instead of `()` here?



##########
graphx/src/test/scala/org/apache/spark/graphx/lib/PageRankSuite.scala:
##########
@@ -321,7 +321,7 @@ class PageRankSuite extends SparkFunSuite with 
LocalSparkContext {
         val rank = if (vid < source) {
           0.0
         } else {
-          a * Math.pow(1 - resetProb, vid - source)
+          a * Math.pow(1.0 - resetProb, vid.toDouble - source)

Review Comment:
   nit: `resetProb` is already a double type, it seems unnecessary to 
explicitly change it to 1.0, right?
   



##########
streaming/src/main/scala/org/apache/spark/streaming/ui/UIUtils.scala:
##########
@@ -67,9 +67,9 @@ private[streaming] object UIUtils {
    * will discard the fractional part.
    */
   def convertToTimeUnit(milliseconds: Long, unit: TimeUnit): Double = unit 
match {
-    case TimeUnit.NANOSECONDS => milliseconds * 1000 * 1000
-    case TimeUnit.MICROSECONDS => milliseconds * 1000
-    case TimeUnit.MILLISECONDS => milliseconds
+    case TimeUnit.NANOSECONDS => milliseconds * 1000.0 * 1000.0

Review Comment:
   How about unifying it to `milliseconds.toDouble`?



##########
core/src/main/scala/org/apache/spark/input/FixedLengthBinaryInputFormat.scala:
##########
@@ -74,7 +74,7 @@ private[spark] class FixedLengthBinaryInputFormat
     if (defaultSize < recordLength) {
       recordLength.toLong
     } else {
-      (Math.floor(defaultSize / recordLength) * recordLength).toLong
+      (Math.floor(defaultSize.toDouble / recordLength.toDouble) * 
recordLength).toLong

Review Comment:
   1. Is it feasible here to just add `.toDouble` to `defaultSize`?
   2. In the current scenario, theoretically, both `defaultSize` and 
`recordLength` should be integer(or long) greater than 0 and `defaultSize / 
recordLength` is a long value. So, Is there a need to perform Math.floor on it?
   
   This is a very old piece of code, do you have time to help check this? 
@dongjoon-hyun @srowen Thanks
   
   



##########
core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala:
##########
@@ -165,7 +165,8 @@ private[spark] class Benchmark(
     val best = runTimes.min
     val avg = runTimes.sum / runTimes.size

Review Comment:
   how about just make `avg` as a `Double`?



##########
core/src/test/scala/org/apache/spark/deploy/history/EventLogTestHelper.scala:
##########
@@ -56,7 +56,7 @@ object EventLogTestHelper {
       eventStr: String,
       desiredSize: Long): Seq[String] = {
     val stringLen = eventStr.getBytes(StandardCharsets.UTF_8).length
-    val repeatCount = Math.floor(desiredSize / stringLen).toInt
+    val repeatCount = Math.floor(desiredSize.toDouble / stringLen).toInt

Review Comment:
   This is the third case related to `Math.floor`. 
   `desiredSize / stringLen` is also a long value. Is it necessary to perform 
`Math.floor` here?
   
    @HeartSaVioR  Do you have time to help check this logic? Thanks



##########
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorRollPlugin.scala:
##########
@@ -150,14 +150,15 @@ class ExecutorRollDriverPlugin extends DriverPlugin with 
Logging {
    * Since we will choose only first item, the duplication is okay.
    */
   private def outliersFromMultipleDimensions(listWithoutDriver: 
Seq[v1.ExecutorSummary]) =
-    outliers(listWithoutDriver.filter(_.totalTasks > 0), e => e.totalDuration 
/ e.totalTasks) ++
-      outliers(listWithoutDriver, e => e.totalDuration) ++
-      outliers(listWithoutDriver, e => e.totalGCTime) ++
-      outliers(listWithoutDriver, e => e.failedTasks) ++
-      outliers(listWithoutDriver, e => getPeakMetrics(e, "JVMHeapMemory")) ++
-      outliers(listWithoutDriver, e => getPeakMetrics(e, "JVMOffHeapMemory")) 
++
-      outliers(listWithoutDriver, e => e.totalShuffleWrite) ++
-      outliers(listWithoutDriver, e => e.diskUsed)
+    outliers(listWithoutDriver.filter(_.totalTasks > 0),
+      e => e.totalDuration.toFloat / e.totalTasks) ++

Review Comment:
   Directly change to `e.totalDuration.toFloat` seems to change the result? It 
might need @dongjoon-hyun  to confirm this change



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to