fitermay commented on issue #23986: [SPARK-27070] Fix performance bug in DefaultPartitionCoalescer URL: https://github.com/apache/spark/pull/23986#issuecomment-471147196 > Hm! That's surprising. Looking at min vs minBy, it even seems like min has more indirection (calls foldLeft). The implicit still involves calling a function to compare and get num partitions in both cases. If you're pretty sure this is accurate I'm OK returning to the implicit. @srowen There is some non-obvious indirection here. Below is the relevant bytecode that ends up being generated Sets up the lambda that's passed into minBy. Notice that the return type of the closure must be `Ljava/lang/Object`. So it can't return a primitive int. ``` LINENUMBER 223 L0 ALOAD 0 INVOKEDYNAMIC apply()Lscala/Function1; [ // handle kind 0x6 : INVOKESTATIC java/lang/invoke/LambdaMetafactory.altMetafactory(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite; // arguments: (Ljava/lang/Object;)Ljava/lang/Object;, // handle kind 0x6 : INVOKESTATIC org/apache/spark/rdd/DefaultPartitionCoalescer.$anonfun$getLeastGroupHash$3$adapted(Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;, (Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;, 7, 1, scala.Serializable.class, 1, (Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object; ] GETSTATIC scala/math/Ordering$Int$.MODULE$ : Lscala/math/Ordering$Int$; INVOKEVIRTUAL scala/collection/mutable/ArrayBuffer.minBy (Lscala/Function1;Lscala/math/Ordering;)Ljava/lang/Object; ``` The lambda first invokes the below function, whose only job is to box the primitive int ``` // access flags 0x1019 public final static synthetic $anonfun$getLeastGroupHash$3$adapted(Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object; // parameter final x$7 L0 LINENUMBER 223 L0 ALOAD 0 INVOKESTATIC org/apache/spark/rdd/DefaultPartitionCoalescer.$anonfun$getLeastGroupHash$3 (Lorg/apache/spark/rdd/PartitionGroup;)I INVOKESTATIC scala/runtime/BoxesRunTime.boxToInteger (I)Ljava/lang/Integer; ARETURN L1 LOCALVARIABLE x$7 Lorg/apache/spark/rdd/PartitionGroup; L0 L1 0 MAXSTACK = 1 MAXLOCALS = 1 ``` Then the actual method that returns `numParititons` for the comparison gets invoked. ``` public final static synthetic $anonfun$getLeastGroupHash$3(Lorg/apache/spark/rdd/PartitionGroup;)I // parameter final x$7 L0 LINENUMBER 223 L0 ALOAD 0 INVOKEVIRTUAL org/apache/spark/rdd/PartitionGroup.numPartitions ()I IRETURN L1 LOCALVARIABLE x$7 Lorg/apache/spark/rdd/PartitionGroup; L0 L1 0 MAXSTACK = 1 MAXLOCALS = 1 ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
