fitermay commented on issue #23986: [SPARK-27070] Fix performance bug in 
DefaultPartitionCoalescer
URL: https://github.com/apache/spark/pull/23986#issuecomment-471147196
 
 
   > Hm! That's surprising. Looking at min vs minBy, it even seems like min has 
more indirection (calls foldLeft). The implicit still involves calling a 
function to compare and get num partitions in both cases. If you're pretty sure 
this is accurate I'm OK returning to the implicit.
   
   @srowen 
   There is some non-obvious indirection here.  Below is the relevant bytecode 
that ends up being generated
   
   Sets up the lambda that's passed into minBy. Notice that the return type of 
the closure must be `Ljava/lang/Object`.  So it can't return a primitive int. 
   ```
       LINENUMBER 223 L0
       ALOAD 0
       INVOKEDYNAMIC apply()Lscala/Function1; [
         // handle kind 0x6 : INVOKESTATIC
         
java/lang/invoke/LambdaMetafactory.altMetafactory(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
         // arguments:
         (Ljava/lang/Object;)Ljava/lang/Object;, 
         // handle kind 0x6 : INVOKESTATIC
         
org/apache/spark/rdd/DefaultPartitionCoalescer.$anonfun$getLeastGroupHash$3$adapted(Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;,
 
         (Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;, 
         7, 
         1, 
         scala.Serializable.class, 
         1, 
         (Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;
       ]
       GETSTATIC scala/math/Ordering$Int$.MODULE$ : Lscala/math/Ordering$Int$;
       INVOKEVIRTUAL scala/collection/mutable/ArrayBuffer.minBy 
(Lscala/Function1;Lscala/math/Ordering;)Ljava/lang/Object;
   ```
   
   The lambda first invokes the below function, whose only job is to box the 
primitive int
   ```
     // access flags 0x1019
     public final static synthetic 
$anonfun$getLeastGroupHash$3$adapted(Lorg/apache/spark/rdd/PartitionGroup;)Ljava/lang/Object;
       // parameter final  x$7
      L0
       LINENUMBER 223 L0
       ALOAD 0
       INVOKESTATIC 
org/apache/spark/rdd/DefaultPartitionCoalescer.$anonfun$getLeastGroupHash$3 
(Lorg/apache/spark/rdd/PartitionGroup;)I
       INVOKESTATIC scala/runtime/BoxesRunTime.boxToInteger 
(I)Ljava/lang/Integer;
       ARETURN
      L1
       LOCALVARIABLE x$7 Lorg/apache/spark/rdd/PartitionGroup; L0 L1 0
       MAXSTACK = 1
       MAXLOCALS = 1
   ```
   
   Then the actual method that returns `numParititons` for the comparison gets 
invoked.
   ```
    public final static synthetic 
$anonfun$getLeastGroupHash$3(Lorg/apache/spark/rdd/PartitionGroup;)I
       // parameter final  x$7
      L0
       LINENUMBER 223 L0
       ALOAD 0
       INVOKEVIRTUAL org/apache/spark/rdd/PartitionGroup.numPartitions ()I
       IRETURN
      L1
       LOCALVARIABLE x$7 Lorg/apache/spark/rdd/PartitionGroup; L0 L1 0
       MAXSTACK = 1
       MAXLOCALS = 1
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to