LuciferYang commented on PR #36925:
URL: https://github.com/apache/spark/pull/36925#issuecomment-1161176292

   > Merged to master. It looks like this wasn't a problem in 3.3.x; this 
change removed the overrides:
   > 
   > https://issues.apache.org/jira/browse/SPARK-38836 #36121
   > 
   > I wonder if we will have other similar issues -- sounds like ++ is OK. 
CCing @minyyy
   
   The default implementation of `++` reuse `concat`:
   
   ```scala
   def concat(that: collection.IterableOnce[A]): C = fromSpecific(that match {
       case that: collection.Iterable[A] => new View.Concat(this, that)
       case _ => iterator.concat(that.iterator)
     })
   ```
   
   So the semantics of `++` should be ok, but if override `concat` method as 
follows, maybe a little quick than default implementation:
   
   ```scala
    override def concat(that: collection.IterableOnce[Expression]): 
ExpressionSet = {
       val newSet = clone()
       that.iterator.foreach(newSet.add)
       newSet
     }
   ```
   
   for example:
   
   ```
   val initialSet = ExpressionSet(aUpper + 1 :: Rand(0) :: Nil)
   val setToAddWithSameDeterministicExpression = ExpressionSet(aUpper + 1 :: 
Rand(0) :: Nil)
   benchmark.addCase("Test ++") { _: Int =>
       for (_ <- 0L until 100000) {
          setToAddWithSameDeterministicExpression ++ initialSet
        }
      }
   ```
   
   default implementation:
   
   ```
   OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Mac OS X 11.4
   Apple M1
   Test ExpressionSet ++ :                   Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   Test ++                                              10             10       
    0         10.3          96.7       1.0X
   ```
   
   override `concat`:
   
   ```
   OpenJDK 64-Bit Server VM 1.8.0_322-b06 on Mac OS X 11.4
   Apple M1
   Test ExpressionSet ++ :                   Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   Test ++                                               7              7       
    0         15.0          66.5       1.0X
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to