[GitHub] spark pull request #19452: [SPARK-22136][SS] Evaluate one-sided conditions e...

tdas Thu, 12 Oct 2017 16:32:06 -0700

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19452#discussion_r144435539
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala
 ---
    @@ -221,43 +237,36 @@ case class StreamingSymmetricHashJoinExec(
         //    matching new left input with new right input, since the new left 
input has become stored
         //    by that point. This tiny asymmetry is necessary to avoid 
duplication.
         val leftOutputIter = 
leftSideJoiner.storeAndJoinWithOtherSide(rightSideJoiner) {
    -      (input: UnsafeRow, matched: UnsafeRow) => 
joinedRow.withLeft(input).withRight(matched)
    +      (input: InternalRow, matched: InternalRow) => 
joinedRow.withLeft(input).withRight(matched)
         }
         val rightOutputIter = 
rightSideJoiner.storeAndJoinWithOtherSide(leftSideJoiner) {
    -      (input: UnsafeRow, matched: UnsafeRow) => 
joinedRow.withLeft(matched).withRight(input)
    +      (input: InternalRow, matched: InternalRow) => 
joinedRow.withLeft(matched).withRight(input)
         }
     
    -    // Filter the joined rows based on the given condition.
    -    val outputFilterFunction = 
newPredicate(condition.getOrElse(Literal(true)), output).eval _
    -
         // We need to save the time that the inner join output iterator 
completes, since outer join
         // output counts as both update and removal time.
         var innerOutputCompletionTimeNs: Long = 0
         def onInnerOutputCompletion = {
           innerOutputCompletionTimeNs = System.nanoTime
         }
    -    val filteredInnerOutputIter = CompletionIterator[InternalRow, 
Iterator[InternalRow]](
    -      (leftOutputIter ++ rightOutputIter).filter(outputFilterFunction), 
onInnerOutputCompletion)
    +    val innerOutputIter = CompletionIterator[InternalRow, 
Iterator[InternalRow]](
    --- End diff --
    
    nit: can you add a description of what is "inner output". the code is 
getting more complex, so I think better to add more docs.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19452: [SPARK-22136][SS] Evaluate one-sided conditions e...

Reply via email to