[GitHub] spark pull request #19847: [SPARK-22652][SQL] remove set methods in Columnar...

cloud-fan Wed, 29 Nov 2017 07:37:46 -0800

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19847#discussion_r153823326
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
    @@ -141,29 +136,35 @@ class VectorizedHashMapGenerator(
       }
     
       /**
    -   * Generates a method that returns a mutable
    -   * [[org.apache.spark.sql.execution.vectorized.ColumnarRow]] which keeps 
track of the
    +   * Generates a method that returns a
    +   * [[org.apache.spark.sql.execution.vectorized.MutableColumnarRow]] 
which keeps track of the
        * aggregate value(s) for a given set of keys. If the corresponding row 
doesn't exist, the
        * generated method adds the corresponding row in the associated
    -   * [[org.apache.spark.sql.execution.vectorized.ColumnarBatch]]. For 
instance, if we
    +   * [[org.apache.spark.sql.execution.vectorized.OnHeapColumnVector]]. For 
instance, if we
        * have 2 long group-by keys, the generated function would be of the 
form:
        *
        * {{{
    -   * public org.apache.spark.sql.execution.vectorized.ColumnarRow 
findOrInsert(
    -   *     long agg_key, long agg_key1) {
    +   * public MutableColumnarRow findOrInsert(long agg_key, long agg_key1) {
        *   long h = hash(agg_key, agg_key1);
        *   int step = 0;
        *   int idx = (int) h & (numBuckets - 1);
        *   while (step < maxSteps) {
        *     // Return bucket index if it's either an empty slot or already 
contains the key
        *     if (buckets[idx] == -1) {
    -   *       batchVectors[0].putLong(numRows, agg_key);
    -   *       batchVectors[1].putLong(numRows, agg_key1);
    -   *       batchVectors[2].putLong(numRows, 0);
    -   *       buckets[idx] = numRows++;
    -   *       return batch.getRow(buckets[idx]);
    +   *       if (numRows < capacity) {
    --- End diff --
    
    update the comment to match the real code.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19847: [SPARK-22652][SQL] remove set methods in Columnar...

Reply via email to