viirya commented on code in PR #490:
URL: https://github.com/apache/datafusion-comet/pull/490#discussion_r1619295940


##########
common/src/main/java/org/apache/comet/vector/CometDictionary.java:
##########
@@ -59,121 +46,57 @@ public ValueVector getValueVector() {
   }
 
   public boolean decodeToBoolean(int index) {
-    return booleans[index];
+    return values.getBoolean(index);
   }
 
   public byte decodeToByte(int index) {
-    return bytes[index];
+    return values.getByte(index);
   }
 
   public short decodeToShort(int index) {
-    return shorts[index];
+    return values.getShort(index);
   }
 
   public int decodeToInt(int index) {
-    return ints[index];
+    return values.getInt(index);
   }
 
   public long decodeToLong(int index) {
-    return longs[index];
+    return values.getLong(index);
   }
 
   public float decodeToFloat(int index) {
-    return floats[index];
+    return values.getFloat(index);
   }
 
   public double decodeToDouble(int index) {
-    return doubles[index];
+    return values.getDouble(index);
   }
 
   public byte[] decodeToBinary(int index) {
-    return binaries[index].bytes;
-  }
-
-  public UTF8String decodeToUTF8String(int index) {
-    return strings[index];
-  }
-
-  @Override
-  public void close() {
-    values.close();
-  }
-
-  private void initialize() {
     switch (values.getValueVector().getMinorType()) {
-      case BIT:
-        booleans = new boolean[numValues];
-        for (int i = 0; i < numValues; i++) {
-          booleans[i] = values.getBoolean(i);
-        }
-        break;
-      case TINYINT:
-        bytes = new byte[numValues];
-        for (int i = 0; i < numValues; i++) {
-          bytes[i] = values.getByte(i);
-        }
-        break;
-      case SMALLINT:
-        shorts = new short[numValues];
-        for (int i = 0; i < numValues; i++) {
-          shorts[i] = values.getShort(i);
-        }
-        break;
-      case INT:
-      case DATEDAY:
-        ints = new int[numValues];
-        for (int i = 0; i < numValues; i++) {
-          ints[i] = values.getInt(i);
-        }
-        break;
-      case BIGINT:
-      case TIMESTAMPMICRO:
-      case TIMESTAMPMICROTZ:
-        longs = new long[numValues];
-        for (int i = 0; i < numValues; i++) {
-          longs[i] = values.getLong(i);
-        }
-        break;
-      case FLOAT4:
-        floats = new float[numValues];
-        for (int i = 0; i < numValues; i++) {
-          floats[i] = values.getFloat(i);
-        }
-        break;
-      case FLOAT8:
-        doubles = new double[numValues];
-        for (int i = 0; i < numValues; i++) {
-          doubles[i] = values.getDouble(i);
-        }
-        break;
       case VARBINARY:
       case FIXEDSIZEBINARY:
-        binaries = new ByteArrayWrapper[numValues];
-        for (int i = 0; i < numValues; i++) {
-          binaries[i] = new ByteArrayWrapper(values.getBinary(i));
-        }
-        break;
-      case VARCHAR:
-        strings = new UTF8String[numValues];
-        for (int i = 0; i < numValues; i++) {
-          strings[i] = values.getUTF8String(i);
-        }
-        break;
+        return values.getBinary(index);
       case DECIMAL:
-        binaries = new ByteArrayWrapper[numValues];
-        for (int i = 0; i < numValues; i++) {
-          // Need copying here since we re-use byte array for decimal
-          byte[] bytes = new byte[DECIMAL_BYTE_WIDTH];
-          bytes = values.copyBinaryDecimal(i, bytes);
-          binaries[i] = new ByteArrayWrapper(bytes);
-        }
-        break;
+        byte[] bytes = new byte[DECIMAL_BYTE_WIDTH];

Review Comment:
   > I'm basically trying to understand whether the changes in this PR is 
necessary. Is it trying to address some perf regression we saw recently?
   
   This is motivated by recent regression on CometDictionary after 
re-initiating CometDictionary after re-importing array, though it is not 
directly related. It inspires me that this part of copying might be not good 
for performance.
   
   Maybe we just need to do copying for binary types? For other types like int, 
long, etc, I don't see the good reason to copy the values actually.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to