cryptoe commented on code in PR #14900:
URL: https://github.com/apache/druid/pull/14900#discussion_r1343704788


##########
processing/src/main/java/org/apache/druid/frame/field/NumericArrayFieldWriter.java:
##########
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.frame.field;
+
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.frame.write.FrameWriterUtils;
+import org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nullable;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Writes the values of the type ARRAY<X> where X is a numeric type to row 
based frames.
+ * The format of the array written is as follows:
+ * <p>
+ * Format:
+ * - 1 Byte - {@link #NULL_ROW} or {@link #NON_NULL_ROW} denoting whether the 
array itself is null
+ * - If the array is null, then the writer stops here
+ * - If the array is not null, then it proceeds to the following steps
+ * <p>
+ * For each value in the non-null array:
+ * - 1 Byte - {@link NumericFieldWriter#ARRAY_ELEMENT_NULL_BYTE} or {@link 
NumericFieldWriter#ARRAY_ELEMENT_NOT_NULL_BYTE}
+ * denothing whether the proceeding value is null or not.
+ * - ElementSize Bytes - The encoded value of the element
+ * <p>
+ * Once all the values in the non-null arrays are over, writes {@link 
#ARRAY_TERMINATOR}. This is to aid the byte
+ * comparison, and also let the reader know that the number of elements in the 
array are over.
+ * <p>
+ * The format doesn't add the number of elements in the array at the 
beginning, though that would have been more
+ * convenient to keep the written array value byte comparable

Review Comment:
   I did not understand the sentence 
   `though that would have been more convenient to keep the written array value 
byte comparable`
   
   Even if we had lengths we could not make the bytes comparable since we sort 
arrays like 
   `[a,a,a] < [b]`



##########
processing/src/main/java/org/apache/druid/frame/field/NumericFieldReader.java:
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.frame.field;
+
+import org.apache.datasketches.memory.Memory;
+import org.apache.druid.query.extraction.ExtractionFn;
+import org.apache.druid.segment.ColumnValueSelector;
+import org.apache.druid.segment.DimensionSelector;
+import org.apache.druid.segment.column.ValueType;
+import org.apache.druid.segment.column.ValueTypes;
+
+import javax.annotation.Nullable;
+
+/**
+ * Reads the fields created by the {@link NumericFieldWriter}. See the Javadoc 
for the writer for format details
+ */
+public abstract class NumericFieldReader implements FieldReader
+{
+
+  /**
+   * The indicator byte which denotes that the following value is null.
+   */
+  private final byte nullIndicatorByte;
+
+  public NumericFieldReader(boolean forArray)
+  {
+    if (!forArray) {
+      this.nullIndicatorByte = NumericFieldWriter.NULL_BYTE;
+    } else {
+      this.nullIndicatorByte = NumericFieldWriter.ARRAY_ELEMENT_NULL_BYTE;
+    }
+  }
+
+  @Override
+  public ColumnValueSelector<?> makeColumnValueSelector(Memory memory, 
ReadableFieldPointer fieldPointer)
+  {
+    return getColumnValueSelector(memory, fieldPointer, nullIndicatorByte);
+  }
+
+  @Override
+  public DimensionSelector makeDimensionSelector(
+      Memory memory,
+      ReadableFieldPointer fieldPointer,
+      @Nullable ExtractionFn extractionFn
+  )
+  {
+    return ValueTypes.makeNumericWrappingDimensionSelector(
+        getValueType(),
+        makeColumnValueSelector(memory, fieldPointer),
+        extractionFn
+    );
+  }
+
+  @Override
+  public boolean isNull(Memory memory, long position)
+  {
+    return memory.getByte(position) == nullIndicatorByte;
+  }
+
+
+  @Override
+  public boolean isComparable()
+  {
+    return true;
+  }
+
+  public abstract ColumnValueSelector<?> getColumnValueSelector(
+      Memory memory,
+      ReadableFieldPointer fieldPointer,
+      byte nullIndicatorByte
+  );
+
+  public abstract ValueType getValueType();
+
+  public abstract static class Selector

Review Comment:
   Is there a reason why this is not a class of its own?
   Where is this selector being used in the abstract class `NumericFieldReader` 
?



##########
processing/src/main/java/org/apache/druid/frame/field/NumericArrayFieldWriter.java:
##########
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.frame.field;
+
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.frame.write.FrameWriterUtils;
+import org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nullable;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Writes the values of the type ARRAY<X> where X is a numeric type to row 
based frames.
+ * The format of the array written is as follows:
+ * <p>
+ * Format:
+ * - 1 Byte - {@link #NULL_ROW} or {@link #NON_NULL_ROW} denoting whether the 
array itself is null
+ * - If the array is null, then the writer stops here
+ * - If the array is not null, then it proceeds to the following steps
+ * <p>
+ * For each value in the non-null array:
+ * - 1 Byte - {@link NumericFieldWriter#ARRAY_ELEMENT_NULL_BYTE} or {@link 
NumericFieldWriter#ARRAY_ELEMENT_NOT_NULL_BYTE}
+ * denothing whether the proceeding value is null or not.
+ * - ElementSize Bytes - The encoded value of the element
+ * <p>
+ * Once all the values in the non-null arrays are over, writes {@link 
#ARRAY_TERMINATOR}. This is to aid the byte
+ * comparison, and also let the reader know that the number of elements in the 
array are over.
+ * <p>
+ * The format doesn't add the number of elements in the array at the 
beginning, though that would have been more
+ * convenient to keep the written array value byte comparable
+ * <p>
+ * Examples:
+ * 1. null
+ * | Bytes  | Value | Interpretation              |
+ * |--------|-------|-----------------------------|
+ * | 1      | 0x00  | Denotes that the array null |
+ * <p>
+ * 2. [] (empty array)
+ * | Bytes  | Value | Interpretation                     |
+ * |--------|----- -|------------------------------------|
+ * | 1      | 0x01  | Denotes that the array is not null |
+ * | 2      | 0x00  | End of the array                   |
+ * <p>
+ * 3. [5L, null, 6L]
+ * | Bytes   | Value        | Interpretation                                   
                                 |
+ * 
|---------|--------------|-----------------------------------------------------------------------------------|
+ * | 1       | 0x01         | Denotes that the array is not null               
                                 |
+ * | 2       | 0x02         | Denotes that the next element is not null        
                                 |
+ * | 3-10    | transform(5) | Representation of 5                              
                                 |
+ * | 11      | 0x01         | Denotes that the next element is null            
                                 |
+ * | 12-19   | transform(0) | Representation of 0 (default value, the reader 
will ignore it if SqlCompatible mode is on |
+ * | 20      | 0x02         | Denotes that the next element is not null        
                                 |
+ * | 21-28   | transform(6) | Representation of 6                              
                                 |
+ * | 29      | 0x00         | End of array                                     
                                 |
+ */
+public class NumericArrayFieldWriter implements FieldWriter
+{
+
+  /**
+   * Denotes that the array itself is null
+   */
+  public static final byte NULL_ROW = 0x00;
+
+  /**
+   * Denotes that the array is non null
+   */
+  public static final byte NON_NULL_ROW = 0x01;
+
+  /**
+   * Marks the end of the array. Since {@link #NULL_ROW}  and {@link 
#ARRAY_TERMINATOR} will only occur at different
+   * locations, therefore there is no clash in keeping both's values at 0x00
+   */
+  public static final byte ARRAY_TERMINATOR = 0x00;
+
+  private final ColumnValueSelector selector;
+  private final NumericFieldWriterFactory writerFactory;
+
+  /**
+   * Returns the writer for ARRAY<LONG>
+   */
+  public static NumericArrayFieldWriter getLongArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, LongFieldWriter::forArray);
+  }
+
+  /**
+   * Returns the writer for ARRAY<FLOAT>
+   */
+  public static NumericArrayFieldWriter getFloatArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, FloatFieldWriter::forArray);
+  }
+
+  /**
+   * Returns the writer for ARRAY<DOUBLE>
+   */
+  public static NumericArrayFieldWriter getDoubleArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, DoubleFieldWriter::forArray);
+  }
+
+  public NumericArrayFieldWriter(final ColumnValueSelector selector, 
NumericFieldWriterFactory writerFactory)
+  {
+    this.selector = selector;
+    this.writerFactory = writerFactory;
+  }
+
+  @Override
+  public long writeTo(WritableMemory memory, long position, long maxSize)
+  {
+    Object row = selector.getObject();
+    if (row == null) {
+      int requiredSize = Byte.BYTES;
+      if (requiredSize > maxSize) {
+        return -1;
+      }
+      memory.putByte(position, NULL_ROW);
+      return requiredSize;
+    } else {
+
+      List<? extends Number> list = 
FrameWriterUtils.getNumericArrayFromNumericArray(row);
+
+      if (list == null) {
+        int requiredSize = Byte.BYTES;
+        if (requiredSize > maxSize) {
+          return -1;
+        }
+        memory.putByte(position, NULL_ROW);
+        return requiredSize;
+      }
+
+      // Create a columnValueSelector to write the individual elements 
re-using the NumericFieldWriter
+      AtomicInteger index = new AtomicInteger(0);
+      ColumnValueSelector<Number> columnValueSelector = new 
ColumnValueSelector<Number>()

Review Comment:
   Can we push column value selector to another class or a subclass. That would 
make the code more readable. 



##########
processing/src/main/java/org/apache/druid/frame/field/NumericArrayFieldWriter.java:
##########
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.frame.field;
+
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.frame.write.FrameWriterUtils;
+import org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nullable;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Writes the values of the type ARRAY<X> where X is a numeric type to row 
based frames.
+ * The format of the array written is as follows:
+ * <p>
+ * Format:
+ * - 1 Byte - {@link #NULL_ROW} or {@link #NON_NULL_ROW} denoting whether the 
array itself is null
+ * - If the array is null, then the writer stops here
+ * - If the array is not null, then it proceeds to the following steps
+ * <p>
+ * For each value in the non-null array:
+ * - 1 Byte - {@link NumericFieldWriter#ARRAY_ELEMENT_NULL_BYTE} or {@link 
NumericFieldWriter#ARRAY_ELEMENT_NOT_NULL_BYTE}
+ * denothing whether the proceeding value is null or not.
+ * - ElementSize Bytes - The encoded value of the element
+ * <p>
+ * Once all the values in the non-null arrays are over, writes {@link 
#ARRAY_TERMINATOR}. This is to aid the byte
+ * comparison, and also let the reader know that the number of elements in the 
array are over.
+ * <p>
+ * The format doesn't add the number of elements in the array at the 
beginning, though that would have been more
+ * convenient to keep the written array value byte comparable
+ * <p>
+ * Examples:
+ * 1. null
+ * | Bytes  | Value | Interpretation              |
+ * |--------|-------|-----------------------------|
+ * | 1      | 0x00  | Denotes that the array null |
+ * <p>
+ * 2. [] (empty array)
+ * | Bytes  | Value | Interpretation                     |
+ * |--------|----- -|------------------------------------|
+ * | 1      | 0x01  | Denotes that the array is not null |
+ * | 2      | 0x00  | End of the array                   |
+ * <p>
+ * 3. [5L, null, 6L]
+ * | Bytes   | Value        | Interpretation                                   
                                 |
+ * 
|---------|--------------|-----------------------------------------------------------------------------------|
+ * | 1       | 0x01         | Denotes that the array is not null               
                                 |
+ * | 2       | 0x02         | Denotes that the next element is not null        
                                 |
+ * | 3-10    | transform(5) | Representation of 5                              
                                 |
+ * | 11      | 0x01         | Denotes that the next element is null            
                                 |
+ * | 12-19   | transform(0) | Representation of 0 (default value, the reader 
will ignore it if SqlCompatible mode is on |
+ * | 20      | 0x02         | Denotes that the next element is not null        
                                 |
+ * | 21-28   | transform(6) | Representation of 6                              
                                 |
+ * | 29      | 0x00         | End of array                                     
                                 |
+ */
+public class NumericArrayFieldWriter implements FieldWriter
+{
+
+  /**
+   * Denotes that the array itself is null
+   */
+  public static final byte NULL_ROW = 0x00;
+
+  /**
+   * Denotes that the array is non null
+   */
+  public static final byte NON_NULL_ROW = 0x01;
+
+  /**
+   * Marks the end of the array. Since {@link #NULL_ROW}  and {@link 
#ARRAY_TERMINATOR} will only occur at different
+   * locations, therefore there is no clash in keeping both's values at 0x00
+   */
+  public static final byte ARRAY_TERMINATOR = 0x00;
+
+  private final ColumnValueSelector selector;
+  private final NumericFieldWriterFactory writerFactory;
+
+  /**
+   * Returns the writer for ARRAY<LONG>
+   */
+  public static NumericArrayFieldWriter getLongArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, LongFieldWriter::forArray);
+  }
+
+  /**
+   * Returns the writer for ARRAY<FLOAT>
+   */
+  public static NumericArrayFieldWriter getFloatArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, FloatFieldWriter::forArray);
+  }
+
+  /**
+   * Returns the writer for ARRAY<DOUBLE>
+   */
+  public static NumericArrayFieldWriter getDoubleArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, DoubleFieldWriter::forArray);
+  }
+
+  public NumericArrayFieldWriter(final ColumnValueSelector selector, 
NumericFieldWriterFactory writerFactory)
+  {
+    this.selector = selector;
+    this.writerFactory = writerFactory;
+  }
+
+  @Override
+  public long writeTo(WritableMemory memory, long position, long maxSize)
+  {
+    Object row = selector.getObject();
+    if (row == null) {
+      int requiredSize = Byte.BYTES;
+      if (requiredSize > maxSize) {
+        return -1;
+      }
+      memory.putByte(position, NULL_ROW);
+      return requiredSize;
+    } else {
+
+      List<? extends Number> list = 
FrameWriterUtils.getNumericArrayFromNumericArray(row);
+
+      if (list == null) {
+        int requiredSize = Byte.BYTES;
+        if (requiredSize > maxSize) {
+          return -1;
+        }
+        memory.putByte(position, NULL_ROW);
+        return requiredSize;
+      }
+
+      // Create a columnValueSelector to write the individual elements 
re-using the NumericFieldWriter
+      AtomicInteger index = new AtomicInteger(0);
+      ColumnValueSelector<Number> columnValueSelector = new 
ColumnValueSelector<Number>()
+      {
+        @Override
+        public double getDouble()
+        {
+          final Number n = getObject();
+          assert NullHandling.replaceWithDefault() || n != null;
+          return n != null ? n.doubleValue() : 0d;
+        }
+
+        @Override
+        public float getFloat()
+        {
+          final Number n = getObject();
+          assert NullHandling.replaceWithDefault() || n != null;
+          return n != null ? n.floatValue() : 0f;
+        }
+
+        @Override
+        public long getLong()
+        {
+          final Number n = getObject();
+          assert NullHandling.replaceWithDefault() || n != null;
+          return n != null ? n.longValue() : 0L;
+        }
+
+        @Override
+        public void inspectRuntimeShape(RuntimeShapeInspector inspector)
+        {
+
+        }
+
+        @Override
+        public boolean isNull()
+        {
+          // Arrays preserve the individual element's nullity when they are 
written and read.
+          // Therefore, when working with SQL incompatible mode, [7, null] 
won't change to [7, 0] when written to and
+          // read from the underlying serialization (as compared with the 
primitives). Therefore,
+          // even when NullHandling.replaceWithDefault() is true we need to 
write null as is, and not convert it to their
+          // default value when writing the array. Therefore, the check is 
`getObject() == null` ignoring the value of
+          // `NullHandling.replaceWithDefaul()`.
+          return getObject() == null;

Review Comment:
   @clintropolis Could you please review this. 



##########
processing/src/main/java/org/apache/druid/frame/field/NumericFieldWriter.java:
##########
@@ -0,0 +1,137 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.frame.field;
+
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.segment.BaseNullableColumnValueSelector;
+
+/**
+ * FieldWriter for numeric datatypes. The parent class does the null handling 
for the underlying data, while
+ * the individual subclasses write the individual element (long, float or 
double type). This also allows for a clean
+ * reuse of the readers and writers between the numeric types and also 
allowing the array writers ({@link NumericArrayFieldWriter})
+ * to use these methods directly without duplication
+ *
+ * Format:
+ *  - 1 byte: Whether the following value is null or not. Take a look at the 
note on the indicator bytes.
+ *  - X bytes: Encoded value of the selector, or the default value if it is 
null. X denotes the size of the numeric value
+ *
+ * Indicator bytes for denoting whether the element is null or not null 
changes depending on whether the writer is used
+ * to write the data for individual value (like LONG) or for an element of an 
array (like ARRAY<LONG>). This is because
+ * array support for the numeric types was added later and by then the field 
writers for individual fields were using
+ * 0x00 to denote the null byte, which is reserved for denoting the array end 
when we are writing the elements as part
+ * of the array instead. (0x00 is used for array end because it helps in 
preserving the byte comparison property of the
+ * numeric array field writers).
+ *
+ * Therefore, to preserve backward and forward compatibility, the individual 
element's writers were left unchanged,

Review Comment:
   Please provide and example of the above. This will help user reason about 
the choice of bytes.



##########
processing/src/main/java/org/apache/druid/frame/field/NumericFieldReader.java:
##########
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.frame.field;
+
+import org.apache.datasketches.memory.Memory;
+import org.apache.druid.query.extraction.ExtractionFn;
+import org.apache.druid.segment.ColumnValueSelector;
+import org.apache.druid.segment.DimensionSelector;
+import org.apache.druid.segment.column.ValueType;
+import org.apache.druid.segment.column.ValueTypes;
+
+import javax.annotation.Nullable;
+
+/**
+ * Reads the fields created by the {@link NumericFieldWriter}. See the Javadoc 
for the writer for format details
+ */
+public abstract class NumericFieldReader implements FieldReader
+{
+
+  /**
+   * The indicator byte which denotes that the following value is null.
+   */
+  private final byte nullIndicatorByte;
+
+  public NumericFieldReader(boolean forArray)
+  {
+    if (!forArray) {
+      this.nullIndicatorByte = NumericFieldWriter.NULL_BYTE;
+    } else {
+      this.nullIndicatorByte = NumericFieldWriter.ARRAY_ELEMENT_NULL_BYTE;
+    }
+  }
+
+  @Override
+  public ColumnValueSelector<?> makeColumnValueSelector(Memory memory, 
ReadableFieldPointer fieldPointer)
+  {
+    return getColumnValueSelector(memory, fieldPointer, nullIndicatorByte);
+  }
+
+  @Override
+  public DimensionSelector makeDimensionSelector(
+      Memory memory,
+      ReadableFieldPointer fieldPointer,
+      @Nullable ExtractionFn extractionFn
+  )
+  {
+    return ValueTypes.makeNumericWrappingDimensionSelector(
+        getValueType(),
+        makeColumnValueSelector(memory, fieldPointer),
+        extractionFn
+    );
+  }
+
+  @Override
+  public boolean isNull(Memory memory, long position)
+  {
+    return memory.getByte(position) == nullIndicatorByte;
+  }
+
+
+  @Override
+  public boolean isComparable()
+  {
+    return true;
+  }
+
+  public abstract ColumnValueSelector<?> getColumnValueSelector(

Review Comment:
   Could you please java doc the abstract class methods. 



##########
processing/src/main/java/org/apache/druid/frame/read/FrameReader.java:
##########
@@ -57,15 +61,28 @@ public class FrameReader
   // Field readers, for row-based frames.
   private final List<FieldReader> fieldReaders;
 
+  /**
+   * Currently, only ROW_BASED frames support numerical array columns, while 
the COLUMNAR frames donot. While creating
+   * a FrameReader, for types unsupported by COLUMNAR frames, we populate this 
field to denote that the FrameReader is
+   * "incomplete" and can't be used to read the columnar frame. However, the 
FrameReader performs as expected for the
+   * row-based frames.
+   * In short, this is a temporary measure till columnar frames support the 
numerical array types to punt the unsupported
+   * type check for the numerical arrays (for COLUMNAR frames only) at the 
usage time, rather than the creation time
+   */
+  private final Optional<Pair<String, ColumnType>> unsupportedColumnAndType;

Review Comment:
   +1 to @gianm comment. I think frameReader should not track the incompatible 
column types. That logic should be pushed to the `fieldReaders`and they should 
just throw impl not found. 



##########
processing/src/main/java/org/apache/druid/frame/field/NumericArrayFieldWriter.java:
##########
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.frame.field;
+
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.frame.write.FrameWriterUtils;
+import org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nullable;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Writes the values of the type ARRAY<X> where X is a numeric type to row 
based frames.
+ * The format of the array written is as follows:
+ * <p>
+ * Format:
+ * - 1 Byte - {@link #NULL_ROW} or {@link #NON_NULL_ROW} denoting whether the 
array itself is null
+ * - If the array is null, then the writer stops here
+ * - If the array is not null, then it proceeds to the following steps
+ * <p>
+ * For each value in the non-null array:
+ * - 1 Byte - {@link NumericFieldWriter#ARRAY_ELEMENT_NULL_BYTE} or {@link 
NumericFieldWriter#ARRAY_ELEMENT_NOT_NULL_BYTE}
+ * denothing whether the proceeding value is null or not.
+ * - ElementSize Bytes - The encoded value of the element
+ * <p>
+ * Once all the values in the non-null arrays are over, writes {@link 
#ARRAY_TERMINATOR}. This is to aid the byte
+ * comparison, and also let the reader know that the number of elements in the 
array are over.
+ * <p>
+ * The format doesn't add the number of elements in the array at the 
beginning, though that would have been more
+ * convenient to keep the written array value byte comparable
+ * <p>
+ * Examples:
+ * 1. null
+ * | Bytes  | Value | Interpretation              |
+ * |--------|-------|-----------------------------|
+ * | 1      | 0x00  | Denotes that the array null |
+ * <p>
+ * 2. [] (empty array)
+ * | Bytes  | Value | Interpretation                     |
+ * |--------|----- -|------------------------------------|
+ * | 1      | 0x01  | Denotes that the array is not null |
+ * | 2      | 0x00  | End of the array                   |
+ * <p>
+ * 3. [5L, null, 6L]
+ * | Bytes   | Value        | Interpretation                                   
                                 |
+ * 
|---------|--------------|-----------------------------------------------------------------------------------|
+ * | 1       | 0x01         | Denotes that the array is not null               
                                 |
+ * | 2       | 0x02         | Denotes that the next element is not null        
                                 |
+ * | 3-10    | transform(5) | Representation of 5                              
                                 |
+ * | 11      | 0x01         | Denotes that the next element is null            
                                 |
+ * | 12-19   | transform(0) | Representation of 0 (default value, the reader 
will ignore it if SqlCompatible mode is on |
+ * | 20      | 0x02         | Denotes that the next element is not null        
                                 |
+ * | 21-28   | transform(6) | Representation of 6                              
                                 |
+ * | 29      | 0x00         | End of array                                     
                                 |
+ */
+public class NumericArrayFieldWriter implements FieldWriter
+{
+
+  /**
+   * Denotes that the array itself is null
+   */
+  public static final byte NULL_ROW = 0x00;
+
+  /**
+   * Denotes that the array is non null
+   */
+  public static final byte NON_NULL_ROW = 0x01;
+
+  /**
+   * Marks the end of the array. Since {@link #NULL_ROW}  and {@link 
#ARRAY_TERMINATOR} will only occur at different
+   * locations, therefore there is no clash in keeping both's values at 0x00
+   */
+  public static final byte ARRAY_TERMINATOR = 0x00;
+
+  private final ColumnValueSelector selector;
+  private final NumericFieldWriterFactory writerFactory;
+
+  /**
+   * Returns the writer for ARRAY<LONG>
+   */
+  public static NumericArrayFieldWriter getLongArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, LongFieldWriter::forArray);
+  }
+
+  /**
+   * Returns the writer for ARRAY<FLOAT>
+   */
+  public static NumericArrayFieldWriter getFloatArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, FloatFieldWriter::forArray);
+  }
+
+  /**
+   * Returns the writer for ARRAY<DOUBLE>
+   */
+  public static NumericArrayFieldWriter getDoubleArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, DoubleFieldWriter::forArray);
+  }
+
+  public NumericArrayFieldWriter(final ColumnValueSelector selector, 
NumericFieldWriterFactory writerFactory)
+  {
+    this.selector = selector;
+    this.writerFactory = writerFactory;
+  }
+
+  @Override
+  public long writeTo(WritableMemory memory, long position, long maxSize)
+  {
+    Object row = selector.getObject();
+    if (row == null) {
+      int requiredSize = Byte.BYTES;
+      if (requiredSize > maxSize) {
+        return -1;
+      }
+      memory.putByte(position, NULL_ROW);
+      return requiredSize;
+    } else {
+
+      List<? extends Number> list = 
FrameWriterUtils.getNumericArrayFromNumericArray(row);
+
+      if (list == null) {
+        int requiredSize = Byte.BYTES;
+        if (requiredSize > maxSize) {
+          return -1;
+        }
+        memory.putByte(position, NULL_ROW);
+        return requiredSize;
+      }
+
+      // Create a columnValueSelector to write the individual elements 
re-using the NumericFieldWriter
+      AtomicInteger index = new AtomicInteger(0);
+      ColumnValueSelector<Number> columnValueSelector = new 
ColumnValueSelector<Number>()
+      {
+        @Override
+        public double getDouble()
+        {
+          final Number n = getObject();
+          assert NullHandling.replaceWithDefault() || n != null;
+          return n != null ? n.doubleValue() : 0d;
+        }
+
+        @Override
+        public float getFloat()
+        {
+          final Number n = getObject();
+          assert NullHandling.replaceWithDefault() || n != null;
+          return n != null ? n.floatValue() : 0f;
+        }
+
+        @Override
+        public long getLong()
+        {
+          final Number n = getObject();
+          assert NullHandling.replaceWithDefault() || n != null;
+          return n != null ? n.longValue() : 0L;
+        }
+
+        @Override
+        public void inspectRuntimeShape(RuntimeShapeInspector inspector)
+        {
+
+        }
+
+        @Override
+        public boolean isNull()
+        {
+          // Arrays preserve the individual element's nullity when they are 
written and read.
+          // Therefore, when working with SQL incompatible mode, [7, null] 
won't change to [7, 0] when written to and
+          // read from the underlying serialization (as compared with the 
primitives). Therefore,
+          // even when NullHandling.replaceWithDefault() is true we need to 
write null as is, and not convert it to their
+          // default value when writing the array. Therefore, the check is 
`getObject() == null` ignoring the value of
+          // `NullHandling.replaceWithDefaul()`.
+          return getObject() == null;
+        }
+
+        @Nullable
+        @Override
+        public Number getObject()
+        {
+          return list.get(index.get());
+        }
+
+        @Override
+        public Class<? extends Number> classOfObject()
+        {
+          return Number.class;
+        }
+      };
+
+      NumericFieldWriter writer = writerFactory.get(columnValueSelector);
+
+      int requiredSize = Byte.BYTES + (writer.getNumericSizeBytes() + 
Byte.BYTES) * list.size() + Byte.BYTES;

Review Comment:
   Could you please a java comment here. 
   I think its [startMarker, fields ..... end Marker] 



##########
processing/src/main/java/org/apache/druid/frame/field/NumericArrayFieldWriter.java:
##########
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.frame.field;
+
+import org.apache.datasketches.memory.WritableMemory;
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.frame.write.FrameWriterUtils;
+import org.apache.druid.query.monomorphicprocessing.RuntimeShapeInspector;
+import org.apache.druid.segment.ColumnValueSelector;
+
+import javax.annotation.Nullable;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+
+/**
+ * Writes the values of the type ARRAY<X> where X is a numeric type to row 
based frames.
+ * The format of the array written is as follows:
+ * <p>
+ * Format:
+ * - 1 Byte - {@link #NULL_ROW} or {@link #NON_NULL_ROW} denoting whether the 
array itself is null
+ * - If the array is null, then the writer stops here
+ * - If the array is not null, then it proceeds to the following steps
+ * <p>
+ * For each value in the non-null array:
+ * - 1 Byte - {@link NumericFieldWriter#ARRAY_ELEMENT_NULL_BYTE} or {@link 
NumericFieldWriter#ARRAY_ELEMENT_NOT_NULL_BYTE}
+ * denothing whether the proceeding value is null or not.
+ * - ElementSize Bytes - The encoded value of the element
+ * <p>
+ * Once all the values in the non-null arrays are over, writes {@link 
#ARRAY_TERMINATOR}. This is to aid the byte
+ * comparison, and also let the reader know that the number of elements in the 
array are over.
+ * <p>
+ * The format doesn't add the number of elements in the array at the 
beginning, though that would have been more
+ * convenient to keep the written array value byte comparable
+ * <p>
+ * Examples:
+ * 1. null
+ * | Bytes  | Value | Interpretation              |
+ * |--------|-------|-----------------------------|
+ * | 1      | 0x00  | Denotes that the array null |
+ * <p>
+ * 2. [] (empty array)
+ * | Bytes  | Value | Interpretation                     |
+ * |--------|----- -|------------------------------------|
+ * | 1      | 0x01  | Denotes that the array is not null |
+ * | 2      | 0x00  | End of the array                   |
+ * <p>
+ * 3. [5L, null, 6L]
+ * | Bytes   | Value        | Interpretation                                   
                                 |
+ * 
|---------|--------------|-----------------------------------------------------------------------------------|
+ * | 1       | 0x01         | Denotes that the array is not null               
                                 |
+ * | 2       | 0x02         | Denotes that the next element is not null        
                                 |
+ * | 3-10    | transform(5) | Representation of 5                              
                                 |
+ * | 11      | 0x01         | Denotes that the next element is null            
                                 |
+ * | 12-19   | transform(0) | Representation of 0 (default value, the reader 
will ignore it if SqlCompatible mode is on |
+ * | 20      | 0x02         | Denotes that the next element is not null        
                                 |
+ * | 21-28   | transform(6) | Representation of 6                              
                                 |
+ * | 29      | 0x00         | End of array                                     
                                 |
+ */
+public class NumericArrayFieldWriter implements FieldWriter
+{
+
+  /**
+   * Denotes that the array itself is null
+   */
+  public static final byte NULL_ROW = 0x00;
+
+  /**
+   * Denotes that the array is non null
+   */
+  public static final byte NON_NULL_ROW = 0x01;
+
+  /**
+   * Marks the end of the array. Since {@link #NULL_ROW}  and {@link 
#ARRAY_TERMINATOR} will only occur at different
+   * locations, therefore there is no clash in keeping both's values at 0x00
+   */
+  public static final byte ARRAY_TERMINATOR = 0x00;
+
+  private final ColumnValueSelector selector;
+  private final NumericFieldWriterFactory writerFactory;
+
+  /**
+   * Returns the writer for ARRAY<LONG>
+   */
+  public static NumericArrayFieldWriter getLongArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, LongFieldWriter::forArray);
+  }
+
+  /**
+   * Returns the writer for ARRAY<FLOAT>
+   */
+  public static NumericArrayFieldWriter getFloatArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, FloatFieldWriter::forArray);
+  }
+
+  /**
+   * Returns the writer for ARRAY<DOUBLE>
+   */
+  public static NumericArrayFieldWriter getDoubleArrayFieldWriter(final 
ColumnValueSelector selector)
+  {
+    return new NumericArrayFieldWriter(selector, DoubleFieldWriter::forArray);
+  }
+
+  public NumericArrayFieldWriter(final ColumnValueSelector selector, 
NumericFieldWriterFactory writerFactory)
+  {
+    this.selector = selector;
+    this.writerFactory = writerFactory;
+  }
+
+  @Override
+  public long writeTo(WritableMemory memory, long position, long maxSize)
+  {
+    Object row = selector.getObject();
+    if (row == null) {
+      int requiredSize = Byte.BYTES;
+      if (requiredSize > maxSize) {
+        return -1;
+      }
+      memory.putByte(position, NULL_ROW);
+      return requiredSize;
+    } else {
+
+      List<? extends Number> list = 
FrameWriterUtils.getNumericArrayFromNumericArray(row);
+
+      if (list == null) {
+        int requiredSize = Byte.BYTES;
+        if (requiredSize > maxSize) {
+          return -1;
+        }
+        memory.putByte(position, NULL_ROW);
+        return requiredSize;
+      }
+
+      // Create a columnValueSelector to write the individual elements 
re-using the NumericFieldWriter
+      AtomicInteger index = new AtomicInteger(0);
+      ColumnValueSelector<Number> columnValueSelector = new 
ColumnValueSelector<Number>()
+      {
+        @Override
+        public double getDouble()
+        {
+          final Number n = getObject();
+          assert NullHandling.replaceWithDefault() || n != null;
+          return n != null ? n.doubleValue() : 0d;
+        }
+
+        @Override
+        public float getFloat()
+        {
+          final Number n = getObject();
+          assert NullHandling.replaceWithDefault() || n != null;
+          return n != null ? n.floatValue() : 0f;
+        }
+
+        @Override
+        public long getLong()
+        {
+          final Number n = getObject();
+          assert NullHandling.replaceWithDefault() || n != null;
+          return n != null ? n.longValue() : 0L;
+        }
+
+        @Override
+        public void inspectRuntimeShape(RuntimeShapeInspector inspector)
+        {
+
+        }
+
+        @Override
+        public boolean isNull()
+        {
+          // Arrays preserve the individual element's nullity when they are 
written and read.
+          // Therefore, when working with SQL incompatible mode, [7, null] 
won't change to [7, 0] when written to and
+          // read from the underlying serialization (as compared with the 
primitives). Therefore,
+          // even when NullHandling.replaceWithDefault() is true we need to 
write null as is, and not convert it to their
+          // default value when writing the array. Therefore, the check is 
`getObject() == null` ignoring the value of
+          // `NullHandling.replaceWithDefaul()`.
+          return getObject() == null;
+        }
+
+        @Nullable
+        @Override
+        public Number getObject()
+        {
+          return list.get(index.get());
+        }
+
+        @Override
+        public Class<? extends Number> classOfObject()
+        {
+          return Number.class;
+        }
+      };
+
+      NumericFieldWriter writer = writerFactory.get(columnValueSelector);
+
+      int requiredSize = Byte.BYTES + (writer.getNumericSizeBytes() + 
Byte.BYTES) * list.size() + Byte.BYTES;
+
+      if (requiredSize > maxSize) {
+        return -1;
+      }
+
+      long offset = 0;
+      memory.putByte(position + offset, NON_NULL_ROW);
+      offset += Byte.BYTES;
+
+      for (; index.get() < list.size(); index.incrementAndGet()) {

Review Comment:
   Can we iterate the list here ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to