cecemei commented on code in PR #18722:
URL: https://github.com/apache/druid/pull/18722#discussion_r2604049544


##########
processing/src/main/java/org/apache/druid/segment/column/BitmapIndexEncodingStrategy.java:
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.column;
+
+import com.fasterxml.jackson.annotation.JsonSubTypes;
+import com.fasterxml.jackson.annotation.JsonTypeInfo;
+import org.apache.druid.collections.bitmap.BitmapFactory;
+import org.apache.druid.collections.bitmap.ImmutableBitmap;
+import org.apache.druid.collections.bitmap.MutableBitmap;
+import org.apache.druid.error.DruidException;
+import org.apache.druid.segment.data.GenericIndexedWriter;
+import org.apache.druid.segment.file.SegmentFileBuilder;
+import org.apache.druid.segment.serde.Serializer;
+
+import javax.annotation.Nullable;
+import java.io.IOException;
+import java.nio.channels.WritableByteChannel;
+import java.util.Objects;
+
+@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "type")
+@JsonSubTypes({
+    @JsonSubTypes.Type(value = BitmapIndexEncodingStrategy.DictionaryId.class, 
name = "dictionaryId"),
+    @JsonSubTypes.Type(value = BitmapIndexEncodingStrategy.NullsOnly.class, 
name = "nullsOnly")
+})
+public abstract class BitmapIndexEncodingStrategy implements Serializer
+{
+  /**
+   * Assigned in {@link #init(BitmapFactory, int)}
+   */
+  @Nullable
+  protected MutableBitmap[] bitmaps;
+  /**
+   * Assigned in {@link #close(BitmapFactory, GenericIndexedWriter)}
+   */
+  @Nullable
+  GenericIndexedWriter<ImmutableBitmap> writer;
+
+  public abstract void init(BitmapFactory bitmapFactory, int dictionarySize);
+
+  public abstract void add(int row, int sortedId, @Nullable Object o);
+
+  public void close(BitmapFactory bitmapFactory, 
GenericIndexedWriter<ImmutableBitmap> writer) throws IOException

Review Comment:
   renamed to finalize



##########
processing/src/main/java/org/apache/druid/segment/nested/GlobalDictionaryEncodedFieldColumnWriter.java:
##########
@@ -187,7 +189,11 @@ long getSerializedColumnSize() throws IOException
    * Defines how to write the column, including the dictionary id column, 
along with any additional columns
    * such as the long or double value column as type appropriate.
    */
-  abstract void writeColumnTo(WritableByteChannel channel, SegmentFileBuilder 
fileBuilder) throws IOException;
+  void writeColumnTo(WritableByteChannel channel, SegmentFileBuilder 
fileBuilder) throws IOException
+  {
+    writeLongAndDoubleColumnLength(channel, 0, 0);
+    encodedValueSerializer.writeTo(channel, fileBuilder);
+  }

Review Comment:
   i was jumping through classes to compare `getSerializedColumnSize` and 
`writeColumnTo` and thought it'd be nice if inside the abstraction class these 
could all match. so like either overwrite all or overwrite none.



##########
processing/src/main/java/org/apache/druid/segment/nested/GlobalDictionaryEncodedFieldColumnWriter.java:
##########
@@ -272,21 +272,21 @@ public long getSerializedSize() throws IOException
         } else {
           arraySize = 0;
         }
-        return 1 + Integer.BYTES +
+        return 1 + Integer.BYTES + // version + feature flags
                sortedDictionaryWriter.getSerializedSize() +
-               bitmapIndexWriter.getSerializedSize() +
-               arraySize +
-               getSerializedColumnSize();
+               getSerializedColumnSize() +
+               bitmapIndexEncoding.getSerializedSize() +
+               arraySize;
       }
 
       @Override
       public void writeTo(WritableByteChannel channel, SegmentFileBuilder 
fileBuilder) throws IOException
       {
         Channels.writeFully(channel, ByteBuffer.wrap(new 
byte[]{version.asByte()}));
-        channel.write(ByteBuffer.wrap(Ints.toByteArray(flags)));
+        Channels.writeFully(channel, ByteBuffer.wrap(Ints.toByteArray(flags)));

Review Comment:
   no just for consistency with line above 



##########
processing/src/main/java/org/apache/druid/segment/nested/NestedCommonFormatColumnFormatSpec.java:
##########
@@ -172,14 +184,18 @@ public static NestedCommonFormatColumnFormatSpec 
getEffectiveFormatSpec(
   private final CompressionStrategy doubleColumnCompression;
   @Nullable
   private final BitmapSerdeFactory bitmapEncoding;
+  @Nullable
+  private final BitmapIndexEncodingStrategy numericFieldsBitmapIndexEncoding;
 
   @JsonCreator
   public NestedCommonFormatColumnFormatSpec(
       @JsonProperty("objectFieldsDictionaryEncoding") @Nullable 
StringEncodingStrategy objectFieldsDictionaryEncoding,
+      @JsonProperty("numericFieldsBitmapIndexEncoding") @Nullable 
BitmapIndexEncodingStrategy numericFieldsBitmapIndexEncoding,

Review Comment:
   i was thinking more granular controls by field names, with numeric as 
default. kinda feel like nothing in particular that double and long might be 
different? adding `longFieldIndexType` and `doubleFieldIndexType` is not too 
much work either, can do that if you have strong opinions on this.
   
   rn i renamed to `numericFieldsBitmapIndexType` and `BitmapIndexType`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to