gianm commented on code in PR #17762:
URL: https://github.com/apache/druid/pull/17762#discussion_r2354498651


##########
processing/src/main/java/org/apache/druid/segment/nested/NestedCommonFormatColumn.java:
##########
@@ -125,10 +132,11 @@ public ColumnFormat merge(@Nullable ColumnFormat 
otherFormat)
 
       if (otherFormat instanceof Format) {
         final Format other = (Format) otherFormat;
+        // todo (clint): actually merge columnFormatSpec, maybe

Review Comment:
   please don't commit todos. Should this be done now or later?



##########
processing/src/main/java/org/apache/druid/segment/nested/NestedCommonFormatColumnFormatSpec.java:
##########
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.nested;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.segment.IndexSpec;
+import org.apache.druid.segment.column.StringEncodingStrategy;
+import org.apache.druid.segment.data.BitmapSerdeFactory;
+import org.apache.druid.segment.data.CompressionFactory;
+import org.apache.druid.segment.data.CompressionStrategy;
+
+import javax.annotation.Nullable;
+import java.util.Objects;
+
+public class NestedCommonFormatColumnFormatSpec

Review Comment:
   It'd be nice to have a javadoc for this class talking about how it's 
expected to be used.



##########
processing/src/main/java/org/apache/druid/segment/AutoTypeColumnSchema.java:
##########
@@ -84,6 +95,9 @@ public AutoTypeColumnSchema(
     } else {
       this.castToType = castToType;
     }
+    this.columnFormatSpec = columnFormatSpec == null

Review Comment:
   IMO, the user provided object should be merged with the server-wide default 
object at the level of individual keys. If you set `objectStorageCompression` 
here in the schema, it shouldn't override a server-wide default that was 
specified for `stringDictionaryEncoding`.



##########
processing/src/main/java/org/apache/druid/segment/nested/NestedCommonFormatColumnFormatSpec.java:
##########
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.nested;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.segment.IndexSpec;
+import org.apache.druid.segment.column.StringEncodingStrategy;
+import org.apache.druid.segment.data.BitmapSerdeFactory;
+import org.apache.druid.segment.data.CompressionFactory;
+import org.apache.druid.segment.data.CompressionStrategy;
+
+import javax.annotation.Nullable;
+import java.util.Objects;
+
+public class NestedCommonFormatColumnFormatSpec
+{
+  public static Builder builder()
+  {
+    return new Builder();
+  }
+
+  public static NestedCommonFormatColumnFormatSpec getEffectiveFormatSpec(
+      @Nullable NestedCommonFormatColumnFormatSpec columnFormatSpec,
+      IndexSpec indexSpec
+  )
+  {
+    return Objects.requireNonNullElseGet(
+        columnFormatSpec,
+        () -> NestedCommonFormatColumnFormatSpec.builder()
+                                                
.setObjectFieldsDictionaryEncoding(StringEncodingStrategy.UTF8_STRATEGY)
+                                                
.setObjectStorageEncoding(ObjectStorageEncoding.SMILE)
+                                                .build()
+    ).getEffectiveSpec(indexSpec);
+  }
+
+  @Nullable
+  @JsonProperty
+  private final StringEncodingStrategy objectFieldsDictionaryEncoding;
+  @Nullable
+  @JsonProperty
+  private final ObjectStorageEncoding objectStorageEncoding;
+  @Nullable
+  @JsonProperty
+  private final CompressionStrategy objectStorageCompression;
+  @Nullable
+  @JsonProperty
+  private final StringEncodingStrategy stringDictionaryEncoding;
+  @Nullable
+  @JsonProperty
+  private final CompressionStrategy dictionaryEncodedColumnCompression;
+  @Nullable
+  @JsonProperty
+  private final CompressionFactory.LongEncodingStrategy longColumnEncoding;
+  @Nullable
+  @JsonProperty
+  private final CompressionStrategy longColumnCompression;
+  @Nullable
+  @JsonProperty
+  private final CompressionStrategy doubleColumnCompression;
+  @Nullable
+  @JsonProperty
+  private final BitmapSerdeFactory bitmapEncoding;
+
+  @JsonCreator
+  public NestedCommonFormatColumnFormatSpec(
+      @JsonProperty("objectFieldsDictionaryEncoding") @Nullable 
StringEncodingStrategy objectFieldsDictionaryEncoding,
+      @JsonProperty("objectStorageEncoding") @Nullable ObjectStorageEncoding 
objectStorageEncoding,
+      @JsonProperty("objectStorageCompression") @Nullable CompressionStrategy 
objectStorageCompression,
+      @JsonProperty("stringDictionaryEncoding") @Nullable 
StringEncodingStrategy stringDictionaryEncoding,
+      @JsonProperty("dictionaryEncodedColumnCompression") @Nullable 
CompressionStrategy dictionaryEncodedColumnCompression,
+      @JsonProperty("longColumnEncoding") @Nullable 
CompressionFactory.LongEncodingStrategy longColumnEncoding,
+      @JsonProperty("longColumnCompression") @Nullable CompressionStrategy 
longColumnCompression,
+      @JsonProperty("doubleColumnCompression") @Nullable CompressionStrategy 
doubleColumnCompression,
+      @JsonProperty("bitmapEncoding") @Nullable BitmapSerdeFactory 
bitmapEncoding
+  )
+  {
+    this.objectFieldsDictionaryEncoding = objectFieldsDictionaryEncoding;
+    this.objectStorageEncoding = objectStorageEncoding;
+    this.objectStorageCompression = objectStorageCompression;
+    this.stringDictionaryEncoding = stringDictionaryEncoding;
+    this.dictionaryEncodedColumnCompression = 
dictionaryEncodedColumnCompression;
+    this.longColumnEncoding = longColumnEncoding;
+    this.longColumnCompression = longColumnCompression;
+    this.doubleColumnCompression = doubleColumnCompression;
+    this.bitmapEncoding = bitmapEncoding;
+  }
+
+  public NestedCommonFormatColumnFormatSpec getEffectiveSpec(IndexSpec 
indexSpec)
+  {
+    if (bitmapEncoding != null && 
!bitmapEncoding.equals(indexSpec.getBitmapSerdeFactory())) {

Review Comment:
   Why let the `bitmapEncoding` be set at all, if it can only have one valid 
value?



##########
processing/src/main/java/org/apache/druid/segment/AutoTypeColumnIndexer.java:
##########
@@ -793,10 +799,11 @@ public ColumnFormat merge(@Nullable ColumnFormat 
otherFormat)
       }
       if (otherFormat instanceof Format) {
         final Format other = (Format) otherFormat;
+        // todo (clint): actually merge column format spec, maybe?

Review Comment:
   please don't commit todos



##########
processing/src/main/java/org/apache/druid/segment/nested/NestedCommonFormatColumnFormatSpec.java:
##########
@@ -0,0 +1,338 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.segment.nested;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.druid.java.util.common.ISE;
+import org.apache.druid.segment.IndexSpec;
+import org.apache.druid.segment.column.StringEncodingStrategy;
+import org.apache.druid.segment.data.BitmapSerdeFactory;
+import org.apache.druid.segment.data.CompressionFactory;
+import org.apache.druid.segment.data.CompressionStrategy;
+
+import javax.annotation.Nullable;
+import java.util.Objects;
+
+public class NestedCommonFormatColumnFormatSpec
+{
+  public static Builder builder()
+  {
+    return new Builder();
+  }
+
+  public static NestedCommonFormatColumnFormatSpec getEffectiveFormatSpec(
+      @Nullable NestedCommonFormatColumnFormatSpec columnFormatSpec,
+      IndexSpec indexSpec
+  )
+  {
+    return Objects.requireNonNullElseGet(
+        columnFormatSpec,
+        () -> NestedCommonFormatColumnFormatSpec.builder()
+                                                
.setObjectFieldsDictionaryEncoding(StringEncodingStrategy.UTF8_STRATEGY)
+                                                
.setObjectStorageEncoding(ObjectStorageEncoding.SMILE)
+                                                .build()
+    ).getEffectiveSpec(indexSpec);
+  }
+
+  @Nullable
+  @JsonProperty
+  private final StringEncodingStrategy objectFieldsDictionaryEncoding;
+  @Nullable
+  @JsonProperty
+  private final ObjectStorageEncoding objectStorageEncoding;
+  @Nullable
+  @JsonProperty
+  private final CompressionStrategy objectStorageCompression;
+  @Nullable
+  @JsonProperty
+  private final StringEncodingStrategy stringDictionaryEncoding;
+  @Nullable
+  @JsonProperty
+  private final CompressionStrategy dictionaryEncodedColumnCompression;
+  @Nullable
+  @JsonProperty
+  private final CompressionFactory.LongEncodingStrategy longColumnEncoding;
+  @Nullable
+  @JsonProperty
+  private final CompressionStrategy longColumnCompression;
+  @Nullable
+  @JsonProperty
+  private final CompressionStrategy doubleColumnCompression;
+  @Nullable
+  @JsonProperty

Review Comment:
   Seems strange to have `@JsonProperty` on all of the fields, the constructor, 
and the getters. i would think just constructor and getters is fine? Or just 
fields?



##########
processing/src/main/java/org/apache/druid/segment/DefaultColumnFormatConfig.java:
##########
@@ -69,17 +77,30 @@ private static void 
validateMultiValueHandlingMode(@Nullable String stringMultiV
   @JsonProperty("stringMultiValueHandlingMode")
   private final String stringMultiValueHandlingMode;
 
+  @Nullable
+  @JsonProperty
+  private final NestedCommonFormatColumnFormatSpec nestedColumnFormatSpec;
+
   @JsonCreator
   public DefaultColumnFormatConfig(
+      @JsonProperty("stringMultiValueHandlingMode") @Nullable String 
stringMultiValueHandlingMode,
       @JsonProperty("nestedColumnFormatVersion") @Nullable Integer 
nestedColumnFormatVersion,
-      @JsonProperty("stringMultiValueHandlingMode") @Nullable String 
stringMultiValueHandlingMode
+      @JsonProperty("nestedColumnFormatSpec") @Nullable 
NestedCommonFormatColumnFormatSpec nestedColumnFormatSpec

Review Comment:
   If I read the code right, this applies to both nested and auto typed 
columns. If that's right, the name is misleading. Also, I wonder why this 
should be part of DefaultColumnFormatConfig, rather than IndexSpec? What I'm 
suggesting is like:
   
   1. IndexSpec remains the place where default encoding/compression/etc stuff 
is specified. It seems useful to be able to set server-wide defaults, and IIRC 
there isn't currently a way to set a server-wide default IndexSpec. But, I 
don't see why there shouldn't be.
   2. You can still override nestedColumnFormatSpec at the level of 
`NestedDataColumnSchema` and `AutoTypeColumnSchema`.
   3. Maybe rename `nestedColumnFormatSpec` to `autoTypeColumnSpec`? To me that 
seems more descriptive, since all nested columns are auto typed, but all auto 
typed columns are not necessarily nested.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to