imply-cheddar commented on code in PR #14014:
URL: https://github.com/apache/druid/pull/14014#discussion_r1156544077
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/CompactionTask.java:
##########
@@ -1109,16 +1104,25 @@ private void processDimensionsSpec(final QueryableIndex
index)
);
if (!uniqueDims.containsKey(dimension)) {
- Preconditions.checkNotNull(
+ final DimensionHandler dimensionHandler = Preconditions.checkNotNull(
dimensionHandlerMap.get(dimension),
"Cannot find dimensionHandler for dimension[%s]",
dimension
);
+
uniqueDims.put(dimension, uniqueDims.size());
dimensionSchemaMap.put(
dimension,
- columnHolder.getColumnFormat().getColumnSchema(dimension)
+ // this should use:
+ // columnHolder.getColumnFormat().getColumnSchema(dimension)
+ // someday...
Review Comment:
But it cannot because? (Please add to the comment)
##########
processing/src/main/java/org/apache/druid/segment/nested/NestedCommonFormatColumn.java:
##########
@@ -17,27 +17,31 @@
* under the License.
*/
-package org.apache.druid.segment.column;
+package org.apache.druid.segment.nested;
Review Comment:
Package nit: I wonder if it shouldn't be `segment.column.nested`? Probably
doesn't really make that much of a difference, but seems more technically
accurate.
##########
processing/src/main/java/org/apache/druid/segment/QueryableIndexIndexableAdapter.java:
##########
@@ -113,6 +111,13 @@ public <T extends Comparable<? super T>>
CloseableIndexed<T> getDimValueLookup(S
final BaseColumn col = columnHolder.getColumn();
if (!(col instanceof DictionaryEncodedColumn)) {
+ // this shouldn't happen, but if it does, try to close to prevent a leak
+ try {
+ col.close();
+ }
+ catch (IOException e) {
+ throw new RuntimeException(e);
+ }
Review Comment:
A bit of a nit, but if you add this text to the exception message, you get
the benefit of the comment explaining that you don't expect it to occur and if
the exception ever does get thrown, the message is there too for whoever sees
it.
##########
processing/src/main/java/org/apache/druid/segment/nested/NestedDataColumnV5.java:
##########
@@ -30,16 +30,24 @@
import org.apache.druid.segment.data.FrontCodedIntArrayIndexed;
import org.apache.druid.segment.data.GenericIndexed;
import org.apache.druid.segment.data.Indexed;
-import org.apache.druid.segment.serde.StandardTypeColumnSerializer;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.List;
-public class StandardNestedDataColumn<TStringDictionary extends
Indexed<ByteBuffer>>
+/**
+ * Nested data column with optimized support for simple arrays. Not actually
v5 in the segment since columns are now
+ * serialized using {@link
org.apache.druid.segment.serde.NestedCommonFormatColumnPartSerde} instead of
the generic
+ * complex type system.
+ *
+ * Not really stored in a segment as V5 since instead of V5 we migrated to
{@link NestedCommonFormatColumn} which
+ * specializes physical format based on the types of data encountered during
processing, and so versions are now
+ * {@link NestedCommonFormatColumnSerializer#V0} for all associated
specializations.
Review Comment:
I like the `NestedCommon` name that you picked for other stuff. It wouldn't
bother me if you named this `NestedCommon` as well, fwiw. I'm also fine with
the current naming, so more stream-of-consciousness thought than a request for
change.
##########
processing/src/main/java/org/apache/druid/segment/NestedDataColumnMerger.java:
##########
@@ -92,7 +92,12 @@ public void
writeMergedValueDictionary(List<IndexableAdapter> adapters) throws I
for (int i = 0; i < adapters.size(); i++) {
final IndexableAdapter adapter = adapters.get(i);
- final SortedValueDictionary dimValues =
closer.register(adapter.getSortedValueLookup(name, mergedFields));
+
+ final IndexableAdapter.NestedColumnMergable mergable = closer.register(
+ adapter.getNestedColumnMergeables(name)
+ );
+ final SortedValueDictionary dimValues = mergable.getValueDictionary();
+ mergable.mergeFieldsInto(mergedFields);
Review Comment:
It's unclear if the `dimValues` dictionary would be mutated or not and if
that's expected or not. I *think* that it's expected that it is not mutated.
In which case, it might be nice to push `mergable.mergeFieldsInto()` down below
the `if (!allNulls){ }` block just to make it clear that those values are
being evaluated and grabbed before any merge occurs.
##########
processing/src/main/java/org/apache/druid/segment/serde/ComplexMetricSerde.java:
##########
@@ -43,14 +43,6 @@
public abstract ComplexMetricExtractor getExtractor();
- /**
- * Deserializes a ByteBuffer and adds it to the ColumnBuilder. This method
allows for the ComplexMetricSerde
- * to implement it's own versioning scheme to allow for changes of binary
format in a forward-compatible manner.
- *
- * @param buffer the buffer to deserialize
- * @param builder ColumnBuilder to add the column to
- * @param columnConfig ColumnConfiguration used during deserialization
- */
Review Comment:
Looks like the javadoc on this method got clobbered, probably by accident?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]