clintropolis commented on a change in pull request #11884:
URL: https://github.com/apache/druid/pull/11884#discussion_r744025968



##########
File path: 
processing/src/main/java/org/apache/druid/segment/RowBasedColumnSelectorFactory.java
##########
@@ -340,9 +330,67 @@ public Class classOfObject()
         @Override
         public void inspectRuntimeShape(RuntimeShapeInspector inspector)
         {
-          inspector.visit("row", supplier);
+          inspector.visit("row", rowSupplier);
           inspector.visit("extractionFn", extractionFn);
         }
+
+        private void updateCurrentValues()
+        {
+          if (rowIdSupplier == null || rowIdSupplier.getAsLong() != currentId) 
{
+            try {
+              final Object rawValue = dimFunction.apply(rowSupplier.get());
+
+              if (rawValue == null || rawValue instanceof String) {
+                final String s = NullHandling.emptyToNullIfNeeded((String) 
rawValue);
+
+                if (extractionFn == null) {
+                  dimensionValues = Collections.singletonList(s);
+                } else {
+                  dimensionValues = 
Collections.singletonList(extractionFn.apply(s));
+                }
+              } else if (rawValue instanceof List) {
+                // Consistent behavior with Rows.objectToStrings, but applies 
extractionFn too.
+                //noinspection rawtypes
+                final List<String> values = new ArrayList<>(((List) 
rawValue).size());
+
+                //noinspection rawtypes
+                for (final Object item : ((List) rawValue)) {
+                  // Behavior with null item is to convert it to string 
"null". This is not what most other areas of Druid
+                  // would do when treating a null as a string, but it's 
consistent with Rows.objectToStrings, which is
+                  // commonly used when retrieving strings from input-row-like 
objects.
+                  if (extractionFn == null) {
+                    
values.add(NullHandling.emptyToNullIfNeeded(String.valueOf(item)));

Review comment:
       heh, i've never paid close attention to this, so that means `null` 
becomes `"null"` but `""` becomes `null`? but only if it is a list of strings, 
not a single string...
   
   I don't think this needs done now, but I can't help but wonder if things 
that want this transformation to happen should like provide a mechanism to make 
it happen so that it isn't implicit functionality of the row selector stuff

##########
File path: 
processing/src/main/java/org/apache/druid/segment/RowBasedColumnSelectorFactory.java
##########
@@ -47,59 +51,71 @@
  */
 public class RowBasedColumnSelectorFactory<T> implements ColumnSelectorFactory
 {
-  private final Supplier<T> supplier;
+  private static final long NO_ID = -1;
+
+  private final Supplier<T> rowSupplier;
+
+  @Nullable
+  private final LongSupplier rowIdSupplier;
   private final RowAdapter<T> adapter;
-  private final RowSignature rowSignature;
+  private final ColumnInspector columnInspector;
   private final boolean throwParseExceptions;
 
-  private RowBasedColumnSelectorFactory(
-      final Supplier<T> supplier,
+  /**
+   * Package-private constructor for {@link RowBasedCursor}. Allows passing in 
a rowIdSupplier, which enables
+   * column value reuse optimizations.
+   */
+  RowBasedColumnSelectorFactory(
+      final Supplier<T> rowSupplier,
+      @Nullable final LongSupplier rowIdSupplier,
       final RowAdapter<T> adapter,
-      final RowSignature rowSignature,
+      final ColumnInspector columnInspector,
       final boolean throwParseExceptions
   )
   {
-    this.supplier = supplier;
-    this.adapter = adapter;
-    this.rowSignature = Preconditions.checkNotNull(rowSignature, "rowSignature 
must be nonnull");
+    this.rowSupplier = Preconditions.checkNotNull(rowSupplier, "rowSupplier");
+    this.rowIdSupplier = rowIdSupplier;
+    this.adapter = Preconditions.checkNotNull(adapter, "adapter");
+    this.columnInspector = Preconditions.checkNotNull(columnInspector, 
"columnInspector must be nonnull");
     this.throwParseExceptions = throwParseExceptions;
   }
 
   /**
    * Create an instance based on any object, along with a {@link RowAdapter} 
for that object.
    *
    * @param adapter              adapter for these row objects
-   * @param supplier             supplier of row objects
-   * @param signature            will be used for reporting available columns 
and their capabilities. Note that the this
+   * @param rowSupplier          supplier of row objects
+   * @param columnInspector      will be used for reporting available columns 
and their capabilities. Note that this
    *                             factory will still allow creation of 
selectors on any named field in the rows, even if
-   *                             it doesn't appear in "rowSignature". (It only 
needs to be accessible via
+   *                             it doesn't appear in "columnInspector". (It 
only needs to be accessible via
    *                             {@link RowAdapter#columnFunction}.) As a 
result, you can achieve an untyped mode by
    *                             passing in {@link RowSignature#empty()}.
    * @param throwParseExceptions whether numeric selectors should throw parse 
exceptions or use a default/null value
    *                             when their inputs are not actually numeric
    */
   public static <RowType> RowBasedColumnSelectorFactory<RowType> create(
       final RowAdapter<RowType> adapter,
-      final Supplier<RowType> supplier,
-      final RowSignature signature,
+      final Supplier<RowType> rowSupplier,
+      final ColumnInspector columnInspector,
       final boolean throwParseExceptions
   )
   {
-    return new RowBasedColumnSelectorFactory<>(supplier, adapter, signature, 
throwParseExceptions);
+    return new RowBasedColumnSelectorFactory<>(rowSupplier, null, adapter, 
columnInspector, throwParseExceptions);
   }
 
   @Nullable
   static ColumnCapabilities getColumnCapabilities(
-      final RowSignature rowSignature,
+      final ColumnInspector columnInspector,
       final String columnName
   )
   {
     if (ColumnHolder.TIME_COLUMN_NAME.equals(columnName)) {
-      // TIME_COLUMN_NAME is handled specially; override the provided 
rowSignature.
+      // TIME_COLUMN_NAME is handled specially; override the provided 
inspector.
       return 
ColumnCapabilitiesImpl.createSimpleNumericColumnCapabilities(ColumnType.LONG);
     } else {
-      final ColumnType valueType = 
rowSignature.getColumnType(columnName).orElse(null);
-
+      final Optional<ColumnCapabilities> inspectedCapabilities =
+          
Optional.ofNullable(columnInspector.getColumnCapabilities(columnName));
+      final ColumnType valueType = 
inspectedCapabilities.map(ColumnCapabilities::toColumnType).orElse(null);

Review comment:
       nit: I should have probably made `ColumnCapabilitiesImpl.setType` accept 
`TypeSignature<ValueType>` instead of `ColumnType` so that you could just use 
the capabilities from the inspector directly, want to make this change? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to