xiangfu0 commented on code in PR #18504:
URL: https://github.com/apache/pinot/pull/18504#discussion_r3246102493


##########
pinot-core/src/main/java/org/apache/pinot/core/common/BlockValSet.java:
##########
@@ -48,11 +48,27 @@ public interface BlockValSet {
   boolean isSingleValue();
 
   /**
-   * Returns the dictionary for the column, or {@code null} if the column is 
not dictionary-encoded.
+   * Returns the dictionary file for the column if one exists, or {@code null} 
otherwise. The dictionary may be
+   * present even when {@link #isDictionaryEncoded()} returns {@code false} — 
a column declared as
+   * {@code EncodingType.RAW} with an explicit {@code dictionaryIndex} carries 
a dictionary file on disk but a RAW
+   * forward index. Callers that select between a dictionary-id read path
+   * ({@link #getDictionaryIdsSV()} / {@link #getDictionaryIdsMV()}) and a 
value read path MUST gate on
+   * {@link #isDictionaryEncoded()}, not {@code getDictionary() != null}.

Review Comment:
   Good nit — fixed in bf6327d21e. `getDictionary()` Javadoc no longer says 
"file"; it now reads "may live on disk (segment-backed columns) or be built on 
the fly (transform functions)" and points callers at `isDictionaryEncoded()` as 
the canonical gate.



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/ColumnContext.java:
##########
@@ -49,25 +52,44 @@ public boolean isSingleValue() {
     return _isSingleValue;
   }
 
+  /// Returns the column's dictionary file if one exists, regardless of 
whether the forward index can answer
+  /// dictionary-id reads. Callers that need to select between a dict-id read 
path and a value read path MUST gate
+  /// on {@link #isDictionaryEncoded()} rather than {@code getDictionary() != 
null} — a column declared as
+  /// {@code EncodingType.RAW} with an explicit {@code dictionaryIndex} 
returns a non-null dictionary here but its
+  /// forward index throws on {@link ForwardIndexReader#readDictIds}.
   @Nullable
   public Dictionary getDictionary() {
     return _dictionary;
   }
 
+  /// Returns {@code true} if the column's forward index is dictionary-encoded 
and the dict-id read path
+  /// ({@link org.apache.pinot.core.common.BlockValSet#getDictionaryIdsSV()}) 
is callable. A column with
+  /// {@code EncodingType.RAW} + an explicit {@code dictionaryIndex} returns 
{@code false} here even though
+  /// {@link #getDictionary()} is non-null.
+  public boolean isDictionaryEncoded() {
+    return _dictionaryEncoded;
+  }
+
   @Nullable
   public DataSource getDataSource() {
     return _dataSource;
   }
 
   public static ColumnContext fromDataSource(DataSource dataSource) {
     DataSourceMetadata dataSourceMetadata = dataSource.getDataSourceMetadata();
-    return new ColumnContext(dataSourceMetadata.getDataType(), 
dataSourceMetadata.isSingleValue(),
-        dataSource.getDictionary(), dataSource);
+    Dictionary dictionary = dataSource.getDictionary();
+    ForwardIndexReader<?> forwardIndex = dataSource.getForwardIndex();
+    boolean dictEncoded = dictionary != null && (forwardIndex == null || 
forwardIndex.isDictionaryEncoded());

Review Comment:
   Fixed in bf6327d21e — `ColumnContext.fromDataSource` now requires 
`forwardIndex != null && forwardIndex.isDictionaryEncoded()`, so a disabled 
forward index correctly reports `isDictionaryEncoded() == false`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to