clintropolis commented on code in PR #13803:
URL: https://github.com/apache/druid/pull/13803#discussion_r1136487243


##########
extensions-core/parquet-extensions/src/test/java/org/apache/druid/data/input/parquet/NestedColumnParquetReaderTest.java:
##########
@@ -181,7 +181,7 @@ public void 
testNestedColumnSchemalessNestedTestFileNoNested() throws IOExceptio
     );
 
     List<InputRow> rows = readAllRows(reader);
-    Assert.assertEquals(ImmutableList.of("dim1", "metric1", "timestamp"), 
rows.get(0).getDimensions());
+    Assert.assertEquals(ImmutableList.of("dim1", "metric1"), 
rows.get(0).getDimensions());

Review Comment:
   this is related to the change in dimension filter in `MapBasedInputRow` to 
always filter out the timestamp spec column.
   
   In 'normal' production code the timestamp spec is added to dimension 
exclusions so the code in `MapBasedInputRow` that computes dimensions to ensure 
time would not be in the dimensions list. However, in test code, especially in 
processing, which doesn't have access to the methods that take a dataschema and 
transform it into an input row schema, its pretty easy to not explicitly add 
timestamp column to the dimensions exclusion list. So as a result of not 
manually adding timestamp column to exclusions, it would end up in the 
dimensions list in schema discovery modes, as a string (or nested column, 
depending on config), which when doing rollup tests means it ends up as part of 
the rollup key, and so on. (Again this doesn't happen in current production 
code because it goes through that translator utility method).
   
   I made the change there to always filter the timestamp spec column from the 
dimensions list to make it easier to not write wrong tests for schema discovery 
mode, which caused the change here and other places.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to