clintropolis commented on code in PR #18078:
URL: https://github.com/apache/druid/pull/18078#discussion_r2130496388


##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java:
##########
@@ -7386,4 +7477,173 @@ public void testCountPathWithArraysReturning()
                     .build()
     );
   }
+
+  @Test
+  public void testSumPathWithArraysRealtime()
+  {
+    /*
+    "obj":{... "c": [100], ...}
+    "obj":{... "c": ["a", "b"], ...}
+    "obj":{...}
+    "obj":{... "c": {"a": 1}, ...},
+    "obj":{... "c": "hello", ...},
+    "obj":{... "c": 12.3, ...},
+    "obj":{... "c": null, ...},
+     */
+    skipVectorize();
+    testQuery(
+        "SELECT "
+        + "SUM(JSON_VALUE(obj, '$.c')) "
+        + "FROM druid.all_auto_realtime",
+        ImmutableList.of(
+            Druids.newTimeseriesQueryBuilder()
+                  .dataSource(DATA_SOURCE_ALL_REALTIME)
+                  .intervals(querySegmentSpec(Filtration.eternity()))
+                  .granularity(Granularities.ALL)
+                  .virtualColumns(new NestedFieldVirtualColumn("obj", "$.c", 
"v0", ColumnType.DOUBLE))
+                  .aggregators(aggregators(new 
DoubleSumAggregatorFactory("a0", "v0")))
+                  .context(QUERY_CONTEXT_DEFAULT)
+                  .build()
+        ),
+        ImmutableList.of(

Review Comment:
   With regards to the single element to scalar casting, this is just being 
consistent with Druids array handling in the expression layer, which is kind of 
swamped in multi-value string history as well (since arrays were first 
introduced as a way to perform array-like operations on multi-value string 
columns).
   
   I certainly agree that it is weird that we can cast single element arrays to 
scalars, but i think changing that is a separate issue  - and maybe a 
discussion worth having, since mvds are weird.
   
   re: `json_query` vs `json_value`, arrays of primitives are treated 
themselves as a type of primitive, so it makes sense to me to still handle them 
with `json_value` since `json_value` is the only way to extract things out of 
json into other druid native types to use with non-json functions. `json_query` 
is a bit different in that it always returns a `COMPLEX<JSON>` type, which 
means it cannot directly be used with most other druid expressions without 
further extracting primitive values from it. For example you cannot use 
something like `array_contains(json_query(...))` because the json type is sort 
of opaque.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to