gianm commented on code in PR #18053:
URL: https://github.com/apache/druid/pull/18053#discussion_r2116198449


##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java:
##########
@@ -7263,4 +7263,127 @@ public void testToJsonString()
         RowSignature.builder().add("EXPR$0", ColumnType.STRING).build()
     );
   }
+
+  @Test
+  public void testSumPathWithArrays()
+  {
+    /*
+    "obj":{... "c": 100, ...}
+    "obj":{... "c": ["a", "b"], ...}
+    "obj":{...}
+    "obj":{... "c": {"a": 1}, ...},
+    "obj":{... "c": "hello", ...},
+    "obj":{... "c": 12.3, ...},
+    "obj":{... "c": null, ...},
+     */
+    testQuery(
+        "SELECT "
+        + "SUM(JSON_VALUE(obj, '$.c')) "
+        + "FROM druid.all_auto",
+        ImmutableList.of(
+            Druids.newTimeseriesQueryBuilder()
+                  .dataSource(DATA_SOURCE_ALL)
+                  .intervals(querySegmentSpec(Filtration.eternity()))
+                  .granularity(Granularities.ALL)
+                  .virtualColumns(new NestedFieldVirtualColumn("obj", "$.c", 
"v0", ColumnType.DOUBLE))
+                  .aggregators(aggregators(new 
DoubleSumAggregatorFactory("a0", "v0")))
+                  .context(QUERY_CONTEXT_DEFAULT)
+                  .build()
+        ),
+        ImmutableList.of(
+            new Object[]{112.3d}
+        ),
+        RowSignature.builder()
+                    .add("EXPR$0", ColumnType.DOUBLE)
+                    .build()
+    );
+  }
+
+  @Test
+  public void testCountPathWithArrays()
+  {
+    /*
+    "obj":{... "c": 100, ...}
+    "obj":{... "c": ["a", "b"], ...}
+    "obj":{...}
+    "obj":{... "c": {"a": 1}, ...},
+    "obj":{... "c": "hello", ...},
+    "obj":{... "c": 12.3, ...},
+    "obj":{... "c": null, ...},
+     */
+    // capturing existing behavior... the count should be 4 if it was counting 
all non-null literal values, but that
+    // would mean that the virtual column would need to plan as ARRAY<STRING> 
expected type instead of STRING
+    // ... you might notice there are actually 5 non-null obj.c values, 
however json_value only returns primitive

Review Comment:
   In `docs/querying/nested-columns.md`, it says:
   
   > `JSON_VALUE` only returns literal types. Any paths that reference JSON 
objects or array types return null.
   
   I think "literal" is meant to say "primitive"? In which case, this behavior 
is documented and correct, just perhaps unintuitive. The comment here should 
clarify that. Currently, to me, this comment "capturing existing behavior" 
reads like it suggests the existing behavior is incorrect.
   
   Btw, I do see some tests for `JSON_VALUE` with `RETURNING VARCHAR ARRAY`, 
which I think means the docs are out of date in some way. Do the docs need an 
update to say that `JSON_VALUE` can return arrays if you explicitly use 
`RETURNING`, & otherwise it won't?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to