domodwyer commented on a change in pull request #1539:
URL: https://github.com/apache/arrow-datafusion/pull/1539#discussion_r793978884
##########
File path: datafusion/tests/sql/aggregates.rs
##########
@@ -316,6 +316,95 @@ async fn csv_query_approx_count() -> Result<()> {
Ok(())
}
+// This test executes the APPROX_QUANTILE aggregation against the test data,
+// asserting the estimated quantiles are ±5% their actual values.
+//
+// Actual quantiles calculated with:
+//
+// ```r
+// read_csv("./testing/data/csv/aggregate_test_100.csv") |>
+// select_if(is.numeric) |>
+// summarise_all(~ quantile(., c(0.1, 0.5, 0.9)))
+// ```
+//
+// Giving:
+//
+// ```text
+// c2 c3 c4 c5 c6 c7 c8 c9
c10 c11 c12
+// <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
<dbl> <dbl> <dbl>
+// 1 1 -95.3 -22925. -1882606710 -7.25e18 18.9 2671. 472608672.
1.83e18 0.109 0.0714
+// 2 3 15.5 4599 377164262 1.13e18 134. 30634 2365817608.
9.30e18 0.491 0.551
+// 3 5 102. 25334. 1991374996. 7.37e18 231 57518. 3776538487.
1.61e19 0.834 0.946
+// ```
+//
+// Column `c12` is omitted due to a large relative error (~10%) due to the
small
+// float values.
+#[tokio::test]
+async fn csv_query_approx_quantile() -> Result<()> {
Review comment:
Merge appears to always be called, so it is definitely _exercised_ - I
don't know if this is sufficent coverage though!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]