yashmayya commented on code in PR #13835:
URL: https://github.com/apache/pinot/pull/13835#discussion_r1721714055
##########
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/StarTreeClusterIntegrationTest.java:
##########
@@ -234,6 +240,175 @@ private void testStarQuery(String starQuery, boolean
verifyPlan)
starQuery, starResponse, referenceQuery, referenceResponse,
_randomSeed));
}
+ @Test
+ public void testStarTreeWithDistinctCountHllConfigurations() throws
Exception {
+ List<StarTreeIndexConfig> starTreeIndexConfigs =
_tableConfig.getIndexingConfig().getStarTreeIndexConfigs();
+ StarTreeAggregationConfig aggregationConfig = new
StarTreeAggregationConfig("OriginCityName", "DISTINCTCOUNTHLL",
+ Map.of(Constants.HLL_LOG2M_KEY, 4), null, null, null, null, null);
+
+ starTreeIndexConfigs.add(new
StarTreeIndexConfig(Collections.singletonList("CRSDepTime"), null,
+ null, List.of(aggregationConfig), 1));
+ updateTableConfig(_tableConfig);
+
+ // Wait for table config to be updated
+ TestUtils.waitForCondition(
+ (aVoid) -> {
+ TableConfig tableConfig = getOfflineTableConfig();
+ return
tableConfig.getIndexingConfig().getStarTreeIndexConfigs().size() ==
starTreeIndexConfigs.size();
+ }, 5_000L, "Failed to update table config"
+ );
+
+ reloadOfflineTable(DEFAULT_TABLE_NAME);
+
+ // Wait for the star-tree indexes to be built
+ TestUtils.waitForCondition(
+ (aVoid) -> {
+ JsonNode result;
+ try {
+ result = postQuery("EXPLAIN PLAN FOR SELECT
DISTINCTCOUNTHLL(OriginCityName, 4) FROM mytable "
+ + "WHERE CRSDepTime = 35");
+ } catch (Exception e) {
+ throw new RuntimeException(e);
+ }
+ return result.toString().contains(FILTER_STARTREE_INDEX);
+ }, 1000L, 120_000L, "Failed to use star-tree index for query"
Review Comment:
Looks like that was indeed the issue, the test here had 12 segments with
~100 MB of data. I've moved this test to a separate
`StarTreeFunctionParametersIntegrationTest` that uses a tiny table (1 segment
with just 100 rows) since this test doesn't really care too much about the
specific data set or size used. The test is now passing in CI even with a 10
second timeout. It consistently passes locally with a timeout as low as 100 ms
but I've gone with a larger value here just to ensure that it doesn't end up
flaky in CI based on the previous results.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]