davecromberge commented on code in PR #17825:
URL: https://github.com/apache/pinot/pull/17825#discussion_r2911111265
##########
pinot-core/src/test/java/org/apache/pinot/core/segment/processing/framework/ReducerTest.java:
##########
@@ -570,4 +575,209 @@ public void testDedupWithSort()
assertEquals(fieldToValueMap.get("d2"), dValues[expectedDIndex][1]);
}
}
+
+ /**
+ * Test rollup with theta sketch metric to verify batch aggregation path.
+ * This test creates multiple rows per dimension key, each with a theta
sketch,
+ * and verifies they are properly aggregated using batch aggregation.
+ */
+ @Test
+ public void testRollupWithThetaSketch()
+ throws Exception {
+ TableConfig tableConfig = new
TableConfigBuilder(TableType.OFFLINE).setTableName("testTable").build();
+ Schema schema = new Schema.SchemaBuilder().setSchemaName("testTable")
+ .addSingleValueDimension("d", DataType.INT)
+ .addMetric("sketch", DataType.BYTES)
+ .build();
+ Pair<List<FieldSpec>, Integer> result =
SegmentProcessorUtils.getFieldSpecs(schema, MergeType.ROLLUP, null);
+ GenericRowFileManager fileManager =
+ new GenericRowFileManager(FILE_MANAGER_OUTPUT_DIR, result.getLeft(),
false, result.getRight());
+
+ GenericRowFileWriter fileWriter = fileManager.getFileWriter();
+ int numRecords = 100;
+ int numDimValues = 5;
+ // Track expected distinct values per dimension
+ Map<Integer, Set<Integer>> expectedDistincts = new TreeMap<>();
+ for (int i = 0; i < numDimValues; i++) {
+ expectedDistincts.put(i, new HashSet<>());
+ }
+
+ GenericRow row = new GenericRow();
+ for (int i = 0; i < numRecords; i++) {
+ row.clear();
+ int d = RANDOM.nextInt(numDimValues);
+ // Create a sketch with some random values
+ UpdateSketch updateSketch =
UpdateSketch.builder().setNominalEntries(128).build();
+ int numValues = RANDOM.nextInt(10) + 1;
+ for (int j = 0; j < numValues; j++) {
+ int value = RANDOM.nextInt(1000);
+ updateSketch.update(value);
+ expectedDistincts.get(d).add(value);
+ }
Review Comment:
Nominal entries is well above the ceiling for the random function. The
sketch will therefore always be in exact mode and no approximation will apply.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]