[GitHub] [incubator-pinot] npawar commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework

GitBox Tue, 15 Sep 2020 23:02:40 -0700


npawar commented on a change in pull request #6021:
URL: https://github.com/apache/incubator-pinot/pull/6021#discussion_r489182763




##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/framework/SegmentMapper.java
##########
@@ -100,8 +110,11 @@ public void map()
       }
 
       // Partitioning
-      // TODO: 2 step partitioner. 1) Apply custom partitioner 2) Apply table 
config partitioner. Combine both to get final partition.
-      String partition = _partitioner.getPartition(reusableRow);
+      int p = 0;
+      for (Partitioner partitioner : _partitioners) {
+        partitions[p++] = partitioner.getPartition(reusableRow);
+      }
+      String partition = StringUtil.join("_", partitions);

Review comment:
       Use case: say data in input segments is spread across 3 days. In the 
resulting segments, we want to create a segment for each day. Additionally, we 
want partitioning on some id column for query purposes.
   
   Partitioning by time column is first step. This doesn't affect segment 
metadata or broker routing. This is simply used by the framework, and it's 
scope ends with the framework. It's merely helping create date aligned input 
files for Segment generation stage.
   Partitioning by id column is second step. This is for queries. This will be 
whatever is in the table config. Only this partition will get set in the 
segment metadata. And even that will happen during segment creation.
   See this comment and 
discussion:https://github.com/apache/incubator-pinot/pull/5934#discussion_r486006754




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-pinot] npawar commented on a change in pull request #6021: List of partitioners in SegmentProcessorFramework

Reply via email to