kmozaid commented on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044436343


   > Hi @kmozaid thanks for taking the time to make this contribution. Can you 
explain what problem this solves? Is it because you already have a partitioning 
and you want to maintain locality within partitions?
   
   Hi @richardstartin , We have a table where data is being ingested from 
multiple sources. (these multiple sources pushes data to same kafka topic). 
Data is kept for 5 days in realtime table and then moved offline table by 
minion task. We want to keep data from these sources in separate segments for 
offline table. There is a column which identifies the source. 
`BoundedColumnValue` partition function provides capability to keep data from 
different sources in respective partitioned segments. Later if we want to 
backfill the data of just one source, then we will be able to do so because we 
would know what are the segments for given source and replace them by backfill. 
The main use case is to be able to backfill data of particular source. This is 
also discussed in slack thread - 
https://apache-pinot.slack.com/archives/CDRCA57FC/p1643286670255700
   
   <img width="982" alt="image" 
src="https://user-images.githubusercontent.com/8354145/154681296-cf689529-26fc-4d1f-a983-70e18db4f4bc.png";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to