[GitHub] [pinot] mcvsubbu commented on issue #10996: Questions about Stream level consumption model

via GitHub Wed, 28 Jun 2023 16:14:49 -0700


mcvsubbu commented on issue #10996:
URL: https://github.com/apache/pinot/issues/10996#issuecomment-1612227779


   On your questions:
   Stream level consumption basically involves a single server consuming all 
partitions of the stream. Therefore there are no guarantees that all replicas 
of the table (in realtime part) have the same data. Consequently, each replica 
independently "closes" the segment. Since the segments can have slightly 
different data, the segment name for each replica is different. Also, the 
"closed" segments are never uploaded to deep store. they are retained locally 
in the server. If the server goes down, then they need to start afresh (can't 
copy segments from some place else). \
   
   Each server registers with a different consumer ID. The consumer IDs were 
also stored in zookeeper some place (and there is code somewhere there for 
that).
   
   There were a lot of zookeeper watches on the controller. When each replica 
closed a segment, it would mark in segment zk metadata. controller watch would 
trigger, and it would create a new segment. 
   
   All this was an operational nightmare, and that is the reason we built LLC. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [pinot] mcvsubbu commented on issue #10996: Questions about Stream level consumption model

Reply via email to