mcvsubbu commented on issue #10996: URL: https://github.com/apache/pinot/issues/10996#issuecomment-1612227779
On your questions: Stream level consumption basically involves a single server consuming all partitions of the stream. Therefore there are no guarantees that all replicas of the table (in realtime part) have the same data. Consequently, each replica independently "closes" the segment. Since the segments can have slightly different data, the segment name for each replica is different. Also, the "closed" segments are never uploaded to deep store. they are retained locally in the server. If the server goes down, then they need to start afresh (can't copy segments from some place else). \ Each server registers with a different consumer ID. The consumer IDs were also stored in zookeeper some place (and there is code somewhere there for that). There were a lot of zookeeper watches on the controller. When each replica closed a segment, it would mark in segment zk metadata. controller watch would trigger, and it would create a new segment. All this was an operational nightmare, and that is the reason we built LLC. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
