Mike Percy has posted comments on this change. Change subject: Add a design doc for rpc retry/failover semantics ......................................................................
Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/2642/4/docs/design-docs/rpc-retry-and-failover.md File docs/design-docs/rpc-retry-and-failover.md: Line 178: Retry handling on the client side and retry rendez-vous logic will be implemented at the Here is something tricky related to plumbing the concept of exactly-once semantics back to the client: - What happens when we have a client batch that includes multiple tablet writes? This type of operation can partially succeed and partially time out. I think as stated the design works fine as long as the per-tablet automatic retries are happening within the client timeout, but... - What happens if the client timeout occurs and we get a partial timeout? Some of the items succeeded and some timed out. Will we provide a mechanism through the API to be able to access the replay cache when manually retrying these writes after a backoff period? Imagine this happens if a tablet fails over and it takes a while for a new leader to be elected. How should an ingest pipeline like Flume or Kafka handle this? Without the replay cache, for inserts, they are simply forced to ignore "row already exists" errors. Can the replay cache help them avoid that behavior? -- To view, visit http://gerrit.cloudera.org:8080/2642 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc2aa40486153b39724e1c9bd09c626b829274c6 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
