FlorianHockmann commented on PR #1792:
URL: https://github.com/apache/tinkerpop/pull/1792#issuecomment-1239127692

   I agree that adding retry logic directly into the drivers isn't trivial. 
@kenhuuu mentions an important topic that we have to consider if we want to add 
a general retry logic:
   
   > A possible improvement might be to test if the error should be retried. 
E.g. don't retry for non-recoverable errors like incorrect password.
   
   Apart from non-recoverable errors, can't we also run into a situation where 
a traversal was already successfully evaluated on the server but sending back 
its results to the driver failed due to some network problem? Simply sending 
that traversal (which must be considered as failed by the driver as it didn't 
receive a successful response by the server) to the server again can be 
problematic if the traversal modified the graph, e.g., we shouldn't resend an 
`addV()` step as it results in duplicates.
   
   [This 
article](https://devblogs.microsoft.com/azure-sql/configurable-retry-logic-for-microsoft-data-sqlclient/)
 explains some considerations that were made when a retry logic was added to 
Microsoft's .NET SQL client. That's also where I got the scenario from I just 
described with mutating traversals. They handle it by letting users configure 
which SQL commands should be retried / which not so that mutating commands can 
be skipped for the retry.
   
   I think we can add a retry logic to the drivers, but we should make it 
configurable for users. This means that users should be able to configure:
   1. Whether they want to use our retry logic in general (so they can 
implement their own instead / don't use a retry logic)
   2. Which exceptions should be retried, e.g., transient network errors, but 
not failures from the server or only specific failures from the server, but not 
a `FORBIDDEN` response for example.
   3. Which traversals should be retried, e.g., retry a `g.V().has()[...]` 
traversal, but not a mutating traversal.
   4. Number of retries, times to wait between retry, and so on (exponential 
retry with / without a random jitter could be added, but doesn't have to be, 
especially in the first version, in my opinion).
   
   Number 3 is probably a lot easier to implement, then letting users specify 
whether they don't want to retry specific Gremlin steps or mutating traversals 
in general.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to