[ 
https://issues.apache.org/jira/browse/TINKERPOP-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16634194#comment-16634194
 ] 

Florian Hockmann commented on TINKERPOP-2043:
---------------------------------------------

{quote}Would you agree that Gremlin.Net's 
ConnectionPool.CloseAndRemoveAllConnections should recover on its own and 
complete its Client.SubmitAsync call or do you expect the consumer of 
Gremlin.Net to provide some work-around? If so, how can the pool be 
re-initiated or reset from outside of the Gremlin.Net library?
{quote}
The {{ConnectionPool}} should definitely recover on its own but when an 
existing WebSocket connection throws an exception while a query is sent to the 
server because it is in a broken state, then that exception won't be handled by 
Gremlin.Net. Instead, users should implement a retry policy for those failed 
queries and ideally fix the underlying problem that caused the exception in the 
first place.
 {{CloseAndRemoveAllConnections}} is called when a connection enters a broken 
state as it is assumed that the whole server was probably not reachable for 
some time which would effect all connections in the {{ConnectionPool}}. The 
idea is simply that all connections should be closed in this case instead of 
using the broken connections again for new incoming requests which would result 
in more exceptions for the user. After all connections are closed, new ones 
have to be created for new incoming queries which should be in a good state 
again (assuming that the server is reachable again). 
 Now, you say that you are getting multiple exceptions coming from 
{{CloseAndRemoveAllConnections}} in a short time frame. This is probably caused 
by the fact that a connection is taken from the pool while it is being used. 
Therefore, it cannot be closed by {{CloseAndRemoveAllConnections}}. When it 
gets returned to the pool by the function {{AddConnectionIfOpen}} (which is in 
the stack trace directly before {{CloseAndRemoveAllConnections}}), then it will 
again call {{CloseAndRemoveAllConnections}} if the connection is not open any 
more. This can lead to a few exceptions, depending on the number of parallel 
queries you are executing.

So, do you see a lot more exceptions than the number of queries you are 
executing in parallel? That would mean that there is some other problem with 
the {{ConnectionPool}} that we need to address. Otherwise, you simply have to 
employ a retry policy and the {{ConnectionPool}} works as expected. (Although 
further improvements to the {{ConnectionPool}} like the planned TINKERPOP-1775 
could also further reduce the number of exceptions in this case.)
  
{quote}In regards to the connection pool having this many connection (96), is 
it possible that Florian's WebSocketConnection (#925) fix may mitigate this?
{quote}
This could _solve_ your problem as it swallows exceptions silently that occur 
during a connection close which is happening according to your stack trace. You 
can try it out as [version 
3.4.0-rc2|https://www.nuget.org/packages/Gremlin.Net/3.4.0-rc2] contains the 
fix from #925.
 It would definitely be good to know whether this fix also solves your problem.

 
{quote}Quick high-level question: In a web api, should the Gremlin.Net 
connection be a singleton or should each connection be instantiated separately?
{quote}
Only one {{GremlinClient}} instance should be used as that client has its own 
{{ConnectionPool}} which should be used for many requests.

 

 

> CloseAndRemoveAllConnections bubbles up System.Net.Http.WinHttpException and 
> doesn't recover
> --------------------------------------------------------------------------------------------
>
>                 Key: TINKERPOP-2043
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2043
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: dotnet
>    Affects Versions: 3.4.0
>         Environment: Gremlin.Net 3.4.0-rc1
> Microsoft.NetCore.App v2.0
>            Reporter: Sami
>            Priority: Critical
>
> We have a .Net Core web api with an Azure Cosmos db backend.
> Our data access layer calls 
> _gremlinConnection.Client.SubmitAsync<dynamic>(query) using a singleton 
> gremlin connection. This has worked well except when the API is called with 
> many parallel web api requests. The endpoints that we provide make around 5 
> gremlin calls (some to read, some to write cosmos db edges). 
> During performance testing we noticed System.Net.Http.WinHttpException
> Looking into the issue, we found that NrConnections was at 96 and the stack 
> trace showed that Gremlin.Net called 
> ConnectionPool.CloseAndRemoveAllConnections. 
> We would expect that Gremlin.Net would recover after a call to 
> CloseAndRemoveAllConnections and finish its "Client.SubmitAsync" call but 
> instead it bubbles up the System.Net.Http.WinHttpException and immediately 
> following calls to our API failed with the same 
> System.Net.Http.WinHttpException.
>  
> Would you agree that Gremlin.Net's 
> ConnectionPool.CloseAndRemoveAllConnections should recover on its own and 
> +complete+ its Client.SubmitAsync call or do you expect the consumer of 
> Gremlin.Net to provide some work-around? If so, how can the pool be 
> re-initiated or reset from outside of the Gremlin.Net library?
>  
> Here is the stack trace. Please note that we are ready for a release and this 
> is holding us back. By the way, we do use Gremlin.Net 3.4 RC1 but 3.3.3 also 
> proved to have the same issue.
>  
> {code:java}
> // code placeholder
> ProjectX.Exceptions.PersistenceFailedException:
> at ProjectX.Repository.Cosmos.UnitOfWork+<DeletePersonAsync>d__32.MoveNext 
> (<...removed...>)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
>  (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.GetResult 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> ProjectX.ModelAPI.Controllers.PersonsController+<RemovePerson>d__7.MoveNext 
> (<...removed...>)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
>  (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker+<InvokeActionMethodAsync>d__12.MoveNext
>  (Microsoft.AspNetCore.Mvc.Core, Version=2.0.2.0, Culture=neutral, 
> PublicKeyToken=adb9793829ddae60)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker+<InvokeNextActionFilterAsync>d__10.MoveNext
>  (Microsoft.AspNetCore.Mvc.Core, Version=2.0.2.0, Culture=neutral, 
> PublicKeyToken=adb9793829ddae60)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Rethrow 
> (Microsoft.AspNetCore.Mvc.Core, Version=2.0.2.0, Culture=neutral, 
> PublicKeyToken=adb9793829ddae60)
> at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Next 
> (Microsoft.AspNetCore.Mvc.Core, Version=2.0.2.0, Culture=neutral, 
> PublicKeyToken=adb9793829ddae60)
> at 
> Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker+<InvokeInnerFilterAsync>d__14.MoveNext
>  (Microsoft.AspNetCore.Mvc.Core, Version=2.0.2.0, Culture=neutral, 
> PublicKeyToken=adb9793829ddae60)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker+<InvokeNextExceptionFilterAsync>d__23.MoveNext
>  (Microsoft.AspNetCore.Mvc.Core, Version=2.0.2.0, Culture=neutral, 
> PublicKeyToken=adb9793829ddae60)
> Inner exception System.Net.Http.WinHttpException handled at 
> ProjectX.Repository.Cosmos.UnitOfWork+<DeletePersonAsync>d__32.MoveNext:
> at System.Net.WebSockets.WinHttpWebSocket.InternalCloseAsync 
> (System.Net.WebSockets.Client, Version=4.1.0.0, Culture=neutral, 
> PublicKeyToken=b03f5f7f11d50a3a)
> at System.Net.WebSockets.WinHttpWebSocket+<CloseAsync>d__30.MoveNext 
> (System.Net.WebSockets.Client, Version=4.1.0.0, Culture=neutral, 
> PublicKeyToken=b03f5f7f11d50a3a)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
>  (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at Gremlin.Net.Driver.WebSocketConnection+<CloseAsync>d__4.MoveNext 
> (Gremlin.Net, Version=3.4.0.0, Culture=neutral, 
> PublicKeyToken=d2035e9aa387a711)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
>  (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at Gremlin.Net.Driver.Connection+<CloseAsync>d__10.MoveNext (Gremlin.Net, 
> Version=3.4.0.0, Culture=neutral, PublicKeyToken=d2035e9aa387a711)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
>  (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at Gremlin.Net.Driver.ConnectionPool+<TeardownAsync>d__16.MoveNext 
> (Gremlin.Net, Version=3.4.0.0, Culture=neutral, 
> PublicKeyToken=d2035e9aa387a711)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at Gremlin.Net.Process.Utils.WaitUnwrap (Gremlin.Net, Version=3.4.0.0, 
> Culture=neutral, PublicKeyToken=d2035e9aa387a711)
> at Gremlin.Net.Driver.ConnectionPool.CloseAndRemoveAllConnections 
> (Gremlin.Net, Version=3.4.0.0, Culture=neutral, 
> PublicKeyToken=d2035e9aa387a711)
> at Gremlin.Net.Driver.ConnectionPool.AddConnectionIfOpen (Gremlin.Net, 
> Version=3.4.0.0, Culture=neutral, PublicKeyToken=d2035e9aa387a711)
> at Gremlin.Net.Driver.ProxyConnection.Dispose (Gremlin.Net, Version=3.4.0.0, 
> Culture=neutral, PublicKeyToken=d2035e9aa387a711)
> at Gremlin.Net.Driver.GremlinClient+<SubmitAsync>d__6`1.MoveNext 
> (Gremlin.Net, Version=3.4.0.0, Culture=neutral, 
> PublicKeyToken=d2035e9aa387a711)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
>  (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at Gremlin.Net.Driver.GremlinClientExtensions+<SubmitAsync>d__4`1.MoveNext 
> (Gremlin.Net, Version=3.4.0.0, Culture=neutral, 
> PublicKeyToken=d2035e9aa387a711)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
>  (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at ProjectX.Repository.Cosmos.Writer+<ExecuteGremlinCommand>d__7.MoveNext 
> (<...removed...>)
> at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess 
> (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at 
> System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
>  (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, 
> PublicKeyToken=7cec85d7bea7798e)
> at ProjectX.Repository.Cosmos.UnitOfWork+<DeletePersonAsync>d__32.MoveNext 
> (<...removed...>)
> {code}
> Thank you so much for looking into this.
> We're hoping for a reasonably quick fix or good work-around strategy.
> Sami Abbushi
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to