[ 
https://issues.apache.org/jira/browse/TINKERPOP-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830910#comment-16830910
 ] 

ASF GitHub Bot commented on TINKERPOP-2205:
-------------------------------------------

divijvaidya commented on pull request #1105: TINKERPOP-2205 Change connection 
management to single request per channel
URL: https://github.com/apache/tinkerpop/pull/1105
 
 
   https://issues.apache.org/jira/browse/TINKERPOP-2205
   
   The code in this pull request changes the server interaction mechanism of 
the Gremlin open source Java client. The new code addresses problems and 
shortcomings discussed in the linked conversation 
[[1]](https://lists.apache.org/thread.html/77728cb77d4eab90f15680595e653ffc6055b74db29cbd4dcd5f0339@%3Cdev.tinkerpop.apache.org%3E).
 More specifically, the problems addressed are as follows:
   1. Difficulty in configuring the client for optimum performance.
   2. Undocumented dependency of configuration parameters on each other.
   3. A bad request can impact other requests on the same channel.
   4. Host is marked as dead even if it is busy serving requests.
   5. No way to free up server resources if the client has stopped consuming 
results.
   6. No differentiation between retriable and non-retriable exceptions from 
the application code.
   7. Keep alive is only sent when a query is executing, which means that a 
connection open for a very long time with no query being sent will be closed by 
the server.
   8. Race condition if the server response reaches before result queue has 
been registered.
   9. Unpredictable behaviour if the server sends an exception followed by a 
genuine response for the same request.
   10. A concurrent hash map (tracking pending requests) is a point of 
contention amongst threads.
   ### Changes
   1. ResultSet can be closed.

       * This allows the client to tell the server to relinquish resources 
associated with this request. 
   2. Single request per connection. No channel multiplexing.

       * Impact of a rogue response (such as one which causes IOException 
exceeding content length) does not impact the rest of the in-flight requests.

       * Each request has its own bandwidth.
   3. Removed custom keep alive logic and replaced with Netty IdleState 
handler.

       * Makes the client more robust
   4. Deprecated InProcess and SimultaneousUsage configuration parameters.
       * Now the customers would have to configure only a single parameter for 
setting concurrency of requests.
   5. Throw different exceptions to the application code which makes it easy to 
determine what can be retried and what not.

   6. Handle errors gracefully during WebSocket handshake.
       * Makes the client robust
   7. Close the websocket channel gracefully (with a close frame).
       * Server closes the channel gracefully on receiving the close frame.
   8. Use EPoll instead of Nio whenever possible.

       * Poll provides better performance on Linux platforms
   9. Run chooseConnection in an async manner using executors threads.

       * Increases thread utilization. In general a lot of effort has been made 
to improve thread utilization.
   10. Make client resilient to multiple response from the server for the same 
request.
   11. Client operations do not rely on the UUID of the request provided by the 
server.
   
   ### Backward compatibility with 3.4.x/3.3.x
   **Application layer code** - This new client is fully backward compatible 
and requires no change in the application layer code. The only change required 
will be if the application layer code is relying on certain types of exceptions 
thrown by the client.


   
   **Channelizer** - Although the channelizer interface hasn’t changed, custom 
implementations of the channelizer will have to change their code to work with 
the new client.
   ### Limitations
   1. A client generating high TPS from a single machine will have to modify 
the OS setting for max number of open files, since each connection corresponds 
to a single file in linux OS.
   ### Benchmarks
   Benchmark code will be shared soon in this PR and results will be updated 
here. During preliminary testing, there was no difference in performance. This 
is because channels are being re-used and the additional overhead is only at 
the bootstrap when we do more WebSocket handshakes (due to more connections) 
than older code.
   ### Testing
   1. Added a new test suite. 
   2. All existing tests pass.
      * 
gremlin-driver: mvn clean install -DskipIntegrationTests=false
   
   * gremlin-server: mvn clean install -DskipIntegrationTests=false
   ### Post merge work
   1. Write a document describing how the client works.
   2. Add examples of efficient usage of client.
   3. Update change log.
   4. Update documentation.
   ### Future work
   1. Add a default retry strategy for timeouts while trying to obtain a 
connection.
   2. Add a strategy to remove a fishy host from the load balancer (without 
impacting existing requests).
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Use one connection per request for Java client
> ----------------------------------------------
>
>                 Key: TINKERPOP-2205
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2205
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: driver
>            Reporter: Divij Vaidya
>            Priority: Major
>
> This issue is a tracking item for the conversation in the mailing list 
> [[1]|https://lists.apache.org/thread.html/77728cb77d4eab90f15680595e653ffc6055b74db29cbd4dcd5f0339@%3Cdev.tinkerpop.apache.org%3E]
>  which highlights multiple problems and shortcomings in the existing Java 
> client and proposes a design change in the client connection pooling to 
> address the same. More specifically, the problems addressed are as follows:
>  # Difficulty in configuring the client for optimum performance.
>  # Undocumented dependency of configuration parameters on each other.
>  # A bad request can impact other requests on the same channel.
>  # Host is marked as dead even if it is busy serving requests.
>  # No way to free up server resources if the client has stopped consuming 
> results.
>  # No differentiation between retriable and non-retriable exceptions from the 
> application code.
>  # Keep alive is only sent when a query is executing, which means that a 
> connection open for a very long time with no query being sent will be closed 
> by the server.
>  # Race condition if the server response reaches before result queue has been 
> registered.
>  # Unpredictable behaviour if the server sends an exception followed by a 
> genuine response for the same request.
>  # A concurrent hash map (tracking pending requests) is a point of contention 
> amongst threads.
> [1]https://lists.apache.org/thread.html/77728cb77d4eab90f15680595e653ffc6055b74db29cbd4dcd5f0339@%3Cdev.tinkerpop.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to