[jira] [Updated] (TINKERPOP-2424) Reduce chance for OOME with large results to Java driver

Stephen Mallette (Jira) Wed, 16 Sep 2020 04:33:45 -0700


     [ 
https://issues.apache.org/jira/browse/TINKERPOP-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Stephen Mallette updated TINKERPOP-2424:
----------------------------------------
    Description: 
Originally mentioned here:

https://groups.google.com/g/gremlin-users/c/I4HQC9JkzSo/m/fYfd5o0UAQAJ

and pretty easy to create with an empty TinkerGraph in Gremlin Server with 
{{evaluationTimeout}} set to something large using this script in the Gremlin 
Console with {{-Xmx512}}:

{code}
cluster = Cluster.open()
client = cluster.connect()
client.submit("g.addV().as('a').addE('self').iterate()")
rs = 
client.submit("g.V().emit().repeat(out()).valueMap(true).limit(10000000)");[]
iterator = rs.iterator();[]
x = 0
while(iterator.hasNext()) {
  x++
  if (x % 10000 == 0) {
    System.out.println(x + "-[" + rs.getAvailableItemCount() + "]-"+ 
iterator.next());
  }
}
{code}

The {{LinkedBlockingQueue}} of the {{ResultQueue}} is unbounded and can fill 
faster than can be consumed and on a system with limited memory an OOME can 
loom. 

While we tend to discourage iteration of large result sets {{e.g. g.V()}} I 
suppose we should do what we can to keep users out of OOME situations if we 
can. Not sure of the best way to do this but some simple experimentation showed 
that bounding the queue helps (tried with 100000) but does require the adding 
of new results to be blocked until more are consumed.

{code}
+++ 
b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/Connection.java
@@ -233,7 +233,7 @@ final class Connection {
 
                         cluster.executor().submit(() -> 
resultQueueSetup.completeExceptionally(f.cause()));
                     } else {
-                        final LinkedBlockingQueue<Result> 
resultLinkedBlockingQueue = new LinkedBlockingQueue<>();
+                        final LinkedBlockingQueue<Result> 
resultLinkedBlockingQueue = new LinkedBlockingQueue<>(100000);
                         final CompletableFuture<Void> readCompleted = new 
CompletableFuture<>();
 
                         readCompleted.whenCompleteAsync((v, t) -> {
diff --git 
a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
 
b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
index 29a6453431..4b52ae1671 100644
--- 
a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
+++ 
b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
@@ -70,7 +70,7 @@ final class ResultQueue {
      * @param result a return value from the {@link Traversal} or script 
submitted for execution
      */
     public void add(final Result result) {
-        this.resultLinkedBlockingQueue.offer(result);
+        while(!this.resultLinkedBlockingQueue.offer(result)) {}
         tryDrainNextWaiting(false);
     }
{code}

  was:
Originally mentioned here:

https://groups.google.com/g/gremlin-users/c/I4HQC9JkzSo/m/fYfd5o0UAQAJ

and pretty easy to create with an empty TinkerGraph in Gremlin Server with 
{{evaluationTimeout}} set to something large using this script in the Gremlin 
Console with {{-Xmx512}}:

{code}
cluster = Cluster.open()
client = cluster.connect()
client.submit("g.addV().as('a').addE('self').iterate()")
rs = 
client.submit("g.V().emit().repeat(out()).valueMap(true).limit(10000000)");[]
iterator = rs.iterator();[]
x = 0
while(iterator.hasNext()) {
  x++
  if (x % 10000 == 0) {
    System.out.println(x + "-[" + rs.getAvailableItemCount() + "]-"+ 
iterator.next());
  }
}
{code}

The {{LinkedBlockingQueue}} of the {{ResultQueue}} is unbounded and can fill 
faster than can be consumed and on a system with limited memory an OOME can 
loom. 

While we tend to discourage iteration of large result sets {{e.g. g.V()}} I 
suppose we should do what we can to keep users out of OOME situations if we 
can. Not sure of the best way to do this but some simple experimentation showed 
that bounding the queue helps (tried with 100000) but does require the adding 
of new results to be blocked until more are consumed.


> Reduce chance for OOME with large results to Java driver
> --------------------------------------------------------
>
>                 Key: TINKERPOP-2424
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2424
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: driver
>    Affects Versions: 3.4.8
>            Reporter: Stephen Mallette
>            Priority: Minor
>
> Originally mentioned here:
> https://groups.google.com/g/gremlin-users/c/I4HQC9JkzSo/m/fYfd5o0UAQAJ
> and pretty easy to create with an empty TinkerGraph in Gremlin Server with 
> {{evaluationTimeout}} set to something large using this script in the Gremlin 
> Console with {{-Xmx512}}:
> {code}
> cluster = Cluster.open()
> client = cluster.connect()
> client.submit("g.addV().as('a').addE('self').iterate()")
> rs = 
> client.submit("g.V().emit().repeat(out()).valueMap(true).limit(10000000)");[]
> iterator = rs.iterator();[]
> x = 0
> while(iterator.hasNext()) {
>   x++
>   if (x % 10000 == 0) {
>     System.out.println(x + "-[" + rs.getAvailableItemCount() + "]-"+ 
> iterator.next());
>   }
> }
> {code}
> The {{LinkedBlockingQueue}} of the {{ResultQueue}} is unbounded and can fill 
> faster than can be consumed and on a system with limited memory an OOME can 
> loom. 
> While we tend to discourage iteration of large result sets {{e.g. g.V()}} I 
> suppose we should do what we can to keep users out of OOME situations if we 
> can. Not sure of the best way to do this but some simple experimentation 
> showed that bounding the queue helps (tried with 100000) but does require the 
> adding of new results to be blocked until more are consumed.
> {code}
> +++ 
> b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/Connection.java
> @@ -233,7 +233,7 @@ final class Connection {
>  
>                          cluster.executor().submit(() -> 
> resultQueueSetup.completeExceptionally(f.cause()));
>                      } else {
> -                        final LinkedBlockingQueue<Result> 
> resultLinkedBlockingQueue = new LinkedBlockingQueue<>();
> +                        final LinkedBlockingQueue<Result> 
> resultLinkedBlockingQueue = new LinkedBlockingQueue<>(100000);
>                          final CompletableFuture<Void> readCompleted = new 
> CompletableFuture<>();
>  
>                          readCompleted.whenCompleteAsync((v, t) -> {
> diff --git 
> a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
>  
> b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
> index 29a6453431..4b52ae1671 100644
> --- 
> a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
> +++ 
> b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ResultQueue.java
> @@ -70,7 +70,7 @@ final class ResultQueue {
>       * @param result a return value from the {@link Traversal} or script 
> submitted for execution
>       */
>      public void add(final Result result) {
> -        this.resultLinkedBlockingQueue.offer(result);
> +        while(!this.resultLinkedBlockingQueue.offer(result)) {}
>          tryDrainNextWaiting(false);
>      }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (TINKERPOP-2424) Reduce chance for OOME with large results to Java driver

Reply via email to