[jira] [Updated] (TINKERPOP-2454) OOM error when running gremlin queries asynchronously with JAVA

Vikas Yadav (Jira) Wed, 21 Oct 2020 03:51:38 -0700


     [ 
https://issues.apache.org/jira/browse/TINKERPOP-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vikas Yadav updated TINKERPOP-2454:
-----------------------------------
    Description: 
We have created a rest API that executes a gremlin query on the Janus graph and 
returns the result in JSON format. API works file for small result sets. But 
for large result sets, when we hit the API asynchronously, it gives the 
following error, (max heap size {{-Xmx4g}}
{quote}java.lang.OutOfMemoryError: GC overhead limit exceeded
{quote}
I am using curl with {{&}} to hit API asynchronously,

~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &~
 ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &~
 ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &~
 ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &~

Code to connect to janus graph,

~cluster = Cluster.open(config);~
 ~connect = cluster.connect();~

~submit = connect.submit(gremlin);~
 ~Iterator<Result> resultIterator = submit.iterator();~
 ~int count=0;~
 ~while (resultIterator.hasNext()){~
 ~//add to list, commented to check OOM error~
 ~}~

 
 Configurations,

~config.setProperty("connectionPool.maxContentLength", "50000000");~
 ~config.setProperty("connectionPool.maxInProcessPerConnection", "30");~
 ~config.setProperty("connectionPool.maxInProcessPerConnection", "30");~
 ~config.setProperty("connectionPool.maxSize", "30");~
 ~config.setProperty("connectionPool.minSize", "1");~
 ~config.setProperty("connectionPool.resultIterationBatchSize", "200");~
  

Gremlin driver,
  ~org.apache.tinkerpop.gremlin-driver:3.4.6~
  

Query returns around *17K records with 80MB size.*

*How to handle a large resultset like a cursor so that not all the data is 
loaded in the memory?*
 Is there any configuration that I am missing?

>From profiling, it is clear that the gremlin driver is causing the issue but I 
>am not sure how to fix it and release the memory.



Please let me know if you need more details.

 

Thanks.

  was:
We have created a rest API that executes a gremlin query on the Janus graph and 
returns the result in JSON format. API works file for small result sets. But 
for large result sets, when we hit the API asynchronously, it gives the 
following error, (max heap size {{-Xmx4g}}
{quote}java.lang.OutOfMemoryError: GC overhead limit exceeded
{quote}
I am using curl with {{&}} to hit API asynchronously,

~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &~
 ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &~
 ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &~
 ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &~

 Code to connect to janus graph,

 ~cluster = Cluster.open(config);~
 ~connect = cluster.connect();~

~submit = connect.submit(gremlin);~
 ~Iterator<Result> resultIterator = submit.iterator();~
 ~int count=0;~
 ~while (resultIterator.hasNext()){~
 ~//add to list, commented to check OOM error~
 ~}~

 
Configurations,


 ~config.setProperty("connectionPool.maxContentLength", "50000000");~
 ~config.setProperty("connectionPool.maxInProcessPerConnection", "30");~
 ~config.setProperty("connectionPool.maxInProcessPerConnection", "30");~
 ~config.setProperty("connectionPool.maxSize", "30");~
 ~config.setProperty("connectionPool.minSize", "1");~
 ~config.setProperty("connectionPool.resultIterationBatchSize", "200");~
  

Gremlin driver,
  ~org.apache.tinkerpop.gremlin-driver:3.4.6~
  

Query returns around *17K records with 80MB size.*

*How to handle a large resultset like a cursor so that not all the data is 
loaded in the memory?*
 Is there any configuration that I am missing?

>From profiling, it is clear that the gremlin driver is causing the issue but I 
>am not sure how to fix it and release the memory.

 

!jc8rc.png!

 

Please let me know if you need more details.

 

Thanks.


> OOM error when running gremlin queries asynchronously with JAVA
> ---------------------------------------------------------------
>
>                 Key: TINKERPOP-2454
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2454
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: driver
>    Affects Versions: 3.4.6, 3.4.8
>            Reporter: Vikas Yadav
>            Priority: Blocker
>         Attachments: jc8rc.png, wUqam.png
>
>
> We have created a rest API that executes a gremlin query on the Janus graph 
> and returns the result in JSON format. API works file for small result sets. 
> But for large result sets, when we hit the API asynchronously, it gives the 
> following error, (max heap size {{-Xmx4g}}
> {quote}java.lang.OutOfMemoryError: GC overhead limit exceeded
> {quote}
> I am using curl with {{&}} to hit API asynchronously,
> ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &~
>  ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query 
> &~
>  ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query 
> &~
>  ~curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query 
> &~
> Code to connect to janus graph,
> ~cluster = Cluster.open(config);~
>  ~connect = cluster.connect();~
> ~submit = connect.submit(gremlin);~
>  ~Iterator<Result> resultIterator = submit.iterator();~
>  ~int count=0;~
>  ~while (resultIterator.hasNext()){~
>  ~//add to list, commented to check OOM error~
>  ~}~
>  
>  Configurations,
> ~config.setProperty("connectionPool.maxContentLength", "50000000");~
>  ~config.setProperty("connectionPool.maxInProcessPerConnection", "30");~
>  ~config.setProperty("connectionPool.maxInProcessPerConnection", "30");~
>  ~config.setProperty("connectionPool.maxSize", "30");~
>  ~config.setProperty("connectionPool.minSize", "1");~
>  ~config.setProperty("connectionPool.resultIterationBatchSize", "200");~
>   
> Gremlin driver,
>   ~org.apache.tinkerpop.gremlin-driver:3.4.6~
>   
> Query returns around *17K records with 80MB size.*
> *How to handle a large resultset like a cursor so that not all the data is 
> loaded in the memory?*
>  Is there any configuration that I am missing?
> From profiling, it is clear that the gremlin driver is causing the issue but 
> I am not sure how to fix it and release the memory.
> Please let me know if you need more details.
>  
> Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (TINKERPOP-2454) OOM error when running gremlin queries asynchronously with JAVA

Reply via email to