[ 
https://issues.apache.org/jira/browse/TINKERPOP-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644647#comment-17644647
 ] 

ASF GitHub Bot commented on TINKERPOP-2831:
-------------------------------------------

ministat commented on PR #1873:
URL: https://github.com/apache/tinkerpop/pull/1873#issuecomment-1342218361

   @cole-bq I'm interested with your exploration. Using Java's Exception to 
stop the iteration is not recommended since it has to fill stack trace which 
hurts the performance. The workaround now is to use Exception without stack 
trace.
   
   I checked all places which catches the NoSuchElementException. Currently, 
there are some other places which catch the NoSuchElementException to stop the 
traversal, and none of them care about the stacktrace, just wants to break the 
loop or traversal, like this patch wants to address.
   
   Now, there are some other graph DB which already refers to Tinkerpop and 
depends on the exception to do something in their logic, like Janusgraph 
mentioned in this issue. So, if you prefer to some other solution, please 
consider the backward compatibility.  
   
   ```
   
src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/TraversalUtil.java:45:
        } catch (final NoSuchElementException e) {
   
src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/TraversalUtil.java:113:
        } catch (final NoSuchElementException e) {
   
src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:156:
            } catch (final NoSuchElementException e) {
   
src/main/java/org/apache/tinkerpop/gremlin/process/traversal/Traversal.java:187:
        } catch (final NoSuchElementException ignored) {
   
src/main/java/org/apache/tinkerpop/gremlin/process/traversal/Traversal.java:212:
        } catch (final NoSuchElementException ignored) {
   
src/main/java/org/apache/tinkerpop/gremlin/process/traversal/Traversal.java:267:
        } catch (final NoSuchElementException ignore) {
   
src/main/java/org/apache/tinkerpop/gremlin/process/traversal/Traversal.java:280:
        } catch (final NoSuchElementException ignore) {
   ```
   




> Throw NoSuchElementException frequently which slowers the performance
> ---------------------------------------------------------------------
>
>                 Key: TINKERPOP-2831
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2831
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.5.4
>            Reporter: Redriver
>            Priority: Major
>         Attachments: Screen Shot 2022-11-24 at 11.35.40.png
>
>
> When I run g.V().label().groupCount() on a huge graph: 600m vertices + 6 
> billion edges, the JVM async profiler exposed that the NoSuchElementException 
> is a hotspot. In fact, that exception is used to inform the caller that the 
> iteration end reached, so the stack trace information is not used. In 
> addition, creating a new exception everytime is also not necessary.
> {code:java}
> java.lang.Throwable.fillInStackTrace(Native Method)
> java.lang.Throwable.fillInStackTrace(Throwable.java:783) => holding 
> Monitor(java.util.NoSuchElementException@1860956919})
> java.lang.Throwable.<init>(Throwable.java:250)
> java.lang.Exception.<init>(Exception.java:54)
> java.lang.RuntimeException.<init>(RuntimeException.java:51)
> java.util.NoSuchElementException.<init>(NoSuchElementException.java:46)
> org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraphIterator.next(TinkerGraphIterator.java:63)
> org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.getOrCreateVertex(JanusGraphVertexDeserializer.java:192)
> org.janusgraph.hadoop.formats.util.JanusGraphVertexDeserializer.readHadoopVertex(JanusGraphVertexDeserializer.java:153)
> org.janusgraph.hadoop.formats.util.HadoopRecordReader.nextKeyValue(HadoopRecordReader.java:69)
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:230)
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:220)
> org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:348)
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
> org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
> org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
> org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
> org.apache.spark.scheduler.Task.run(Task.scala:121)
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:416)
> org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:422)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to