[ https://issues.apache.org/jira/browse/TINKERPOP-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275054#comment-15275054 ]
ASF GitHub Bot commented on TINKERPOP-1295: ------------------------------------------- GitHub user twilmes opened a pull request: https://github.com/apache/incubator-tinkerpop/pull/307 TINKERPOP-1295 Precompile ScriptInputFormat scripts once during initialization of ScriptRecordReader This update precompiles an input script and then reads the input file using the compiled script instead of repeatedly calling the engine.eval(). This should cut down on the time spent repeatedly eval-ing the input script. I ran a quick and dirty benchmark on my measly macbook with SparkGraphComputer, 2 workers. `g.V().count()` on a test file with 250,000 vertices and a simple `.groovy` script to read it in. Average of 10 runs ------------------------- before (TP 3.2.0 - engine.eval): 14975.7 ms after (TP-1295 - w/ compiled script): 10163.6 ms `mvn clean install` success VOTE: +1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1295 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-tinkerpop/pull/307.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #307 ---- commit be6f8d79f606cc0a39dd45457e26328dd31e5636 Author: Ted Wilmes <twil...@gmail.com> Date: 2016-05-06T19:54:54Z Precompile scripts during ScriptRecordReader initialization. ---- > Precompile ScriptInputFormat scripts once during initialization of > ScriptRecordReader > ------------------------------------------------------------------------------------- > > Key: TINKERPOP-1295 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1295 > Project: TinkerPop > Issue Type: Improvement > Components: hadoop, io > Affects Versions: 3.2.0-incubating, 3.1.2-incubating > Reporter: Ted Wilmes > Assignee: Ted Wilmes > Attachments: intern.svg > > > The {{ScriptRecordReader}} evals scripts on every {{nextKeyValue()}}. I > think we can cut down on script execution evaluation time by precompiling the > input script once. This should speedup bulk loads. I've attached some > profiling info showing a large chunk of time being spent on this eval during > a recent test run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)