[ 
https://issues.apache.org/jira/browse/GIRAPH-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689514#comment-13689514
 ] 

Hudson commented on GIRAPH-683:
-------------------------------

Integrated in Giraph-trunk-Commit #1010 (See 
[https://builds.apache.org/job/Giraph-trunk-Commit/1010/])
    GIRAPH-683: Jython for Computation (nitay) (Revision 
8f89bd85a03a1fec25e21e334631931f69078040)

     Result = SUCCESS
nitay : 
http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=8f89bd85a03a1fec25e21e334631931f69078040
Files : 
* giraph-core/src/main/java/org/apache/giraph/conf/TypesHolder.java
* giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java
* giraph-core/src/test/java/org/apache/giraph/jython/TestJython.java
* giraph-core/src/main/java/org/apache/giraph/conf/StrConfOption.java
* giraph-core/src/main/java/org/apache/giraph/conf/GiraphClasses.java
* giraph-examples/src/test/java/org/apache/giraph/TestBspBasic.java
* giraph-core/src/main/java/org/apache/giraph/utils/FileUtils.java
* 
giraph-core/src/main/java/org/apache/giraph/io/formats/IntIntNullTextInputFormat.java
* README
* 
giraph-core/src/main/java/org/apache/giraph/jython/JythonComputationFactory.java
* giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java
* giraph-core/src/main/java/org/apache/giraph/conf/ClassConfOption.java
* CHANGELOG
* giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
* giraph-core/src/main/java/org/apache/giraph/utils/ReflectionUtils.java
* giraph-examples/src/test/java/org/apache/giraph/TestComputationState.java
* giraph-core/src/main/java/org/apache/giraph/conf/LongConfOption.java
* giraph-core/src/main/java/org/apache/giraph/utils/DistributedCacheUtils.java
* giraph-core/src/main/java/org/apache/giraph/conf/AbstractConfOption.java
* giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java
* giraph-core/src/main/java/org/apache/giraph/graph/Language.java
* pom.xml
* 
giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java
* giraph-core/src/test/java/org/apache/giraph/BspCase.java
* giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java
* giraph-core/src/main/java/org/apache/giraph/conf/BooleanConfOption.java
* giraph-core/src/main/java/org/apache/giraph/conf/GiraphTypes.java
* 
giraph-examples/src/main/java/org/apache/giraph/examples/GeneratedVertexReader.java
* giraph-core/src/test/java/org/apache/giraph/utils/TestReflectionUtils.java
* giraph-core/src/main/java/org/apache/giraph/conf/ConfOptionType.java
* giraph-examples/src/test/java/org/apache/giraph/TestGraphPartitioner.java
* giraph-core/src/main/java/org/apache/giraph/graph/ComputationFactory.java
* 
giraph-core/src/main/java/org/apache/giraph/io/formats/LongLongNullTextInputFormat.java
* giraph-core/src/main/java/org/apache/giraph/benchmark/BenchmarkOption.java
* giraph-core/src/main/java/org/apache/giraph/jython/package-info.java
* giraph-core/src/main/java/org/apache/giraph/jython/JythonUtils.java
* giraph-core/src/main/java/org/apache/giraph/jython/DeployType.java
* giraph-core/pom.xml
* giraph-core/src/main/resources/org/apache/giraph/benchmark/page-rank.py
* giraph-core/src/main/java/org/apache/giraph/benchmark/PageRankBenchmark.java
* giraph-core/src/main/java/org/apache/giraph/master/SuperstepClasses.java
* giraph-core/src/main/java/org/apache/giraph/conf/FloatConfOption.java
* giraph-core/src/main/java/org/apache/giraph/conf/AllOptions.java
* giraph-core/src/test/resources/org/apache/giraph/jython/count-edges.py
* giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java
* giraph-core/src/main/java/org/apache/giraph/conf/EnumConfOption.java
* giraph-core/src/main/java/org/apache/giraph/graph/Computation.java
* giraph-core/src/main/java/org/apache/giraph/conf/IntConfOption.java
* 
giraph-core/src/main/java/org/apache/giraph/graph/DefaultComputationFactory.java
* 
giraph-core/src/main/java/org/apache/giraph/job/GiraphConfigurationValidator.java

                
> Jython for Computation
> ----------------------
>
>                 Key: GIRAPH-683
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-683
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>
> Support for writing Computation code in Python. We add Jython bindings so 
> that the Python computation code can communicate back with the Java Giraph 
> classes.
> To make this work I had to change a few parts of Giraph:
> 1) The Jython computation is not known until we read the script and create a 
> Computation object for it at runtime. This has to be done on each worker 
> separately after the job has launched. Because of this, there is no 
> Computation class set at the beginning. I suspect other scripting languages 
> will have similar issue. To fix this I created a ComputationFactory interface 
> which is responsible for creating the Computation, with a default that just 
> grabs the class from the Configuration and creates it.
> 2) I created a GiraphTypes class to hold the I,V,E,M1,M2 classes. There was a 
> lot of repetitive code around these things so centralizing it all in one 
> place made things a lot cleaner.
> 3) I added some more helpers like isDefaultValue() to our conf options. Also 
> added EnumConfOption.
> 4) The ReflectionUtils type inference was broken for interfaces. I fixed it 
> by putting in TypeTools, a library that does it better.
> 5) I added a TypesHolder interface (with help of [4]) that people can extend 
> to describe types used. Computation implements this. I use this with Jython 
> so that user can provide something that describes types but without requiring 
> any methods.
> 6) Fixed GraphConfigurationValidator with interfaces and cleaned it up.
> To use Jython all the user has to do is call JythonUtils#init(...) somewhere 
> in his initialization.
> I also added it to GiraphRunner. To use it through that you give an HDFS path 
> to the python file as the Computation. It takes a little more work because 
> you need to also supply the new options --typesHolder and --jythonClass.
> This patch contains our page rank benchmark implementation in Jython. I added 
> an option (--jython) which chooses whether to run the default or the jython 
> version.
> Here is the initial PageRankBenchmark comparison (200 workers, 1B vertices, 
> 200 edges per vertex):
> Java:
> Total (milliseconds)  1,702,429       0       1,702,429
> Superstep 3 (milliseconds)    316,844 0       316,844
> Setup (milliseconds)  13,226  0       13,226
> Shutdown (milliseconds)       113     0       113
> Superstep 0 (milliseconds)    300,950 0       300,950
> Superstep 4 (milliseconds)    318,627 0       318,627
> Input superstep (milliseconds)        114,673 0       114,673
> Superstep 5 (milliseconds)    7,898   0       7,898
> Superstep 2 (milliseconds)    312,152 0       312,152
> Superstep 1 (milliseconds)    317,942 0       317,942
> Jython:
> Total (milliseconds)  2,123,228       0       2,123,228
> Superstep 3 (milliseconds)    406,422 0       406,422
> Setup (milliseconds)  7,159   0       7,159
> Shutdown (milliseconds)       131     0       131
> Superstep 0 (milliseconds)    347,732 0       347,732
> Superstep 4 (milliseconds)    405,696 0       405,696
> Input superstep (milliseconds)        112,645 0       112,645
> Superstep 5 (milliseconds)    46,687  0       46,687
> Superstep 2 (milliseconds)    410,349 0       410,349
> Superstep 1 (milliseconds)    386,404 0       386,404
> That's a mere 25% overhead.
> Take a look at the reviewboard for latest patch: 
> https://reviews.apache.org/r/11709/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to