[ https://issues.apache.org/jira/browse/MRQL-98?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15591986#comment-15591986 ]
Hudson commented on MRQL-98: ---------------------------- SUCCESS: Integrated in Jenkins build mrql-master-snapshot #43 (See [https://builds.apache.org/job/mrql-master-snapshot/43/]) [MRQL-98] Improve Data Serialization in Spark Evaluation (fegaras: [https://git-wip-us.apache.org/repos/asf?p=incubator-mrql.git&a=commit&h=ab6bac73711b0d5fa5f05fa2c9b4c558c0576042]) * (edit) gen/src/main/java/org/apache/mrql/gen/VariableLeaf.java * (edit) core/src/main/java/org/apache/mrql/MRContainer.java * (edit) core/src/main/java/org/apache/mrql/MR_variable.java * (edit) gen/src/main/java/org/apache/mrql/gen/Node.java * (edit) core/src/main/java/org/apache/mrql/MR_bool.java * (edit) core/src/main/java/org/apache/mrql/Lambda.java * (edit) core/src/main/java/org/apache/mrql/MR_float.java * (edit) core/src/main/java/org/apache/mrql/MR_int.java * (edit) core/src/main/java/org/apache/mrql/MR_more_bsp_steps.java * (edit) core/src/main/java/org/apache/mrql/Union.java * (edit) spark/src/main/java/org/apache/mrql/MR_rdd.java * (edit) core/src/main/java/org/apache/mrql/MR_long.java * (edit) core/src/main/java/org/apache/mrql/Inv.java * (edit) core/src/main/java/org/apache/mrql/MR_double.java * (edit) spark/src/main/java/org/apache/mrql/RDDDataSource.java * (edit) core/src/main/java/org/apache/mrql/MR_sync.java * (edit) core/src/main/java/org/apache/mrql/MRData.java * (edit) core/src/main/java/org/apache/mrql/MR_dataset.java * (edit) core/src/main/java/org/apache/mrql/MR_byte.java * (edit) flink/src/main/java/org/apache/mrql/FData.java * (edit) flink/src/main/java/org/apache/mrql/MR_flink.java * (edit) core/src/main/java/org/apache/mrql/Tuple.java * (edit) core/src/main/java/org/apache/mrql/MR_char.java * (edit) core/src/main/java/org/apache/mrql/Bag.java * (edit) core/src/main/java/org/apache/mrql/MR_short.java * (edit) core/src/main/java/org/apache/mrql/MR_string.java > Improve Data Serialization in Spark Evaluation > ---------------------------------------------- > > Key: MRQL-98 > URL: https://issues.apache.org/jira/browse/MRQL-98 > Project: MRQL > Issue Type: Improvement > Components: Run-Time/Spark > Affects Versions: 0.9.8 > Reporter: Leonidas Fegaras > Assignee: Leonidas Fegaras > Priority: Critical > > MRQL data (MRData) are serialized as Writable (for Hadoop Map-Reduce), Java > Serializable (for Spark), and CopyableValue (for Flink). Until now, the Spark > MRQL engine was using a wrapper for MRData (called MRContainer) to serialize > data using the Writable methods. Some data used in Spark mode though were > left unwrapped, so Spark was using the default Java serialization, which was > inefficient. With this patch, MRData becomes Serializable with custom > serialization methods that are very efficient. My performance evaluation of > the Pagerank query over 10 millions links run on a cluster with 16 cores > gives 38% improvement compared to the old Spark evaluation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)