[jira] [Commented] (HAMA-990) GSoC'16: Apache Hama benchmark against Spark and Flink

Behroz Sikander (JIRA) Thu, 19 May 2016 19:24:06 -0700

    [ 
https://issues.apache.org/jira/browse/HAMA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292566#comment-15292566
 ]


Behroz Sikander commented on HAMA-990:
--------------------------------------

Ok. So, I am listing basic steps (requirements) that the script would do and I 
will refine them over time as I have more information.


0- MRQL needs HAMA/FLINK/SPARK already installed, so we need to assume that 
they are configured.

1- Download and extract the Apache MRQL latest stable release.

2- If the default configurations in mrql-env.sh [1] are alright then do nothing 
otherwise update the mrql-env.sh file for the correct version.

3- Read the command input (all | kmeans ...) and prepare the input data for the 
algorithm(s) and place them in HDFS (e.g. mrql.bsp -dist -nodes 50 RMAT.mrql 
100000 1000000)

4- Execute the algorithm (mrql.bsp -dist -nodes 50 pagerank.mrql)

5- Dump the output

6- Repeat the algorithm for other platforms

[1] https://github.com/apache/incubator-mrql/blob/master/conf/mrql-env.sh
[2] https://mrql.incubator.apache.org/getting_started.html

> GSoC'16: Apache Hama benchmark against Spark and Flink
> ------------------------------------------------------
>
>                 Key: HAMA-990
>                 URL: https://issues.apache.org/jira/browse/HAMA-990
>             Project: Hama
>          Issue Type: Documentation
>            Reporter: Behroz Sikander
>            Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HAMA-990) GSoC'16: Apache Hama benchmark against Spark and Flink

Reply via email to