[jira] [Created] (HAMA-983) Hama runner for DataFlow

Edward J. Yoon (JIRA) Sun, 14 Feb 2016 16:38:27 -0800

Edward J. Yoon created HAMA-983:
-----------------------------------

             Summary: Hama runner for DataFlow
                 Key: HAMA-983
                 URL: https://issues.apache.org/jira/browse/HAMA-983
             Project: Hama
          Issue Type: Bug
            Reporter: Edward J. Yoon



As you already know, Apache Beam provides unified programming model for both 
batch and streaming inputs.

The APIs are generally associated with data filtering and transforming. So 
we'll need to implement some data processing runner like 
https://github.com/dapurv5/MapReduce-BSP-Adapter/blob/master/src/main/java/org/apache/hama/mapreduce/examples/WordCount.java

Also, implementing similarity join can be funny. According to 
http://www.ruizhang.info/publications/TPDS2015-Heads_Join.pdf, Apache Hama is 
clearly winner among Apache Hadoop and Apache Spark.

Since it consists of transformation, aggregation, and partition computations, I 
think it's possible to implement using Apache Beam APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HAMA-983) Hama runner for DataFlow

Reply via email to