Edward J. Yoon created HAMA-983:
---
Summary: Hama runner for DataFlow
Key: HAMA-983
URL: https://issues.apache.org/jira/browse/HAMA-983
Project: Hama
Issue Type: Bug
Reporter: Edward J. Yoon
As you already know, Apache Beam provides unified programming model for both
batch and streaming inputs.
The APIs are generally associated with data filtering and transforming. So
we'll need to implement some data processing runner like
https://github.com/dapurv5/MapReduce-BSP-Adapter/blob/master/src/main/java/org/apache/hama/mapreduce/examples/WordCount.java
Also, implementing similarity join can be funny. According to
http://www.ruizhang.info/publications/TPDS2015-Heads_Join.pdf, Apache Hama is
clearly winner among Apache Hadoop and Apache Spark.
Since it consists of transformation, aggregation, and partition computations, I
think it's possible to implement using Apache Beam APIs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)