Hadoop Abacus, a package for performing simple counting/aggregation
-------------------------------------------------------------------

                 Key: HADOOP-908
                 URL: https://issues.apache.org/jira/browse/HADOOP-908
             Project: Hadoop
          Issue Type: New Feature
          Components: contrib/streaming
            Reporter: Runping Qi


Hadoop Abacus package is a specialization of map/reduce framework, 
specilizing for performing various counting and aggregations. 
It offers similar functionalities to Google's SawZall. 

Generally speaking, in order to implement an application using Map/Reduce 
model, 
the developer needs to implement Map and Reduce functions (and possibly Combine 
function). 
However, for a lot of applications related to counting and statistics 
computing, 
these functions have very similar characteristics. 
Abacus abstracts out the general patterns and provides a package implementing 
those patterns. 
In particular, the package provides a generic mapper class, a reducer class and 
a combiner class, 
and a set of built-in value aggregators. It also provides a generic utility 
class, ValueAggregatorJob
for creating Abacus jobs.

To create an Abacus job, the user just needs to implement one plugin class that 
is responsible for specifying what aggregators to use and what values are for 
which aggregators. 
The mapper will call this class in the runtime to generate aggregation ids and 
values.
The generic  combiner and reducer will aggregate the values associated with the 
same 
aggregation ids accordingly. Thus, it is much easier to create and run an 
Abacus job than 
a normal map/reduce job. Since a  built-in generic combiner is always used, the 
execution is very efficient.






-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to