hbase (coprocessors & cell tags) used in hadoop-yarn

Vrushali Channapattan Mon, 21 Dec 2015 23:08:08 -0800

A group of us in the hadoop community are working on Yarn's next gen
timeline service component https://issues.apache.org/jira/browse/YARN-2928


that will be storing for application that runs on a hadoop cluster all of
the application stats, workflow metadata and container metrics information
in hbase tables (some plain hbase tables and some phoenix based ones).

We have been thinking about validating some of the implementation
approaches we are taking with HBase. It would be great to get some feedback
on the code and design from the HBase dev perspective.

Among other things, we are making use of cell tags in coprocessors for
summation, min and max operations on different versions of cells in a given
column during read as well flush and compaction operations.  Some relevant
subjiras that deal with hbase coprocessors
https://issues.apache.org/jira/browse/YARN-4062
https://issues.apache.org/jira/browse/YARN-3901

We have the schema documented with example records in the code as well as
in pdf on the jira.

https://github.com/apache/hadoop/blob/feature-YARN-2928/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunTable.java#L34

https://github.com/apache/hadoop/blob/feature-YARN-2928/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityTable.java#L40

Schema jira (pdf attachment that describes the schema)
https://issues.apache.org/jira/browse/YARN-3411

Would appreciate any feedback/comments that you have and be glad to answer
any questions to clarify in depth further.

thanks
Vrushali

hbase (coprocessors & cell tags) used in hadoop-yarn

Reply via email to