[
https://issues.apache.org/jira/browse/HBASE-28837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Evelyn Boland updated HBASE-28837:
----------------------------------
Description:
Goal:
Add a coprocessor to HBase that allows administrators to track high level
statistics on the rows and cells in their HBase tables. Administrators can load
this coprocessor into their RegionServers if they wish to gain more visibility
into the shape of their data in HBase.
At my day job, we've leveraged the statistics from this coprocessor to
automatically configure more optimal block sizes and smarter compaction
schedules for our fleet of nearly 200 HBase clusters.
Context:
Since HBase tables can store terabytes or even petabytes of data, HBase
administrators often have incomplete information about the data stored in their
HBase tables. Without a comprehensive understanding of the shape of their data,
it can be difficult for administrators to configure clusters for a desired
level of performance and/or reliability. Row statistics have the potential to
supercharge HBase management.
[Design
doc|https://docs.google.com/document/d/1oaNAZUER5zO8yivmzRBVAMmL6r2cYiJn9YCbDe14LMw/edit#heading=h.nch5d72p27ex]
was:
Goal:
Add a coprocessor to HBase that allows administrators to track high level
statistics on the rows and cells in their HBase tables. Administrators can load
this coprocessor into their RegionServers if they wish to gain more visibility
into the shape of their data in HBase.
At my day job, we've leveraged the statistics from this coprocessor to
automatically configure more optimal block sizes and smarter compaction
schedules for our fleet of nearly 200 HBase clusters.
Context:
Since HBase tables can store terabytes or even petabytes of data, HBase
administrators often have incomplete information about the data stored in their
HBase tables. Without a comprehensive understanding of the shape of their data,
it can be difficult for administrators to configure clusters for a desired
level of performance and/or reliability. Row statistics have the potential to
supercharge HBase management.
> Add row statistics coprocessor
> ------------------------------
>
> Key: HBASE-28837
> URL: https://issues.apache.org/jira/browse/HBASE-28837
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 2.0.0, 3.0.0-beta-1
> Reporter: Evelyn Boland
> Assignee: Evelyn Boland
> Priority: Major
> Labels: pull-request-available
>
> Goal:
> Add a coprocessor to HBase that allows administrators to track high level
> statistics on the rows and cells in their HBase tables. Administrators can
> load this coprocessor into their RegionServers if they wish to gain more
> visibility into the shape of their data in HBase.
> At my day job, we've leveraged the statistics from this coprocessor to
> automatically configure more optimal block sizes and smarter compaction
> schedules for our fleet of nearly 200 HBase clusters.
> Context:
> Since HBase tables can store terabytes or even petabytes of data, HBase
> administrators often have incomplete information about the data stored in
> their HBase tables. Without a comprehensive understanding of the shape of
> their data, it can be difficult for administrators to configure clusters for
> a desired level of performance and/or reliability. Row statistics have the
> potential to supercharge HBase management.
> [Design
> doc|https://docs.google.com/document/d/1oaNAZUER5zO8yivmzRBVAMmL6r2cYiJn9YCbDe14LMw/edit#heading=h.nch5d72p27ex]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)