[
https://issues.apache.org/jira/browse/HBASE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096986#comment-13096986
]
Arun C Murthy commented on HBASE-4329:
--------------------------------------
bq. Why you say that? (I don't disagree but a list of why's would help figure
what the fit criteria for closing this issue are).
Stack, first up, I didn't mean to start to flame - I'm sure you know that. :)
FWIW, talking to folks around, isolation and support for prioritization to
ensure a single user/application cannot *hog* a HBase cluster (or parts
thereof) is something I've heard as concern. This dovetails very well with our
experience running both HDFS and MapReduce at scale, as a shared resource.
Again, this isn't to claim it's a solved problem in Hadoop core, just something
we've focussed on, for a while now.
Hence, my thinking was we could use YARN as an intermediate solution. I
discussed this idea with Andrew at the Summit and he didn't give me the
impression that I was off my rocker, maybe he was just being polite and has a
great poker face!
Thanks for pointing me to HBASE-4120, that seems related - I wasn't aware. It's
a lot to digest, I'll try to spend some time on it. If the HBase community
decides to focus on the multi-tenancy/isolation problem (via HBASE-4120 etc.) -
great! We can close this discussion. If not, I'd like to brainstorm with you
guys for an intermediate solution.
It really depends where you guys want to focus your energies.
bq. Meantime, where I work, mapreduce is the problem (smile). We're messing
with cgroup containing mapreduce so it doesn't steal resources from hdfs (and
hbase).
I'm sure - MR needs more work, I'm painfully aware of this! :)
We plan to go the cgroups route sometime right after we ship 0.23, we could
share notes and ideas.
bq. You want us to get into the nextgen mr container because then there is one
place to go to do accounting?
The idea is that *iff* the HBase community wants to use this an an intermediate
solution, using the RM will ensure the resource usage of HBase is accounted for
w.r.t to the applications/queues/organizations etc.
> Use NextGen Hadoop to deploy HBase
> ----------------------------------
>
> Key: HBASE-4329
> URL: https://issues.apache.org/jira/browse/HBASE-4329
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Arun C Murthy
>
> Currently (circa 2011), with due respect, it's not practical to run shared,
> multi-tenant HBase clusters on the largest Hadoop installs (of 4000+ nodes).
> As an interim, I'd like to brainstorm using NextGen Hadoop (MAPREDUCE-279) to
> deploy HBase for focussed sets of applications/users/organizations. Thus, one
> could deploy a smaller instance of HBase (100s of nodes) in a large Hadoop
> cluster and use it for a set of applications.
> The other advantage is that the resource usage of HBase (master,
> region-server etc.) is accounted for in the overall utilization of the
> cluster and, conceivably, aid in resource tracking, capacity planning etc.
> ----
> Thoughts?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira