[
https://issues.apache.org/jira/browse/HBASE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018969#comment-13018969
]
Jean-Daniel Cryans commented on HBASE-3767:
-------------------------------------------
bq. And if the number of region servers changes, are there repercussions?
Currently once the HTable is created its ThreadPoolExecutor will stay the same
size disregard the changing number of region servers. Caching it here has the
same behavior. Where it changes is if a HTable is created later after the
number of region server changes, but running with less threads than the total
number of region server is only less efficient under bulk load situations where
you need to insert into all of them at the same time (which I believe isn't
frequent when uploading, usually you create the HTables up front). That's the
only repercussion I see, and it's still less bad than the following:
bq. Thats better than doing getCurrentNrHRS. Maybe 2* number of processors
So the reason we use the number of RS is to be able to insert into all the
region servers at the same time in a bulk upload case. Using the number of CPUs
by itself isn't particularly useful since uploading isn't CPU intensive on the
client (it's just threads waiting on region servers) and the fact that you
usually have many HTables per JVM kinda defeats the purpose of limiting the
number of executors.
I personally like the fact that we try to learn how many RS there is in order
to tune the TPE, but it's just that calling it every time is rather expensive
and mostly useless. I still believe we should just cache it.
> Cache the number of RS in HTable
> --------------------------------
>
> Key: HBASE-3767
> URL: https://issues.apache.org/jira/browse/HBASE-3767
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.2
> Reporter: Jean-Daniel Cryans
> Fix For: 0.90.3
>
>
> When creating a new HTable we have to query ZK to learn about the number of
> region servers in the cluster. That is done for every single one of them, I
> think instead we should do it once per JVM and then reuse that number for all
> the others.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira