[jira] [Commented] (HBASE-5401) PerformanceEvaluation generates 10x the number of expected mappers

Yi Liang (JIRA) Tue, 20 Dec 2016 17:27:06 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765781#comment-15765781
 ]


Yi Liang commented on HBASE-5401:
---------------------------------

I have used this command and also encounter this issue, for example:
when I run hbase org.apache.hadoop.hbase.PerformanceEvaluation  --rows=m 
randomWrite n

if we use --nomapred, this will create n threads(clients) and each thread write 
m/n rows into hbase
if we use default mapreduce, this will create 10*n mappers, and each mapper 
will put m/(n*10) rows into hbase.
   I think the static int {code}static int TASKS_PER_CLIENT = 10{code} here is 
unnecessary,
   1. If user want more mappers they can just change client numbers, however, 
if *10 is here, user can only create 10, 20, 30... mappers for different number 
of client, this is not flexible.  
   2. The TASKS_PER_CLIENT = 10 is hardcoded and invisible to user, sometime 
may be user just want 5 mappers for their job, and current code will create 50 
mappers.
   3. when <nclients> = 5, it means 5 threads and 50 mappers, which is a little 
inconsistent, PS. I do not mean mapper is same as thread， but it is better to 
keep them same.  

What do you guys think?

> PerformanceEvaluation generates 10x the number of expected mappers
> ------------------------------------------------------------------
>
>                 Key: HBASE-5401
>                 URL: https://issues.apache.org/jira/browse/HBASE-5401
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 2.0.0
>            Reporter: Oliver Meyn
>             Fix For: 2.0.0
>
>         Attachments: HBASE-5401-V1.patch
>
>
> With a command line like 'hbase org.apache.hadoop.hbase.PerformanceEvaluation 
> randomWrite 10' there are 100 mappers spawned, rather than the expected 10.  
> The culprit appears to be the outer loop in writeInputFile which sets up 10 
> splits for every "asked-for client".  I think the fix is just to remove that 
> outer loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-5401) PerformanceEvaluation generates 10x the number of expected mappers

Reply via email to