liyunzhang_intel created PIG-4937:
-------------------------------------

             Summary: Pigmix hangs when generating data after rows  is set as 
625000000 in  test/perf/pigmix/conf/config.sh
                 Key: PIG-4937
                 URL: https://issues.apache.org/jira/browse/PIG-4937
             Project: Pig
          Issue Type: Bug
            Reporter: liyunzhang_intel


use the default setting in test/perf/pigmix/conf/config.sh, generate data by
"ant -v -Dharness.hadoop.home=$HADOOP_HOME -Dhadoopversion=23  pigmix-deploy 
>ant.pigmix.deploy"
it hangs in the log:
{code}
 [exec] Generating mapping file for column d:1:100000:z:5 into 
hdfs://bdpe41:8020/user/root/tmp/tmp-1056793210/tmp-786100428
     [exec] processed 99%.
     [exec] Generating input files into 
hdfs://bdpe41:8020/user/root/tmp/tmp-1056793210/tmp595036324
     [exec] Submit hadoop job...
     [exec] 16/06/25 23:06:32 INFO client.RMProxy: Connecting to 
ResourceManager at bdpe41/10.239.47.137:8032
     [exec] 16/06/25 23:06:32 INFO client.RMProxy: Connecting to 
ResourceManager at bdpe41/10.239.47.137:8032
     [exec] 16/06/25 23:06:32 INFO mapred.FileInputFormat: Total input paths to 
process : 90
     [exec] 16/06/25 23:06:32 INFO mapreduce.JobSubmitter: number of splits:90
     [exec] 16/06/25 23:06:32 INFO mapreduce.JobSubmitter: Submitting tokens 
for job: job_1466776148247_0034
     [exec] 16/06/25 23:06:33 INFO impl.YarnClientImpl: Submitted application 
application_1466776148247_0034
     [exec] 16/06/25 23:06:33 INFO mapreduce.Job: The url to track the job: 
http://bdpe41:8088/proxy/application_1466776148247_0034/     [exec] 16/06/25 
23:06:33 INFO mapreduce.Job: Running job: job_1466776148247_0034
     [exec] 16/06/25 23:06:38 INFO mapreduce.Job: Job job_1466776148247_0034 
running in uber mode : false
     [exec] 16/06/25 23:06:38 INFO mapreduce.Job:  map 0% reduce 0%
     [exec] 16/06/25 23:06:53 INFO mapreduce.Job:  map 2% reduce 0%
     [exec] 16/06/25 23:06:59 INFO mapreduce.Job:  map 26% reduce 0%
     [exec] 16/06/25 23:07:00 INFO mapreduce.Job:  map 61% reduce 0%
     [exec] 16/06/25 23:07:02 INFO mapreduce.Job:  map 62% reduce 0%
     [exec] 16/06/25 23:07:03 INFO mapreduce.Job:  map 64% reduce 0%
     [exec] 16/06/25 23:07:04 INFO mapreduce.Job:  map 79% reduce 0%
     [exec] 16/06/25 23:07:05 INFO mapreduce.Job:  map 86% reduce 0%
     [exec] 16/06/25 23:07:06 INFO mapreduce.Job:  map 92% reduce 0%
{code}


When i use 625000 as the rows in test/perf/pigmix/conf/config.sh, it is 
successful to generate test data. So is the problem on the limit resources(disk 
size or others)?  My env is 3 nodes cluster(each node has about a disk(830G)).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to