Hi all:
  Now I'm using pigmix to test the performance of Pig On 
Spark(PIG-4937<https://issues.apache.org/jira/browse/PIG-4937>). The test data 
is 1TB. After generating all the test data, I have run first round of test in 
mr mode.
The cluster has 8 nodes(each node has 40 cores and 60g memory, will assign 28 
cores and 56g for  nodemanager on the node).  Total cores and memory for the 
cluster is 224 cores and 448g memory.

The snippet of yarn-site.xml:
<property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>57344</value>
    <description>the amount of memory on the NodeManager in MB</description>
  </property>
   <property>
    <name>yarn.nodemanager.resource.cpu-vcores</name>
    <value>28</value>
  </property>
  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>2048</value>
  </property>
  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>57344</value>
  </property>
    <property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
    <description>Whether virtual memory limits will be enforced for 
containers</description>
  </property>
  <property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>4</value>
    <description>Ratio between virtual memory to physical memory when setting 
memory limits for containers</description>
  </property>

The snippet of mapred-site.xml is
  <property>
    <name>mapreduce.map.java.opts</name>
    <value>-Xmx1638m</value>
  </property>
  <property>
    <name>mapreduce.reduce.java.opts</name>
    <value>-Xmx3276m</value>
  </property>
  <property>
    <name>mapreduce.map.memory.mb</name>
    <value>2048</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>4096</value>
  </property>
  <property>
    <name>mapreduce.task.io.sort.mb</name>
    <value>820</value>
  </property>
  <property>
    <name>mapred.task.timeout</name>
    <value>1200000</value>
  </property>

The snippet of hdfs-site.xml
<property>
    <name>dfs.blocksize</name>
    <value>1124217344</value>
  </property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
<property>
<name>dfs.socket.timeout</name>
<value>1200000</value>
</property>
<property>
<name>dfs.datanode.socket.write.timeout</name>
<value>1200000</value>
</property>

The result of last run of pigmix in mr mode(L9,10,13,14,17 fail). It shows that 
the average time spent on one script is nearly 6 hours.  I don't know whether 
it really need so much time to run L1~L17?  Can anyone who has experience on 
pigmix share his/her configuration and expected result with me?



MR(sec)

L1

21544

L2

20482

L3

21629

L4

20905

L5

20738

L6

24131

L7

21983

L8

24549

L9

6585(Fail)

L10

22286(Fail)

L11

21849

L12

21266

L13

11099(Fail)

L14

43(Fail)

L15

23808

L16

42889

L17

10(Fail)




Kelly Zhang/Zhang,Liyun
Best Regards

Reply via email to