[
https://issues.apache.org/jira/browse/PIG-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245259#comment-15245259
]
liyunzhang_intel commented on PIG-4846:
---------------------------------------
[~xuefuz]: Thanks for your configuration.
spark.yarn.driver.memoryOverhead is a parameter for yarn-cluster mode,
currently, we don't support yarn-cluster mode for pig on spark(see PIG-4681),
so i don't add this parameter in the pig.properties.
I add following in conf/pig.properties
Here the unit of spark.yarn.executor.memoryOverhead is MB.
{code}
spark.eventLog.enabled=true
spark.eventLog.dir=hdfs://bdpe41:8020/spark-history-server
spark.executor.cores=4
spark.executor.memory=6553m
spark.yarn.executor.memoryOverhead=1638
spark.driver.memory=2048m
spark.executor.instances=7
{code}
The new result of yarn-client mode is
||Script||Before||Later||
|L_1|71|55|
|L_2|46|45|
|L_3|1746|337|
|L_4|49|48|
|L_5|1754|338|
|L_6|60|48|
|L_7|1002|210|
|L_8|46|46|
|L_9|65|51|
|L_10|67|51|
|L_11|196|65|
|L_12|56|54|
|L_13|1010|212|
|L_14|52|45|
|L_15|1018|210|
|L_16|1022|207|
|L_17|109|54|
It shows a big peformance improvement than before :).
Can you explain more about how to configure these configuration?
> Use pigmix to test the performance of pig on spark
> --------------------------------------------------
>
> Key: PIG-4846
> URL: https://issues.apache.org/jira/browse/PIG-4846
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: liyunzhang_intel
> Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-4846.patch, PIG-4846_1.patch
>
>
> We can compare the performance between mr and spark mode by pigmix.
> The introduction of pigmix is
> https://cwiki.apache.org/confluence/display/PIG/PigMix.
> PIG-4846.patch is to make pigmix run by specied exectype.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)