[ 
https://issues.apache.org/jira/browse/TEZ-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Authur Wang updated TEZ-4442:
-----------------------------
    Attachment: java heap2.png
                java heap1.png

> tez unable to control the memory size when UDF occupies 100MB memory 
> ---------------------------------------------------------------------
>
>                 Key: TEZ-4442
>                 URL: https://issues.apache.org/jira/browse/TEZ-4442
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.9.1
>         Environment: CDP7.1.7SP1
> tez 0.9.1
> hive 3.1.3
>  
>            Reporter: Authur Wang
>            Priority: Critical
>         Attachments: app.log, application_1659706606596_0047.log.gz, 
> hiveserver2.out, java heap1.png, java heap2.png, 
> spark-udf-0.0.1-SNAPSHOT.jar, spark-udf-src.zip
>
>
>           We have a UDF which loads about 5 million records into memory, and 
> matchs the data in the memory according to the user's input, and finally 
> return the output. Each input record of the UDF will lead to one output.
>           Based on heapdump analysis, this  udf occupies about 100MB of 
> memory. The UDF runs stably in hive on MR, hive on spark and native spark, 
> and only needs about 4GB of memory for that situation. However, if we use tez 
> engine,  we adjust the memory from 4G to 8g, the task will fail. Even if we 
> adjust the memory to 12g, the task will fail with a high probability. Why 
> does tez engine need so much memory compared to Mr and spark? Is there a good 
> tuning method to control the amount of memory ?
>  
>  
> command is as follows:
> beeline -u 
> 'jdbc:hive2://bg21146.hadoop.com:10000/default;principal=hive/[bg21146.hadoop....@bg.com|mailto:bg21146.hadoop....@bg.com]'
>  --hiveconf tez.queue.name=root.000kjb.bdhmgmas_bas -e "
>  
> create temporary function get_card_rank as 
> 'com.unionpay.spark.udf.GenericUDFCupsCardMediaProc' using jar 
> 'hdfs:///user/lib/spark-udf-0.0.1-SNAPSHOT.jar';
>  
> set tez.am.log.level=debug;
> set tez.am.resource.memory.mb=8192;
> set hive.tez.container.size=8192;
> set tez.task.resource.memory.mb=2048;
> set tez.runtime.io.sort.mb=1200;
> set hive.auto.convert.join.noconditionaltask.size=500000000;
> set tez.runtime.unordered.output.buffer.size-mb=800;
> set tez.grouping.min-size=33554432;
> set tez.grouping.max-size=536870912;
> set hive.tez.auto.reducer.parallelism=true;
> set hive.tez.min.partition.factor=0.25;
> set hive.tez.max.partition.factor=2.0;
> set hive.exec.reducers.bytes.per.reducer=268435456;
> set mapreduce.map.memory.mb=4096;
> set ipc.maximum.response.length=1536000000;
>  
>  
> select
>  get_card_rank(ext_pri_acct_no) as ext_card_media_proc_md,
>  count(\*)
> from bs_comdb.tmp_bscom_glhis_ct_settle_dtl_bas_swt a
> where a.hp_settle_dt = '20200910'
> group by get_card_rank(ext_pri_acct_no)
> ;
> "
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to