[ 
https://issues.apache.org/jira/browse/HIVE-27511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745011#comment-17745011
 ] 

Stamatis Zampetakis commented on HIVE-27511:
--------------------------------------------

The option 2 on the tez side is exactly what I had in mind. 

For option 1 (current patch), I still feel that it should be on the Tez side 
although this doesn't mean that it will go to the application logs. If the 
logging part goes somewhere inside {{TezUtils}} then it will still show up in 
HS2 logs I guess. I will elaborate a bit below.

The new method  {{DagUtils.createPayloadFromWritable}} that is introduced in 
the PR definitely makes sense but code-wise it would be a much better fit 
inside {{TezUtils}} along with the other existing methods of the form 
{{createUserPayloadFrom}}.

Moreover, I don't think we really care for the top-10 but basically for any 
property that is bigger than a threshold thus sorting may be a bit redundant. 
To deactivate insights we could simply pick a special value e.g., -1 or any 
negative one, and thus avoid any potential perf hit due to this.

In the current PR, printing the insights requires a previous call to 
{{TezUtils.createUserPayloadFromConf}} so it feels natural to put the logging 
logic inside there. If we don't require sorting to find the top-10 then the 
whole thing boils down to a simple if block probably inside 
{{TezUtils.populateConfProtoFromEntries}}.

Putting things in Tez would of-course require a Tez upgrade so that Hive can 
benefit from these. I am OK merging the current PR as an interim solution but I 
think we should log some follow-ups to remove the Hive specific part (property 
etc) after the Tez upgrade.

[~abstractdog] Let me know how do you prefer to move forward!

> Print Tez processor payload insights
> ------------------------------------
>
>                 Key: HIVE-27511
>                 URL: https://issues.apache.org/jira/browse/HIVE-27511
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>
> When investigating TezAM OOM issues, a common problem is a huge conf. In this 
> we usually do the following:
> 1. ask for heapdump
> 2. analyze heapdump, and finally find huge UserPayload, which is a binary :) 
> extremely hard to read
> 3. deserialize userpayload
> 4. looks for configuration properties in payload to blame
> 5. figure out how to reduce the payload
> I think with a handy configuration insight log message in HS2, we need to do 
> only 5) (as 1) 2) 3) 4) can be complicated)
> with the initial implementation I got this with e.g. ptf.q
> {code}
> 2023-07-18T05:02:16,080  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: Payload size for Map 1: 82967 bytes, # of 
> keys in conf: 2653
> 2023-07-18T05:02:16,084  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 1. 
> '/Users/laszlobodor/apache/hive/itests/qtest/target/tmp/localscratchdir/8e476599-63fd-4530-a007-71d92afced89/hive_2023-07-18_05-02-15_122_2130403664877939930-1/laszlobodor/_tez_scratch_dir/2d7f67a1-b7a3-413d-a027-51978f85e862/map.xml'
>  size: 2552
> 2023-07-18T05:02:16,084  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 2. 'hive.conf.hidden.list' size: 526
> 2023-07-18T05:02:16,084  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 3. 'hadoop.security.sensitive-config-keys' 
> size: 496
> 2023-07-18T05:02:16,084  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 4. 'hive.serdes.using.metastore.for.schema' 
> size: 426
> 2023-07-18T05:02:16,084  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 5. 'fs.s3a.aws.credentials.provider' size: 252
> 2023-07-18T05:02:16,084  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 6. 'hive.exec.plan' size: 229
> 2023-07-18T05:02:16,084  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 7. 'io.serializations' size: 180
> 2023-07-18T05:02:16,084  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 8. 'hive.exec.post.hooks' size: 177
> 2023-07-18T05:02:16,084  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 9. '_hive_tez_tmp_dir' size: 173
> 2023-07-18T05:02:16,085  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 10. 
> 'dfs.webhdfs.acl.provider.permission.pattern' size: 154
> ...
> 2023-07-18T05:02:19,051  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: Payload size for Reducer 2: 90231 bytes, # of 
> keys in conf: 2658
> 2023-07-18T05:02:19,052  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 1. 
> '/Users/laszlobodor/apache/hive/itests/qtest/target/tmp/localscratchdir/8e476599-63fd-4530-a007-71d92afced89/hive_2023-07-18_05-02-18_172_1139677043401490299-1/laszlobodor/_tez_scratch_dir/040836dc-01a2-438b-99ff-148d1f5afd48/reduce.xml'
>  size: 9364
> 2023-07-18T05:02:19,052  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 2. 'hive.conf.hidden.list' size: 526
> 2023-07-18T05:02:19,053  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 3. 'hadoop.security.sensitive-config-keys' 
> size: 496
> 2023-07-18T05:02:19,053  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 4. 'hive.serdes.using.metastore.for.schema' 
> size: 426
> 2023-07-18T05:02:19,053  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 5. 'columns.types' size: 377
> 2023-07-18T05:02:19,053  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 6. 'columns' size: 367
> 2023-07-18T05:02:19,053  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 7. 'fs.s3a.aws.credentials.provider' size: 252
> 2023-07-18T05:02:19,053  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 8. 'hive.exec.plan' size: 229
> 2023-07-18T05:02:19,053  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 9. 'io.serializations' size: 180
> 2023-07-18T05:02:19,053  INFO [8e476599-63fd-4530-a007-71d92afced89 Listener 
> at 0.0.0.0/55258] tez.DagUtils: 10. 'hive.exec.post.hooks' size: 177
> ...
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to