[
https://issues.apache.org/jira/browse/HIVE-27511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745011#comment-17745011
]
Stamatis Zampetakis commented on HIVE-27511:
--------------------------------------------
The option 2 on the tez side is exactly what I had in mind.
For option 1 (current patch), I still feel that it should be on the Tez side
although this doesn't mean that it will go to the application logs. If the
logging part goes somewhere inside {{TezUtils}} then it will still show up in
HS2 logs I guess. I will elaborate a bit below.
The new method {{DagUtils.createPayloadFromWritable}} that is introduced in
the PR definitely makes sense but code-wise it would be a much better fit
inside {{TezUtils}} along with the other existing methods of the form
{{createUserPayloadFrom}}.
Moreover, I don't think we really care for the top-10 but basically for any
property that is bigger than a threshold thus sorting may be a bit redundant.
To deactivate insights we could simply pick a special value e.g., -1 or any
negative one, and thus avoid any potential perf hit due to this.
In the current PR, printing the insights requires a previous call to
{{TezUtils.createUserPayloadFromConf}} so it feels natural to put the logging
logic inside there. If we don't require sorting to find the top-10 then the
whole thing boils down to a simple if block probably inside
{{TezUtils.populateConfProtoFromEntries}}.
Putting things in Tez would of-course require a Tez upgrade so that Hive can
benefit from these. I am OK merging the current PR as an interim solution but I
think we should log some follow-ups to remove the Hive specific part (property
etc) after the Tez upgrade.
[~abstractdog] Let me know how do you prefer to move forward!
> Print Tez processor payload insights
> ------------------------------------
>
> Key: HIVE-27511
> URL: https://issues.apache.org/jira/browse/HIVE-27511
> Project: Hive
> Issue Type: Improvement
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Labels: pull-request-available
>
> When investigating TezAM OOM issues, a common problem is a huge conf. In this
> we usually do the following:
> 1. ask for heapdump
> 2. analyze heapdump, and finally find huge UserPayload, which is a binary :)
> extremely hard to read
> 3. deserialize userpayload
> 4. looks for configuration properties in payload to blame
> 5. figure out how to reduce the payload
> I think with a handy configuration insight log message in HS2, we need to do
> only 5) (as 1) 2) 3) 4) can be complicated)
> with the initial implementation I got this with e.g. ptf.q
> {code}
> 2023-07-18T05:02:16,080 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: Payload size for Map 1: 82967 bytes, # of
> keys in conf: 2653
> 2023-07-18T05:02:16,084 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 1.
> '/Users/laszlobodor/apache/hive/itests/qtest/target/tmp/localscratchdir/8e476599-63fd-4530-a007-71d92afced89/hive_2023-07-18_05-02-15_122_2130403664877939930-1/laszlobodor/_tez_scratch_dir/2d7f67a1-b7a3-413d-a027-51978f85e862/map.xml'
> size: 2552
> 2023-07-18T05:02:16,084 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 2. 'hive.conf.hidden.list' size: 526
> 2023-07-18T05:02:16,084 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 3. 'hadoop.security.sensitive-config-keys'
> size: 496
> 2023-07-18T05:02:16,084 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 4. 'hive.serdes.using.metastore.for.schema'
> size: 426
> 2023-07-18T05:02:16,084 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 5. 'fs.s3a.aws.credentials.provider' size: 252
> 2023-07-18T05:02:16,084 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 6. 'hive.exec.plan' size: 229
> 2023-07-18T05:02:16,084 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 7. 'io.serializations' size: 180
> 2023-07-18T05:02:16,084 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 8. 'hive.exec.post.hooks' size: 177
> 2023-07-18T05:02:16,084 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 9. '_hive_tez_tmp_dir' size: 173
> 2023-07-18T05:02:16,085 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 10.
> 'dfs.webhdfs.acl.provider.permission.pattern' size: 154
> ...
> 2023-07-18T05:02:19,051 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: Payload size for Reducer 2: 90231 bytes, # of
> keys in conf: 2658
> 2023-07-18T05:02:19,052 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 1.
> '/Users/laszlobodor/apache/hive/itests/qtest/target/tmp/localscratchdir/8e476599-63fd-4530-a007-71d92afced89/hive_2023-07-18_05-02-18_172_1139677043401490299-1/laszlobodor/_tez_scratch_dir/040836dc-01a2-438b-99ff-148d1f5afd48/reduce.xml'
> size: 9364
> 2023-07-18T05:02:19,052 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 2. 'hive.conf.hidden.list' size: 526
> 2023-07-18T05:02:19,053 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 3. 'hadoop.security.sensitive-config-keys'
> size: 496
> 2023-07-18T05:02:19,053 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 4. 'hive.serdes.using.metastore.for.schema'
> size: 426
> 2023-07-18T05:02:19,053 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 5. 'columns.types' size: 377
> 2023-07-18T05:02:19,053 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 6. 'columns' size: 367
> 2023-07-18T05:02:19,053 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 7. 'fs.s3a.aws.credentials.provider' size: 252
> 2023-07-18T05:02:19,053 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 8. 'hive.exec.plan' size: 229
> 2023-07-18T05:02:19,053 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 9. 'io.serializations' size: 180
> 2023-07-18T05:02:19,053 INFO [8e476599-63fd-4530-a007-71d92afced89 Listener
> at 0.0.0.0/55258] tez.DagUtils: 10. 'hive.exec.post.hooks' size: 177
> ...
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)