[
https://issues.apache.org/jira/browse/TEZ-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875335#comment-17875335
]
Chenyu Zheng commented on TEZ-4577:
-----------------------------------
[~yigress]
If the maxItems of a new span is 1, the kvmeta of the new span will be very
small. Then PipelinedSorter::sort will be triggered frequently, result to be
slow. Am I right? If so, I think it needs to be fix it. Do you have any plans
to fix it?
In addition, I am curious, since the first span size is 16*1024*1024, why does
maxItems become 1? Can you add some logs to your problem application to print
the appropriate call to PipelinedSorter::sort?
> SortSpan could be created real small, resulting in eventual job failure
> -----------------------------------------------------------------------
>
> Key: TEZ-4577
> URL: https://issues.apache.org/jira/browse/TEZ-4577
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.10.4
> Reporter: Yi Zhang
> Priority: Major
>
> we run into a issue with overflow as in TEZ-4542, with TEZ-4542 applied, it
> then run into an issue of real small sortspan (per record in this case),
> eventually the job failed due to timeout
> from sample logs it looks like
>
> SortSpan(ByteBuffer source, int maxItems, int perItem, RawComparator
> comparator)
>
> once it get into a situation of maxItems=1, then it persists with maxItems=1
>
> (also a side issue, the logging in this situation becomes huge)
>
> sample logs:
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: Span260.length = 1, perItem = 139
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: reserved.remaining()=268396925, reserved.metasize=16
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: New Span261.length = 1, perItem = 139, counter:5307003
> 2024-08-19 19:02:28,157 [INFO] [Sorter \{scope_302 -> scope_308} #1|#1]
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=260,
> length=1, time=0
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: Span261.length = 1, perItem = 128
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: reserved.remaining()=268396781, reserved.metasize=16
> 2024-08-19 19:02:28,157 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: New Span262.length = 1, perItem = 128, counter:5307004
> 2024-08-19 19:02:28,158 [INFO] [Sorter \{scope_302 -> scope_308} #0|#0]
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=261,
> length=1, time=0
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: Span262.length = 1, perItem = 145
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: reserved.remaining()=268396620, reserved.metasize=16
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: New Span263.length = 1, perItem = 145, counter:5307005
> 2024-08-19 19:02:28,158 [INFO] [Sorter \{scope_302 -> scope_308} #1|#1]
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=262,
> length=1, time=0
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: Span263.length = 1, perItem = 139
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: reserved.remaining()=268396465, reserved.metasize=16
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: New Span264.length = 1, perItem = 139, counter:5307006
> 2024-08-19 19:02:28,158 [INFO] [Sorter \{scope_302 -> scope_308} #0|#0]
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=263,
> length=1, time=0
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: Span264.length = 1, perItem = 129
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: reserved.remaining()=268396320, reserved.metasize=16
> 2024-08-19 19:02:28,158 [INFO] [TezChild] |impl.PipelinedSorter|: scope-302
> -> scope-308: New Span265.length = 1, perItem = 129, counter:5307007
> 2024-08-19 19:02:28,158 [INFO] [Sorter \{scope_302 -> scope_308} #1|#1]
> |impl.PipelinedSorter|: scope-302 -> scope-308: done sorting span=264,
> length=1, time=0
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)