[
https://issues.apache.org/jira/browse/HBASE-28068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769987#comment-17769987
]
Hudson commented on HBASE-28068:
--------------------------------
Results for branch branch-3
[build #56 on
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/56/]:
(x) *{color:red}-1 overall{color}*
----
details (if available):
(/) {color:green}+1 general checks{color}
-- For more information [see general
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/56/General_20Nightly_20Build_20Report/]
(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3)
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/56/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]
(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/56/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]
(/) {color:green}+1 source release artifact{color}
-- See build output for details.
(/) {color:green}+1 client integration test{color}
> Add hbase.normalizer.merge.merge_request_max_number_of_regions property to
> limit max number of regions in a merge request for merge normalization
> -------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-28068
> URL: https://issues.apache.org/jira/browse/HBASE-28068
> Project: HBase
> Issue Type: Improvement
> Components: Normalizer
> Affects Versions: 2.4.0, 2.5.0, 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
> Reporter: Ravi Kishore Valeti
> Assignee: Rahul Kumar
> Priority: Minor
> Fix For: 2.6.0, 2.4.18, 2.5.6, 3.0.0-beta-1, 4.0.0-alpha-1
>
>
> In our production environment, while investigating an issue, we observed that
> the Noramlizer had scheduled one single merge procedure to an RS providing
> 27K+ empty regions of a table (this was a result of a failed copy table job
> that left 27K+ empty regions of the table) to merge.
> This action led the procedure to go to stuck state and eventually the
> procedure framework bailed out after ~40mins. This was happening with each
> normalizer run until we deleted the table manually.
> Logs
> Normalizer triggers a merge procedure
> normalizer.RegionNormalizerWorker - NormalizationTarget[regionInfo=\{ENCODED
> => 6e8606335a62f6bafceb017dc7edfdf5, NAME => 'TEST.TEST_TABLE,XXXX.',
> STARTKEY => 'XXXX', ENDKEY => 'YYYY'},{*}regionSizeMb=0{*}],
> NormalizationTarget[regionInfo=\{ENCODED => 79607df308d7618e632abe8a12c1bf6b,
> NAME => 'TEST.TEST_TABLE,XXXX', STARTKEY => 'XXYY', ENDKEY =>
> 'YYZZ'},{*}regionSizeMb=0]{*}]] resulting in *pid 21968356*
> procedure immediately gets stuck
> procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run
> time 12.4850 sec
> Finally fails after ~40 mins
> procedure2.ProcedureExecutor - Worker *stuck* PEWorker-56(pid=21968356), run
> time *40 mins, 58.055 sec*
> Bails out with RuntimeException
> procedure2.ProcedureExecutor - force=false
> java.lang.UnsupportedOperationException: pid=21968356,
> state=FAILED:MERGE_TABLE_REGIONS_UPDATE_META, locked=true,
> exception=java.lang.{*}RuntimeException via CODE-BUG: Uncaught runtime
> exception{*}: pid=21968356, state=RUNNABLE:MERGE_TABLE_REGIONS_UPDATE_META,
> locked=true; MergeTableRegionsProcedure table=TEST.TEST_TABLEXXXX,
> {*}regions={*}{*}[269a1b168af497cce9ba6d3d581568f2{*}
> .
> .
> .
> .
> *27K+ regions printed here]*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)