[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. IMPALA-6394: Restart HDFS when blocks are under replicated HDFS sometimes fails to fully replicate all the blocks in 30 seconds and no progress is made. This patch tries to restart HDFS several times before aborting the data loading. Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Reviewed-on: http://gerrit.cloudera.org:8080/9469 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins --- M testdata/bin/create-load-data.sh 1 file changed, 14 insertions(+), 8 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 5 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 22:54:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2076/ -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 19:16:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. Patch Set 4: Code-Review+2 Thanks for continuing to try and fix this persistent issue -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 04:54:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9469/2/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/9469/2/testdata/bin/create-load-data.sh@464 PS2, Line 464: > I just recalled that there shouldn't be a space because those are 2 argumen Interesting, I wasn't aware of this. It looks like this depends on the leading whitespace of the following line. Alternatively you could use a heredoc, but the current solution works for me. -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Thu, 08 Mar 2018 23:00:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Tianyi Wang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. IMPALA-6394: Restart HDFS when blocks are under replicated HDFS sometimes fails to fully replicate all the blocks in 30 seconds and no progress is made. This patch tries to restart HDFS several times before aborting the data loading. Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 --- M testdata/bin/create-load-data.sh 1 file changed, 14 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/9469/4 -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/9469/3/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/9469/3/testdata/bin/create-load-data.sh@462 PS3, Line 462: if [[ "$RESTART_COUNT" -eq "$MAX_RETRIES" ]] ; then We only enter the loop body when RESTART_COUNT < MAX_RETRIES, so this condition can never be satisfied. http://gerrit.cloudera.org:8080/#/c/9469/3/testdata/bin/create-load-data.sh@469 PS3, Line 469: ${IMPALA_HOME}/testdata/bin/run-mini-dfs.sh Let's echo a message that there are underreplicated blocks and that we will restart HDFS to try to resolve that issue. -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Tue, 06 Mar 2018 21:41:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/9469/2/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/9469/2/testdata/bin/create-load-data.sh@460 PS2, Line 460: return > I changed it to a for loop and renamed FAIL_COUNT into RESTART_COUNT. I did You could also call fsck once before entering the loop. I don't feel strongly about it though. -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Mon, 05 Mar 2018 21:37:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Tianyi Wang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. IMPALA-6394: Restart HDFS when blocks are under replicated HDFS sometimes fails to fully replicate all the blocks in 30 seconds and no progress is made. This patch tries to restart HDFS several times before aborting the data loading. Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 --- M testdata/bin/create-load-data.sh 1 file changed, 12 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/9469/3 -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Tianyi Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. IMPALA-6394: Restart HDFS when blocks are under replicated HDFS sometimes fails to fully replicate all the blocks in 30 seconds and no progress is made. This patch tries to restart HDFS several times before aborting the data loading. Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 --- M testdata/bin/create-load-data.sh 1 file changed, 11 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/9469/2 -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/9469/1/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/9469/1/testdata/bin/create-load-data.sh@454 PS1, Line 454: true I think it would be easier to read if we kept the loop condition here, e.g. while [[ $FAIL_COUNT -lt $MAX_RETRIES ]]; then ... and then after the loop check if [[ $FAIL_COUNT -eq $MAX_RETRIES ]]; then exit 1; http://gerrit.cloudera.org:8080/#/c/9469/1/testdata/bin/create-load-data.sh@459 PS1, Line 459: r Is this the return on success? http://gerrit.cloudera.org:8080/#/c/9469/1/testdata/bin/create-load-data.sh@462 PS1, Line 462: 6 Can you move this into a variable MAX_RETRIES and re-use that inside the error message? -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 02 Mar 2018 23:04:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Tianyi Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9469 Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. IMPALA-6394: Restart HDFS when blocks are under replicated HDFS sometimes fails to fully replicate all the blocks in 30 seconds and no progress is made. This patch tries to restart HDFS several times before aborting the data loading. Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 --- M testdata/bin/create-load-data.sh 1 file changed, 9 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/9469/1 -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang