[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads In order to use HDFS hedged reads, the hdfsPread API must be used instead of the hdfsRead() call. Adds a flag to use hdfsPread: --use_hdfs_pread Testing: * Running existing tests with this flag enabled. * Cluster testing with HDFS hedged reads enabled via the HDFS client config. * Manually tested setting the 'max_chunk_size' to a small value to force multiple iterations of the while loop which would only normally happen on S3. Tested reading lineitem was OK. Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Reviewed-on: http://gerrit.cloudera.org:8080/5635 Reviewed-by: Matthew JacobsTested-by: Impala Public Jenkins --- M be/src/runtime/disk-io-mgr-scan-range.cc 1 file changed, 13 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Verified Matthew Jacobs: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 3: Code-Review+2 rebase carry +2, needed 0ee6d19 - (asf-gerrit/master) IMPALA-4742: Change "{}".format() to "{0}".format() for Py 2.6 -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 2: Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/671/ -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Huaisi Xu has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 2: > > The HDFS-5776 JIRA mentions that it's implemented for the pread() > > code path, but it's not that clear. If you look at the code it's > > very clear that it only kicks in on the pread() code path: > > > > https://github.com/apache/hadoop/search?utf8=%E2%9C%93=isHedgedReadsEnabled > > I may be wrong again (as always), hdfspread is not pread. from > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c#L1426, > you can see hdfspread() invokes read(), and hdfsread() calls read() > as well, and in turn read() calls pread() as you can see in the > link you just quoted. So this new gflag is essentially ignored? > What did I miss here? I missed. it calls a different signature. hdfspread calls read(long position, byte[] buffer, int offset, int length). thanks for pointing it out.. -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Huaisi Xu has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 2: > The HDFS-5776 JIRA mentions that it's implemented for the pread() > code path, but it's not that clear. If you look at the code it's > very clear that it only kicks in on the pread() code path: > > https://github.com/apache/hadoop/search?utf8=%E2%9C%93=isHedgedReadsEnabled I may be wrong again (as always), hdfspread is not pread. from https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c#L1426, you can see hdfspread() invokes read(), and hdfsread() calls read() as well, and in turn read() calls pread() as you can see in the link you just quoted. So this new gflag is essentially ignored? What did I miss here? -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 2: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/669/ -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Dan Hecht has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Huaisi Xu has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5635/2/be/src/runtime/disk-io-mgr-scan-range.cc File be/src/runtime/disk-io-mgr-scan-range.cc: PS2, Line 408: hdfsPread could you point to the reference that we need hdfsPread in order to make use of hdfs hedged read? from HDFS-5776 looks like it is automatically enabled for Impala if the HDFS client is configured so? -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5635/1/be/src/runtime/disk-io-mgr-scan-range.cc File be/src/runtime/disk-io-mgr-scan-range.cc: PS1, Line 406: buffer > Don't we need to add *bytes_read to buffer like below? Done -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5635/1/be/src/runtime/disk-io-mgr-scan-range.cc File be/src/runtime/disk-io-mgr-scan-range.cc: PS1, Line 406: buffer Don't we need to add *bytes_read to buffer like below? -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. Patch Set 1: Did some quick tests locally to make sure HDFS still works; currently running jenkins test w/ this enabled by default. -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/5635 Change subject: IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads .. IMPALA-4740: Add option to use hdfsPread() for HDFS hedged reads In order to use HDFS hedged reads, the hdfsPread API must be used instead of the hdfsRead() call. Adds a flag to use hdfsPread: --use_hdfs_pread Testing: Running existing tests with this flag enabled. Cluster testing with HDFS hedged reads enabled via the HDFS client config. Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 --- M be/src/runtime/disk-io-mgr-scan-range.cc 1 file changed, 11 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/5635/1 -- To view, visit http://gerrit.cloudera.org:8080/5635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iecc8b12aa20cbfe08f4ef6a08a191e49709d9525 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs