[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 13 Mar 2018 02:40:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. IMPALA-6638: Reduce file handle cache lock contention FileHandleCache::OpenFileHandle() currently holds the lock while opening a file handle. This lengthens the duration holding the lock considerably, causing contention when a lot of file handles are being opened (i.e. when the cache is cold). This changes FileHandleCache::OpenFileHandle() drops the lock while opening the file handle, then reacquires it to add the file handle to the cache. When running a simple select on a table with 46801 Parquet files, this fixes a small performance overhead for the cold cache case compared to having the cache off. All results are averaged over 5 runs: File handle cache off: 7.19s File handle cache cold (no patch): 7.92s File handle cache cold (with patch): 7.20s When running a select that accesses and aggregates 4 columns from the same Parquet table (thus requiring more scan ranges for the same file), the fix reduces contention, particularly for the case with 8 IO threads per disk: 1 IO thread per disk: Without patch: 9.59s With patch: 8.11s 8 IO threads per disk: Without patch: 14.2s With patch: 8.17s The patch has no impact on the performance when the cache is hot. Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Reviewed-on: http://gerrit.cloudera.org:8080/9576 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins --- M be/src/runtime/io/handle-cache.inline.h 1 file changed, 37 insertions(+), 24 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2089/ -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Mon, 12 Mar 2018 23:02:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. Patch Set 4: Code-Review+2 Rebase, carry +2 -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Mon, 12 Mar 2018 23:02:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. Patch Set 3: Code-Review+2 Thanks, the new comments make it clearer why we're doing what we're doing. -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Mon, 12 Mar 2018 22:10:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h File be/src/runtime/io/handle-cache.inline.h: http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@128 PS2, Line 128: This only happens if there is no entry for this file or all the : // existing file handles are in use. > That was true while holding the lock above but is no longer necessarily tru Changed this to insert in the front and updated this comment about why. Also added a comment when iterating above to explain why iterating in order is important. http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@130 PS2, Line 130: matter whether we insert this entry before or after or in between the : // existing entries for this file > why is the previous sentence needed in order for this to be okay? This was poorly worded. What I meant is that not much time has passed since we saw that there wasn't any entry available. Any ordering decision is mostly unimportant under that circumstance. I thought about it some more, and we are very slightly better off explicitly specifying the ordering. It is mostly an edge case, but inserting at the front solves it. http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@134 PS2, Line 134: FileHandleEntry entry(new_fh, p.lru_list); : typename MapType::iterator new_it = p.cache.emplace(*fname, std::move(entry)); > I would remove the constructor and move the arguments to emplace: emplace(* I appreciate the comment, but I try to avoid style changes in adjacent code. http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@137 PS2, Line 137: ++p.size; : if (p.size > p.capacity) EvictHandles(p); > I would prefer to move this before the creation of the new entry, to make i I appreciate the comment, but I try to avoid style changes in adjacent code. http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@139 PS2, Line 139: DCHECK(!new_elem->in_use); > This DCHECK could be probably removed. This could go either way. We are relying on the constructor setting in_use to false. I figure a DCHECK doesn't hurt. -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Mon, 12 Mar 2018 21:51:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Hello Csaba Ringhofer, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9576 to look at the new patch set (#3). Change subject: IMPALA-6638: Reduce file handle cache lock contention .. IMPALA-6638: Reduce file handle cache lock contention FileHandleCache::OpenFileHandle() currently holds the lock while opening a file handle. This lengthens the duration holding the lock considerably, causing contention when a lot of file handles are being opened (i.e. when the cache is cold). This changes FileHandleCache::OpenFileHandle() drops the lock while opening the file handle, then reacquires it to add the file handle to the cache. When running a simple select on a table with 46801 Parquet files, this fixes a small performance overhead for the cold cache case compared to having the cache off. All results are averaged over 5 runs: File handle cache off: 7.19s File handle cache cold (no patch): 7.92s File handle cache cold (with patch): 7.20s When running a select that accesses and aggregates 4 columns from the same Parquet table (thus requiring more scan ranges for the same file), the fix reduces contention, particularly for the case with 8 IO threads per disk: 1 IO thread per disk: Without patch: 9.59s With patch: 8.11s 8 IO threads per disk: Without patch: 14.2s With patch: 8.17s The patch has no impact on the performance when the cache is hot. Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d --- M be/src/runtime/io/handle-cache.inline.h 1 file changed, 37 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/9576/3 -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h File be/src/runtime/io/handle-cache.inline.h: http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@128 PS2, Line 128: This only happens if there is no entry for this file or all the : // existing file handles are in use. That was true while holding the lock above but is no longer necessarily true since we no longer hold the lock -- there might now be an unused entry for this file. So how can we rely on that? http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@130 PS2, Line 130: matter whether we insert this entry before or after or in between the : // existing entries for this file why is the previous sentence needed in order for this to be okay? -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Mon, 12 Mar 2018 19:28:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. Patch Set 2: Code-Review+1 (3 comments) Thanks for the explanation! I have left some optional comments, but the code is ok for me as it is. http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h File be/src/runtime/io/handle-cache.inline.h: http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@134 PS2, Line 134: FileHandleEntry entry(new_fh, p.lru_list); : typename MapType::iterator new_it = p.cache.emplace(*fname, std::move(entry)); I would remove the constructor and move the arguments to emplace: emplace(*fname, new_fh, p.lru_list) or emplace(*fname, FileHandleEntry(new_fh, p.lru_list)) http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@137 PS2, Line 137: ++p.size; : if (p.size > p.capacity) EvictHandles(p); I would prefer to move this before the creation of the new entry, to make it clear that EvictHandles() can not have any effect on it. This would also make it a little little bit faster, because evicted elements would be removed from a smaller map. http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@139 PS2, Line 139: DCHECK(!new_elem->in_use); This DCHECK could be probably removed. -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Mon, 12 Mar 2018 18:56:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. Patch Set 1: (1 comment) > (1 comment) > > This change can increase the number of parallel calls to the > namenode by FileHandleCache, which was limited by the number of > partitions before. I do not know if this can cause problems or not, > but I would prefer to mention it as a possible side effect, or > limit the number of parallel hdfsOpenFile calls somehow. Good point. This code executes in the disk IO threads, and there are a limited number of disk IO threads (1/spinning disk, 8/SSD, etc). This can be larger than the number of partitions for the cache, but it is limited. The reason that I consider it ok to increase the parallelism is that we already use this level of parallelism when the file handle cache is off. When the cache is off, we use DiskIoMgr::GetExclusiveHdfsFileHandle(), which doesn't need to get any lock. This is fixing up the file handle cache so that it can be as fast as when the cache is off. http://gerrit.cloudera.org:8080/#/c/9576/1/be/src/runtime/io/handle-cache.inline.h File be/src/runtime/io/handle-cache.inline.h: http://gerrit.cloudera.org:8080/#/c/9576/1/be/src/runtime/io/handle-cache.inline.h@129 PS1, Line 129: pair range = : p.cache.equal_range(*fname); : FileHandleEntry entry(new_fh, p.lru_list); : typename MapType::iterator new_it = p.cache.emplace_hint(range.second, : *fname, std::move(entry)); > Does the hint logic make a difference anymore? I think that it would be bet Good point, that is simpler/better. Done. -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Mon, 12 Mar 2018 17:24:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Hello Csaba Ringhofer, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9576 to look at the new patch set (#2). Change subject: IMPALA-6638: Reduce file handle cache lock contention .. IMPALA-6638: Reduce file handle cache lock contention FileHandleCache::OpenFileHandle() currently holds the lock while opening a file handle. This lengthens the duration holding the lock considerably, causing contention when a lot of file handles are being opened (i.e. when the cache is cold). This changes FileHandleCache::OpenFileHandle() drops the lock while opening the file handle, then reacquires it to add the file handle to the cache. When running a simple select on a table with 46801 Parquet files, this fixes a small performance overhead for the cold cache case compared to having the cache off. All results are averaged over 5 runs: File handle cache off: 7.19s File handle cache cold (no patch): 7.92s File handle cache cold (with patch): 7.20s When running a select that accesses and aggregates 4 columns from the same Parquet table (thus requiring more scan ranges for the same file), the fix reduces contention, particularly for the case with 8 IO threads per disk: 1 IO thread per disk: Without patch: 9.59s With patch: 8.11s 8 IO threads per disk: Without patch: 14.2s With patch: 8.17s The patch has no impact on the performance when the cache is hot. Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d --- M be/src/runtime/io/handle-cache.inline.h 1 file changed, 30 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/9576/2 -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/9576 ) Change subject: IMPALA-6638: Reduce file handle cache lock contention .. Patch Set 1: (1 comment) This change can increase the number of parallel calls to the namenode by FileHandleCache, which was limited by the number of partitions before. I do not know if this can cause problems or not, but I would prefer to mention it as a possible side effect, or limit the number of parallel hdfsOpenFile calls somehow. http://gerrit.cloudera.org:8080/#/c/9576/1/be/src/runtime/io/handle-cache.inline.h File be/src/runtime/io/handle-cache.inline.h: http://gerrit.cloudera.org:8080/#/c/9576/1/be/src/runtime/io/handle-cache.inline.h@129 PS1, Line 129: pair range = : p.cache.equal_range(*fname); : FileHandleEntry entry(new_fh, p.lru_list); : typename MapType::iterator new_it = p.cache.emplace_hint(range.second, : *fname, std::move(entry)); Does the hint logic make a difference anymore? I think that it would be better to replace emplace_hint() with emplace() and remove equal_range(). -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Comment-Date: Mon, 12 Mar 2018 16:11:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9576 Change subject: IMPALA-6638: Reduce file handle cache lock contention .. IMPALA-6638: Reduce file handle cache lock contention FileHandleCache::OpenFileHandle() currently holds the lock while opening a file handle. This lengthens the duration holding the lock considerably, causing contention when a lot of file handles are being opened (i.e. when the cache is cold). This changes FileHandleCache::OpenFileHandle() drops the lock while opening the file handle, then reacquires it to add the file handle to the cache. Performance tests confirm that this fixes contention when the file handle cache is cold. It does not change performance when the cache is hot. Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d --- M be/src/runtime/io/handle-cache.inline.h 1 file changed, 28 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/9576/1 -- To view, visit http://gerrit.cloudera.org:8080/9576 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d Gerrit-Change-Number: 9576 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell