[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. IMPALA-6137: fix text scanner split delim mem mgmt The bug was that the buffer pointed to by byte_buffer_ptr_ could be freed by ReleaseCompletedResources() before CheckForSplitDelimiter() was called. The simple fix is to copy out the single byte that is needed each time the buffer is filled. Testing: Ran exhaustive query tests under ASAN with --disable_mem_pools=true. Before the change test_text_split_delimiters reliably caused an ASAN failure when run with --disable_mem_pools=true. We should get this coverage automatically once the I/O mgr switches to using the buffer pool, which uses ASAN poisoning on freed buffers. Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Reviewed-on: http://gerrit.cloudera.org:8080/8438 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 2 files changed, 43 insertions(+), 7 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 03 Nov 2017 21:38:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h File be/src/exec/hdfs-text-scanner.h: http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h@144 PS5, Line 144: in > if? Done -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 03 Nov 2017 18:15:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Hello anujphadke, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8438 to look at the new patch set (#6). Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. IMPALA-6137: fix text scanner split delim mem mgmt The bug was that the buffer pointed to by byte_buffer_ptr_ could be freed by ReleaseCompletedResources() before CheckForSplitDelimiter() was called. The simple fix is to copy out the single byte that is needed each time the buffer is filled. Testing: Ran exhaustive query tests under ASAN with --disable_mem_pools=true. Before the change test_text_split_delimiters reliably caused an ASAN failure when run with --disable_mem_pools=true. We should get this coverage automatically once the I/O mgr switches to using the buffer pool, which uses ASAN poisoning on freed buffers. Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 2 files changed, 43 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/6 -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1431/ -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 03 Nov 2017 18:16:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 6: Code-Review+2 carry -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 03 Nov 2017 18:15:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 5: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h File be/src/exec/hdfs-text-scanner.h: http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h@144 PS5, Line 144: in if? -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 03 Nov 2017 18:10:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Hello anujphadke, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8438 to look at the new patch set (#5). Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. IMPALA-6137: fix text scanner split delim mem mgmt The bug was that the buffer pointed to by byte_buffer_ptr_ could be freed by ReleaseCompletedResources() before CheckForSplitDelimiter() was called. The simple fix is to copy out the single byte that is needed each time the buffer is filled. Testing: Ran exhaustive query tests under ASAN with --disable_mem_pools=true. Before the change test_text_split_delimiters reliably caused an ASAN failure when run with --disable_mem_pools=true. We should get this coverage automatically once the I/O mgr switches to using the buffer pool, which uses ASAN poisoning on freed buffers. Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 2 files changed, 43 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/5 -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h File be/src/exec/hdfs-text-scanner.h: http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@90 PS4, Line 90: /// CheckForSplitDelimiter(). Valid if 'byte_buffer_filled_' is true. > Maybe briefly explain why we have it: the character is needed after the byt Done http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@144 PS4, Line 144: virtual > I'm not following why we need a second virtual method. Is the wrapper meant We don't, that was my mistake. Removed the virtual. -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 03 Nov 2017 17:40:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h File be/src/exec/hdfs-text-scanner.h: http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@90 PS4, Line 90: /// CheckForSplitDelimiter(). Valid if 'byte_buffer_filled_' is true. Maybe briefly explain why we have it: the character is needed after the byte buffer is returned, or whatever. http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@144 PS4, Line 144: virtual I'm not following why we need a second virtual method. Is the wrapper meant to always apply to all subclasses? If so, it shouldn't need (or want it) to be virtual right? -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 03 Nov 2017 16:52:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Hello anujphadke, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8438 to look at the new patch set (#4). Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. IMPALA-6137: fix text scanner split delim mem mgmt The bug was that the buffer pointed to by byte_buffer_ptr_ could be freed by ReleaseCompletedResources() before CheckForSplitDelimiter() was called. The simple fix is to copy out the single byte that is needed each time the buffer is filled. Testing: Ran exhaustive query tests under ASAN with --disable_mem_pools=true. Before the change test_text_split_delimiters reliably caused an ASAN failure when run with --disable_mem_pools=true. We should get this coverage automatically once the I/O mgr switches to using the buffer pool, which uses ASAN poisoning on freed buffers. Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 2 files changed, 41 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/4 -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: (2 comments) The newest patch is my attempt at a minimal fix that addresses the edge case we discussed and works around the problem that FillByteBuffer() is overridable. http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.h File be/src/exec/hdfs-text-scanner.h: http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.h@143 PS2, Line 143: virtual Status FillByteBuffer(MemPool* pool, bool* eosr, int num_bytes = 0) > *sigh* I found out that this is virtual because the text scanner plugin mec I don't think I want to change the extension mechanism as part of this bug fix - filed IMPALA-6146. http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717 PS2, Line 717: byte_buffer_last_byte_ > I couldn't see an obvious code path that would take us there. Maybe some sp I couldn't come up with a test that would cause this, but I added logic to avoid reading byte_buffer_last_byte_ when it wasn't initialised to a valid value. -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Thu, 02 Nov 2017 20:12:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Hello anujphadke, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8438 to look at the new patch set (#3). Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. IMPALA-6137: fix text scanner split delim mem mgmt The bug was that the buffer pointed to by byte_buffer_ could be freed by ReleaseCompletedResources() before CheckForSplitDelimiter() was called. The simple fix is to copy out the single byte that is needed each time the buffer is filled. Testing: Ran exhaustive query tests under ASAN with --disable_mem_pools=true. Before the change test_text_split_delimiters reliably caused an ASAN failure when run with --disable_mem_pools=true. We should get this coverage automatically once the I/O mgr switches to using the buffer pool, which uses ASAN poisoning on freed buffers. Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 2 files changed, 41 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/3 -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717 PS2, Line 717: byte_buffer_last_byte_ > Oops, didn't read line 712 ;) I couldn't see an obvious code path that would take us there. Maybe some specially-constructed compressed data that expanded to zero bytes? Will think some more. -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 21:27:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717 PS2, Line 717: byte_buffer_last_byte_ > The byte_buffer_read_size_ check above prevents using a stale value from a Oops, didn't read line 712 ;) That does seem like a possibility, and seems worth addressing. But is there a way we can exercise that? -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 21:20:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717 PS2, Line 717: byte_buffer_last_byte_ > given that this doesn't happen when readsize is 0, is it possible to use a The byte_buffer_read_size_ check above prevents using a stale value from a previous byte buffer. There's a related scenario where I can't convince myself that it is or isn't possible: if the last FillByteBuffer() call filled zero bytes but the call before that had a split delimiter at the end of the buffer, we could potentially ignore the split delimiter, even though it was the last thing we read from the range. We could fix that by tracking whether *any* bytes were read - if so, it means that byte_buffer_last_byte_ is valid and is the last byte read. What do you think? Should I try to more explicitly prevent that scenario? I don't think this change makes it more or less likely to hit. -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 21:10:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717 PS2, Line 717: byte_buffer_last_byte_ given that this doesn't happen when readsize is 0, is it possible to use a stale value here? -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 20:49:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has removed Taras Bobrovytsky from this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Removed reviewer Taras Bobrovytsky. -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 17:29:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: Could you have a look, Taras? I know you recently did a little work on memory management in the text scanners. Hopefully the bug/fix should be fairly clear-cut. -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 16:58:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8438/2//COMMIT_MSG@9 PS2, Line 9: The bug was that the buffer pointed to by byte_buffer_ could be freed by > Do you mean byte_buffer_ptr_? Yeah I will fix in the next iteration. -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 16:16:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 ) Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8438/2//COMMIT_MSG@9 PS2, Line 9: The bug was that the buffer pointed to by byte_buffer_ could be freed by Do you mean byte_buffer_ptr_? -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 01 Nov 2017 08:52:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8438 Change subject: IMPALA-6137: fix text scanner split delim mem mgmt .. IMPALA-6137: fix text scanner split delim mem mgmt The bug was that the buffer pointed to by byte_buffer_ could be freed by ReleaseCompletedResources() before CheckForSplitDelimiter() was called. The simple fix is to copy out the single byte that is needed each time the buffer is filled. Testing: Ran exhaustive query tests under ASAN with --disable_mem_pools=true. Before the change test_text_split_delimiters reliably caused an ASAN failure when run with --disable_mem_pools=true. We should get this coverage automatically once the I/O mgr switches to using the buffer pool, which uses ASAN poisoning on freed buffers. Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h 2 files changed, 10 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/1 -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong