[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..

IMPALA-6137: fix text scanner split delim mem mgmt

The bug was that the buffer pointed to by byte_buffer_ptr_ could be freed
by ReleaseCompletedResources() before CheckForSplitDelimiter() was called.

The simple fix is to copy out the single byte that is needed each time
the buffer is filled.

Testing:
Ran exhaustive query tests under ASAN with --disable_mem_pools=true.

Before the change test_text_split_delimiters reliably caused an ASAN
failure when run with --disable_mem_pools=true. We should get this
coverage automatically once the I/O mgr switches to using the buffer
pool, which uses ASAN poisoning on freed buffers.

Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Reviewed-on: http://gerrit.cloudera.org:8080/8438
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
2 files changed, 43 insertions(+), 7 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 03 Nov 2017 21:38:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h
File be/src/exec/hdfs-text-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h@144
PS5, Line 144: in
> if?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 03 Nov 2017 18:15:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Tim Armstrong (Code Review)
Hello anujphadke, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8438

to look at the new patch set (#6).

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..

IMPALA-6137: fix text scanner split delim mem mgmt

The bug was that the buffer pointed to by byte_buffer_ptr_ could be freed
by ReleaseCompletedResources() before CheckForSplitDelimiter() was called.

The simple fix is to copy out the single byte that is needed each time
the buffer is filled.

Testing:
Ran exhaustive query tests under ASAN with --disable_mem_pools=true.

Before the change test_text_split_delimiters reliably caused an ASAN
failure when run with --disable_mem_pools=true. We should get this
coverage automatically once the I/O mgr switches to using the buffer
pool, which uses ASAN poisoning on freed buffers.

Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
2 files changed, 43 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/6
--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1431/


--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 03 Nov 2017 18:16:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 6: Code-Review+2

carry


--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 03 Nov 2017 18:15:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h
File be/src/exec/hdfs-text-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8438/5/be/src/exec/hdfs-text-scanner.h@144
PS5, Line 144: in
if?



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 03 Nov 2017 18:10:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Tim Armstrong (Code Review)
Hello anujphadke, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8438

to look at the new patch set (#5).

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..

IMPALA-6137: fix text scanner split delim mem mgmt

The bug was that the buffer pointed to by byte_buffer_ptr_ could be freed
by ReleaseCompletedResources() before CheckForSplitDelimiter() was called.

The simple fix is to copy out the single byte that is needed each time
the buffer is filled.

Testing:
Ran exhaustive query tests under ASAN with --disable_mem_pools=true.

Before the change test_text_split_delimiters reliably caused an ASAN
failure when run with --disable_mem_pools=true. We should get this
coverage automatically once the I/O mgr switches to using the buffer
pool, which uses ASAN poisoning on freed buffers.

Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
2 files changed, 43 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/5
--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h
File be/src/exec/hdfs-text-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@90
PS4, Line 90:   /// CheckForSplitDelimiter(). Valid if 'byte_buffer_filled_' is 
true.
> Maybe briefly explain why we have it: the character is needed after the byt
Done


http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@144
PS4, Line 144: virtual
> I'm not following why we need a second virtual method. Is the wrapper meant
We don't, that was my mistake. Removed the virtual.



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 03 Nov 2017 17:40:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h
File be/src/exec/hdfs-text-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@90
PS4, Line 90:   /// CheckForSplitDelimiter(). Valid if 'byte_buffer_filled_' is 
true.
Maybe briefly explain why we have it: the character is needed after the byte 
buffer is returned, or whatever.


http://gerrit.cloudera.org:8080/#/c/8438/4/be/src/exec/hdfs-text-scanner.h@144
PS4, Line 144: virtual
I'm not following why we need a second virtual method. Is the wrapper meant to 
always apply to all subclasses? If so, it shouldn't need (or want it) to be 
virtual right?



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 03 Nov 2017 16:52:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-02 Thread Tim Armstrong (Code Review)
Hello anujphadke, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8438

to look at the new patch set (#4).

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..

IMPALA-6137: fix text scanner split delim mem mgmt

The bug was that the buffer pointed to by byte_buffer_ptr_ could be freed
by ReleaseCompletedResources() before CheckForSplitDelimiter() was called.

The simple fix is to copy out the single byte that is needed each time
the buffer is filled.

Testing:
Ran exhaustive query tests under ASAN with --disable_mem_pools=true.

Before the change test_text_split_delimiters reliably caused an ASAN
failure when run with --disable_mem_pools=true. We should get this
coverage automatically once the I/O mgr switches to using the buffer
pool, which uses ASAN poisoning on freed buffers.

Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
2 files changed, 41 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/4
--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

(2 comments)

The newest patch is my attempt at a minimal fix that addresses the edge case we 
discussed and works around the problem that FillByteBuffer() is overridable.

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.h
File be/src/exec/hdfs-text-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.h@143
PS2, Line 143:   virtual Status FillByteBuffer(MemPool* pool, bool* eosr, int 
num_bytes = 0)
> *sigh* I found out that this is virtual because the text scanner plugin mec
I don't think I want to change the extension mechanism as part of this bug fix 
- filed IMPALA-6146.


http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717
PS2, Line 717: byte_buffer_last_byte_
> I couldn't see an obvious code path that would take us there. Maybe some sp
I couldn't come up with a test that would cause this, but I added logic to 
avoid reading byte_buffer_last_byte_ when it wasn't initialised to a valid 
value.



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 02 Nov 2017 20:12:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-02 Thread Tim Armstrong (Code Review)
Hello anujphadke, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8438

to look at the new patch set (#3).

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..

IMPALA-6137: fix text scanner split delim mem mgmt

The bug was that the buffer pointed to by byte_buffer_ could be freed by
ReleaseCompletedResources() before CheckForSplitDelimiter() was called.

The simple fix is to copy out the single byte that is needed each time
the buffer is filled.

Testing:
Ran exhaustive query tests under ASAN with --disable_mem_pools=true.

Before the change test_text_split_delimiters reliably caused an ASAN
failure when run with --disable_mem_pools=true. We should get this
coverage automatically once the I/O mgr switches to using the buffer
pool, which uses ASAN poisoning on freed buffers.

Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
2 files changed, 41 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/3
--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717
PS2, Line 717: byte_buffer_last_byte_
> Oops, didn't read line 712 ;)
I couldn't see an obvious code path that would take us there. Maybe some 
specially-constructed compressed data that expanded to zero bytes? Will think 
some more.



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 21:27:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717
PS2, Line 717: byte_buffer_last_byte_
> The byte_buffer_read_size_ check above prevents using a stale value from a
Oops, didn't read line 712 ;)

That does seem like a possibility, and seems worth addressing. But is there a 
way we can exercise that?



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 21:20:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717
PS2, Line 717: byte_buffer_last_byte_
> given that this doesn't happen when readsize is 0, is it possible to use a
The byte_buffer_read_size_ check above prevents using a stale value from a 
previous byte buffer.

There's a related scenario where I can't convince myself that it is or isn't 
possible: if the last FillByteBuffer() call filled zero bytes but the call 
before that had a split delimiter at the end of the buffer, we could 
potentially ignore the split delimiter, even though it was the last thing we 
read from the range. We could fix that by tracking whether *any* bytes were 
read - if so, it means that byte_buffer_last_byte_ is valid and is the last 
byte read.

What do you think? Should I try to more explicitly prevent that scenario? I 
don't think this change makes it more or less likely to hit.



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 21:10:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717
PS2, Line 717: byte_buffer_last_byte_
given that this doesn't happen when readsize is 0, is it possible to use a 
stale value here?



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 20:49:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has removed Taras Bobrovytsky from this change.  ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Removed reviewer Taras Bobrovytsky.
--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread anujphadke (Code Review)
anujphadke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 17:29:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

Could you have a look, Taras? I know you recently did a little work on memory 
management in the text scanners. Hopefully the bug/fix should be fairly 
clear-cut.


--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 16:58:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8438/2//COMMIT_MSG@9
PS2, Line 9: The bug was that the buffer pointed to by byte_buffer_ could be 
freed by
> Do you mean byte_buffer_ptr_?
Yeah I will fix in the next iteration.



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 16:16:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread anujphadke (Code Review)
anujphadke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8438/2//COMMIT_MSG@9
PS2, Line 9: The bug was that the buffer pointed to by byte_buffer_ could be 
freed by
Do you mean byte_buffer_ptr_?



--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 01 Nov 2017 08:52:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt

2017-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8438


Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
..

IMPALA-6137: fix text scanner split delim mem mgmt

The bug was that the buffer pointed to by byte_buffer_ could be freed by
ReleaseCompletedResources() before CheckForSplitDelimiter() was called.

The simple fix is to copy out the single byte that is needed each time
the buffer is filled.

Testing:
Ran exhaustive query tests under ASAN with --disable_mem_pools=true.

Before the change test_text_split_delimiters reliably caused an ASAN
failure when run with --disable_mem_pools=true. We should get this
coverage automatically once the I/O mgr switches to using the buffer
pool, which uses ASAN poisoning on freed buffers.

Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
2 files changed, 10 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8438/1
--
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong