Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 )
Change subject: IMPALA-6137: fix text scanner split delim mem mgmt ...................................................................... Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717 PS2, Line 717: byte_buffer_last_byte_ > given that this doesn't happen when readsize is 0, is it possible to use a The byte_buffer_read_size_ check above prevents using a stale value from a previous byte buffer. There's a related scenario where I can't convince myself that it is or isn't possible: if the last FillByteBuffer() call filled zero bytes but the call before that had a split delimiter at the end of the buffer, we could potentially ignore the split delimiter, even though it was the last thing we read from the range. We could fix that by tracking whether *any* bytes were read - if so, it means that byte_buffer_last_byte_ is valid and is the last byte read. What do you think? Should I try to more explicitly prevent that scenario? I don't think this change makes it more or less likely to hit. -- To view, visit http://gerrit.cloudera.org:8080/8438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c Gerrit-Change-Number: 8438 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: anujphadke <apha...@cloudera.com> Gerrit-Comment-Date: Wed, 01 Nov 2017 21:10:37 +0000 Gerrit-HasComments: Yes