Skye Wanderman-Milne has posted comments on this change. Change subject: IMPALA-1578: fix text scanner to handle "\r\n" delimiters split across blocks ......................................................................
Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/2803/4/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: Line 629: byte_buffer_ptr_ += 1; > why is this needed now? was this case broken too (i.e. \r\n at a buffer bou It happened to work because of another bug I found where the SSE path in DelimitedTextParser::FindFirstInstance() wouldn't report \r delimiters. Since we use 8mb buffer sizes by default and this is called at the beginning of a buffer, the SSE code would completely consume the buffer so we would report no tuple found and then correctly find the \n in the next buffer. I'm working on a patch for that fix now (still need to file a JIRA too). -- To view, visit http://gerrit.cloudera.org:8080/2803 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id42b441674bb21517ad2788b99942a4b5dc55420 Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Skye Wanderman-Milne <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
