Skye Wanderman-Milne has posted comments on this change.

Change subject: IMPALA-1578: fix text scanner to handle "\r\n" delimiters split 
across blocks
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/2803/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

Line 613: If so, the tuple after it is considered
        :     // part of the next scan range
> what if this isn't the last buffer of this scan range? i.e. why don't we ha
Good catch, we need to handle the non-eosr case differently.


http://gerrit.cloudera.org:8080/#/c/2803/2/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

Line 441: "/
> get_fs_path()
Done


Line 456:     assert result.data == ['abc', 'de', 'fg', 'hij', 'klm', 'no']
> could we also test cases where the \r\n span a buffer but not a scan range?
Done. Initially the new test actually passed due to another bug (with or 
without this fix), which I'll post a patch for soon. I'll post this updated 
patch for now though and you can imagine that it works as expected :)


-- 
To view, visit http://gerrit.cloudera.org:8080/2803
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id42b441674bb21517ad2788b99942a4b5dc55420
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Skye Wanderman-Milne <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to