Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9525 )
Change subject: IMPALA-6389: Make '\0' delimited text files work ...................................................................... Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser-test.cc File be/src/exec/delimited-text-parser-test.cc: http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser-test.cc@160 PS2, Line 160: TupleDelimitedTextParser nul_delim_parser(NUM_COLS, 0, is_materialized_col, NUL_DELIM); should we also test nul tuple delim with a column delim? if not, why is that case not interesting? http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser.h File be/src/exec/delimited-text-parser.h: http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser.h@41 PS2, Line 41: thme them http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser.h@55 PS2, Line 55: char field_delim_ = '\0', char collection_item_delim = '^', : char escape_char = '\0'); do these default argument values still have special meanings? can the delims be equal or do they need to be distinct from each other? please update the comment above to indicate anything interesting. does it still make sense to have these default arguments? http://gerrit.cloudera.org:8080/#/c/9525/2/be/src/exec/delimited-text-parser.h@171 PS2, Line 171: bool IsFieldOrCollectionItemDelimiter(char c) { comment for this -- it's surprisingly non-trivial to understand what this is (e.g. what does the field_delim_ == tuple_delim_ case mean? are they allowed to be equal?) http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.inline.h File be/src/exec/delimited-text-parser.inline.h: http://gerrit.cloudera.org:8080/#/c/9525/1/be/src/exec/delimited-text-parser.inline.h@163 PS1, Line 163: tuple_delim_ || (tuple_delim_ > Subtly, no. Any delimiter of '\0' will not be included in the xmm_delim_se I don't follow that. When DELIMITED_TUPLES is false, the comment says tuple_delim_ can be set to anything. Also, isn't the point to handle '\0' as a delimiter, so why is that not included in the search field? -- To view, visit http://gerrit.cloudera.org:8080/9525 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8 Gerrit-Change-Number: 9525 Gerrit-PatchSet: 2 Gerrit-Owner: Zach Amsden <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zach Amsden <[email protected]> Gerrit-Comment-Date: Wed, 14 Mar 2018 00:41:41 +0000 Gerrit-HasComments: Yes
