bkietz commented on code in PR #39892:
URL: https://github.com/apache/arrow/pull/39892#discussion_r1476224449
##########
cpp/src/arrow/csv/parser_test.cc:
##########
@@ -376,6 +383,19 @@ TEST(BlockParser, TruncatedData) {
}
}
+TEST(BlockParser, TruncatedDataViews) {
+ // If non-last block is truncated, parsing stops
+ BlockParser parser(ParseOptions::Defaults(), /*num_cols=*/3);
+ AssertParsePartial(parser, Views({"a,b,", "c\n"}), 0);
+ // (XXX should we guarantee this one below?)
Review Comment:
I don't think so; BlockParser::Parse's docstring states that only the last
block is allowed to be truncated. That suggests that both these calls should
raise an error rather than return. IIUC, we expect the chunker to ensure that
the parser need not handle partial blocks. I think instead of unit testing the
parser in an erroneous condition, we should test the reader and ensure that an
error is raised and the parser never enters that condition.
##########
cpp/src/arrow/csv/parser_test.cc:
##########
@@ -376,6 +383,19 @@ TEST(BlockParser, TruncatedData) {
}
}
+TEST(BlockParser, TruncatedDataViews) {
+ // If non-last block is truncated, parsing stops
+ BlockParser parser(ParseOptions::Defaults(), /*num_cols=*/3);
+ AssertParsePartial(parser, Views({"a,b,", "c\n"}), 0);
+ // (XXX should we guarantee this one below?)
+ AssertParsePartial(parser, Views({"a,b,c\nd,", "e,f\n"}), 6);
+
+ // More sophisticated: non-last block ends on some newline inside a quoted
string
Review Comment:
o_O
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]