n3world commented on a change in pull request #10255:
URL: https://github.com/apache/arrow/pull/10255#discussion_r644463979
##########
File path: cpp/src/arrow/csv/chunker.cc
##########
@@ -234,6 +235,38 @@ class LexingBoundaryFinder : public BoundaryFinder {
return Status::OK();
}
+ Result<uint64_t> FindN(util::string_view partial, util::string_view block,
+ uint64_t count, int64_t* out_pos) override {
+ Lexer<quoting, escaping> lexer(options_);
+ uint64_t found = 0;
+ const char* data = block.data();
+ const char* const data_end = block.data() + block.size();
+
+ const char* line_end;
+ if (partial.size()) {
Review comment:
Yes because the lexer cannot be called with an empty block of data. It
asserts that it is passed data and doesn't handle when `data == data_end`
##########
File path: cpp/src/arrow/csv/chunker_test.cc
##########
@@ -261,5 +291,43 @@ TEST_P(BaseChunkerTest, EscapingNewline) {
}
}
+TEST_P(BaseChunkerTest, ParseSkip) {
+ {
+ auto csv = MakeCSVData({"ab,c,\n", "def,,gh\n", ",ij,kl\n"});
+ ASSERT_NO_FATAL_FAILURE(AssertSkip(csv, 1, 0, 15));
Review comment:
It can be a problem for some tests which might cause a segfault or have
bad behavior if test execution fails after an ASSERT fails, I know this is not
one of those types of tests. Normally ASSERT failures halt test execution for
non halting checks it is standard to use the `EXPECT_` macros
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]