Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18536 )
Change subject: IMPALA-10465: Use IGNORE variant of Kudu write operations ...................................................................... Patch Set 5: (8 comments) Thank you Alexey for your feedback! I'm replying to some that we might want to discuss more. http://gerrit.cloudera.org:8080/#/c/18536/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18536/5//COMMIT_MSG@22 PS5, Line 22: 9 iterations > Just curious: why such number? Did the 10nth iteration fail for some reaso No particular reason. Just pick an odd number out of habit to get exact median number. http://gerrit.cloudera.org:8080/#/c/18536/5//COMMIT_MSG@34 PS5, Line 34: Additionally, this patch adds 'workload_iterations' option in : single_node_perf_run.py to loop a workload and modify DDL_CRUD_PATTERN : in query_executor.py to not not trigger EXPLAIN in regular CREATE : query (non CTAS). > Why not to separate that part in its own changelist for clarity? The kudu-ignore.test that I craft require these changes in Impala perf script. I'm OK to drop kudu-ignore.test and the related script changes to reduce the patch scope. http://gerrit.cloudera.org:8080/#/c/18536/5/be/src/exec/kudu-table-sink.cc File be/src/exec/kudu-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/18536/5/be/src/exec/kudu-table-sink.cc@254 PS5, Line 254: if (sink_action_ == TSinkAction::INSERT) { : return table_->NewInsertIgnore(); : } else if (sink_action_ == TSinkAction::UPDATE) { : return table_->NewUpdateIgnore(); : } else if (sink_action_ == TSinkAction::UPSERT) { : return table_->NewUpsert(); : } else { : DCHECK(sink_action_ == TSinkAction::DELETE) : << "Sink type not supported: " << sink_action_; : return table_->NewDeleteIgnore(); : } > Just curios: I used to see this is done usually via switch statements, not This is implemented with 'if/else' just to maintain consistency with the existing KuduTableSink::NewWriteOp(). http://gerrit.cloudera.org:8080/#/c/18536/5/be/src/exec/kudu-table-sink.cc@354 PS5, Line 354: + metrics.GetMetric("delete_ignore_errors"); > Does it make sense to add a DCHECK to verify that ignored_errors >= total_i Make sense. Will do. http://gerrit.cloudera.org:8080/#/c/18536/5/be/src/exec/kudu-table-sink.cc@356 PS5, Line 356: // Since ignore_conflicts_ is true, there should be no Kudu error message. : DCHECK(session_->CountPendingErrors() == 0); > Why? What if some other than duplicate/absent key error happened? My assumption is that XXX_IGNORE ops will not return any error message back to client, regardless of the root cause. Is that incorrect? http://gerrit.cloudera.org:8080/#/c/18536/5/be/src/exec/kudu-table-sink.cc@360 PS5, Line 360: if (is_transactional_) { : // Return general status error to abort transaction. : return Status("Kudu reported write operation errors during transaction."); : } > That's an interesting twist: with this, Impala unconditionally aborts multi This is to maintain previous behavior. In line 382, Impala immediately return error to abort transaction. With XXX_IGNORE ops usage, I think we should still abort the transaction even although the error message is discarded. Will clarify this as comment. http://gerrit.cloudera.org:8080/#/c/18536/5/be/src/exec/kudu-table-sink.cc@364 PS5, Line 364: return Status::OK(); > What if there were some UPSERT operation errors in addition? Is it safe to Impala does not mix write operation in single KuduTableSink instance lifetime. The sink_action_ will either be INSERT[_IGNORE], UPDATE[_IGNORE], or UPSERT and remain unchanged. Will add comment about this. http://gerrit.cloudera.org:8080/#/c/18536/5/fe/src/main/java/org/apache/impala/planner/KuduTableSink.java File fe/src/main/java/org/apache/impala/planner/KuduTableSink.java: http://gerrit.cloudera.org:8080/#/c/18536/5/fe/src/main/java/org/apache/impala/planner/KuduTableSink.java@75 PS5, Line 75: supportsIgnoreOperations_ = client.supportsIgnoreOperations(); > Is there a way to find out what mode KuduTableSink uses later on during its I think we can put an ExecOption info in Impala query profile about this. -- To view, visit http://gerrit.cloudera.org:8080/18536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8da7c41d61b0888378b390b8b643238433eb3b52 Gerrit-Change-Number: 18536 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Fri, 20 May 2022 21:36:46 +0000 Gerrit-HasComments: Yes
