[
https://issues.apache.org/jira/browse/KUDU-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Henke updated KUDU-2812:
------------------------------
Summary: Problem with error reporting in kudu-backup (was: Problem with
error reporting in kudu-spark and kudu-backup)
> Problem with error reporting in kudu-backup
> -------------------------------------------
>
> Key: KUDU-2812
> URL: https://issues.apache.org/jira/browse/KUDU-2812
> Project: Kudu
> Issue Type: Bug
> Components: backup, spark
> Affects Versions: 1.9.0
> Reporter: Will Berkeley
> Priority: Major
> Labels: backup
> Fix For: 1.10.0
>
>
> In KuduRestore.scala we have code like
> {noformat}
> // Fail the task if there are any errors.
> val errorCount = session.getPendingErrors.getRowErrors.length
> if (errorCount > 0) {
> val errors =
>
> session.getPendingErrors.getRowErrors.take(5).map(_.getErrorStatus).mkString
> throw new RuntimeException(
> s"failed to write $errorCount rows from DataFrame to Kudu;
> sample errors: $errors")
> }
> {noformat}
> There's similar code in KuduContext.scala:
> {noformat}
> val errorCount = pendingErrors.getRowErrors.length
> if (errorCount > 0) {
> val errors =
> pendingErrors.getRowErrors.take(5).map(_.getErrorStatus).mkString
> throw new RuntimeException(
> s"failed to write $errorCount rows from DataFrame to Kudu; sample
> errors: $errors")
> }
> {noformat}
> I've seen the former fail to print any sample errors. Taking a reference to
> {{session.getPendingErrors.getRowErrors}} and using that through fixes this,
> so it seems like there's some TOCTOU problem that can occur, probably because
> multiple batches can be in flight at once.
> The latter is most likely vulnerable to this as well.
> This issue made diagnosing KUDU-2809 harder.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)