[ https://issues.apache.org/jira/browse/KUDU-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Will Berkeley updated KUDU-2812: -------------------------------- Summary: Problem with error reporting in kudu-spark and kudu-backup (was: TOCTOU problem with error reporting in kudu-spark and kudu-backup) > Problem with error reporting in kudu-spark and kudu-backup > ---------------------------------------------------------- > > Key: KUDU-2812 > URL: https://issues.apache.org/jira/browse/KUDU-2812 > Project: Kudu > Issue Type: Bug > Components: backup, spark > Affects Versions: 1.9.0 > Reporter: Will Berkeley > Priority: Major > Fix For: 1.10.0 > > > In KuduRestore.scala we have code like > {noformat} > // Fail the task if there are any errors. > val errorCount = session.getPendingErrors.getRowErrors.length > if (errorCount > 0) { > val errors = > > session.getPendingErrors.getRowErrors.take(5).map(_.getErrorStatus).mkString > throw new RuntimeException( > s"failed to write $errorCount rows from DataFrame to Kudu; > sample errors: $errors") > } > {noformat} > There's similar code in KuduContext.scala: > {noformat} > val errorCount = pendingErrors.getRowErrors.length > if (errorCount > 0) { > val errors = > pendingErrors.getRowErrors.take(5).map(_.getErrorStatus).mkString > throw new RuntimeException( > s"failed to write $errorCount rows from DataFrame to Kudu; sample > errors: $errors") > } > {noformat} > I've seen the former fail to print any sample errors. Taking a reference to > {{session.getPendingErrors.getRowErrors}} and using that through fixes this, > so it seems like there's some TOCTOU problem that can occur, probably because > multiple batches can be in flight at once. > The latter is most likely vulnerable to this as well. > This issue made diagnosing KUDU-2809 harder. -- This message was sent by Atlassian JIRA (v7.6.3#76005)