Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10941 )

Change subject: WIP Add error handling to KuduBackup DONT_BUILD KuduBackup will 
attempt to backup every table. If any one table fails the entire job will fail.
......................................................................


Patch Set 2:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/10941/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10941/2//COMMIT_MSG@7
PS2, Line 7: WIP Add error handling to KuduBackup
Nit: This subject is really long. Shows up as 4 lines on gerrit.


http://gerrit.cloudera.org:8080/#/c/10941/2//COMMIT_MSG@21
PS2, Line 21: - if 'overwrite' flag is specified, overwrite the 'metadata'
I think we can keep it simple and let users manually delete files they don't 
want for now.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala@5
PS2, Line 5: class BackupStatus(val tableName: String, val isSuccess: Boolean) {
Since their isn't a ton of custom functionality perhaps using Scala's built in 
Try directly is good enough and more lightweight.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@39
PS2, Line 39:   def main(args: Array[String]): Unit = {
Is there a reason you moved main to the top of the class? I think we should try 
to keep the layout between KuduBackup and KuduRestore similar since the 
processes are so similar.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@56
PS2, Line 56:       throw new Exception(
Nit. Use a more specific exception.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@57
PS2, Line 57:         s"Failed backing up at least one table. See previous 
error messages for details. Failed tables: ${failures.mkString(",")}")
Instead of "at least one". Include the count that failed out of the total 
submitted.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@70
PS2, Line 70:         // @TODO: do incremental backup goes here
I am not sure this is how we will handle incremental. I am thinking we would 
pass an argument requesting incremental vs full. In the case of a full backup I 
think we can rely on Sparks SaveMode semantics to detect existing data. That 
covers more cases because any data existing, not just the metadata thats 
successful, we should likely fail on.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@74
PS2, Line 74:                                  context <- Try(new 
KuduContext(options.kuduMasterAddresses, session.sparkContext));
Why did this context get move here? I think it can be shared for all uses in 
the application.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@87
PS2, Line 87:               case Success(_) => log.error(s"Removed partial 
backup directory: $tablePath")
Maybe info level


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@96
PS2, Line 96:   private def writeMetadata(options: KuduBackupOptions, session: 
SparkSession, tablePath: Path, table: KuduTable) = {
Nit: add explicit return type

Could this method just be merged into writeTableMetadata?


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@127
PS2, Line 127:     val conf = session.sparkContext.hadoopConfiguration
Now that we do this 3 line preamble to get the filesystem and hpath a few 
times, maybe we should move these methods to a utility?


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@135
PS2, Line 135:   private def getTableMetadata(options: KuduBackupOptions, 
session: SparkSession, tablePath: Path, table: KuduTable) = Try {
What is this for? It looks like it writes the metadata.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@66
PS2, Line 66:           BackupSuccess(t)
Should the be RestoreSuccess/RestoreFailure? Since their isn't a ton of custom 
functionality perhaps using Scala's built in Try directly is good enough and 
more lightweight.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@79
PS2, Line 79:   def readTableMetadata(path: Path, session: SparkSession): 
Try[TableMetadataPB] = Try {
Nit: Explicit return type.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@91
PS2, Line 91:   private def getTable(metadata: TableMetadataPB, options: 
KuduRestoreOptions, context: KuduContext): Try[KuduTable] = Try {
Nit: Explicit return type

Maybe getOrCreateTable to be more descriptive?


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
File 
java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala@85
PS2, Line 85:         val backupRows = kuduContext.kuduRDD(ss.sparkContext, 
s"$tableName").collect
nit: accidental indent


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala@140
PS2, Line 140:   test("Backup Failure When No Connection to Kudu") {
This could fail because the path is not found. Need to validate why it failed.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala@147
PS2, Line 147:   test("Restore Throws Exception When Path Not Found") {
Need to fill this out.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala@151
PS2, Line 151:   test("Restore Throws Exception When Table Already Exists") {
Need to fill this out.



--
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 2
Gerrit-Owner: Tony Foerster <anthonymfoers...@gmail.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <anthonymfoers...@gmail.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 20:11:34 +0000
Gerrit-HasComments: Yes

Reply via email to