Grant Henke has posted comments on this change. ( )

Change subject: Kudu Backup/Restore Spark Jobs

Patch Set 16:

Commit Message:
PS14, Line 9: Adds a rough base implementation of Kudu backup and restore
> Nit: please reformat the commit message so it adheres to our style guide's
PS14, Line 9:
> Nit: todos
PS14, Line 13:
> Nit: as written, you should combine these two sentences: "in any spark comp
PS14, Line 13:
> In the "per table" directory, perhaps? It's a directory per table, right?
PS14, Line 15: These jobs, as annotated, should be considered private, unstable,
> The second sentence is a fragment; consider joining it to the first: "
File java/kudu-backup/src/main/protobuf/backup.proto:
PS12, Line 28:
> Can you move TableMetadataPB to the bottom?
File java/kudu-backup/src/main/protobuf/backup.proto:
PS14, Line 22: syntax = "proto3";
> The default in proto3 is for each field to be optional, right?
Explicit "optional" keyword are disallowed in proto3 syntax, as fields are
optional by default; required fields are no longer supported.

They also removed field presence logic for primitive value fields and default 
values. This is why I use StringValue below.
PS14, Line 40: differentiate be
> Nit: differentiate.
PS14, Line 58: }
> Is this a column name? An index? An ID?
PS14, Line 59:
> Is this value encoded? Decoded?
PS14, Line 68:     int32 seed = 3;
> Column names? Indexes? IDs?
PS14, Line 73:     RangePartitionMetadataPB rangePartitions = 2;
> Same.
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala:
PS14, Line 66: prevents
> Nit: prevents
PS12, Line 41:
> Still to do?
Yeah, because I am not sure what our final limitations are going to be and 
right now that isn't a public "concept" I will put a todo comment.
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala:
PS12, Line 38: sc
> What I meant was, the KuduContext argument to this function is named 'kuduC
Oh, that's actually a pain, because it collides with a variable in the RDD 
class we are extending.

I will note, ideally we would just use the context included in kuduContext, but 
it has serialization issues, so we can't.
PS12, Line 87: 
> Nit: indentation.
PS12, Line 87:
> Still to do.
PS12, Line 91:
> Still to do.
PS12, Line 91:
> Not really understanding the purpose of this wrapper. Doc?
PS12, Line 97:   * that takes the job partitions and task context and expects 
an Iterator[Row] in return.
> Still to do.
This matches the KuduRDD in SparkContext. We may eventually merge some 
functionality back in so I would like to leave it the same for ease of 
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala:
PS14, Line 43: ins't
> Nit: isn't
PS12, Line 47:
> OK, can you reference that constant here then?
I want the test to break if the constant changes.
File java/kudu-client/src/main/java/org/apache/kudu/
PS14, Line 194:
> Nit: should add a space before this.

To view, visit
To unsubscribe, visit

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If02183a2f833ffa0225eb7b0a35fc7531109e6f7
Gerrit-Change-Number: 10375
Gerrit-PatchSet: 16
Gerrit-Owner: Grant Henke <>
Gerrit-Reviewer: Adar Dembo <>
Gerrit-Reviewer: Grant Henke <>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <>
Gerrit-Reviewer: Todd Lipcon <>
Gerrit-Comment-Date: Wed, 13 Jun 2018 19:59:36 +0000
Gerrit-HasComments: Yes

Reply via email to