Hello Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/10375 to look at the new patch set (#18). Change subject: Kudu Backup/Restore Spark Jobs ...................................................................... Kudu Backup/Restore Spark Jobs Adds a rough base implementation of Kudu backup and restore Spark jobs. There are many todos indicating gaps and more testing and details to be be finished. However, these base jobs work and are in a functional state that can be committed and iterated on as we build up and improve our backup functionality. These jobs, as annotated, should be considered private, unstable, and experimental. The backup job can output one to many tables data to any spark compatible path in any spark compatible format, the defaults being HDFS and Parquet. Each table’s data is written in a subdirectory of the provided path. The subdirectory’s name is the url encoded table name. Additionally in each tables directory a json metadata file is output with the metadata needed to recreate the table that was exported when restoring. The restore job can read the data and metadata generated and create “restore” tables with a matching schema and reload the data. The job arguments are a work in progress and will likely be enhanced and simplified as we find what is useful and what isn’t through performance and functional testing. More documentation will be generated when the jobs are ready for general use. Change-Id: If02183a2f833ffa0225eb7b0a35fc7531109e6f7 --- M java/gradle/dependencies.gradle A java/kudu-backup/build.gradle A java/kudu-backup/pom.xml A java/kudu-backup/src/main/protobuf/backup.proto A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestoreOptions.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala A java/kudu-backup/src/test/resources/log4j.properties A java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala M java/kudu-client/src/main/java/org/apache/kudu/Type.java M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala M java/pom.xml M java/settings.gradle 18 files changed, 1,594 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/10375/18 -- To view, visit http://gerrit.cloudera.org:8080/10375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If02183a2f833ffa0225eb7b0a35fc7531109e6f7 Gerrit-Change-Number: 10375 Gerrit-PatchSet: 18 Gerrit-Owner: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Todd Lipcon <t...@apache.org>