Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/16031 )
Change subject: KUDU-1802: Avoid call to master when deserializing scan tokens ...................................................................... Patch Set 5: (7 comments) http://gerrit.cloudera.org:8080/#/c/16031/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16031/5//COMMIT_MSG@23 PS5, Line 23: This doesn’t avoid the need for a call to the master in the case : of writing data to Kudu > It also doesn't populate the meta cache, right? We still need to send GetTa Right, this is specifically avoiding the schema metadata currently. http://gerrit.cloudera.org:8080/#/c/16031/5/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala: http://gerrit.cloudera.org:8080/#/c/16031/5/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala@60 PS5, Line 60: This shouldn't be required. > nit: it may not be obvious from the laid out solution why this is required. Done http://gerrit.cloudera.org:8080/#/c/16031/5/java/kudu-client/src/main/java/org/apache/kudu/client/ProtobufHelper.java File java/kudu-client/src/main/java/org/apache/kudu/client/ProtobufHelper.java: http://gerrit.cloudera.org:8080/#/c/16031/5/java/kudu-client/src/main/java/org/apache/kudu/client/ProtobufHelper.java@232 PS5, Line 232: partitionSchema.getHashBucketSchemas()) { : Common.PartitionSchemaPB.HashBucketSchemaPB.Builder hbsBuilder = : Common.PartitionSchemaPB.HashBucketSchemaPB.newBuilder() : .addAllColumns(idsToPb(hashBucketSchema.getColumnIds())) : .setNumBuckets(hashBucketSchema.getNumBuckets()) : .setSeed(hashBucketSchema.getSeed()); : builder.addHashBucketSchemas(hbsBuilder.build()); : } : : Common.PartitionSchemaPB.RangeSchemaPB rangeSchemaPB = : Common.PartitionSchemaPB.RangeSchemaPB.newBuilder() : .addAllColumns(idsToPb(partitionSchema.getRangeSchema().getColumnIds())) : .build(); > nit: typical newline alignment in Java code is to 4 spaces, not 8, though r Done http://gerrit.cloudera.org:8080/#/c/16031/5/java/kudu-client/src/test/java/org/apache/kudu/client/TestScanToken.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestScanToken.java: http://gerrit.cloudera.org:8080/#/c/16031/5/java/kudu-client/src/test/java/org/apache/kudu/client/TestScanToken.java@489 PS5, Line 489: // TODO(ghenke): This shouldn't be required. Instead we should use snapshot scans : // and Kudu should use the schema at the given snapshot time when scanning or we : // should leverage column ids in the projection and predicates for scans : // (similar to how we use the table id to handle table renames). > Should this be a jira? Seems high-hanging enough that a TODO might not do t yeah, definitely. I meant to file one, thanks for pointing it out. http://gerrit.cloudera.org:8080/#/c/16031/5/src/kudu/client/client.proto File src/kudu/client/client.proto: http://gerrit.cloudera.org:8080/#/c/16031/5/src/kudu/client/client.proto@28 PS5, Line 28: message TableMetadataPB { > nit: add some docs explaining how this can be used, maybe adding some color I add the color below in the scan token given this can be used more generically for future use too. I will add some context here though. http://gerrit.cloudera.org:8080/#/c/16031/5/src/kudu/client/scan_token-internal.cc File src/kudu/client/scan_token-internal.cc: http://gerrit.cloudera.org:8080/#/c/16031/5/src/kudu/client/scan_token-internal.cc@128 PS5, Line 128: table.reset(new KuduTable(client->shared_from_this(), message.table_name(), message.table_id(), : metadata.num_replicas(), kuduSchema, partition_schema, extra_configs)); > This doesn't clear the meta cache as we would in OpenTable() -- is that imp This comment is also on `AsyncKuduClient.getTableSchema`. That comment was added in https://github.com/apache/kudu/commit/c1f78ffb59af28c0f453cb8485ed946465f19f43 and looks to specifically address issues around newly created tables cached state. I don't think it is an issue for scan tokens. http://gerrit.cloudera.org:8080/#/c/16031/5/src/kudu/client/scan_token-internal.cc@266 PS5, Line 266: pb.set_table_id(table->id()); : pb.set_table_name(table->name()); > Since these are optional when including table metadata, is there a reason t Done -- To view, visit http://gerrit.cloudera.org:8080/16031 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88c1b8392de37dd5e8b7bd8b78a21603ff8b1d1b Gerrit-Change-Number: 16031 Gerrit-PatchSet: 5 Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Tue, 09 Jun 2020 03:41:14 +0000 Gerrit-HasComments: Yes
