ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/14107 )
Change subject: KUDU-2921: Exposing the table statistics to spark relation. ...................................................................... Patch Set 3: (11 comments) http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsRequest.java File java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsRequest.java: PS2: > Copyright header. Done http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsRequest.java@47 PS2, Line 47: Master.TableIdentifierPB.newBuilder().setTableName(name).build(); > Nit: indent Done http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsResponse.java File java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsResponse.java: PS2: > Copyright header. Done http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsResponse.java@28 PS2, Line 28: > Should mention that this is pre-replication (i.e. based on the cumulative o Done http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTableStatistics.java File java/kudu-client/src/main/java/org/apache/kudu/client/KuduTableStatistics.java: PS2: > Copyright header. Done http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTableStatistics.java@38 PS2, Line 38: long liveRowCount) { > Should be getLiveRowCount Done http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduTable.java File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduTable.java: http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduTable.java@802 PS2, Line 802: assertTrue(currentStatistics.getLiveRowCount() >= prevStatistics.getLiveRowCount()); : assertTrue(currentStatistics.getLiveRowCount() <= i + 1); > Break this up into two separate asserts so that if one condition fails, it' Done http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala: http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@263 PS2, Line 263: > What's the purpose of this scalar if its value is 1? This sizeFactor is use to counteract the compression of disk size. Here the value is experimental and it do nothing with value 1. I try to use this to make sure the sizeInBytes below is overestimate. Because the sizeInBytes is encoded size so it is hard to guarantee overestimate maybe it is better to add it as a config. http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@267 PS2, Line 267: > Why is this needed? Done http://gerrit.cloudera.org:8080/#/c/14107/2/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/14107/2/src/kudu/master/catalog_manager.h@636 PS2, Line 636: If 'user' is provided, checks if the user is : // authorized to get such statistics. > This was copy-pasted from ListTables but isn't correct for GetTableStatisti Done, and according to HaoHao's comment I add new authorization check. http://gerrit.cloudera.org:8080/#/c/14107/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/14107/2/src/kudu/master/catalog_manager.cc@271 PS2, Line 271: DEFINE_bool(mock_table_metrics_for_testing, false, : "Whether to enable mock table metrics for testing."); : TAG_FLAG(mock_table_metrics_for_testing, runtime); : TAG_FLAG(mock_table_metrics_for_testing, unsafe); : : DEFINE_int64(on_disk_size_for_testing, 0, : "Mock the on disk size of metrics for testing."); : TAG_FLAG(on_disk_size_for_testing, runtime); : TAG_FLAG(on_disk_size_for_testing, unsafe); : : DEFINE_int64(live_row_count_for_testing, 0, : "Mock the live row count of metrics for testing."); : TAG_FLAG(live_row_count_for_testing, runtime); : TAG_FLAG(live_row_count_for_testing, unsafe); > If these are for tests, tag them as hidden so they don't show up in docs. According to the comment about Flag tags, it seems tagged as unsafe will exclude the flag automatically even if it wasn't marked as 'hidden'. Is that unsafe enough? -- To view, visit http://gerrit.cloudera.org:8080/14107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7742a76708f989b0ccc8ba417f3390013e260175 Gerrit-Change-Number: 14107 Gerrit-PatchSet: 3 Gerrit-Owner: ZhangYao <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: ZhangYao <[email protected]> Gerrit-Comment-Date: Wed, 28 Aug 2019 03:32:46 +0000 Gerrit-HasComments: Yes
