ZhangYao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14107 )

Change subject: KUDU-2921: Exposing the table statistics to spark relation.
......................................................................


Patch Set 3:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsRequest.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsRequest.java:

PS2:
> Copyright header.
Done


http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsRequest.java@47
PS2, Line 47:             
Master.TableIdentifierPB.newBuilder().setTableName(name).build();
> Nit: indent
Done


http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsResponse.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsResponse.java:

PS2:
> Copyright header.
Done


http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/GetTableStatisticsResponse.java@28
PS2, Line 28:
> Should mention that this is pre-replication (i.e. based on the cumulative o
Done


http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTableStatistics.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/KuduTableStatistics.java:

PS2:
> Copyright header.
Done


http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTableStatistics.java@38
PS2, Line 38:                       long liveRowCount) {
> Should be getLiveRowCount
Done


http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduTable.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduTable.java:

http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduTable.java@802
PS2, Line 802:       assertTrue(currentStatistics.getLiveRowCount() >= 
prevStatistics.getLiveRowCount());
             :       assertTrue(currentStatistics.getLiveRowCount() <= i + 1);
> Break this up into two separate asserts so that if one condition fails, it'
Done


http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@263
PS2, Line 263:
> What's the purpose of this scalar if its value is 1?
This sizeFactor is use to counteract the compression of disk size. Here the 
value is experimental and it do nothing with value 1. I try to use this to make 
sure the sizeInBytes below is overestimate.

Because the sizeInBytes is encoded size so it is hard to guarantee overestimate 
maybe it is better to add it as a config.


http://gerrit.cloudera.org:8080/#/c/14107/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@267
PS2, Line 267:
> Why is this needed?
Done


http://gerrit.cloudera.org:8080/#/c/14107/2/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/14107/2/src/kudu/master/catalog_manager.h@636
PS2, Line 636: If 'user' is provided, checks if the user is
             :   // authorized to get such statistics.
> This was copy-pasted from ListTables but isn't correct for GetTableStatisti
Done, and according to HaoHao's comment I add new authorization check.


http://gerrit.cloudera.org:8080/#/c/14107/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14107/2/src/kudu/master/catalog_manager.cc@271
PS2, Line 271: DEFINE_bool(mock_table_metrics_for_testing, false,
             :             "Whether to enable mock table metrics for testing.");
             : TAG_FLAG(mock_table_metrics_for_testing, runtime);
             : TAG_FLAG(mock_table_metrics_for_testing, unsafe);
             :
             : DEFINE_int64(on_disk_size_for_testing, 0,
             :              "Mock the on disk size of metrics for testing.");
             : TAG_FLAG(on_disk_size_for_testing, runtime);
             : TAG_FLAG(on_disk_size_for_testing, unsafe);
             :
             : DEFINE_int64(live_row_count_for_testing, 0,
             :              "Mock the live row count of metrics for testing.");
             : TAG_FLAG(live_row_count_for_testing, runtime);
             : TAG_FLAG(live_row_count_for_testing, unsafe);
> If these are for tests, tag them as hidden so they don't show up in docs.
According to the comment about Flag tags, it seems tagged as unsafe will 
exclude the flag automatically even if it wasn't marked as 'hidden'. Is that 
unsafe enough?



--
To view, visit http://gerrit.cloudera.org:8080/14107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7742a76708f989b0ccc8ba417f3390013e260175
Gerrit-Change-Number: 14107
Gerrit-PatchSet: 3
Gerrit-Owner: ZhangYao <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: ZhangYao <[email protected]>
Gerrit-Comment-Date: Wed, 28 Aug 2019 03:32:46 +0000
Gerrit-HasComments: Yes

Reply via email to