amaliujia commented on code in PR #38938:
URL: https://github.com/apache/spark/pull/38938#discussion_r1041314131
##########
connector/connect/src/main/protobuf/spark/connect/relations.proto:
##########
@@ -404,6 +405,18 @@ message StatSummary {
repeated string statistics = 2;
}
+// Computes basic statistics for numeric and string columns, including count,
mean, stddev, min,
+// and max. If no columns are given, this function computes statistics for all
numerical or
+// string columns.
+// It will invoke 'Dataset.describe' to compute the results.
+message StatDescribe {
+ // (Required) The input relation.
+ Relation input = 1;
+
+ // Columns to compute statistics on.
+ repeated string cols = 2;
Review Comment:
Can you follow
https://github.com/apache/spark/blob/master/connector/connect/docs/adding-proto-messages.md
to mark if this is required or optional?
##########
connector/connect/src/main/protobuf/spark/connect/relations.proto:
##########
@@ -404,6 +405,18 @@ message StatSummary {
repeated string statistics = 2;
}
+// Computes basic statistics for numeric and string columns, including count,
mean, stddev, min,
+// and max. If no columns are given, this function computes statistics for all
numerical or
+// string columns.
+// It will invoke 'Dataset.describe' to compute the results.
Review Comment:
`It will invoke 'Dataset.describe' to compute the results.`
I tend to avoid say this as this exposes implementation details while the
comment here actually serves as proto documentation. (E.g. we might switch the
implementation to something else but forget to update here thus cause
inconsistency)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]