[GitHub] [spark] cloud-fan commented on a diff in pull request #38193: [SPARK-39375][CONNECT][FOLLOW-UP] Refactor Read to UnresolvedRelation

GitBox Tue, 11 Oct 2022 05:57:08 -0700


cloud-fan commented on code in PR #38193:
URL: https://github.com/apache/spark/pull/38193#discussion_r992291218



##########
connector/connect/src/main/protobuf/spark/connect/relations.proto:
##########
@@ -31,7 +31,7 @@ option java_package = "org.apache.spark.connect.proto";
 message Relation {
   RelationCommon common = 1;
   oneof rel_type {
-    Read read = 2;
+    UnresolvedRelation unresolved_relation = 2;

Review Comment:
   @hvanhovell , yes, we are talking about how to describe the scan operation. 
IIUC there are two options on the table:
   1. Have a single `Read` operation, with many fields: `tableName`, 
`userSpecifiedSchema`, `dataSourceName`, `options`. If we are scanning a table, 
only `tableName` is set, if we are scanning a data source, all fields except 
`tableName` are set.
   2. Have a `UnresolvedRelation` and `UnresolvedDataSource` (we can pick 
better names, this is just to match catalyst). `UnresolvedRelation` only have a 
`tableName` field, `UnresolvedDataSource` has many fields: 
`userSpecifiedSchema`, `dataSourceName`, `options`.
   
   I'm not familiar wth the protobuf design philosophy. Does it prefer option 1 
or 2?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a diff in pull request #38193: [SPARK-39375][CONNECT][FOLLOW-UP] Refactor Read to UnresolvedRelation

Reply via email to