[GitHub] [spark] amaliujia commented on a diff in pull request #38193: [SPARK-39375][CONNECT][FOLLOW-UP] Refactor Read to UnresolvedRelation

GitBox Mon, 10 Oct 2022 15:31:26 -0700


amaliujia commented on code in PR #38193:
URL: https://github.com/apache/spark/pull/38193#discussion_r991673158



##########
connector/connect/src/main/protobuf/spark/connect/relations.proto:
##########
@@ -31,7 +31,7 @@ option java_package = "org.apache.spark.connect.proto";
 message Relation {
   RelationCommon common = 1;
   oneof rel_type {
-    Read read = 2;
+    UnresolvedRelation unresolved_relation = 2;

Review Comment:
   Per the review suggestions on https://github.com/apache/spark/pull/38086, we 
plan to decouple unresolved relation and data source (e.g. in different 
messages). cc @cloud-fan 
   
   Data source needs its own plan with its own fields. In that case current 
`Read` will only be unresolved relation. 
   
   For Spark I don't think Relation and DataSource are ever being the same 
concept. Relation has an identifier and analyzer needs to resolve it. Data 
source has format, path, schema and other configurations. IIRC relation and 
data source do not share same fields.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] amaliujia commented on a diff in pull request #38193: [SPARK-39375][CONNECT][FOLLOW-UP] Refactor Read to UnresolvedRelation

Reply via email to