[GitHub] [spark] cloud-fan commented on a diff in pull request #38193: [SPARK-39375][CONNECT][FOLLOW-UP] Refactor Read to UnresolvedRelation

GitBox Tue, 11 Oct 2022 02:30:44 -0700


cloud-fan commented on code in PR #38193:
URL: https://github.com/apache/spark/pull/38193#discussion_r992079265



##########
connector/connect/src/main/protobuf/spark/connect/relations.proto:
##########
@@ -31,7 +31,7 @@ option java_package = "org.apache.spark.connect.proto";
 message Relation {
   RelationCommon common = 1;
   oneof rel_type {
-    Read read = 2;
+    UnresolvedRelation unresolved_relation = 2;

Review Comment:
   I left a comment in 
https://github.com/apache/spark/pull/38193#discussion_r991722481 but let me 
also reply here.
   
   Yes, we want the query plan proto definition to follow the basics for the 
relational algebra / SQL language. However, data source reading is not really 
SQL, but more of "configuration". To read a data source in SQL, we need to do
   ```
   CRETE TABLE t USING my_source OPTIONS (...)
   SELECT * FROM t
   ```
   We must register the data source as a table and then read it. We can't read 
a data source directly in SQL.
   
   DF API provides a way to read a data source directly: 
`spark.read.format("my_source").option(...).load()`. This means, we can't say 
`a read is a sink the same way you use it in the from statement`. Reading a 
data source directly needs a dedicated query plan.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a diff in pull request #38193: [SPARK-39375][CONNECT][FOLLOW-UP] Refactor Read to UnresolvedRelation

Reply via email to