cloud-fan commented on code in PR #38193:
URL: https://github.com/apache/spark/pull/38193#discussion_r992079265
##########
connector/connect/src/main/protobuf/spark/connect/relations.proto:
##########
@@ -31,7 +31,7 @@ option java_package = "org.apache.spark.connect.proto";
message Relation {
RelationCommon common = 1;
oneof rel_type {
- Read read = 2;
+ UnresolvedRelation unresolved_relation = 2;
Review Comment:
I left a comment in
https://github.com/apache/spark/pull/38193#discussion_r991722481 but let me
also reply here.
Yes, we want the query plan proto definition to follow the basics for the
relational algebra / SQL language. However, data source reading is not really
SQL, but more of "configuration". To read a data source in SQL, we need to do
```
CRETE TABLE t USING my_source OPTIONS (...)
SELECT * FROM t
```
We must register the data source as a table and then read it. We can't read
a data source directly in SQL.
DF API provides a way to read a data source directly:
`spark.read.format("my_source").option(...).load()`. This means, we can't say
`a read is a sink the same way you use it in the from statement`. Reading a
data source directly needs a dedicated query plan.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]