hvanhovell commented on code in PR #40997:
URL: https://github.com/apache/spark/pull/40997#discussion_r1254940165
##########
connector/connect/common/src/main/protobuf/spark/connect/relations.proto:
##########
@@ -225,6 +225,16 @@ message Join {
JOIN_TYPE_LEFT_SEMI = 6;
JOIN_TYPE_CROSS = 7;
}
+
+ // (Optional) Only used by joinWith. Set the left and right join data types.
+ optional JoinDataType join_data_type = 6;
+
+ message JoinDataType {
Review Comment:
This is probably the right thing to do for the wrong reason. The problem
that you seem to be solving is dealing with structs, you need to know when you
should wrap the input in a struct. However this could easily be solved by
looking at the left & right input schemas (if more than 1 column make this a
struct).
The problem that you are actually solving is that when you have applied a
single value encoder to a dataset with multiple columns. The single encoder
will bind to the first column. JoinWith should not project the remaining
columns. This is not something we can infer from the serverside schema.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]