Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20647#discussion_r170087573
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -107,17 +106,24 @@ case class DataSourceV2Relation(
}
/**
- * A specialization of DataSourceV2Relation with the streaming bit set to
true. Otherwise identical
- * to the non-streaming relation.
+ * A specialization of [[DataSourceV2Relation]] with the streaming bit set
to true.
+ *
+ * Note that, this plan has a mutable reader, so Spark won't apply
operator push-down for this plan,
+ * to avoid making the plan mutable. We should consolidate this plan and
[[DataSourceV2Relation]]
+ * after we figure out how to apply operator push-down for streaming data
sources.
--- End diff --
I agree that a diagram would really help us follow what's happening and the
assumptions that are going into these proposals.
I'd also like to see this discussion happen on the dev list, where more
people can participate. The streaming API for v2 wasn't really discussed there
(unless I missed it) and given these challenges I think we should go back and
have a design and discussion on it. This PR probably isn't the right place to
get into these details.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]