Looking at the Sink in 2.0 there is a warning (added in SPARK-16020 without
a lot of details) that says "Note: You cannot apply any operators on `data`
except consuming it (e.g., `collect/foreach`)." but I'm wondering if this
restriction is perhaps too broadly worded? Provided that we consume the
data in a blocking fashion could we apply some other transformation
beforehand? Or is there a better way to get equivalent foreachRDD
functionality with the structured streaming API?

On somewhat of tangent - would it maybe make sense to mark transformations
on Datasets which are not supported for Streaming use (e.g. toJson etc.)?

Cheers,

Holden :)
-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Reply via email to