[
https://issues.apache.org/jira/browse/SPARK-31376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-31376:
----------------------------------
Affects Version/s: (was: 3.0.0)
3.1.0
> Non-global sort support for structured streaming
> ------------------------------------------------
>
> Key: SPARK-31376
> URL: https://issues.apache.org/jira/browse/SPARK-31376
> Project: Spark
> Issue Type: Improvement
> Components: Structured Streaming
> Affects Versions: 3.1.0
> Reporter: Adam Binford
> Priority: Minor
>
> Currently, all sorting is disallowed with structured streaming queries. Not
> allowing global sorting makes sense, but could non-global sorting (i.e.
> sortWithinPartitions) be allowed? I'm running into this with an external
> source I'm using, but not sure if this would be useful to file sources as
> well. I have to foreachBatch so that I can do a sortWithinPartitions.
> Two main questions:
> * Does a local sort cause issues with any exactly-once guarantees streaming
> queries provides? I can't say I know or understand how these semantics work.
> Or are there other issues I can't think of this would cause?
> * Is the change as simple as changing the unsupported operations check to
> only look for global sorts instead of all sorts?
> I have built a version that simply changes the unsupported check to only
> disallow global sorts and it seems to be working. Anything I'm missing or is
> it this simple?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]