[ 
https://issues.apache.org/jira/browse/SPARK-31376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077821#comment-17077821
 ] 

Jungtaek Lim commented on SPARK-31376:
--------------------------------------

Btw it would be even better if you initiate the discussion/proposal in dev@ 
mailing list (or users@) before, so that the discussion like comments here can 
be consumed among community and we all have better understanding. I might be 
wrong and someone can correct me. There's less chance on JIRA issue.

> Non-global sort support for structured streaming
> ------------------------------------------------
>
>                 Key: SPARK-31376
>                 URL: https://issues.apache.org/jira/browse/SPARK-31376
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.1.0
>            Reporter: Adam Binford
>            Priority: Minor
>
> Currently, all sorting is disallowed with structured streaming queries. Not 
> allowing global sorting makes sense, but could non-global sorting (i.e. 
> sortWithinPartitions) be allowed? I'm running into this with an external 
> source I'm using, but not sure if this would be useful to file sources as 
> well. I have to foreachBatch so that I can do a sortWithinPartitions.
> Two main questions:
>  * Does a local sort cause issues with any exactly-once guarantees streaming 
> queries provides? I can't say I know or understand how these semantics work. 
> Or are there other issues I can't think of this would cause?
>  * Is the change as simple as changing the unsupported operations check to 
> only look for global sorts instead of all sorts?
> I have built a version that simply changes the unsupported check to only 
> disallow global sorts and it seems to be working. Anything I'm missing or is 
> it this simple?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to