[jira] [Comment Edited] (SPARK-48338) Sql Scripting support for Spark SQL

Dongjoon Hyun (Jira) Sun, 07 Jun 2026 08:37:09 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-48338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18086736#comment-18086736
 ]


Dongjoon Hyun edited comment on SPARK-48338 at 6/7/26 3:36 PM:
---------------------------------------------------------------

1. No, but, AFAIK, there is no technical way to block people from making a 
subtask mistakenly. Even for the closed umbrella JIRA issues, I guess a person 
can make a mistake.

bq. shall we allow people to create more subtasks on a resolved umbrella ticket?

2. No, if you want to resolve the umbrella JIRA issue, you need to do clean up 
first by moving out the open subtasks somewhere else. Typically, I believe next 
Spark Version Umbrella JIRA issue can be a placeholder until a new feature 
umbrella JIRA issue is created.

bq. can we resolve an umbrella ticket when there are still open subtasks but 
non-blocking?

3. For JIRA issue management, it's always okay to resolve this as 4.1.0 at that 
time or now. What I expected is the resolving should be based on the status of 
subtasks. So, we are good with 4.1.0 because I cleaned them up.

4. For releasenotes issues. Actually, it matters to me because it means the 
feature developers didn't co-operate the community release process at all. As 
we see in this JIRA issue, the Apache Spark 4.1 release manager (me) didn't get 
a proper collaboration. Specifically, no `releasenotes` label at this JIRA 
issue and no reply to the questions at that time.

bq. For release notes: this does not matter.

I sincerely wanted to help and collaborate with this feature at that time and 
still do.


was (Author: dongjoon):
1. No, but, AFAIK, there is no technical way to block people from making a 
subtask mistakenly. Even for the closed umbrella JIRA issues, I guess a person 
can make a mistake.

bq. shall we allow people to create more subtasks on a resolved umbrella ticket?

2. No, if you want to resolve the umbrella JIRA issue, you need to do clean up 
first by moving out the open subtasks somewhere else. Typically, I believe next 
Spark Version Umbrella JIRA issue can be a placeholder until a new feature 
umbrella JIRA issue is created.

bq. can we resolve an umbrella ticket when there are still open subtasks but 
non-blocking?

3. For JIRA issue management, it's always okay to resolve this as 4.1.0 at that 
time or now. What I expected is the resolving should be based on the status of 
subtasks. So, we are good with 4.1.0 because I cleaned them up.

4. For releasenotes issues. Actually, it matters to me because it means the 
feature developers didn't co-operate the community release process at all. As 
we see in this JIRA issue, the Apache Spark 4.1 release manager (me) didn't get 
a proper collaboration. Specifically, no `releasenotes` label at this JIRA 
issue and no reply to the questions at that time.

> For release notes: this does not matter.

I sincerely wanted to help and collaborate with this feature at that time and 
still do.

> Sql Scripting support for Spark SQL
> -----------------------------------
>
>                 Key: SPARK-48338
>                 URL: https://issues.apache.org/jira/browse/SPARK-48338
>             Project: Spark
>          Issue Type: Umbrella
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Aleksandar Tomic
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.1.0
>
>         Attachments: Screenshot 2026-06-04 at 21.12.59.png, Sql Scripting - 
> OSS.odt, [Design Doc] Sql Scripting - OSS.pdf
>
>
> Design doc for this feature is in attachment.
> *High level example of Sql Script:*
> {code:java}
> BEGIN
>   DECLARE c INT = 10;
>   WHILE c > 0 DO
>     INSERT INTO tscript VALUES (c);
>     SET c = c - 1;
>   END WHILE;
> END{code}
> *High level motivation behind this feature:*
> SQL Scripting gives customers the ability to develop complex ETL and analysis 
> entirely in SQL. Until now, customers have had to write verbose SQL 
> statements or combine SQL + Python to efficiently write business logic. 
> Coming from another system, customers have to choose whether or not they want 
> to migrate to pyspark. Some customers end up not using Spark because of this 
> gap. SQL Scripting is a key milestone towards enabling SQL practitioners to 
> write sophisticated queries, without the need to use pyspark. Further, SQL 
> Scripting is a necessary step towards support for SQL Stored Procedures, and 
> along with SQL Variables (released) and Temp Tables (in progress), will allow 
> for more seamless data warehouse migrations.
> *Work items classification:*
>  * M0 - basic support
>  * M1 - features and changes required to enable SQL Scripting by default
>  * M2 - follow-up improvements and additional functionalities that are 
> non-fundamental and should not block M1
>  * M3 - potential improvements for the future, need investigation



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SPARK-48338) Sql Scripting support for Spark SQL

Reply via email to