[
https://issues.apache.org/jira/browse/SPARK-50005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890353#comment-17890353
]
loong edited comment on SPARK-50005 at 10/17/24 8:00 AM:
---------------------------------------------------------
SparkSQL will throw exception if outputPath tries to overwrite inputpath. You
can see the specific validation method 'verifyNotReadPath()' in the ddl.scala.
SparkSQL can identify simple scenario such as:
{code:sql}
insert overwrite table output_t select * from output_t;
{code}
However,SparkSQL cannot identify more complex scenarios where the query is
hidden within filter conditions, such as:
{code:sql}
insert overwrite table output_t select * from input_t ta where not
exists(select tb.id from output_t tb where tb.id = ta.id);
insert overwrite table output_t select * from input_t ta where ta.id in (select
id from output_t );
insert overwrite table output_t select * from input_t ta where ta.id = (select
max(tb.id) from output_t tb where tb.id=ta.id);
{code}
In these scenarios above, SparkSQL throws an exception with the message
'java.io.FileNotFoundException: File does not exist' which can be confusing.
was (Author: JIRAUSER299610):
SparkSQL will throw exception if outputPath tries to overwrite inputpath. You
can see the specific validation method verifyNotReadPath in the ddl.scala file.
SparkSQL can identify simple scenario such as:
{code:sql}
insert overwrite table output_t select * from output_t;
{code}
However,SparkSQL cannot identify more complex scenarios where the query is
hidden within filter conditions, such as:
{code:sql}
insert overwrite table output_t select * from input_t ta where not
exists(select tb.id from output_t tb where tb.id = ta.id);
insert overwrite table output_t select * from input_t ta where ta.id in (select
id from output_t );
insert overwrite table output_t select * from input_t ta where ta.id = (select
max(tb.id) from output_t tb where tb.id=ta.id);
{code}
In these scenarios above, SparkSQL throws an exception with the message
'java.io.FileNotFoundException: File does not exist' which can be confusing.
> "Throws exception if outputPath tries to overwrite inputpath." But some
> scenes are not recognized.
> --------------------------------------------------------------------------------------------------
>
> Key: SPARK-50005
> URL: https://issues.apache.org/jira/browse/SPARK-50005
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.8, 3.5.3
> Reporter: loong
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]