[ 
https://issues.apache.org/jira/browse/SPARK-50005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890353#comment-17890353
 ] 

loong edited comment on SPARK-50005 at 10/17/24 8:01 AM:
---------------------------------------------------------

SparkSQL will throw exception if outputPath tries to overwrite inputpath. You 
can see the specific validation method 'verifyNotReadPath()' in the ddl.scala. 
SparkSQL can identify simple scenario such as:
{code:sql}
insert overwrite table output_t select * from output_t;
{code}
However,SparkSQL cannot identify more complex scenarios where the query is 
hidden within filter conditions, such as:
{code:sql}
insert overwrite table output_t select * from input_t ta where not 
exists(select tb.id from output_t tb where tb.id = ta.id);
insert overwrite table output_t select * from input_t ta where ta.id in (select 
id from output_t );
insert overwrite table output_t select * from input_t ta where ta.id < (select 
max(tb.id) from output_t tb where tb.id=ta.id);
{code}
In these scenarios above, SparkSQL throws an exception with the message 
'java.io.FileNotFoundException: File does not exist' which can be confusing.


was (Author: JIRAUSER299610):
SparkSQL will throw exception if outputPath tries to overwrite inputpath. You 
can see the specific validation method 'verifyNotReadPath()' in the ddl.scala. 
SparkSQL can identify simple scenario such as:
{code:sql}
insert overwrite table output_t select * from output_t;
{code}
However,SparkSQL cannot identify more complex scenarios where the query is 
hidden within filter conditions, such as:
{code:sql}
insert overwrite table output_t select * from input_t ta where not 
exists(select tb.id from output_t tb where tb.id = ta.id);
insert overwrite table output_t select * from input_t ta where ta.id in (select 
id from output_t );
insert overwrite table output_t select * from input_t ta where ta.id = (select 
max(tb.id) from output_t tb where tb.id=ta.id);
{code}
In these scenarios above, SparkSQL throws an exception with the message 
'java.io.FileNotFoundException: File does not exist' which can be confusing.

> "Throws exception if outputPath tries to overwrite inputpath." But some 
> scenes are not recognized.
> --------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-50005
>                 URL: https://issues.apache.org/jira/browse/SPARK-50005
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.8, 3.5.3
>            Reporter: loong
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to