[ 
https://issues.apache.org/jira/browse/SPARK-42550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693103#comment-17693103
 ] 

Yuming Wang commented on SPARK-42550:
-------------------------------------

I can't reproduce this issue:
{noformat}
spark-sql> select version();
3.2.3 b53c341e0fefbb33d115ab630369a18765b7763d
Time taken: 0.271 seconds, Fetched 1 row(s)
spark-sql> create database test;
23/02/24 17:41:40 WARN ObjectStore: Failed to get database test, returning 
NoSuchObjectException
Time taken: 0.161 seconds
spark-sql> CREATE TABLE IF NOT EXISTS test.spark32_overwrite(amt1 int) STORED 
AS ORC;
23/02/24 17:41:40 WARN HiveMetaStore: Location: 
file:/Users/yumwang/Downloads/spark-3.2.3-bin-hadoop3.2/spark-warehouse/test.db/spark32_overwrite
 specified for non-external table:spark32_overwrite
Time taken: 0.172 seconds
spark-sql>
         > INSERT OVERWRITE TABLE test.spark32_overwrite select 128;
Time taken: 0.475 seconds
spark-sql>
         > CREATE TABLE IF NOT EXISTS test.spark32_overwrite2(amt1 long) STORED 
AS ORC;
23/02/24 17:41:41 WARN HiveMetaStore: Location: 
file:/Users/yumwang/Downloads/spark-3.2.3-bin-hadoop3.2/spark-warehouse/test.db/spark32_overwrite2
 specified for non-external table:spark32_overwrite2
Time taken: 0.175 seconds
spark-sql>
         > INSERT OVERWRITE TABLE test.spark32_overwrite2 select 6000044164;
Time taken: 0.647 seconds
spark-sql>
         > INSERT OVERWRITE TABLE test.spark32_overwrite select amt1 from 
test.spark32_overwrite2;
23/02/24 17:41:42 ERROR Utils: Aborting task
java.lang.ArithmeticException: Casting 6000044164 to int causes overflow
        at 
org.apache.spark.sql.errors.QueryExecutionErrors$.castingCauseOverflowError(QueryExecutionErrors.scala:91)
******

spark-sql>
         > select * from test.spark32_overwrite;
23/02/24 17:41:44 ERROR Executor: Exception in task 0.0 in stage 8.0 (TID 8)
java.io.FileNotFoundException:
File 
file:/Users/yumwang/Downloads/spark-3.2.3-bin-hadoop3.2/spark-warehouse/test.db/spark32_overwrite/part-00000-c0d09f8e-5abd-4743-b4f4-9447dc6d1381-c000.snappy.orc
 does not exist

******

spark-sql> dfs -ls 
/Users/yumwang/Downloads/spark-3.2.3-bin-hadoop3.2/spark-warehouse/test.db/;
Found 2 items
drwxr-xr-x   - yumwang 110302528         64 2023-02-24 17:41 
/Users/yumwang/Downloads/spark-3.2.3-bin-hadoop3.2/spark-warehouse/test.db/spark32_overwrite
drwxr-xr-x   - yumwang 110302528        192 2023-02-24 17:41 
/Users/yumwang/Downloads/spark-3.2.3-bin-hadoop3.2/spark-warehouse/test.db/spark32_overwrite2
spark-sql>
{noformat}

> table directory will lost on hdfs when `INSERT OVERWRITE` faild
> ---------------------------------------------------------------
>
>                 Key: SPARK-42550
>                 URL: https://issues.apache.org/jira/browse/SPARK-42550
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.2.3
>         Environment: spark 3.2.3 / HDP 3.1.4
>            Reporter: kevinshin
>            Priority: Critical
>         Attachments: image-2023-02-24-15-21-55-273.png, 
> image-2023-02-24-15-23-32-977.png, image-2023-02-24-15-25-57-770.png
>
>
> {color:#4c9aff}when a  `{*}INSERT{*} OVERWRITE *TABLE`* statment faild during 
> execution, the table's directory will be deleted.  this is not happen in 
> spark 3.2.1.{color}
> {color:#4c9aff}for example: {color}
> *CREATE* *TABLE* *IF* *NOT* *EXISTS* test.spark32_overwrite(amt1 {*}int{*}) 
> STORED *AS* ORC;
> *INSERT* OVERWRITE *TABLE* test.spark32_overwrite *select* 128;
> *CREATE* *TABLE* *IF* *NOT* *EXISTS* test.spark32_overwrite2(amt1 long) 
> STORED *AS* ORC;
> *INSERT* OVERWRITE *TABLE* test.spark32_overwrite *select* 6000044164;
> *INSERT* OVERWRITE *TABLE* test.spark32_overwrite *select* amt1 *from* 
> test.spark32_overwrite2;    – {color:#de350b}*this will got Casting overflow 
> exception*{color}
> {color:#4c9aff}and then :{color}
> *select* * *from* test.spark32_overwrite; 
> {color:#4c9aff}will got error:{color}
> {color:#172b4d}java.io.FileNotFoundException{color}
> {color:#172b4d}!image-2023-02-24-15-21-55-273.png!{color}
> {color:#172b4d}the table's directory is losted. use `hdfs dfs -ls` cmd to 
> check:{color}
> !image-2023-02-24-15-23-32-977.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to