[ 
https://issues.apache.org/jira/browse/SPARK-34055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Gekk updated SPARK-34055:
-------------------------------
    Description: 
Here is the example to reproduce the issue:
{code:sql}
spark-sql> create table tbl2 (col int, part int) partitioned by (part);
spark-sql> insert into tbl2 partition (part=0) select 0;
spark-sql> cache table tbl2;
spark-sql> select * from tbl2;
0       0
spark-sql> show table extended like 'tbl2' partition (part = 0);
default tbl2    false   Partition Values: [part=0]
Location: 
file:/Users/maximgekk/proj/add-partition-refresh-cache-2/spark-warehouse/tbl2/part=0
...
{code}
Add new partition by copying the existing one:
{code}
cp -r 
/Users/maximgekk/proj/add-partition-refresh-cache-2/spark-warehouse/tbl2/part=0 
/Users/maximgekk/proj/add-partition-refresh-cache-2/spark-warehouse/tbl2/part=1
{code}
 Add new partition and select the table:
{code}
spark-sql> alter table tbl2 add partition (part = 1) location 
'/Users/maximgekk/proj/add-partition-refresh-cache-2/spark-warehouse/tbl2/part=1';
spark-sql> select * from tbl2;
0       0
{code}
We see only old data.

  was:
Here is the example to reproduce the issue:
{code:sql}
spark-sql> create table tbl (col int, part int) using parquet partitioned by 
(part);
spark-sql> insert into tbl partition (part=0) select 0;
spark-sql> cache table tbl;
spark-sql> select * from tbl;
0       0
spark-sql> show table extended like 'tbl' partition(part=0);
default tbl     false   Partition Values: [part=0]
Location: 
file:/Users/maximgekk/proj/recover-partitions-refresh-cache/spark-warehouse/tbl/part=0
...
{code}
Add new partition by copying the existing one:
{code}
cp -r 
/Users/maximgekk/proj/recover-partitions-refresh-cache/spark-warehouse/tbl/part=0
 
/Users/maximgekk/proj/recover-partitions-refresh-cache/spark-warehouse/tbl/part=1
{code}
 Recover and select the table:
{code}
spark-sql> alter table tbl recover partitions;
spark-sql> select * from tbl;
0       0
{code}
We see only old data.


> ALTER TABLE .. ADD PARTITION doesn't refresh cache
> --------------------------------------------------
>
>                 Key: SPARK-34055
>                 URL: https://issues.apache.org/jira/browse/SPARK-34055
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.1, 3.1.0, 3.2.0
>            Reporter: Maxim Gekk
>            Assignee: Apache Spark
>            Priority: Major
>              Labels: correctness
>             Fix For: 3.2.0
>
>
> Here is the example to reproduce the issue:
> {code:sql}
> spark-sql> create table tbl2 (col int, part int) partitioned by (part);
> spark-sql> insert into tbl2 partition (part=0) select 0;
> spark-sql> cache table tbl2;
> spark-sql> select * from tbl2;
> 0     0
> spark-sql> show table extended like 'tbl2' partition (part = 0);
> default       tbl2    false   Partition Values: [part=0]
> Location: 
> file:/Users/maximgekk/proj/add-partition-refresh-cache-2/spark-warehouse/tbl2/part=0
> ...
> {code}
> Add new partition by copying the existing one:
> {code}
> cp -r 
> /Users/maximgekk/proj/add-partition-refresh-cache-2/spark-warehouse/tbl2/part=0
>  
> /Users/maximgekk/proj/add-partition-refresh-cache-2/spark-warehouse/tbl2/part=1
> {code}
>  Add new partition and select the table:
> {code}
> spark-sql> alter table tbl2 add partition (part = 1) location 
> '/Users/maximgekk/proj/add-partition-refresh-cache-2/spark-warehouse/tbl2/part=1';
> spark-sql> select * from tbl2;
> 0     0
> {code}
> We see only old data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to