[GitHub] [spark] MaxGekk opened a new pull request #31006: [SPARK-33950][SQL][3.1] Refresh cache in v1 `ALTER TABLE .. DROP PARTITION`

GitBox Mon, 04 Jan 2021 00:58:03 -0800


MaxGekk opened a new pull request #31006:
URL: https://github.com/apache/spark/pull/31006



   ### What changes were proposed in this pull request?
   Invoke `refreshTable()` from `AlterTableDropPartitionCommand.run()` after 
partitions dropping. In particular, this invalidates the cache associated with 
the modified table.
   
   ### Why are the changes needed?
   This fixes the issues portrayed by the example:
   ```sql
   spark-sql> CREATE TABLE tbl1 (col0 int, part0 int) USING parquet PARTITIONED 
BY (part0);
   spark-sql> INSERT INTO tbl1 PARTITION (part0=0) SELECT 0;
   spark-sql> INSERT INTO tbl1 PARTITION (part0=1) SELECT 1;
   spark-sql> CACHE TABLE tbl1;
   spark-sql> SELECT * FROM tbl1;
   0    0
   1    1
   spark-sql> ALTER TABLE tbl1 DROP PARTITION (part0=0);
   spark-sql> SELECT * FROM tbl1;
   0    0
   1    1
   ```
   The last query must not return `0    0` since it was deleted by previous 
command.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. After the changes for the example above:
   ```sql
   ...
   spark-sql> ALTER TABLE tbl1 DROP PARTITION (part0=0);
   spark-sql> SELECT * FROM tbl1;
   1    1
   ```
   
   ### How was this patch tested?
   By running the affected test suites:
   ```
   $ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *CachedTableSuite"
   ```
   
   Authored-by: Max Gekk <[email protected]>
   Signed-off-by: Wenchen Fan <[email protected]>
   (cherry picked from commit 67195d0d977caa5a458e8a609c434205f9b54d1b)
   Signed-off-by: Max Gekk <[email protected]>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] MaxGekk opened a new pull request #31006: [SPARK-33950][SQL][3.1] Refresh cache in v1 `ALTER TABLE .. DROP PARTITION`

Reply via email to