This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 6ea4b5f  [SPARK-34401][SQL][DOCS] Update docs about altering cached 
tables/views
6ea4b5f is described below

commit 6ea4b5fda7fd32f78e204e3de466fdc07e47ee89
Author: Max Gekk <max.g...@gmail.com>
AuthorDate: Mon Feb 22 04:32:09 2021 +0000

    [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views
    
    ### What changes were proposed in this pull request?
    Update public docs of SQL commands about altering cached tables/views. For 
instance:
    <img width="869" alt="Screenshot 2021-02-08 at 15 11 48" 
src="https://user-images.githubusercontent.com/1580697/107217940-fd3b8980-6a1f-11eb-98b9-9b2e3fe7f4ef.png";>
    
    ### Why are the changes needed?
    To inform users about commands behavior in altering cached tables or views.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    By running the command below and manually checking the docs:
    ```
    $ SKIP_API=1 SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 jekyll serve 
--watch
    ```
    
    Closes #31524 from MaxGekk/doc-cmd-caching.
    
    Authored-by: Max Gekk <max.g...@gmail.com>
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
---
 docs/sql-ref-syntax-ddl-alter-table.md                         | 10 ++++++++++
 docs/sql-ref-syntax-ddl-alter-view.md                          |  2 ++
 docs/sql-ref-syntax-ddl-drop-table.md                          |  2 ++
 docs/sql-ref-syntax-ddl-repair-table.md                        |  2 ++
 docs/sql-ref-syntax-ddl-truncate-table.md                      |  2 ++
 docs/sql-ref-syntax-dml-load.md                                |  2 ++
 .../main/scala/org/apache/spark/sql/internal/CatalogImpl.scala |  2 +-
 7 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index e4d73f3..6fe1405 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -27,6 +27,10 @@ license: |
 
 `ALTER TABLE RENAME TO` statement changes the table name of an existing table 
in the database. The table rename command cannot be used to move a table 
between databases, only to rename a table within the same database.
 
+If the table is cached, the commands clear cached data of the table. The cache 
will be lazily filled when the next time the table is accessed. Additionally:
+  * the table rename command uncaches all table's dependents such as views 
that refer to the table. The dependents should be cached again explicitly.
+  * the partition rename command clears caches of all table dependents while 
keeping them as cached. So, their caches will be lazily filled when the next 
time they are accessed.
+
 #### Syntax
 
 ```sql
@@ -103,6 +107,8 @@ ALTER TABLE table_identifier { ALTER | CHANGE } [ COLUMN ] 
col_spec alterColumnA
 
 `ALTER TABLE ADD` statement adds partition to the partitioned table.
 
+If the table is cached, the command clears cached data of the table and all 
its dependents that refer to it. The cache will be lazily filled when the next 
time the table or the dependents are accessed.
+
 ##### Syntax
 
 ```sql
@@ -128,6 +134,8 @@ ALTER TABLE table_identifier ADD [IF NOT EXISTS]
 
 `ALTER TABLE DROP` statement drops the partition of the table.
 
+If the table is cached, the command clears cached data of the table and all 
its dependents that refer to it. The cache will be lazily filled when the next 
time the table or the dependents are accessed.
+
 ##### Syntax
 
 ```sql
@@ -187,6 +195,8 @@ ALTER TABLE table_identifier [ partition_spec ] SET SERDE 
serde_class_name
 `ALTER TABLE SET` command can also be used for changing the file location and 
file format for 
 existing tables. 
 
+If the table is cached, the `ALTER TABLE .. SET LOCATION` command clears 
cached data of the table and all its dependents that refer to it. The cache 
will be lazily filled when the next time the table or the dependents are 
accessed.
+
 ##### Syntax
 
 ```sql
diff --git a/docs/sql-ref-syntax-ddl-alter-view.md 
b/docs/sql-ref-syntax-ddl-alter-view.md
index a34e77d..d69f246 100644
--- a/docs/sql-ref-syntax-ddl-alter-view.md
+++ b/docs/sql-ref-syntax-ddl-alter-view.md
@@ -28,6 +28,8 @@ the name of a view to a different name, set and unset the 
metadata of the view b
 Renames the existing view. If the new view name already exists in the source 
database, a `TableAlreadyExistsException` is thrown. This operation
 does not support moving the views across databases.
 
+If the view is cached, the command clears cached data of the view and all its 
dependents that refer to it. View's cache will be lazily filled when the next 
time the view is accessed. The command leaves view's dependents as uncached.
+
 #### Syntax
 ```sql
 ALTER VIEW view_identifier RENAME TO view_identifier
diff --git a/docs/sql-ref-syntax-ddl-drop-table.md 
b/docs/sql-ref-syntax-ddl-drop-table.md
index a15a992..6c115fd 100644
--- a/docs/sql-ref-syntax-ddl-drop-table.md
+++ b/docs/sql-ref-syntax-ddl-drop-table.md
@@ -26,6 +26,8 @@ if the table is not `EXTERNAL` table. If the table is not 
present it throws an e
 
 In case of an external table, only the associated metadata information is 
removed from the metastore database.
 
+If the table is cached, the command uncaches the table and all its dependents.
+
 ### Syntax
 
 ```sql
diff --git a/docs/sql-ref-syntax-ddl-repair-table.md 
b/docs/sql-ref-syntax-ddl-repair-table.md
index c2ef0a7..3614512 100644
--- a/docs/sql-ref-syntax-ddl-repair-table.md
+++ b/docs/sql-ref-syntax-ddl-repair-table.md
@@ -23,6 +23,8 @@ license: |
 
 `MSCK REPAIR TABLE` recovers all the partitions in the directory of a table 
and updates the Hive metastore. When creating a table using `PARTITIONED BY` 
clause, partitions are generated and registered in the Hive metastore. However, 
if the partitioned table is created from existing data, partitions are not 
registered automatically in the Hive metastore. User needs to run `MSCK REPAIR 
TABLE` to register the partitions. `MSCK REPAIR TABLE` on a non-existent table 
or a table without partiti [...]
 
+If the table is cached, the command clears cached data of the table and all 
its dependents that refer to it. The cache will be lazily filled when the next 
time the table or the dependents are accessed.
+
 ### Syntax
 
 ```sql
diff --git a/docs/sql-ref-syntax-ddl-truncate-table.md 
b/docs/sql-ref-syntax-ddl-truncate-table.md
index 6139814..3bc4d7a 100644
--- a/docs/sql-ref-syntax-ddl-truncate-table.md
+++ b/docs/sql-ref-syntax-ddl-truncate-table.md
@@ -25,6 +25,8 @@ The `TRUNCATE TABLE` statement removes all the rows from a 
table or partition(s)
 or an external/temporary table. In order to truncate multiple partitions at 
once, the user can specify the partitions 
 in `partition_spec`. If no `partition_spec` is specified it will remove all 
partitions in the table.
 
+If the table is cached, the command clears cached data of the table and all 
its dependents that refer to it. The cache will be lazily filled when the next 
time the table or the dependents are accessed.
+
 ### Syntax
 
 ```sql
diff --git a/docs/sql-ref-syntax-dml-load.md b/docs/sql-ref-syntax-dml-load.md
index 9381b42..08922b8 100644
--- a/docs/sql-ref-syntax-dml-load.md
+++ b/docs/sql-ref-syntax-dml-load.md
@@ -23,6 +23,8 @@ license: |
 
 `LOAD DATA` statement loads the data into a Hive serde table from the user 
specified directory or file. If a directory is specified then all the files 
from the directory are loaded. If a file is specified then only the single file 
is loaded. Additionally the `LOAD DATA` statement takes an optional partition 
specification. When a partition is specified, the data files (when input source 
is a directory) or the single file (when input source is a file) are loaded 
into the partition of the t [...]
 
+If the table is cached, the command clears cached data of the table and all 
its dependents that refer to it. The cache will be lazily filled when the next 
time the table or the dependents are accessed.
+
 ### Syntax
 
 ```sql
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala
index 145daaf..884a389 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala
@@ -552,7 +552,7 @@ class CatalogImpl(sparkSession: SparkSession) extends 
Catalog {
     // Re-caches the logical plan of the relation.
     // Note this is a no-op for the relation itself if it's not cached, but 
will clear all
     // caches referencing this relation. If this relation is cached as an 
InMemoryRelation,
-    // this will clear the relation cache and caches of all its dependants.
+    // this will clear the relation cache and caches of all its dependents.
     relation match {
       case SubqueryAlias(_, relationPlan) =>
         sparkSession.sharedState.cacheManager.recacheByPlan(sparkSession, 
relationPlan)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to