[GitHub] [hudi] JoshuaZhuCN opened a new issue, #9418: [SUPPORT] Hudi table does not support Spark SQL's cache table syntax

via GitHub Thu, 10 Aug 2023 09:24:12 -0700


JoshuaZhuCN opened a new issue, #9418:
URL: https://github.com/apache/hudi/issues/9418


   **the cached view created through Spark SQL syntax ```cache table``` does 
not use cached data when querying, 
   but instead re queries the hudi table data according to view logic.**
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. create a cached view using sql below
   ```CACHE TABLE spark_view OPTIONS ('storageLevel'='MEMORY_AND_DISK_SER') AS 
SELECT * FROM hudi_table```
   
   
![image](https://github.com/apache/hudi/assets/62231347/bfd01c24-c661-4e45-b533-1bf8331a4b7a)
   
   2.check the storage in spark ui
   
![image](https://github.com/apache/hudi/assets/62231347/43a76ea7-6ba1-489b-bb83-6f0ac1a6b473)
   
   4.query this cached view
   ```SELECT * FROM spark_view LIMIT 1```
   
   
![image](https://github.com/apache/hudi/assets/62231347/af83c612-f8ac-4fc7-95a0-77a3048fc455)
   
   5. see the dag and plan, It still queries from the hudi table instead of 
memory caching
   
![image](https://github.com/apache/hudi/assets/62231347/1d6011a2-f0d8-497e-ae01-85090b13e021)
   
   
   **The following figure shows the query plan after caching the hive table**
   
   
![image](https://github.com/apache/hudi/assets/62231347/bb2bbf51-3152-4558-aac6-d3381bcbba9c)
   
   **Environment Description**
   
   * Hudi version : 0.12.1
   
   * Spark version : 3.1.3
   
   * Hive version : 3.1.0
   
   * Hadoop version : 3.1.3
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] JoshuaZhuCN opened a new issue, #9418: [SUPPORT] Hudi table does not support Spark SQL's cache table syntax

Reply via email to