packyan opened a new issue, #2857:
URL: https://github.com/apache/incubator-kyuubi/issues/2857

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the 
[issues](https://github.com/apache/incubator-kyuubi/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What would you like to be improved?
   
   My scenario is to use kyuubi as a replacement for hiveserver2. When kyuubi 
and hive are used at the same time, the same table may be modified by two 
different execution engines at the same time. Due to the metadata caching 
feature of SparkSQL, SparkSQLEngine cannot perceive the changes of the table in 
time.
   
   For example, Kyuubi user use SparkSQL Engine query a table at fisrt, then 
hive user insert some record to it, back to the SparkSQL Engine, the kyuubi 
user query the table again, they will found nothing changes.
   Another example is, when Kyuubi user query a table at first, then the hive 
user truncate it, when the kyuubi user query this table again, SparkSQL throws 
exceptions as follow:
   ```shell
   It is possible the underlying files have been updated. You can explicitly 
invalidate
   the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by
   recreating the Dataset/DataFrame involved.
   ```
   
   ### How should we improve?
   
   Although there is many ways to solove this problem, such as refresh table 
before query it, set `spark.sql.filesourceTableRelationCacheSize` to zero when 
open a kyuubi session, these methods are not user friendly.
   We should provide a configuration to tooggle the `TableRelationCache` 
feature. Maybe in many scenarios, Kyuubi admin will choose to turn off the 
SparkSQL table relation cache feature to reduce user complaints.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to