Github user windpiger commented on the issue:

    https://github.com/apache/spark/pull/17001
  
    yes,it is for HiveExternalCatalog.
     when I do this [PR](https://github.com/apache/spark/pull/16996), I found 
the logic.
    
    >The hive.metastore.warehouse.dir in sparkConf still take effect in Spark, 
it is not useless.
      The reason is that:
      1.when we run spark with HiveEnabled, it will create ShareState
      2.when create ShareState, it will create a HiveExternalCatalog
    
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala#L85
    3.when create HiveExternalCatalog, it will Create HiveClientImpl by 
HiveUtils
    
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L65
    4.when create HiveClientImpl, it will call SessionState.start(state)
    and then in the SessionState.start(state), it will create a default 
database using hive.metastore.warehouse.dir in hiveConf which is created in 
HiveClientImpl 
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L189
    while the hiveConf created in HiveClientImpl from hadoopConf and sparkConf, 
and sparkConf will overwrite the value of the same key in hadoopConf. So it 
means that it actually will use hive.metastore.warehouse.dir in sparkConf to 
create the default database, if we does not overwrite the value in sparkConf in 
SharedState, the database location is not we expected which is the warehouse 
path. So here sparkContext.conf.set("hive.metastore.warehouse.dir", 
sparkWarehouseDir) should be retained
    
    **we can also find that,the default database does not created in 
SharedState, here condition is false, will not hit the create database logic. 
it has been created when we init the HiveClientImpl
    
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala#L96**


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to