Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3895#discussion_r26449025
  
    --- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim13.scala ---
    @@ -297,7 +297,7 @@ private[hive] object HiveShim {
       def getStatsSetupConstRawDataSize = StatsSetupConst.RAW_DATA_SIZE
     
       def createDefaultDBIfNeeded(context: HiveContext) = {
    -    context.runSqlHive("CREATE DATABASE default")
    +    context.runSqlHive("CREATE DATABASE IF NOT EXISTS default")
    --- End diff --
    
    This is a bit tricky to explain. When initializing a `TestHiveContext`, the 
following things happen:
    
    1. `HiveContext.hiveconf` is initialized (notice that the metastore and 
warehouse paths point to the whatever configured in `hive-site.xml` or the 
default locations)
    1. `HiveContext.sessionState` is initialized
    1. `TestHiveContext.configure()` is called, metastore and warehouse paths 
now point to temporary directories used for testing purposes, no `default` 
database is defined there.
    1. `HiveShim.createDefaultDbIfNeeded()` is called to create the `default` 
database in the temproary directories.
    
    As Michael [commented] [1], the `createDefaultDbIfNeeded` method is more 
like a hack to fix the initialization disorder. And that's why I opened 
baishuo/spark#2 against this PR branch. In that PR, the root cause of this 
initialization disorder is fixed.
    
    [1]: https://github.com/apache/spark/pull/3895#issuecomment-80642173


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to