Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/4382#issuecomment-75908281
  
    Hey @chenghao-intel, thanks for working on this, AFAIK this is a pain point 
for many Spark SQL users who would like to put HiveThriftServer2 into 
production.  Also had a discussion with @marmbrus about this recently.
    
    As we've discussed offline, instead of changing `CacheManager` and 
`FunctionRegistry` to global instances, adding a `SQLSession` and moving 
per-session objects (configurations, temporary functions, etc.) to it could be 
more preferrable. To be more specific:
    
    1. Add a new `SQLSession` class, which is responsible to maintain all 
per-session objects, like configurations, temporary functions, etc..
    2. Add a `session` field of type `SQLSession` in `SQLContext`, and override 
it in `HiveContext`, then put Hive specific per-session objects into it, like 
Hive client, Hive session state, etc..
    3. Add the following session specific methods to `SQLContext`:
    
       - `createSession: SQLSession`
       - `currentSession: SQLSession`
       - `setSession(session: SQLSession): Unit`
       - `closeSession(session: SQLSession)`
    
       These methods should be `private[sql]` as they are subject to change.  
Currently we can just mimic Hive behavior, for example, using thread-local 
instances just like what Hive session state does.  (You may see the 
`SQLSession` object within `HiveContext` a thin wrapper of `SessionState` 
together with other per-session components.)
    
    The benefits of this approach are:
    
    1. In the long run, we'd like to move `HiveContext` out of the main 
framework and make Hive a separate data source. With the above approach, it's 
more natural to build a separate Spark SQL server with multi-user support 
without depending on Hive specific code.
    2. Making components like `CacheManager` global objects are not test 
friendly. Basically it's impossible to make Spark SQL tests run in parallel.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to