Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/4382#issuecomment-75908281
Hey @chenghao-intel, thanks for working on this, AFAIK this is a pain point
for many Spark SQL users who would like to put HiveThriftServer2 into
production. Also had a discussion with @marmbrus about this recently.
As we've discussed offline, instead of changing `CacheManager` and
`FunctionRegistry` to global instances, adding a `SQLSession` and moving
per-session objects (configurations, temporary functions, etc.) to it could be
more preferrable. To be more specific:
1. Add a new `SQLSession` class, which is responsible to maintain all
per-session objects, like configurations, temporary functions, etc..
2. Add a `session` field of type `SQLSession` in `SQLContext`, and override
it in `HiveContext`, then put Hive specific per-session objects into it, like
Hive client, Hive session state, etc..
3. Add the following session specific methods to `SQLContext`:
- `createSession: SQLSession`
- `currentSession: SQLSession`
- `setSession(session: SQLSession): Unit`
- `closeSession(session: SQLSession)`
These methods should be `private[sql]` as they are subject to change.
Currently we can just mimic Hive behavior, for example, using thread-local
instances just like what Hive session state does. (You may see the
`SQLSession` object within `HiveContext` a thin wrapper of `SessionState`
together with other per-session components.)
The benefits of this approach are:
1. In the long run, we'd like to move `HiveContext` out of the main
framework and make Hive a separate data source. With the above approach, it's
more natural to build a separate Spark SQL server with multi-user support
without depending on Hive specific code.
2. Making components like `CacheManager` global objects are not test
friendly. Basically it's impossible to make Spark SQL tests run in parallel.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]