pan3793 commented on PR #52706:
URL: https://github.com/apache/spark/pull/52706#issuecomment-3441217757
> Please make `beeline` work in the existing class path by default. New code
path should be applied additionally by configuration or environment variables.
@dongjoon-hyun In general, I agree with your concerns and proposal, but I
think we can have a different default behavior, due to the following reasons:
1. Technically, BeeLine does NOT use Spark classes.
Spark integrates the vanilla Hive BeeLine without modification, the
dependencies list can be found at [Maven
Central](https://mvnrepository.com/artifact/org.apache.hive/hive-beeline/2.3.10).
Excluding some classic Spark jars should NOT be risky.
2. To not surprise users, we'd better make the usage of BeeLine with Connect
Server out-of-the-box, then we should tune the classpath automatically.
3. If we want to achieve both 2 and make `beeline` work in the existing
classpath by default, we must have a mechanism to distinguish which service
BeeLine is going to connect to, which involves two questions:
1. We can parse the args in `SparkClassCommandBuilder` to distinguish the
connect service if the user provides the JDBC URL in the command directly,
e.g., `beeline -u 'jdbc:sc://xxxx'`, but this means we need to process
`BeeLine` args in Spark Launcher, which introduces additional complexity and is
not eligible IMO.
2. BeeLine also allows users to use `!connect <jdbc-url>` to connect to a
DBMS in interactive mode (after starting the CLI). In this case, we don't have
a chance to dynamically change the classpath.
Given the above reasons, I think we can change the classpath as proposed by
this PR by default, and have an internal switch (i.e., env var
`SPARK_BEELINE_CLASSIC` and keep it for at least until 5.x) as a backdoor to
allow the user to switch back to the original classpath if something goes wrong.
Or, if we are very conservative, we can provide a switch (e.g., env var
`SPARK_BEELINE_CONNECT`) and then the user must set it explicitly before using
BeeLine to connect to Connect Server. TBH, I think this hurts user experience.
```
$ SPARK_BEELINE_CONNECT=1 bin/beeline -u jdbc:sc://localhost:15002
```
also cc @LuciferYang
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]