HyukjinKwon commented on PR #39928: URL: https://github.com/apache/spark/pull/39928#issuecomment-1428473402
> it feels a bit odd to me to have a script in the sbin that user has to manually specify the location of the connect jars. > What is the plan for distribution, are connect jars going to be in the main distribution jars/ directory so it won't be needed at that point? Yes, the eventual plan is to move the whole connect jars to the main distribution jars/ around the next release (Apache Spark 3.5.0). For now, the Spark connect project is separated into the external project. It is located in `connector/` - that was `external/` before (see also https://github.com/apache/spark/pull/35874, I think actually we might have to revisit the top level name here). For a bit of more context, there are a couple of plans such as replacing Py4J to Spark Connect (so we can block arbitrary JVM access from Python side for security purpose), and I personally am thinking about replacing Thrift server in the far future (and don't use Hive's Thrift server). I plan to send another email to explain the whole context in the dev mailing list around .. right after the Spark 3.4 release. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
