oulenz commented on issue #23715: [SPARK-26803][PYTHON] Add sbin subdirectory to pyspark URL: https://github.com/apache/spark/pull/23715#issuecomment-459981150 > The Spark UI is always available while running. You _can_ run the history server if you want, with local mode, though I think that's rare; you just need the actual Spark distribution. The Spark UI disappears as soon as your app run is complete, so in practice it's really unpractical for evaluation. > I'm surprised it just works from the pip packaging, but that makes this a reasonable idea, to package the scripts that happen to work. They're small. I'm still not sure it's a good idea, as it's not on purpose and there is a well-established and intended way to run Spark, which is to get the Spark distro. You say that that's the "well-established and intended way to run Spark", but both the [download page](http://spark.apache.org/downloads.html) and the [quickstart guide](http://spark.apache.org/docs/latest/quick-start.html) explicitly give you the option of installing pyspark as a pip package. I suspect that most Python developpers will jump to that option, because it's a one line installation, because it fits their workflow, and because they are working with conda or virtualenv environments. I think there is a communication issue here from Spark to its users, where at the one hand Spark offers a pip package that can in essence do everything, locally, but on the other hand you seem to be saying well actually, that package is not the right way to run Spark.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
