Looks like PySpark can't initiate a JVM in the backend.  How did you set up Java and Spark on your machine?  Some suggestions that may help solve your issue:

1. Use OpenJDK instead of Apple JDK since Spark was developed using
   OpenJDK, not Apple's.  You can use homebrew to install OpenJDK (I
   don't see any reasons why you need to use Apple's JDK unless you are
   using the latest Mac.  See question below)
2. Download and deploy the Spark tarball directly from Spark's web site
   and run Spark's examples to test your environment using command line
   before integrating with PyCharm

My question to the group:  Does anyone have any luck with Apple's JDK when running Spark or other applications (performance-wise)? Is this the one with native libs for the M1 chipset?

-- ND


On 8/17/21 1:56 AM, karan alang wrote:

Hello Experts,

i'm trying to run spark-submit on my macbook pro(commandline or using PyCharm), and it seems to be giving error ->

Exception: Java gateway process exited before sending its port number

i've tried setting values to variable in the program (based on the recommendations by people on the internet), but the problem still remains.

Any pointers on how to resolve this issue?

# explicitly setting environment variables
os.environ["JAVA_HOME"] = "/Library/Java/JavaVirtualMachines/applejdk-11.0.7.10.1.jdk/Contents/Home" os.environ["PYTHONPATH"] = "/usr/local/Cellar/apache-spark/3.1.2/libexec//python/lib/py4j-0.10.4-src.zip:/usr/local/Cellar/apache-spark/3.1.2/libexec//python/:"
os.environ["PYSPARK_SUBMIT_ARGS"]="--master local[2] pyspark-shell"

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_umd.py", line 198, in runfile     pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script   File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/karanalang/Documents/Technology/StructuredStreamin_Udemy/Spark-Streaming-In-Python-master/00-HelloSparkSQL/HelloSparkSQL.py", line 12, in <module>
    spark = SparkSession.builder.master("local[*]").getOrCreate()
  File "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/sql/session.py", line 228, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/context.py", line 384, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/context.py", line 144, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/context.py", line 331, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/Users/karanalang/.conda/envs/PythonLeetcode/lib/python3.9/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway     raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number

Reply via email to