Try:

import os
os.environ['PYSPARK_PYTHON'] = “python path”
os.environ[’SPARK_HOME’] = “SPARK path”






在 2022年9月20日 17:51,yogita bhardwaj<yogita.bhard...@iktara.ai> 写道:


 
I have installed pyspark using pip.
I m getting the error while running the following code.
from pyspark import SparkContext
sc=SparkContext()
a=sc.parallelize([1,2,3,4])
print(f"a_take:{a.take(2)}")
 
py4j.protocol.Py4JJavaError: An error occurred while calling 
z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 
0) (DESKTOP-DR2QC97.mshome.net executor driver): 
org.apache.spark.SparkException: Python worker failed to connect back.
                at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:189)
                at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:109)
                at 
org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:124)
                at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:164)
                at 
org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
                at 
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
                at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
 
Can anyone please help me to resolve this issue.

Reply via email to