Hi Mich, Thanks for response. I am running it through CLI (on the cluster).
Since this will be scheduled job. I do not want to activate the environment manually. It should automatically take the path of virtual environment to run the job. For that I saw 3 properties which I mentioned. I think setting some of them to point to environment binary will help to run the job from venv. PYTHONPATH PYSPARK_DRIVER_PYTHON PYSPARK_PYTHON Also, It has to be set in env.sh or bashrc file? What is the difference between spark-env.sh and bashrc Thanks Rajat On Sun, Jan 17, 2021 at 10:32 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi Rajat, > > Are you running this through an IDE like PyCharm or on CLI? > > If you already have a Python Virtual environment, then just activate it > > The only env variable you need to set is export PYTHONPATH that you can do > it in your startup shell script .bashrc etc. > > Once you are in virtual environment, then you run: > > $SPARK_HOME/bin/spark-submit <Python.py) > > Alternatively you can chmod +x <python file), and add the following line > to the file > > #! /usr/bin/env python3 > > and then you can run it as. > > ./<python.py> > > HTH > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Sun, 17 Jan 2021 at 13:41, rajat kumar <kumar.rajat20...@gmail.com> > wrote: > >> Hello, >> >> Can anyone confirm here please? >> >> Regards >> Rajat >> >> On Sat, Jan 16, 2021 at 11:46 PM rajat kumar <kumar.rajat20...@gmail.com> >> wrote: >> >>> Hey Users, >>> >>> I want to run spark job from virtual environment using Python. >>> >>> Please note I am creating virtual env (using python3 -m venv env) >>> >>> I see that there are 3 variables for PYTHON which we have to set: >>> PYTHONPATH >>> PYSPARK_DRIVER_PYTHON >>> PYSPARK_PYTHON >>> >>> I have 2 doubts: >>> 1. If i want to use Virtual env, do I need to point python path of >>> virtual environment to all these variables? >>> 2. Should I set these variables in spark-env.sh or should I set them >>> using export statements. >>> >>> Regards >>> Rajat >>> >>> >>>