Re: ImportError: No module named numpy
Issue has been fixed after lots of R&D around finally found preety simple things causing this problem It was related to permission issue on the python libraries. The user I am logged in was not having enough permission to read/execute the following python liabraries. /usr/lib/python2.7/site-packages/ /usr/lib64/python2.7/ so above path should have read/execute permission to user executing python/pyspark program. Thanks everyone for your help with same. Appreciate! Regards On Sun, Jun 5, 2016 at 12:04 AM, Daniel Rodriguez wrote: > Like people have said you need numpy in all the nodes of the cluster. The > easiest way in my opinion is to use anaconda: > https://www.continuum.io/downloads but that can get tricky to manage in > multiple nodes if you don't have some configuration management skills. > > How are you deploying the spark cluster? If you are using cloudera I > recommend to use the Anaconda Parcel: > http://blog.cloudera.com/blog/2016/02/making-python-on-apache-hadoop-easier-with-anaconda-and-cdh/ > > On 4 Jun 2016, at 11:13, Gourav Sengupta > wrote: > > Hi, > > I think that solution is too simple. Just download anaconda (if you pay > for the licensed version you will eventually feel like being in heaven when > you move to CI and CD and live in a world where you have a data product > actually running in real life). > > Then start the pyspark program by including the following: > > PYSPARK_PYTHON=< installation>>/anaconda2/bin/python2.7 PATH=$PATH:< installation>>/anaconda/bin <>/pyspark > > :) > > In case you are using it in EMR the solution is a bit tricky. Just let me > know in case you want any further help. > > > Regards, > Gourav Sengupta > > > > > > On Thu, Jun 2, 2016 at 7:59 PM, Eike von Seggern < > eike.segg...@sevenval.com> wrote: > >> Hi, >> >> are you using Spark on one machine or many? >> >> If on many, are you sure numpy is correctly installed on all machines? >> >> To check that the environment is set-up correctly, you can try something >> like >> >> import os >> pythonpaths = sc.range(10).map(lambda i: >> os.environ.get("PYTHONPATH")).collect() >> print(pythonpaths) >> >> HTH >> >> Eike >> >> 2016-06-02 15:32 GMT+02:00 Bhupendra Mishra : >> >>> did not resolved. :( >>> >>> On Thu, Jun 2, 2016 at 3:01 PM, Sergio Fernández >>> wrote: >>> >>>> >>>> On Thu, Jun 2, 2016 at 9:59 AM, Bhupendra Mishra < >>>> bhupendra.mis...@gmail.com> wrote: >>>>> >>>>> and i have already exported environment variable in spark-env.sh as >>>>> follows.. error still there error: ImportError: No module named numpy >>>>> >>>>> export PYSPARK_PYTHON=/usr/bin/python >>>>> >>>> >>>> According the documentation at >>>> http://spark.apache.org/docs/latest/configuration.html#environment-variables >>>> the PYSPARK_PYTHON environment variable is for poniting to the Python >>>> interpreter binary. >>>> >>>> If you check the programming guide >>>> https://spark.apache.org/docs/0.9.0/python-programming-guide.html#installing-and-configuring-pyspark >>>> it says you need to add your custom path to PYTHONPATH (the script >>>> automatically adds the bin/pyspark there). >>>> >>>> So typically in Linux you would need to add the following (assuming you >>>> installed numpy there): >>>> >>>> export PYTHONPATH=$PYTHONPATH:/usr/lib/python2.7/dist-packages >>>> >>>> Hope that helps. >>>> >>>> >>>> >>>> >>>>> On Thu, Jun 2, 2016 at 12:04 AM, Julio Antonio Soto de Vicente < >>>>> ju...@esbet.es> wrote: >>>>> >>>>>> Try adding to spark-env.sh (renaming if you still have it with >>>>>> .template at the end): >>>>>> >>>>>> PYSPARK_PYTHON=/path/to/your/bin/python >>>>>> >>>>>> Where your bin/python is your actual Python environment with Numpy >>>>>> installed. >>>>>> >>>>>> >>>>>> El 1 jun 2016, a las 20:16, Bhupendra Mishra < >>>>>> bhupendra.mis...@gmail.com> escribió: >>>>>> >>>>>> I have numpy installed but where I should setup PYTHONPATH? >>>>>> >>>>>> >>>>>> On Wed, Jun 1, 20
Re: ImportError: No module named numpy
You should set both PYSPARK_DRIVER_PYTHON and PYSPARK_PYTHON the path to your python interpreter. 2016-06-02 20:32 GMT+07:00 Bhupendra Mishra : > did not resolved. :( > > On Thu, Jun 2, 2016 at 3:01 PM, Sergio Fernández > wrote: > >> >> On Thu, Jun 2, 2016 at 9:59 AM, Bhupendra Mishra < >> bhupendra.mis...@gmail.com> wrote: >>> >>> and i have already exported environment variable in spark-env.sh as >>> follows.. error still there error: ImportError: No module named numpy >>> >>> export PYSPARK_PYTHON=/usr/bin/python >>> >> >> According the documentation at >> http://spark.apache.org/docs/latest/configuration.html#environment-variables >> the PYSPARK_PYTHON environment variable is for poniting to the Python >> interpreter binary. >> >> If you check the programming guide >> https://spark.apache.org/docs/0.9.0/python-programming-guide.html#installing-and-configuring-pyspark >> it says you need to add your custom path to PYTHONPATH (the script >> automatically adds the bin/pyspark there). >> >> So typically in Linux you would need to add the following (assuming you >> installed numpy there): >> >> export PYTHONPATH=$PYTHONPATH:/usr/lib/python2.7/dist-packages >> >> Hope that helps. >> >> >> >> >>> On Thu, Jun 2, 2016 at 12:04 AM, Julio Antonio Soto de Vicente < >>> ju...@esbet.es> wrote: >>> >>>> Try adding to spark-env.sh (renaming if you still have it with >>>> .template at the end): >>>> >>>> PYSPARK_PYTHON=/path/to/your/bin/python >>>> >>>> Where your bin/python is your actual Python environment with Numpy >>>> installed. >>>> >>>> >>>> El 1 jun 2016, a las 20:16, Bhupendra Mishra < >>>> bhupendra.mis...@gmail.com> escribió: >>>> >>>> I have numpy installed but where I should setup PYTHONPATH? >>>> >>>> >>>> On Wed, Jun 1, 2016 at 11:39 PM, Sergio Fernández >>>> wrote: >>>> >>>>> sudo pip install numpy >>>>> >>>>> On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra < >>>>> bhupendra.mis...@gmail.com> wrote: >>>>> >>>>>> Thanks . >>>>>> How can this be resolved? >>>>>> >>>>>> On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau >>>>>> wrote: >>>>>> >>>>>>> Generally this means numpy isn't installed on the system or your >>>>>>> PYTHONPATH has somehow gotten pointed somewhere odd, >>>>>>> >>>>>>> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < >>>>>>> bhupendra.mis...@gmail.com> wrote: >>>>>>> >>>>>>>> If any one please can help me with following error. >>>>>>>> >>>>>>>> File >>>>>>>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>>>>>>> line 25, in >>>>>>>> >>>>>>>> ImportError: No module named numpy >>>>>>>> >>>>>>>> >>>>>>>> Thanks in advance! >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Cell : 425-233-8271 >>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Sergio Fernández >>>>> Partner Technology Manager >>>>> Redlink GmbH >>>>> m: +43 6602747925 >>>>> e: sergio.fernan...@redlink.co >>>>> w: http://redlink.co >>>>> >>>> >>>> >>> >> >> >> -- >> Sergio Fernández >> Partner Technology Manager >> Redlink GmbH >> m: +43 6602747925 >> e: sergio.fernan...@redlink.co >> w: http://redlink.co >> > >
Re: ImportError: No module named numpy
did not resolved. :( On Thu, Jun 2, 2016 at 3:01 PM, Sergio Fernández wrote: > > On Thu, Jun 2, 2016 at 9:59 AM, Bhupendra Mishra < > bhupendra.mis...@gmail.com> wrote: >> >> and i have already exported environment variable in spark-env.sh as >> follows.. error still there error: ImportError: No module named numpy >> >> export PYSPARK_PYTHON=/usr/bin/python >> > > According the documentation at > http://spark.apache.org/docs/latest/configuration.html#environment-variables > the PYSPARK_PYTHON environment variable is for poniting to the Python > interpreter binary. > > If you check the programming guide > https://spark.apache.org/docs/0.9.0/python-programming-guide.html#installing-and-configuring-pyspark > it says you need to add your custom path to PYTHONPATH (the script > automatically adds the bin/pyspark there). > > So typically in Linux you would need to add the following (assuming you > installed numpy there): > > export PYTHONPATH=$PYTHONPATH:/usr/lib/python2.7/dist-packages > > Hope that helps. > > > > >> On Thu, Jun 2, 2016 at 12:04 AM, Julio Antonio Soto de Vicente < >> ju...@esbet.es> wrote: >> >>> Try adding to spark-env.sh (renaming if you still have it with .template >>> at the end): >>> >>> PYSPARK_PYTHON=/path/to/your/bin/python >>> >>> Where your bin/python is your actual Python environment with Numpy >>> installed. >>> >>> >>> El 1 jun 2016, a las 20:16, Bhupendra Mishra >>> escribió: >>> >>> I have numpy installed but where I should setup PYTHONPATH? >>> >>> >>> On Wed, Jun 1, 2016 at 11:39 PM, Sergio Fernández >>> wrote: >>> >>>> sudo pip install numpy >>>> >>>> On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra < >>>> bhupendra.mis...@gmail.com> wrote: >>>> >>>>> Thanks . >>>>> How can this be resolved? >>>>> >>>>> On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau >>>>> wrote: >>>>> >>>>>> Generally this means numpy isn't installed on the system or your >>>>>> PYTHONPATH has somehow gotten pointed somewhere odd, >>>>>> >>>>>> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < >>>>>> bhupendra.mis...@gmail.com> wrote: >>>>>> >>>>>>> If any one please can help me with following error. >>>>>>> >>>>>>> File >>>>>>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>>>>>> line 25, in >>>>>>> >>>>>>> ImportError: No module named numpy >>>>>>> >>>>>>> >>>>>>> Thanks in advance! >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Cell : 425-233-8271 >>>>>> Twitter: https://twitter.com/holdenkarau >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Sergio Fernández >>>> Partner Technology Manager >>>> Redlink GmbH >>>> m: +43 6602747925 >>>> e: sergio.fernan...@redlink.co >>>> w: http://redlink.co >>>> >>> >>> >> > > > -- > Sergio Fernández > Partner Technology Manager > Redlink GmbH > m: +43 6602747925 > e: sergio.fernan...@redlink.co > w: http://redlink.co >
Re: ImportError: No module named numpy
On Thu, Jun 2, 2016 at 9:59 AM, Bhupendra Mishra wrote: > > and i have already exported environment variable in spark-env.sh as > follows.. error still there error: ImportError: No module named numpy > > export PYSPARK_PYTHON=/usr/bin/python > According the documentation at http://spark.apache.org/docs/latest/configuration.html#environment-variables the PYSPARK_PYTHON environment variable is for poniting to the Python interpreter binary. If you check the programming guide https://spark.apache.org/docs/0.9.0/python-programming-guide.html#installing-and-configuring-pyspark it says you need to add your custom path to PYTHONPATH (the script automatically adds the bin/pyspark there). So typically in Linux you would need to add the following (assuming you installed numpy there): export PYTHONPATH=$PYTHONPATH:/usr/lib/python2.7/dist-packages Hope that helps. > On Thu, Jun 2, 2016 at 12:04 AM, Julio Antonio Soto de Vicente < > ju...@esbet.es> wrote: > >> Try adding to spark-env.sh (renaming if you still have it with .template >> at the end): >> >> PYSPARK_PYTHON=/path/to/your/bin/python >> >> Where your bin/python is your actual Python environment with Numpy >> installed. >> >> >> El 1 jun 2016, a las 20:16, Bhupendra Mishra >> escribió: >> >> I have numpy installed but where I should setup PYTHONPATH? >> >> >> On Wed, Jun 1, 2016 at 11:39 PM, Sergio Fernández >> wrote: >> >>> sudo pip install numpy >>> >>> On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra < >>> bhupendra.mis...@gmail.com> wrote: >>> >>>> Thanks . >>>> How can this be resolved? >>>> >>>> On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau >>>> wrote: >>>> >>>>> Generally this means numpy isn't installed on the system or your >>>>> PYTHONPATH has somehow gotten pointed somewhere odd, >>>>> >>>>> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < >>>>> bhupendra.mis...@gmail.com> wrote: >>>>> >>>>>> If any one please can help me with following error. >>>>>> >>>>>> File >>>>>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>>>>> line 25, in >>>>>> >>>>>> ImportError: No module named numpy >>>>>> >>>>>> >>>>>> Thanks in advance! >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Cell : 425-233-8271 >>>>> Twitter: https://twitter.com/holdenkarau >>>>> >>>> >>>> >>> >>> >>> -- >>> Sergio Fernández >>> Partner Technology Manager >>> Redlink GmbH >>> m: +43 6602747925 >>> e: sergio.fernan...@redlink.co >>> w: http://redlink.co >>> >> >> > -- Sergio Fernández Partner Technology Manager Redlink GmbH m: +43 6602747925 e: sergio.fernan...@redlink.co w: http://redlink.co
Re: ImportError: No module named numpy
its RHEL and i have already exported environment variable in spark-env.sh as follows.. error still there error: ImportError: No module named numpy export PYSPARK_PYTHON=/usr/bin/python thanks On Thu, Jun 2, 2016 at 12:04 AM, Julio Antonio Soto de Vicente < ju...@esbet.es> wrote: > Try adding to spark-env.sh (renaming if you still have it with .template > at the end): > > PYSPARK_PYTHON=/path/to/your/bin/python > > Where your bin/python is your actual Python environment with Numpy > installed. > > > El 1 jun 2016, a las 20:16, Bhupendra Mishra > escribió: > > I have numpy installed but where I should setup PYTHONPATH? > > > On Wed, Jun 1, 2016 at 11:39 PM, Sergio Fernández > wrote: > >> sudo pip install numpy >> >> On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra < >> bhupendra.mis...@gmail.com> wrote: >> >>> Thanks . >>> How can this be resolved? >>> >>> On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau >>> wrote: >>> >>>> Generally this means numpy isn't installed on the system or your >>>> PYTHONPATH has somehow gotten pointed somewhere odd, >>>> >>>> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < >>>> bhupendra.mis...@gmail.com> wrote: >>>> >>>>> If any one please can help me with following error. >>>>> >>>>> File >>>>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>>>> line 25, in >>>>> >>>>> ImportError: No module named numpy >>>>> >>>>> >>>>> Thanks in advance! >>>>> >>>>> >>>> >>>> >>>> -- >>>> Cell : 425-233-8271 >>>> Twitter: https://twitter.com/holdenkarau >>>> >>> >>> >> >> >> -- >> Sergio Fernández >> Partner Technology Manager >> Redlink GmbH >> m: +43 6602747925 >> e: sergio.fernan...@redlink.co >> w: http://redlink.co >> > >
Re: ImportError: No module named numpy
Try adding to spark-env.sh (renaming if you still have it with .template at the end): PYSPARK_PYTHON=/path/to/your/bin/python Where your bin/python is your actual Python environment with Numpy installed. > El 1 jun 2016, a las 20:16, Bhupendra Mishra > escribió: > > I have numpy installed but where I should setup PYTHONPATH? > > >> On Wed, Jun 1, 2016 at 11:39 PM, Sergio Fernández wrote: >> sudo pip install numpy >> >>> On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra >>> wrote: >>> Thanks . >>> How can this be resolved? >>> >>>> On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau wrote: >>>> Generally this means numpy isn't installed on the system or your >>>> PYTHONPATH has somehow gotten pointed somewhere odd, >>>> >>>>> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra >>>>> wrote: >>>>> If any one please can help me with following error. >>>>> >>>>> File >>>>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>>>> line 25, in >>>>> >>>>> ImportError: No module named numpy >>>>> >>>>> >>>>> Thanks in advance! >>>> >>>> >>>> >>>> -- >>>> Cell : 425-233-8271 >>>> Twitter: https://twitter.com/holdenkarau >> >> >> >> -- >> Sergio Fernández >> Partner Technology Manager >> Redlink GmbH >> m: +43 6602747925 >> e: sergio.fernan...@redlink.co >> w: http://redlink.co >
Re: ImportError: No module named numpy
I have numpy installed but where I should setup PYTHONPATH? On Wed, Jun 1, 2016 at 11:39 PM, Sergio Fernández wrote: > sudo pip install numpy > > On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra < > bhupendra.mis...@gmail.com> wrote: > >> Thanks . >> How can this be resolved? >> >> On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau >> wrote: >> >>> Generally this means numpy isn't installed on the system or your >>> PYTHONPATH has somehow gotten pointed somewhere odd, >>> >>> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < >>> bhupendra.mis...@gmail.com> wrote: >>> >>>> If any one please can help me with following error. >>>> >>>> File >>>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>>> line 25, in >>>> >>>> ImportError: No module named numpy >>>> >>>> >>>> Thanks in advance! >>>> >>>> >>> >>> >>> -- >>> Cell : 425-233-8271 >>> Twitter: https://twitter.com/holdenkarau >>> >> >> > > > -- > Sergio Fernández > Partner Technology Manager > Redlink GmbH > m: +43 6602747925 > e: sergio.fernan...@redlink.co > w: http://redlink.co >
Re: ImportError: No module named numpy
sudo pip install numpy On Wed, Jun 1, 2016 at 5:56 PM, Bhupendra Mishra wrote: > Thanks . > How can this be resolved? > > On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau wrote: > >> Generally this means numpy isn't installed on the system or your >> PYTHONPATH has somehow gotten pointed somewhere odd, >> >> On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < >> bhupendra.mis...@gmail.com> wrote: >> >>> If any one please can help me with following error. >>> >>> File >>> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >>> line 25, in >>> >>> ImportError: No module named numpy >>> >>> >>> Thanks in advance! >>> >>> >> >> >> -- >> Cell : 425-233-8271 >> Twitter: https://twitter.com/holdenkarau >> > > -- Sergio Fernández Partner Technology Manager Redlink GmbH m: +43 6602747925 e: sergio.fernan...@redlink.co w: http://redlink.co
Re: ImportError: No module named numpy
Thanks . How can this be resolved? On Wed, Jun 1, 2016 at 9:02 PM, Holden Karau wrote: > Generally this means numpy isn't installed on the system or your > PYTHONPATH has somehow gotten pointed somewhere odd, > > On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra < > bhupendra.mis...@gmail.com> wrote: > >> If any one please can help me with following error. >> >> File >> "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", >> line 25, in >> >> ImportError: No module named numpy >> >> >> Thanks in advance! >> >> > > > -- > Cell : 425-233-8271 > Twitter: https://twitter.com/holdenkarau >
Re: ImportError: No module named numpy
Generally this means numpy isn't installed on the system or your PYTHONPATH has somehow gotten pointed somewhere odd, On Wed, Jun 1, 2016 at 8:31 AM, Bhupendra Mishra wrote: > If any one please can help me with following error. > > File > "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", > line 25, in > > ImportError: No module named numpy > > > Thanks in advance! > > -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau
ImportError: No module named numpy
If any one please can help me with following error. File "/opt/mapr/spark/spark-1.6.1/python/lib/pyspark.zip/pyspark/mllib/__init__.py", line 25, in ImportError: No module named numpy Thanks in advance!