OK, so the normal install is working. Now, to fix your issue we need to understand how `sc.install_pypi_package` is working and mainly how does it call `pip`. We need to make sure that it call the right pip (the system `pip3` in your case).
On Fri, 22 Jan 2021 at 14:39, Bertrand B. <bertrand25...@gmail.com> wrote: > Thank you Guillaume for your help, > > I am using : (running on AWS EMR-6.2) > pip3 --version > pip 9.0.3 from /usr/lib/python3.7/site-packages (python 3.7) > > > pip3 install scikit-learn > > Collecting scikit-learn > Using cached > https://files.pythonhosted.org/packages/f4/7b/d415b0c89babf23dcd8ee631015f043e2d76795edd9c7359d6e63257464b/scikit-learn-0.24.1.tar.gz > Requirement already satisfied: numpy>=1.13.3 in > /usr/local/lib64/python3.7/site-packages (from scikit-learn) > Collecting scipy>=0.19.1 (from scikit-learn) > Using cached > https://files.pythonhosted.org/packages/58/9d/8296d8211318d690119eba6d293b7a149c1c51c945342dd4c3816f79e1ba/scipy-1.6.0-cp37-cp37m-manylinux1_x86_64.whl > Requirement already satisfied: joblib>=0.11 in > /usr/local/lib64/python3.7/site-packages (from scikit-learn) > Requirement already satisfied: threadpoolctl>=2.0.0 in > /usr/local/lib/python3.7/site-packages (from scikit-learn) > Installing collected packages: scipy, scikit-learn > Running setup.py install for scikit-learn ... error > Complete output from command /usr/bin/python3 -u -c "import > setuptools, > tokenize;__file__='/mnt/tmp/pip-build-93pagltp/scikit-learn/setup.py';f=getattr(tokenize, > 'open', open)(__file__);code=f.read().replace('\r\n', > '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record > /tmp/pip-0ulalx36-record/install-record.txt > --single-version-externally-managed --compile: > Partial import of sklearn during the build process. > Traceback (most recent call last): > File > "/mnt/tmp/pip-build-93pagltp/scikit-learn/sklearn/_build_utils/__init__.py", > line 27, in _check_cython_version > import Cython > ModuleNotFoundError: No module named 'Cython' > > > Upgrading pip to 20.3.3 : > > sudo pip3 install --upgrade pip > sudo ln -s /usr/local/bin/pip3 /usr/bin/pip3 > > pip3 --version > pip 20.3.3 from /usr/local/lib/python3.7/site-packages/pip (python 3.7) > > let me install from the whl file : > pip3 install scikit-learn > Collecting scikit-learn > Downloading scikit_learn-0.24.1-cp37-cp37m-manylinux2010_x86_64.whl > (22.3 MB) > > However, using the API sc.install_pypi_package("scikit-learn") still uses > the tar file instead of the whl file (even after the pip upgrade). > > Collecting scikit-learn > Using cached > https://files.pythonhosted.org/packages/f4/7b/d415b0c89babf23dcd8ee631015f043e2d76795edd9c7359d6e63257464b/scikit-learn-0.24.1.tar.gz > > > Thanks for your help, > > Cheers, > > Bertrand > > Le ven. 22 janv. 2021 à 04:13, Guillaume Lemaître <g.lemaitr...@gmail.com> > a écrit : > >> @Bertrand Could you tell us which version of `pip` to you use (you need >> pip >= 19.0 for manylinux2010 and pip >= 19.3 for manylinux2014) >> >> On Fri, 22 Jan 2021 at 09:49, Guillaume Lemaître <g.lemaitr...@gmail.com> >> wrote: >> >>> We might experience an issue with PyPI not selecting the manylinux2010 >>> wheel: https://github.com/scikit-learn/scikit-learn/issues/19233 >>> We have to check but we will probably shortly upload manylinux1 wheels >>> that should resolve the issue. >>> >>> I am curious if fetching the wheel by hand and installing via `pip` >>> would be a workaround (not practical for automated usage thought). >>> >>> On Thu, 21 Jan 2021 at 00:34, The Helmbolds via scikit-learn < >>> scikit-learn@python.org> wrote: >>> >>>> Use the Anaconda Python installation. >>>> >>>> "You won't find the right answers if you don't ask the right >>>> questions!" (Robert Helmbold, 2013) >>>> >>>> >>>> On Wednesday, January 20, 2021, 04:16:15 PM MST, Guillaume Lemaître < >>>> g.lemaitr...@gmail.com> wrote: >>>> >>>> >>>> Basically it get the tar with the source and recompile instead of using >>>> the wheel. Could you force an install from PyPI without using the cached >>>> file. >>>> >>>> We pushed wheels yesterday for 0.24.1 as well so it should not get the >>>> 0.24.0 version. >>>> >>>> For 0.23.2, you can see that it used the wheel (.whl). >>>> >>>> Sent from my phone - sorry to be brief and potential misspell. >>>> *From:* bertrand25...@gmail.com >>>> *Sent:* 20 January 2021 23:21 >>>> *To:* scikit-learn@python.org >>>> *Reply to:* scikit-learn@python.org >>>> *Subject:* [scikit-learn] scikit-learn 0.24 installation fails with >>>> ModuleNotFoundError: No module named 'scipy' >>>> >>>> To whom it may concern, >>>> >>>> I am trying to install scikit-learn in a PySpark job using the >>>> install_pypi_package PySpark API but the install fails with : >>>> >>>> sc.install_pypi_package("scikit-learn") >>>> >>>> Collecting scikit-learn >>>> Using cached >>>> https://files.pythonhosted.org/packages/db/e2/9c0bde5f81394b627f623557690536b12017b84988a4a1f98ec826edab9e/scikit-learn-0.24.0.tar.gz >>>> Requirement already satisfied: numpy>=1.13.3 in >>>> /usr/local/lib64/python3.7/site-packages (from scikit-learn) >>>> Collecting scipy>=0.19.1 (from scikit-learn) >>>> Using cached >>>> https://files.pythonhosted.org/packages/58/9d/8296d8211318d690119eba6d293b7a149c1c51c945342dd4c3816f79e1ba/scipy-1.6.0-cp37-cp37m-manylinux1_x86_64.whl >>>> Requirement already satisfied: joblib>=0.11 in >>>> /usr/local/lib64/python3.7/site-packages (from scikit-learn) >>>> Collecting threadpoolctl>=2.0.0 (from scikit-learn) >>>> Using cached >>>> https://files.pythonhosted.org/packages/f7/12/ec3f2e203afa394a149911729357aa48affc59c20e2c1c8297a60f33f133/threadpoolctl-2.1.0-py3-none-any.whl >>>> Building wheels for collected packages: scikit-learn >>>> Running setup.py bdist_wheelfor scikit-learn: started >>>> Running setup.py bdist_wheelfor scikit-learn: finished with status >>>> 'error' >>>> Complete output from command /tmp/1611000009300-0/bin/python -u -c >>>> "import setuptools, >>>> tokenize;__file__='/mnt/tmp/pip-build-phc6p6gl/scikit-learn/setup.py >>>> ';f=getattr(tokenize, 'open', open)(__file__);code=f.read >>>> ().replace('\r\n', '\n');f.close ();exec(compile(code, __file__, 'exec'))" >>>> bdist_wheel -d /tmp/tmpry3gf9r0pip-wheel- --python-tag cp37: >>>> Partial import of sklearn during the build process. >>>> Traceback (most recent call last): >>>> File "/mnt/tmp/pip-build-phc6p6gl/scikit-learn/setup.py ", line 201, >>>> in check_package_status >>>> module = importlib.import_module(package) >>>> File "/tmp/1611000009300-0/lib64/python3.7/importlib/__init__.py", >>>> line 127, in import_module >>>> return _bootstrap._gcd_import(name[level:], package, level) >>>> File "<frozen importlib._bootstrap>", line 1006, in _gcd_import >>>> File "<frozen importlib._bootstrap>", line 983, in _find_and_load >>>> File "<frozen importlib._bootstrap>", line 965, in >>>> _find_and_load_unlocked >>>> ModuleNotFoundError: No module named 'scipy' >>>> Traceback (most recent call last): >>>> File "<string>", line 1, in <module> >>>> File "/mnt/tmp/pip-build-phc6p6gl/scikit-learn/setup.py ", line 306, >>>> in <module> >>>> setup_package() >>>> File "/mnt/tmp/pip-build-phc6p6gl/scikit-learn/setup.py ", line 294, >>>> in setup_package >>>> check_package_status('scipy', min_deps.SCIPY_MIN_VERSION) >>>> File "/mnt/tmp/pip-build-phc6p6gl/scikit-learn/setup.py ", line 227, >>>> in check_package_status >>>> .format(package, req_str, instructions)) >>>> ImportError: scipy is not installed. >>>> scikit-learn requires scipy >= 0.19.1. >>>> >>>> I do not encounter this error with scikit-learn 0.23.2 : >>>> >>>> sc.install_pypi_package("scikit-learn==0.23.2") >>>> >>>> Collecting scikit-learn==0.23.2 >>>> Using cached >>>> https://files.pythonhosted.org/packages/f4/cb/64623369f348e9bfb29ff898a57ac7c91ed4921f228e9726546614d63ccb/scikit_learn-0.23.2-cp37-cp37m-manylinux1_x86_64.whl >>>> Requirement already satisfied: scipy>=0.19.1 in >>>> /mnt/tmp/1611000009300-0/lib/python3.7/site-packages (from >>>> scikit-learn==0.23.2) >>>> Requirement already satisfied: numpy>=1.13.3 in >>>> /usr/local/lib64/python3.7/site-packages (from scikit-learn==0.23.2) >>>> Requirement already satisfied: joblib>=0.11 in >>>> /usr/local/lib64/python3.7/site-packages (from scikit-learn==0.23.2) >>>> Requirement already satisfied: threadpoolctl>=2.0.0 in >>>> /mnt/tmp/1611000009300-0/lib/python3.7/site-packages (from >>>> scikit-learn==0.23.2) >>>> Installing collected packages: scikit-learn >>>> Successfully installed scikit-learn-0.23.2 >>>> >>>> >>>> Could you please help me understand why the scikit-learn 0.24 >>>> installation fails ? >>>> >>>> Thank you for your help, >>>> >>>> Bertrand >>>> _______________________________________________ >>>> scikit-learn mailing list >>>> scikit-learn@python.org >>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>> _______________________________________________ >>>> scikit-learn mailing list >>>> scikit-learn@python.org >>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>> >>> >>> >>> -- >>> Guillaume Lemaitre >>> Scikit-learn @ Inria Foundation >>> https://glemaitre.github.io/ >>> >> >> >> -- >> Guillaume Lemaitre >> Scikit-learn @ Inria Foundation >> https://glemaitre.github.io/ >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn