Ok, this is what I am getting $/tmp/pythonvenv/bin/pip install pandas
The directory '/home/zeppelin/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag. pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. The directory '/home/zeppelin/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag. Collecting pandas Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Could not find a version that satisfies the requirement pandas (from versions: ) No matching distribution found for pandas Could not fetch URL https://pypi.python.org/simple/pandas/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.python.org', port=443): Max retries exceeded with url: /simple/pandas/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)) - skipping Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Friday, June 8, 2018 2:54 PM To: users@zeppelin.apache.org Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6) Just find pip in your python 3.6 folder, and run pip using full path. e.g. /tmp/Python-3.6.5/pip install pandas Manuel Sopena Ballesteros <manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>>于2018年6月8日周五 下午12:47写道: Sorry for the stupid question How can I use pip? Zeppelin will run pip through the shell interpreter but my system global python is 2.6… [cid:image002.jpg@01D3FF37.8827CBF0] thanks Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Friday, June 8, 2018 1:45 PM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6) pip should be available under your python3.6.5, you can use that to install pandas Manuel Sopena Ballesteros <manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午11:40写道: Hi Jeff, Thank you very much for your quick response. My zeppelin is deployed using HDP (hortonworks platform) so I already have spark/yarn integration and I am using zeppelin.pyspark.python to tell pyspark to run python 3.6: zeppelin.pyspark.python --> /tmp/Python-3.6.5/python I do have root access to the machine but OS is centos 6 (python system environment is 2.6) hence pip is not available Thank you Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Friday, June 8, 2018 11:47 AM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6) First I would suggest you to use python 2.7 or python 3.x, because spark2.x has drop the support of python 2.6. Second you need to configure PYSPARK_PYTHON in spark interpreter setting to point to the python that you installed. (I don't know what do you mena that you can't install pandas system wide). Do you mean you are not root and don't have permission to install python packages ? Manuel Sopena Ballesteros <manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午9:26写道: Dear Zeppelin community, I am trying to load pandas into my zeppelin %spark2.pyspark interpreter. The system I am using is centos 6 with python 2.6 so I can’t install pandas system wide through pip as suggested in the documentation. What can I do if I want to add modules into the %spark2.pyspark interpreter? Thank you very much Manuel Sopena Ballesteros | Big data Engineer Garvan Institute of Medical Research The Kinghorn Cancer Centre, 370 Victoria Street, Darlinghurst, NSW 2010<https://maps.google.com/?q=370+Victoria+Street,+Darlinghurst,+NSW+2010&entry=gmail&source=g> T: + 61 (0)2 9355 5760<tel:+61%202%209355%205760> | F: +61 (0)2 9295 8507<tel:+61%202%209295%208507> | E: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.