[ https://issues.apache.org/jira/browse/SPARK-23600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-23600: --------------------------------- Priority: Major (was: Critical) Fix Version/s: (was: 2.3.0) Let's don't set the fix version which we usually set. > conda_panda_example test fails to import panda lib with Spark 2.3 > ----------------------------------------------------------------- > > Key: SPARK-23600 > URL: https://issues.apache.org/jira/browse/SPARK-23600 > Project: Spark > Issue Type: Bug > Components: Spark Submit > Affects Versions: 2.3.0 > Environment: ambari-server --version 2.7.0.2-64 > HDP-3.0.0.2-132 > Reporter: Supreeth Sharma > Priority: Major > > With Spark2.3, conda panda test is failing to import panda. > python version: Python 2.7.5 > 1) Create Requirement file. > virtual_env_type : Native > {code:java} > packaging==16.8 > panda==0.3.1 > pyparsing==2.1.10 > requests==2.13.0 > six==1.10.0 > numpy==1.12.0 > pandas==0.19.2 > python-dateutil==2.6.0 > pytz==2016.10 > {code} > virtual_env_type : conda > {code:java} > mkl=2017.0.1=0 > numpy=1.12.0=py27_0 > openssl=1.0.2k=0 > pandas=0.19.2=np112py27_1 > pip=9.0.1=py27_1 > python=2.7.13=0 > python-dateutil=2.6.0=py27_0 > pytz=2016.10=py27_0 > readline=6.2=2 > setuptools=27.2.0=py27_0 > six=1.10.0=py27_0 > sqlite=3.13.0=0 > tk=8.5.18=0 > wheel=0.29.0=py27_0 > zlib=1.2.8=3 > {code} > 2) Run conda panda test > {code:java} > spark-submit --master yarn-client --jars > /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.3.0.0.2-132.jar --conf > spark.pyspark.virtualenv.enabled=true --conf > spark.pyspark.virtualenv.type=native --conf > spark.pyspark.virtualenv.requirements=/tmp/requirements.txt --conf > spark.pyspark.virtualenv.bin.path=/usr/bin/virtualenv > /hwqe/hadoopqe/tests/spark/data/conda_panda_example.py 2>&1 | tee > /tmp/1/Spark_clientLogs/pyenv_conda_panda_example_native_yarn-client.log > {code} > 3) Application fail to import panda. > {code:java} > 2018-03-05 13:43:31,493|INFO|MainThread|machine.py:167 - > run()||GUID=a3cb88f7-bf55-4d9e-9cfe-3e44eae3a72b|18/03/05 13:43:31 INFO > YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling > beginning after reached minRegisteredResourcesRatio: 0.8 > 2018-03-05 13:43:31,527|INFO|MainThread|machine.py:167 - > run()||GUID=a3cb88f7-bf55-4d9e-9cfe-3e44eae3a72b|Traceback (most recent call > last): > 2018-03-05 13:43:31,527|INFO|MainThread|machine.py:167 - > run()||GUID=a3cb88f7-bf55-4d9e-9cfe-3e44eae3a72b|File > "/hwqe/hadoopqe/tests/spark/data/conda_panda_example.py", line 5, in <module> > 2018-03-05 13:43:31,528|INFO|MainThread|machine.py:167 - > run()||GUID=a3cb88f7-bf55-4d9e-9cfe-3e44eae3a72b|import pandas as pd > 2018-03-05 13:43:31,528|INFO|MainThread|machine.py:167 - > run()||GUID=a3cb88f7-bf55-4d9e-9cfe-3e44eae3a72b|ImportError: No module named > pandas > 2018-03-05 13:43:31,547|INFO|MainThread|machine.py:167 - > run()||GUID=a3cb88f7-bf55-4d9e-9cfe-3e44eae3a72b|18/03/05 13:43:31 INFO > BlockManagerMasterEndpoint: Registering block manager > ctr-e138-1518143905142-67599-01-000005.hwx.site:44861 with 366.3 MB RAM, > BlockManagerId(2, ctr-e138-1518143905142-67599-01-000005.hwx.site, 44861, > None){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org