Davies Liu created SPARK-4348:
---------------------------------

             Summary: pyspark.mllib.random conflicts with random module
                 Key: SPARK-4348
                 URL: https://issues.apache.org/jira/browse/SPARK-4348
             Project: Spark
          Issue Type: Bug
          Components: MLlib, PySpark
    Affects Versions: 1.1.0, 1.2.0
            Reporter: Davies Liu
            Priority: Blocker


There are conflict in two cases:

1. random module is used by pyspark.mllib.feature, if the first part of 
sys.path is not '', then the hack in pyspark/__init__.py will fail to fix the 
conflict.

2. Run tests in mllib/xxx.py, the '' should be popped out before import 
anything, or it will fail.

The first one is not fully fixed for user, it will introduce problems in some 
cases, such as:

{code}
>>> import sys
>>> import sys.insert(0, PATH_OF_MODULE)
>>> import pyspark
>>> # use Word2Vec will fail
{code}

I'd like to rename mllib/random.py as random/_random.py, then in mllib/__init.py

{code}
import pyspark.mllib._random as random
{code}


cc [~mengxr] [~dorx]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to