[ https://issues.apache.org/jira/browse/SPARK-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207199#comment-14207199 ]
Doris Xin commented on SPARK-4348: ---------------------------------- I fully support this. It took a lot of hacking just to override the default random module in Python, and it wasn't clear if the override was the ideal solution. > pyspark.mllib.random conflicts with random module > ------------------------------------------------- > > Key: SPARK-4348 > URL: https://issues.apache.org/jira/browse/SPARK-4348 > Project: Spark > Issue Type: Bug > Components: MLlib, PySpark > Affects Versions: 1.1.0, 1.2.0 > Reporter: Davies Liu > Priority: Blocker > > There are conflict in two cases: > 1. random module is used by pyspark.mllib.feature, if the first part of > sys.path is not '', then the hack in pyspark/__init__.py will fail to fix the > conflict. > 2. Run tests in mllib/xxx.py, the '' should be popped out before import > anything, or it will fail. > The first one is not fully fixed for user, it will introduce problems in some > cases, such as: > {code} > >>> import sys > >>> import sys.insert(0, PATH_OF_MODULE) > >>> import pyspark > >>> # use Word2Vec will fail > {code} > I'd like to rename mllib/random.py as random/_random.py, then in > mllib/__init.py > {code} > import pyspark.mllib._random as random > {code} > cc [~mengxr] [~dorx] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org