[ 
https://issues.apache.org/jira/browse/SPARK-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359404#comment-14359404
 ] 

Nicholas Chammas commented on SPARK-6282:
-----------------------------------------

Shouldn't be related to boto. "_winreg" appears to be something Python uses to 
access the Windows registry, which is strange.

Please give us more details about your cluster setup, where you are running the 
driver from, etc. Also, what if you try using numpy's implementation of 
{{random}}?

> Strange Python import error when using random() in a lambda function
> --------------------------------------------------------------------
>
>                 Key: SPARK-6282
>                 URL: https://issues.apache.org/jira/browse/SPARK-6282
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.2.0
>         Environment: Kubuntu 14.04, Python 2.7.6
>            Reporter: Pavel Laskov
>            Priority: Minor
>
> Consider the exemplary Python code below:
>    from random import random
>    from pyspark.context import SparkContext
>    from xval_mllib import read_csv_file_as_list
> if __name__ == "__main__": 
>     sc = SparkContext(appName="Random() bug test")
>     data = sc.parallelize(read_csv_file_as_list('data/malfease-xp.csv'))
>     #data = sc.parallelize([1, 2, 3, 4, 5], 2)
>     d = data.map(lambda x: (random(), x))
>     print d.first()
> Data is read from a large CSV file. Running this code results in a Python 
> import error:
> ImportError: No module named _winreg
> If I use 'import random' and 'random.random()' in the lambda function no 
> error occurs. Also no error occurs, for both kinds of import statements, for 
> a small artificial data set like the one shown in a commented line.  
> The full error trace, the source code of csv reading code (function 
> 'read_csv_file_as_list' is my own) as well as a sample dataset (the original 
> dataset is about 8M large) can be provided. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to