[jira] [Commented] (SPARK-6282) Strange Python import error when using random() in a lambda function

Joseph K. Bradley (JIRA) Wed, 11 Mar 2015 11:40:04 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357364#comment-14357364
 ]


Joseph K. Bradley commented on SPARK-6282:
------------------------------------------

Do you know where "_winreg" appears in the code you're running?  Is it being 
brought in by the read_csv_file_as_list method or its containing package?

> Strange Python import error when using random() in a lambda function
> --------------------------------------------------------------------
>
>                 Key: SPARK-6282
>                 URL: https://issues.apache.org/jira/browse/SPARK-6282
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.2.0
>         Environment: Kubuntu 14.04, Python 2.7.6
>            Reporter: Pavel Laskov
>            Priority: Minor
>
> Consider the exemplary Python code below:
>    from random import random
>    from pyspark.context import SparkContext
>    from xval_mllib import read_csv_file_as_list
> if __name__ == "__main__": 
>     sc = SparkContext(appName="Random() bug test")
>     data = sc.parallelize(read_csv_file_as_list('data/malfease-xp.csv'))
>     #data = sc.parallelize([1, 2, 3, 4, 5], 2)
>     d = data.map(lambda x: (random(), x))
>     print d.first()
> Data is read from a large CSV file. Running this code results in a Python 
> import error:
> ImportError: No module named _winreg
> If I use 'import random' and 'random.random()' in the lambda function no 
> error occurs. Also no error occurs, for both kinds of import statements, for 
> a small artificial data set like the one shown in a commented line.  
> The full error trace, the source code of csv reading code (function 
> 'read_csv_file_as_list' is my own) as well as a sample dataset (the original 
> dataset is about 8M large) can be provided. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-6282) Strange Python import error when using random() in a lambda function

Reply via email to