Maciej Szymkiewicz created SPARK-11281:
------------------------------------------

             Summary: Issue with creating and collecting DataFrame using 
environments 
                 Key: SPARK-11281
                 URL: https://issues.apache.org/jira/browse/SPARK-11281
             Project: Spark
          Issue Type: Bug
          Components: SparkR
    Affects Versions: 1.6.0
         Environment: R 3.2.2, Spark build from master  
487d409e71767c76399217a07af8de1bb0da7aa8
            Reporter: Maciej Szymkiewicz


It is not possible to to access Map field created from an environment. Assuming 
local data frame is created as follows:

{code}
ldf <- data.frame(row.names=1:2)
ldf$x <- c(as.environment(list(a=1, b=2)), as.environment(list(c=3)))
str(ldf)
## 'data.frame':        2 obs. of  1 variable:
##  $ x:List of 2
##   ..$ :<environment: 0x35c94d8> 
##   ..$ :<environment: 0x35c7ac0> 

get("a", ldf$x[[1]])
## [1] 1

get("c", ldf$x[[2]])
## [1] 3
{code}

It is possible to create a Spark data frame:

{code}
sdf <- createDataFrame(sqlContext, ldf)
printSchema(sdf)

## root
##  |-- x: array (nullable = true)
##  |    |-- element: map (containsNull = true)
##  |    |    |-- key: string
##  |    |    |-- value: double (valueContainsNull = true)
{code}

but it throws:

{code}
java.lang.IllegalArgumentException: Invalid array type e
{code}

on collect / head. 

Problem seems to be specific to environments and cannot be reproduced when Map 
comes for example from Cassandra table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to