Maciej Szymkiewicz created SPARK-11281: ------------------------------------------
Summary: Issue with creating and collecting DataFrame using environments Key: SPARK-11281 URL: https://issues.apache.org/jira/browse/SPARK-11281 Project: Spark Issue Type: Bug Components: SparkR Affects Versions: 1.6.0 Environment: R 3.2.2, Spark build from master 487d409e71767c76399217a07af8de1bb0da7aa8 Reporter: Maciej Szymkiewicz It is not possible to to access Map field created from an environment. Assuming local data frame is created as follows: {code} ldf <- data.frame(row.names=1:2) ldf$x <- c(as.environment(list(a=1, b=2)), as.environment(list(c=3))) str(ldf) ## 'data.frame': 2 obs. of 1 variable: ## $ x:List of 2 ## ..$ :<environment: 0x35c94d8> ## ..$ :<environment: 0x35c7ac0> get("a", ldf$x[[1]]) ## [1] 1 get("c", ldf$x[[2]]) ## [1] 3 {code} It is possible to create a Spark data frame: {code} sdf <- createDataFrame(sqlContext, ldf) printSchema(sdf) ## root ## |-- x: array (nullable = true) ## | |-- element: map (containsNull = true) ## | | |-- key: string ## | | |-- value: double (valueContainsNull = true) {code} but it throws: {code} java.lang.IllegalArgumentException: Invalid array type e {code} on collect / head. Problem seems to be specific to environments and cannot be reproduced when Map comes for example from Cassandra table. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org