Sylvain Zimmer created SPARK-16700: -------------------------------------- Summary: StructType doesn't accept Python dicts anymore Key: SPARK-16700 URL: https://issues.apache.org/jira/browse/SPARK-16700 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.0.0 Reporter: Sylvain Zimmer
Hello, I found this issue while testing my codebase with 2.0.0-rc5 StructType in Spark 1.6.2 accepts the Python <dict> type, which is very handy. 2.0.0-rc5 does not and throws an error. I don't know if this was intended but I'd advocate for this behaviour to remain the same. MapType is probably wasteful when your key names never change and switching to Python tuples would be cumbersome. Here is a minimal script to reproduce the issue: {code:python} from pyspark import SparkContext from pyspark.sql import types as SparkTypes from pyspark.sql import SQLContext sc = SparkContext() sqlc = SQLContext(sc) struct_schema = SparkTypes.StructType([ SparkTypes.StructField("id", SparkTypes.LongType()) ]) rdd = sc.parallelize([{"id": 0}, {"id": 1}]) df = sqlc.createDataFrame(rdd, struct_schema) print df.collect() # 1.6.2 prints [Row(id=0), Row(id=1)] # 2.0.0-rc5 raises TypeError: StructType can not accept object {'id': 0} in type <type 'dict'> {code} Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org