subject:"\[jira\] \[Updated\] \(SPARK\-16700\) StructType doesn't accept Python dicts anymore"

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

2016-08-25 Thread Yin Huai (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-16700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated SPARK-16700:
-
Fix Version/s: 2.0.1

> StructType doesn't accept Python dicts anymore
> --
>
> Key: SPARK-16700
> URL: https://issues.apache.org/jira/browse/SPARK-16700
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.0.0
>Reporter: Sylvain Zimmer
>Assignee: Davies Liu
>  Labels: releasenotes
> Fix For: 2.0.1, 2.1.0
>
>
> Hello,
> I found this issue while testing my codebase with 2.0.0-rc5
> StructType in Spark 1.6.2 accepts the Python  type, which is very 
> handy. 2.0.0-rc5 does not and throws an error.
> I don't know if this was intended but I'd advocate for this behaviour to 
> remain the same. MapType is probably wasteful when your key names never 
> change and switching to Python tuples would be cumbersome.
> Here is a minimal script to reproduce the issue: 
> {code}
> from pyspark import SparkContext
> from pyspark.sql import types as SparkTypes
> from pyspark.sql import SQLContext
> sc = SparkContext()
> sqlc = SQLContext(sc)
> struct_schema = SparkTypes.StructType([
> SparkTypes.StructField("id", SparkTypes.LongType())
> ])
> rdd = sc.parallelize([{"id": 0}, {"id": 1}])
> df = sqlc.createDataFrame(rdd, struct_schema)
> print df.collect()
> # 1.6.2 prints [Row(id=0), Row(id=1)]
> # 2.0.0-rc5 raises TypeError: StructType can not accept object {'id': 0} in 
> type 
> {code}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

2016-08-16 Thread Josh Rosen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-16700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Rosen updated SPARK-16700:
---
Labels: releasenotes  (was: )

> StructType doesn't accept Python dicts anymore
> --
>
> Key: SPARK-16700
> URL: https://issues.apache.org/jira/browse/SPARK-16700
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.0.0
>Reporter: Sylvain Zimmer
>Assignee: Davies Liu
>  Labels: releasenotes
> Fix For: 2.1.0
>
>
> Hello,
> I found this issue while testing my codebase with 2.0.0-rc5
> StructType in Spark 1.6.2 accepts the Python  type, which is very 
> handy. 2.0.0-rc5 does not and throws an error.
> I don't know if this was intended but I'd advocate for this behaviour to 
> remain the same. MapType is probably wasteful when your key names never 
> change and switching to Python tuples would be cumbersome.
> Here is a minimal script to reproduce the issue: 
> {code}
> from pyspark import SparkContext
> from pyspark.sql import types as SparkTypes
> from pyspark.sql import SQLContext
> sc = SparkContext()
> sqlc = SQLContext(sc)
> struct_schema = SparkTypes.StructType([
> SparkTypes.StructField("id", SparkTypes.LongType())
> ])
> rdd = sc.parallelize([{"id": 0}, {"id": 1}])
> df = sqlc.createDataFrame(rdd, struct_schema)
> print df.collect()
> # 1.6.2 prints [Row(id=0), Row(id=1)]
> # 2.0.0-rc5 raises TypeError: StructType can not accept object {'id': 0} in 
> type 
> {code}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

2016-07-24 Thread Sylvain Zimmer (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-16700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Zimmer updated SPARK-16700:
---
Component/s: (was: Spark Core)

> StructType doesn't accept Python dicts anymore
> --
>
> Key: SPARK-16700
> URL: https://issues.apache.org/jira/browse/SPARK-16700
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.0.0
>Reporter: Sylvain Zimmer
>
> Hello,
> I found this issue while testing my codebase with 2.0.0-rc5
> StructType in Spark 1.6.2 accepts the Python  type, which is very 
> handy. 2.0.0-rc5 does not and throws an error.
> I don't know if this was intended but I'd advocate for this behaviour to 
> remain the same. MapType is probably wasteful when your key names never 
> change and switching to Python tuples would be cumbersome.
> Here is a minimal script to reproduce the issue: 
> {code}
> from pyspark import SparkContext
> from pyspark.sql import types as SparkTypes
> from pyspark.sql import SQLContext
> sc = SparkContext()
> sqlc = SQLContext(sc)
> struct_schema = SparkTypes.StructType([
> SparkTypes.StructField("id", SparkTypes.LongType())
> ])
> rdd = sc.parallelize([{"id": 0}, {"id": 1}])
> df = sqlc.createDataFrame(rdd, struct_schema)
> print df.collect()
> # 1.6.2 prints [Row(id=0), Row(id=1)]
> # 2.0.0-rc5 raises TypeError: StructType can not accept object {'id': 0} in 
> type 
> {code}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

2016-07-24 Thread Sylvain Zimmer (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-16700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Zimmer updated SPARK-16700:
---
Description: 
Hello,

I found this issue while testing my codebase with 2.0.0-rc5

StructType in Spark 1.6.2 accepts the Python  type, which is very handy. 
2.0.0-rc5 does not and throws an error.

I don't know if this was intended but I'd advocate for this behaviour to remain 
the same. MapType is probably wasteful when your key names never change and 
switching to Python tuples would be cumbersome.

Here is a minimal script to reproduce the issue: 

{code}
from pyspark import SparkContext
from pyspark.sql import types as SparkTypes
from pyspark.sql import SQLContext


sc = SparkContext()
sqlc = SQLContext(sc)

struct_schema = SparkTypes.StructType([
SparkTypes.StructField("id", SparkTypes.LongType())
])

rdd = sc.parallelize([{"id": 0}, {"id": 1}])

df = sqlc.createDataFrame(rdd, struct_schema)

print df.collect()

# 1.6.2 prints [Row(id=0), Row(id=1)]

# 2.0.0-rc5 raises TypeError: StructType can not accept object {'id': 0} in 
type 

{code}

Thanks!

  was:
Hello,

I found this issue while testing my codebase with 2.0.0-rc5

StructType in Spark 1.6.2 accepts the Python  type, which is very handy. 
2.0.0-rc5 does not and throws an error.

I don't know if this was intended but I'd advocate for this behaviour to remain 
the same. MapType is probably wasteful when your key names never change and 
switching to Python tuples would be cumbersome.

Here is a minimal script to reproduce the issue: 

{code:python}
from pyspark import SparkContext
from pyspark.sql import types as SparkTypes
from pyspark.sql import SQLContext


sc = SparkContext()
sqlc = SQLContext(sc)

struct_schema = SparkTypes.StructType([
SparkTypes.StructField("id", SparkTypes.LongType())
])

rdd = sc.parallelize([{"id": 0}, {"id": 1}])

df = sqlc.createDataFrame(rdd, struct_schema)

print df.collect()

# 1.6.2 prints [Row(id=0), Row(id=1)]

# 2.0.0-rc5 raises TypeError: StructType can not accept object {'id': 0} in 
type 

{code}

Thanks!


> StructType doesn't accept Python dicts anymore
> --
>
> Key: SPARK-16700
> URL: https://issues.apache.org/jira/browse/SPARK-16700
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, Spark Core
>Affects Versions: 2.0.0
>Reporter: Sylvain Zimmer
>
> Hello,
> I found this issue while testing my codebase with 2.0.0-rc5
> StructType in Spark 1.6.2 accepts the Python  type, which is very 
> handy. 2.0.0-rc5 does not and throws an error.
> I don't know if this was intended but I'd advocate for this behaviour to 
> remain the same. MapType is probably wasteful when your key names never 
> change and switching to Python tuples would be cumbersome.
> Here is a minimal script to reproduce the issue: 
> {code}
> from pyspark import SparkContext
> from pyspark.sql import types as SparkTypes
> from pyspark.sql import SQLContext
> sc = SparkContext()
> sqlc = SQLContext(sc)
> struct_schema = SparkTypes.StructType([
> SparkTypes.StructField("id", SparkTypes.LongType())
> ])
> rdd = sc.parallelize([{"id": 0}, {"id": 1}])
> df = sqlc.createDataFrame(rdd, struct_schema)
> print df.collect()
> # 1.6.2 prints [Row(id=0), Row(id=1)]
> # 2.0.0-rc5 raises TypeError: StructType can not accept object {'id': 0} in 
> type 
> {code}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

2016-07-24 Thread Sylvain Zimmer (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-16700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Zimmer updated SPARK-16700:
---
Component/s: PySpark

> StructType doesn't accept Python dicts anymore
> --
>
> Key: SPARK-16700
> URL: https://issues.apache.org/jira/browse/SPARK-16700
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, Spark Core
>Affects Versions: 2.0.0
>Reporter: Sylvain Zimmer
>
> Hello,
> I found this issue while testing my codebase with 2.0.0-rc5
> StructType in Spark 1.6.2 accepts the Python  type, which is very 
> handy. 2.0.0-rc5 does not and throws an error.
> I don't know if this was intended but I'd advocate for this behaviour to 
> remain the same. MapType is probably wasteful when your key names never 
> change and switching to Python tuples would be cumbersome.
> Here is a minimal script to reproduce the issue: 
> {code}
> from pyspark import SparkContext
> from pyspark.sql import types as SparkTypes
> from pyspark.sql import SQLContext
> sc = SparkContext()
> sqlc = SQLContext(sc)
> struct_schema = SparkTypes.StructType([
> SparkTypes.StructField("id", SparkTypes.LongType())
> ])
> rdd = sc.parallelize([{"id": 0}, {"id": 1}])
> df = sqlc.createDataFrame(rdd, struct_schema)
> print df.collect()
> # 1.6.2 prints [Row(id=0), Row(id=1)]
> # 2.0.0-rc5 raises TypeError: StructType can not accept object {'id': 0} in 
> type 
> {code}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

[jira] [Updated] (SPARK-16700) StructType doesn't accept Python dicts anymore

5 matches

Site Navigation

Mail list logo

Footer information