[ 
https://issues.apache.org/jira/browse/SPARK-16204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-16204:
---------------------------------
    Labels: bulk-closed  (was: )

> Row() interface
> ---------------
>
>                 Key: SPARK-16204
>                 URL: https://issues.apache.org/jira/browse/SPARK-16204
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 2.0.0
>            Reporter: Max Moroz
>            Priority: Trivial
>              Labels: bulk-closed
>
> Row('a', 'b') creates a Row-like class, while is slightly unexpected. To 
> create an actual Row, one needs Row(field1 = 'a', field2 = 'b'). Of course 
> Of course, Row('a', 'b')('a', 'b') does create a row.
> I understand the logic, it's similar to namedtuple. But there's a difference 
> in that namedtuple *only* creates classes, while Row creates both Row-like 
> classes and record-like instances. 
> Wouldn't be possible to do something slightly more safe? Like for example, 
> replace expose the class-creation interface through something else, like a 
> global function, or a Row class method, or a brand new class like RowFactory? 
> Overloading the __init__ to create both records and classes seems 
> unnecessarily dangerous.
> In addition, the classes created by Row('a', 'b') allow creation of invalid 
> classes (where the field names are not strings); it would be better to catch 
> that early rather than let it happen silently and then fail (like when 
> someone tries to print(Row('a', 42)).
> And finally, key in Row(field1 = 'a', field2 = 'b') seems to search through 
> the values instead of keys as promised in the documentation at least in 1.6.1 
> (admittedly the docs only mention it in 2.0.0, but I thought it's not a 
> change between the versions?).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to