[ 
https://issues.apache.org/jira/browse/SPARK-35337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin updated SPARK-35337:
----------------------------------
    Summary: pandas API on Spark: Separate basic operations into data type 
based structures  (was: pandas APIs on Spark: Separate basic operations into 
data type based structures)

> pandas API on Spark: Separate basic operations into data type based structures
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-35337
>                 URL: https://issues.apache.org/jira/browse/SPARK-35337
>             Project: Spark
>          Issue Type: Umbrella
>          Components: PySpark
>    Affects Versions: 3.2.0
>            Reporter: Xinrong Meng
>            Assignee: Xinrong Meng
>            Priority: Major
>
> Currently, the same basic operation of all data types is defined in one 
> function, so it’s difficult to extend the behavior change based on the data 
> types. For example, the binary operation Series + Series behaves differently 
> based on the data type, e.g., just adding for numerical operands, 
> concatenating for string operands, etc. The behavior difference is done by 
> if-else in the function, so it’s messy and difficult to maintain or reuse the 
> logic.
> We should provide an infrastructure to manage the differences in these 
> operations.
> Please refer to [pandas APIs on Spark: Separate basic operations into data 
> type based 
> structures|https://docs.google.com/document/d/12MS6xK0hETYmrcl5b9pX5lgV4FmGVfpmcSKq--_oQlc/edit?usp=sharing]
>  for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to