[
https://issues.apache.org/jira/browse/SPARK-35337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Takuya Ueshin updated SPARK-35337:
----------------------------------
Summary: pandas API on Spark: Separate basic operations into data type
based structures (was: pandas APIs on Spark: Separate basic operations into
data type based structures)
> pandas API on Spark: Separate basic operations into data type based structures
> ------------------------------------------------------------------------------
>
> Key: SPARK-35337
> URL: https://issues.apache.org/jira/browse/SPARK-35337
> Project: Spark
> Issue Type: Umbrella
> Components: PySpark
> Affects Versions: 3.2.0
> Reporter: Xinrong Meng
> Assignee: Xinrong Meng
> Priority: Major
>
> Currently, the same basic operation of all data types is defined in one
> function, so it’s difficult to extend the behavior change based on the data
> types. For example, the binary operation Series + Series behaves differently
> based on the data type, e.g., just adding for numerical operands,
> concatenating for string operands, etc. The behavior difference is done by
> if-else in the function, so it’s messy and difficult to maintain or reuse the
> logic.
> We should provide an infrastructure to manage the differences in these
> operations.
> Please refer to [pandas APIs on Spark: Separate basic operations into data
> type based
> structures|https://docs.google.com/document/d/12MS6xK0hETYmrcl5b9pX5lgV4FmGVfpmcSKq--_oQlc/edit?usp=sharing]
> for details.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]