[
https://issues.apache.org/jira/browse/BEAM-11931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brian Hulette updated BEAM-11931:
---------------------------------
Resolution: Won't Fix
Status: Resolved (was: Open)
This is obsolete.
str.split won't be supported anyway because it produces non-deferred column
names.
> str.split(expand=True) doesn't correctly produce None
> -----------------------------------------------------
>
> Key: BEAM-11931
> URL: https://issues.apache.org/jira/browse/BEAM-11931
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Brian Hulette
> Priority: P3
> Labels: dataframe-api
>
> series.str.split(expand=True) and rsplit(expand=True) usually produce None
> for missing values:
> {code}
> >>> s.str.split(expand=True)
> 0 1 2 3
> 4
> 0 this is a regular
> sentence
> 1 https://docs.python.org/3/tutorial/index.html None None None
> None
> 2 NaN NaN NaN NaN
> NaN
> {code}
> And NaNs are only produced for invalid inputs. Our implementation populates
> missing values with NaN, because they're added in the final pd.concat.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)