SasanAhmadi commented on issue #16919:
URL: https://github.com/apache/airflow/issues/16919#issuecomment-877572458
I am proposing a change to ```_fix_int_dtypes``` method as below:
```
def _fix_int_dtypes( df: pd.DataFrame) -> None:
"""Mutate DataFrame to set dtypes for int columns containing NaN
values."""
for col in df:
if "float" in df[col].dtype.name and df[col].hasnans:
# inspect values to determine if dtype of non-null values is
int or float
notna_series = df[col].dropna().values
if np.equal(notna_series, notna_series.astype(int)).all():
# set to dtype that retains integers and supports NaNs
df[col] = np.where(df[col].isnull(), None, df[col])
df[col] = df[col].astype(pd.Int64Dtype())
elif np.isclose(notna_series,
notna_series.astype(int)).all():
df[col] = np.where(df[col].isnull(), None, df[col])
df[col] = df[col].astype(pd.Float64Dtype())
```
This way it is correctly checking if the values are integer or
floating-point and then cast to precise type.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]