Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/18945
Thanks for clarifying @HyukjinKwon , I see what you mean now. Since pandas
will iterate over `self.collect()` anyway I don't think your solution would
impact performance at all right? So your way might be better, but it is
slightly more complicated..
Just to sum things up - @logannc does this still meet your requirements?
Instead of having the `strict = True` option we do the following:
```
for each nullable int32 column:
if there are null values:
change column type to float32
else:
change column type to int32
```
I'm also guessing we will have the same problem with nullable ShortType -
maybe others?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]