Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/17226
> Spark 2.0+ with PivotFirst gives a NPE when one of the pivot column
values is null. The main thing fixed in this PR.
I meant to say it is not fully fixed because it does not NPE but now
introduce a regression.
Why don't we fix NPE and resolve the regression first and then put the
optimization for it?
I think we have two options.
- Deal with this case in both optimization path and non-optimization path
which introduces a regression/inconsistency between them that should be a
separate JIRA. - in this case, let me close mine.
- Fall back to non-optimization path and leave a JIRA to put this in
optimized path - in this case, I think this PR should be closed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]