Github user Stibbons commented on the issue:
https://github.com/apache/spark/pull/14918
Indeed, i dont think it will be feasible to propagate the generator up to
the jvm. It would be cool, because when we have the schema there is no need to
iterate several time on the complete sequence. But, at least with this patch we
can prevent one (no schema) or two (when schema is provided) useless iterations
of the whole sequence using a generator + this optim, it is still a good thing
(Python is no so good sadly). I can try to do some performance evaluation, not
before next week.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]