Github user srowen commented on the issue:
https://github.com/apache/spark/pull/14256
SPARK-16613 is different I believe.
You reported a `StackOverflowError` and indeed I can't figure out why the
existing `pipe` methods just call themselves? It happened in
https://github.com/apache/spark/commit/279bd4aa5fddbabdb0383a3f6f0fc8d91780e092
and unless I totally miss something that's just a small but bad error. They
need to call to the main `pipe` overload.
The cleanup to `PipedRDD` constructors also lost the `tokenize` call. These
simpler `pipe` overloads do need to invoke it.
This is certainly my fault as I was reviewing and suggested some cleanup
that ultimately led to losing this functionality.
(Also I don't really like using `StringTokenizer` instead of just splitting
on whitespace, but, maybe not the thing to deal with now.)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]