Github user marmbrus commented on the issue:
https://github.com/apache/spark/pull/15197
Thanks for working on this, it does seem like it could be useful. I'm not
sure if this should go into Spark or into a separate package. It really
depends on how many people want this feature.
Regardless, a few high level comments on this PR:
- Check out the [contributing to Spark
guide](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark).
Patches need to have tests and follow the style guide.
- I would not define a new `HttpDataFormat` interface. Instead I would
mandate that the input is a single string column (similar to what we do for
`df.write.text`). Users can use all of the existing DataFrame/Dataset
operations to convert their data into a string.
- It would be good to write up a short design on JIRA and debate there. A
few things that I can think of off the top of my head:
- should we support https too?
- do we need to set any headers (i.e. maybe the batch id?)
- We'd also need to add docs for this feature.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]