I agree that we can’t keep adding these to the core API, partly because it will
get unwieldy to maintain and partly just because each storage system will bring
in lots of dependencies. We can simply have helper classes in different modules
for each storage system. There’s some discussion on this at
https://spark-project.atlassian.net/browse/SPARK-1127.
Matei
On Mar 11, 2014, at 9:06 AM, Koert Kuipers wrote:
> I find the current design to write RDDs to disk (or a database, etc) kind of
> ugly. It will lead to a proliferation of saveAs methods. A better abstraction
> would be nice (perhaps a Sink trait to write to)
>