That would be great to add. Right now it would be easy to change it to use another Hadoop FileSystem implementation at the very least (I think you can just pass the URL for that), but for Cassandra you’d have to use a different InputFormat or some direct Cassandra access API.
Matei On Jan 28, 2014, at 5:02 PM, Evan Chan <e...@ooyala.com> wrote: > By the way, is there any plan to make a pluggable backend for > checkpointing? We might be interested in writing a, for example, > Cassandra backend. > > On Sat, Jan 25, 2014 at 9:49 PM, Xia, Junluan <junluan....@intel.com> wrote: >> Hi all >> >> The description about this Bug submitted by Matei is as following >> >> >> The tipping point seems to be around 50. We should fix this by checkpointing >> the RDDs every 10-20 iterations to break the lineage chain, but >> checkpointing currently requires HDFS installed, which not all users will >> have. >> >> We might also be able to fix DAGScheduler to not be recursive. >> >> >> regards, >> Andrew >> > > > > -- > -- > Evan Chan > Staff Engineer > e...@ooyala.com |