2015-07-20 23:29 GMT-07:00 Matei Zaharia <matei.zaha...@gmail.com>:

> I agree with this -- basically, to build on Reynold's point, you should be
> able to get almost the same performance by implementing either the Hadoop
> FileSystem API or the Spark Data Source API over Ignite in the right way.
> This would let people save data persistently in Ignite in addition to using
> it for caching, and it would provide a global namespace, optionally a
> schema, etc. You can still provide data locality, short-circuit reads, etc
> with these APIs.
>

Absolutely agree.

In fact, Ignite already provides a shared RDD implementation which is
essentially a view of Ignite cache data. This implementation adheres to the
Spark DataFrame API. More information can be found here:
http://ignite.incubator.apache.org/features/igniterdd.html

Also, Ignite in-memory filesystem is compliant with Hadoop filesystem API
and can transparently replace HDFS if needed. Plugging it into Spark should
be fairly easy. More information can be found here:
http://ignite.incubator.apache.org/features/igfs.html

--Alexey

Reply via email to