Hi, I want to open up a discussion about adding GridGain Hadoop Accelerator as a feature of Apache Ignite.
As some of you may know, Hadoop Accelerator is now offered as a part of GridGain open source edition. It is built on top of Ignite In-Memory Data Fabric technology and provides plug-n-play acceleration for Hadoop. It also recently has been integrated with Apache BigTop. The acceleration is achieved by providing the following Hadoop components in memory: - IgniteFS, in-memory Hadoop-compliant file system, which natively plugs into Hadoop, and is built on top of Ignite data grid. - Ignite MapReduce, very fast Hadoop MapReduce implementation, which is built on top of Ignite computation framework. I anticipate that some questions will arise around how Ignite Hadoop Accelerator is different from Apache Spark. The reality is that they are very different. One of the main differences is that Ignite Hadoop Accelerator will offer acceleration of the existing Hadoop MapReduce computations that run natively on Hadoop, while Spark essentially takes you off of Hadoop MapReduce into its own DSL. Additionally, because of the fast MapReduce implementation, Ignite Hadoop Accelerator will also accelerate native Hive queries, while Spark provides its own SQL engine. More information about Hadoop Accelerator can be found here: http://hadoop.gridgain.org/ Please let me know your thoughts.
