OK, I think you are missing this part
<https://apacheignite-fs.readme.io/docs/file-system#section-high-availability-igfs-client>:
your Spark compute nodes that you say have no local Ignite still must have
Ignite configuration and Ignite dependencies on the Hadoop's class path. To
make your Spark nodes to connect to remote IGFS cluster do this:
- Copy your Ignite configuration file to all Spark nodes. Make sure you
have IGFS TCP client endpoint configured where to point to any remote
Ignite node (replace myIgfs with your ignite hadoop file system name):
<property name="name" value="myIgfs"/>
<property name="ipcEndpointConfiguration">
<bean class="org.apache.ignite.igfs.IgfsIpcEndpointConfiguration">
<property name="type" value="TCP" />
<property name="host" value="__IGNITE_NODE_IP__" />
<property name="port" value="10500" />
</bean>
</property>
- Update your Hadoop's core-site.xml to tell Hadoop about remote IGFS
configuration (below replace myIgfs with your ignite hadoop file system
name):
<!-- Indicate Ignite is remote -->
<property>
<name>[email protected]_embed</name>
<value>true</value>
</property>
<!-- Remote Ignite Cluster -->
<property>
<name>[email protected]_path</name>
<value>PATH-TO-YOUR-IGNITE-CONFIG.xml</value>
</property>
- Hadoop must know how to parse Ignite config and initialize IGFS
client. Make sure you have the folloing modules on hadoop's CLASSPATH:
- ignite-core-???.jar
- cache-api-???.jar
- ignite-hadoop-???.jar
- asm-all-???.jar
- ignite-spring-???.jar
- spring-core-???.jar
- spring-beans-???.jar
- spring-context-???.jar
- spring-expression-???.jar
Now you can use Hadoop URI "igfs://myIgfs@/" on Spark nodes (replace
myIgfs with
your ignite hadoop file system name). Ignite will load balance the client
connections.