Thanks again Alexey, you are right, I hadn't noticed about the IGFS HA client. I'll give it a try, the embedded client shouldn't be too much overhead if I use instance types that are big enough.
Thanks again for all your help! Greetings, Juan On Thu, Dec 7, 2017 at 6:41 AM, Alexey Kukushkin <[email protected]> wrote: > Just reviewed Remote IGFS TCP client implementation - you will get load > balancing if you use "igfs://myIgfs@host:port/" format but you will not > get high availability. That means your connection would fail if the "host" > goes down. So you have two options: > > > 1. Hight availability IGFS client > > <https://apacheignite-fs.readme.io/docs/file-system#section-high-availability-igfs-client> > (I > specified how to configure it in the previous post) > Pros: provides you both high availability and load balancing. Your > compute nodes will continue working until at least one Ignite node is > running. > Cons: it will start embedded Ignite client node inside your Spark JVM. > 2. Remote IGFS TCP client: use "igfs://myIgfs@host:port/" HDFS URI > format > Pros: still provides load balancing among IGFS nodes. Lightweight > communication with the cluster: does not start any embedded Ignite nodes. > Cons: no high availability. IGFS is lost if the "host" goes down. > >
