Hi, We're starting to build an analytics framework for our wellness service. While our data is not yet Big, we'd like to use a framework that will scale as needed, and Spark seems to be the best around.
I'm new to Hadoop and Spark, and I'm having difficulty figuring out how to use Spark in connection with MongoDB. Apparently, I should be able to use the mongo-hadoop connector (https://github.com/mongodb/mongo-hadoop) also with Spark, but haven't figured out how. I've run through the Spark tutorials and been able to setup a single-machine Hadoop system with the MongoDB connector as instructed at http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ and http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-hadoop/ Could someone give some instructions or pointers on how to configure and use the mongo-hadoop connector with Spark? I haven't been able to find any documentation about this. Thanks. Best regards, Sampo N.
