Hi Everyone,
I'm working on getting a MongoDB Source working for crunch as a holiday
project. Luckily there is already a MongoInputFormat provided by the
mongo-hadoop project. I tried to follow the example of the JDBC input in
crunch-contrib, but I couldn't quite get things working. I'm inheriting from
FileTableSourceImpl and creating my own FormatBundle since MongoInputFormat
extends InputFormat as opposed to FileInputFormat which is what most of the
crunch source classes expect. Anyway I can't seem to get the darn thing to
work. I have feeling that CrunchInputs and friends expect the Source to be
backed my some kind of file in HDFS and in the case of the MongoInputFormat
it's a connection over the network to a mongo server so something isn't quite
working. Any pointer's to which interfaces and classes I should base my
implementation off of or which methods I should override would be much
appreciated.
Thanks!
-Danny