Oh yea I always forget about hbase, I'll take a look there. Thanks Josh! Date: Thu, 25 Dec 2014 05:13:28 -0800 Subject: Re: MongoDB Input Format From: [email protected] To: [email protected]
Hey Danny, Maybe the sources we use for reading from HBase tables in crunch-hbase? I agree that extending one of the File Source impl classes probably isn't the right thing to do. J On Dec 24, 2014 8:13 PM, "Danny Morgan" <[email protected]> wrote: Hi Everyone, I'm working on getting a MongoDB Source working for crunch as a holiday project. Luckily there is already a MongoInputFormat provided by the mongo-hadoop project. I tried to follow the example of the JDBC input in crunch-contrib, but I couldn't quite get things working. I'm inheriting from FileTableSourceImpl and creating my own FormatBundle since MongoInputFormat extends InputFormat as opposed to FileInputFormat which is what most of the crunch source classes expect. Anyway I can't seem to get the darn thing to work. I have feeling that CrunchInputs and friends expect the Source to be backed my some kind of file in HDFS and in the case of the MongoInputFormat it's a connection over the network to a mongo server so something isn't quite working. Any pointer's to which interfaces and classes I should base my implementation off of or which methods I should override would be much appreciated. Thanks! -Danny
