DataCreator

Guy Doulberg Wed, 16 Feb 2011 07:08:37 -0800

Hey all,
I want to consult with you hadoppers about a Map/Reduce application I want to 
build.


I want to build a map/reduce job, that read files from HDFS, perform some sort 
of transformation on the file lines, and store them to several partition 
depending on the source of the file or its data.

I want this application to be as configurable as possible, so I designed 
interfaces to Parse, Decorate and Partition(On HDFS) the Data.

I want to be able to configure different data flows, with different parsers, 
decorators and partitioners, using a config file.

Do you think, you would use such an application? Does it fit an open-source 
project?

Now, I have some technical questions:
I was thinking of using reflection, to load all the classes I would need 
according to the configuration during the setup process of the Mapper.
Do you think it is a good idea?

Is there a way to send the Mapper objects or interfaces from the Job 
declaration?



 Thanks,

DataCreator

Reply via email to