Re: DataCreator

Ted Dunning Wed, 16 Feb 2011 07:19:25 -0800

Sounds like Pig.  Or Cascading.  Or Hive.

Seriously, isn't this already available?


On Wed, Feb 16, 2011 at 7:06 AM, Guy Doulberg <[email protected]>wrote:

>
> Hey all,
> I want to consult with you hadoppers about a Map/Reduce application I want
> to build.
>
> I want to build a map/reduce job, that read files from HDFS, perform some
> sort of transformation on the file lines, and store them to several
> partition depending on the source of the file or its data.
>
> I want this application to be as configurable as possible, so I designed
> interfaces to Parse, Decorate and Partition(On HDFS) the Data.
>
> I want to be able to configure different data flows, with different
> parsers, decorators and partitioners, using a config file.
>
> Do you think, you would use such an application? Does it fit an open-source
> project?
>
> Now, I have some technical questions:
> I was thinking of using reflection, to load all the classes I would need
> according to the configuration during the setup process of the Mapper.
> Do you think it is a good idea?
>
> Is there a way to send the Mapper objects or interfaces from the Job
> declaration?
>
>
>
>  Thanks,
>
>

Re: DataCreator

Reply via email to