Re: Airavata/Hadoop Integration

Danushka Menikkumbura Tue, 26 Feb 2013 19:06:15 -0800

Hi Lahiru,

I think we have pretty much this functionality done in the similar way you
> are explaining. I have added the code in to trunk and will provide some
> test classes and will update the schedular to return the HadoopProvider.
>


Yes. The Hadoop provider that you have committed more or less does the same
thing that I was planning to do :-). I believe we can do following two
important improvements on top of that.

1. Adding support for handling chains of jobs. This is different from
having individual jobs orchestrated in the workflow level.

2. Support for asynchronous job execution which I believe is a must-have
for long-running, data-intensive MapReduce jobs.

I am +1 to enable these API to enable to use other components, but do you
>  think actual users would have a concern  about the underneath library we
> use for Mapreduce jobs ? I am not quite confident about the way people are
> using these. But anyhow its nice to have a support for these.
>

They are not MapReduce frameworks. Sector/Sphere is a completely different
execution framework for data-intensive computing. Hyracks is also another
data-intensive computing framework that also supports MapReduce. The idea
is to compare their performance and see which is better.

Thanks,
Danushka

Re: Airavata/Hadoop Integration

Reply via email to