Amazon hadoop distribution

José Luis Larroque Sun, 18 Oct 2015 17:11:42 -0700

Thanks for your answer Anders.

-The amount of data that i'm going to manipulate it's like the wikipedia (i
will use a dump)
- I already have the basics of hadoop (i hope), i have a local multinode
cluster setup and i already executed some algorithms.
- Because the amount of data its important, i believe that i should use
several nodes.


Maybe another option to considerate should be that i'm running Giraph on
top of the selected hadoop distribution/EC2.

Bye!
Jose

2015-10-18 18:53 GMT-03:00 Anders Nielsen <[email protected]>:

> Dear Jose,
>
> It will help people answer your question if you specify your goals :
>
> -If you do it to learn how to USE a running Hadoop then go for one of the
> prebuilt distributions (Amazon or MapR)
> -If you do it to learn more about the setting up and administrating Hadoop
> then you are better off setting everything up from scratch on EC2.
> -Do you need to run on many nodes or just a 1 node to test some Mapreduce
> scripts on a small data set?
>
> Regards,
>
> Anders
>
>
>
>
> On Sun, Oct 18, 2015 at 10:03 PM, José Luis Larroque <
> [email protected]> wrote:
>
>> Hi all !
>>
>> I started to use hadoop with aws, and a big question appears in front of
>> me!
>>
>> I'm using a MapR distribution, for hadoop 2.4.0 in AWS. I already tried
>> some trivial examples, and before moving forward i have one question.
>>
>> What is the better option for using Hadoop on AWS?
>> - Build it from scratch on a EC2 instance
>> - Use MapR distribution of Hadoop
>> - Use Amazon distribution of Hadoop
>>
>> Sorry if my question is too broad.
>>
>> Bye!
>> Jose
>>
>>
>>
>>
>>
>

Re: Use of hadoop in AWS - Build it from scratch on a EC2 instance / MapR hadoop distribution / Amazon hadoop distribution

Reply via email to