Hi, Warning, I am a newby myself. Please find my answer inline.
Good luck Olivier On 11 January 2013 10:29, John Lilley <[email protected]> wrote: > We are somewhat new to Hadoop and are looking to run some experiments > with HDFS, Pig, and HBase. **** > > With that in mind, I have a few questions:**** > > What is the easiest (preferably free) Hadoop distro to get started with? > Cloudera? > Cloudera is probably easy. I've gone with the solution from Hortonworks. I've used their hmc ( Hortonworks Management Console ). It's a webui which installed all the components you desired on your behalf as well as installing monitoring ( ganglia + nagios ). HMC is based on Ambari ( apache project ). You can find some information on how to install it at : http://hortonworks.com/hdp11-hmc-quick-start-guide/ > **** > > What host OS distro/release is recommended? > CentOS6 / RHEL6 seems to be a good solution. > **** > > What is the easiest environment to get started with? Amazon EC2? Is > there anyone offering virtual/hosted prebuilt Hadoop instances? > I've installed it on EC2. It worked like a charm > **** > > Where would we find some “big data” files that people have used for > testing purposes? > As part of the documentation, there is a map reduce tutorial. You can then use any files and use the wordcount examples. http://hadoop.apache.org/docs/r0.20.2/mapred_tutorial.html > **** > > Feel free to RTFM me to the right place ;-)**** > > Thanks, john**** > > ** ** >
