Hi daemon: Actually, for most folks who would want to actually use a hadoop cluster, i would think setting up bigtop is super easy ! If you have issues with it ping me and I can help you get started. Also, we have docker containers - so you dont even *need* a VM to run a 4 or 5 node hadoop cluster.
install vagrant install VirtualBox git clone https://github.com/apache/bigtop cd bigtop/bigtop-deploy/vm/vagrant-puppet vagrant up Then vagrant destroy when your done. This to me is easier than manually downloading an appliance, picking memory starting the virtualbox gui, loading the appliance , etc... and also its easy to turn the simple single node bigtop VM into a multinode one, by just modifying the vagrantile. On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle <[email protected]> wrote: > What you want as a sandbox depends on what you are trying to learn. > > If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all > of the suggestions (perhaps excluding BigTop due to its setup complexities) > are great. Laptop? perhaps but laptop's are really kind of infuriatingly > slow (because of the hardware - you pay a price for a 30-45watt average > heating bill). A laptop is an OK place to start if it is e.g. an i5 or i7 > with lots of memory. What do you think of the thought that you will pretty > quickly graduate to wanting a small'ish desktop for your sandbox? > > A simple, single node, Hadoop instance will let you learn many things. The > next level of complexity comes when you are attempting to deal with data > whose processing needs to be split up, so you can learn about how to split > data in Mapping, reduce the splits via reduce jobs, etc. For that, you > could get a windows desktop box or e.g. RedHat/CentOS and use > virtualization. Something like a 4 core i5 with 32gb of memory, running 3 > or for some things 4, vm's. You could load e.g. hortonworks into each of > the vm's and practice setting up a 3/4 way cluster. Throw in 2-3 1tb drives > off of eBay and you can have a lot of learning. > > > > > > > > > > > > *.......“The race is not to the swift,nor the battle to the strong,but to > those who can see it coming and jump aside.” - Hunter ThompsonDaemeon* > On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano <[email protected]> wrote: > >> you can try the pivotal vm as well. >> >> >> http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html >> >> On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov <[email protected]> >> wrote: >> >>> Tim, >>> download Sandbox from http://hortonworks/com >>> You will have everything needed in a small VM instance which will run on >>> your home desktop. >>> >>> >>> *Thank you!* >>> >>> >>> *Sincerely,* >>> >>> *Leonid Fedotov* >>> >>> Systems Architect - Professional Services >>> >>> [email protected] >>> >>> office: +1 855 846 7866 ext 292 >>> >>> mobile: +1 650 430 1673 >>> >>> On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy <[email protected]> >>> wrote: >>> >>>> Hey all, >>>> >>>> I want to setup an environment where I can teach myself hadoop. >>>> Usually the way I'll handle this is to grab a machine off the Amazon free >>>> tier and setup whatever software I want. >>>> >>>> However I realize that Hadoop is a memory intensive, big data solution. >>>> So what I'm wondering is, would a t2.micro instance be sufficient for >>>> setting up a cluster of hadoop nodes with the intention of learning it? To >>>> keep things running longer in the free tier I would either setup however >>>> many nodes as I want and keep them stopped when I'm not actively using >>>> them. Or just setup a few nodes with a few different accounts (with a >>>> different gmail address for each one.. easy enough to do). >>>> >>>> Failing that, what are some other free/cheap solutions for setting up a >>>> hadoop learning environment? >>>> >>>> Thanks, >>>> Tim >>>> >>>> -- >>>> GPG me!! >>>> >>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B >>>> >>>> >>> >>> CONFIDENTIALITY NOTICE >>> NOTICE: This message is intended for the use of the individual or entity >>> to which it is addressed and may contain information that is confidential, >>> privileged and exempt from disclosure under applicable law. If the reader >>> of this message is not the intended recipient, you are hereby notified that >>> any printing, copying, dissemination, distribution, disclosure or >>> forwarding of this communication is strictly prohibited. If you have >>> received this communication in error, please contact the sender immediately >>> and delete it from your system. Thank You. >> >> >> > -- jay vyas
