Thanks Nick, appreciate your inputs on this.
On Thu, Mar 13, 2014 at 12:51 PM, Martin, Nick <[email protected]> wrote: > Start here > http://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Support > > > > The list of things you might consider before picking a distribution is > quite likely limited only by one's imagination. So, start with the basics > like hosted vs. in-house, what your use case(s) cover, etc. Basically, > anything you'd consider when looking at a new technology solution to > address a need at your organization. If that doesn't get you to a list of > things you need to consider then do a search for something akin to > "choosing a Hadoop distribution" and maybe that'll spark some thoughts. > > > > Best of luck, happy researching! > > > > *From:* [email protected] [mailto:[email protected]] > *Sent:* Thursday, March 13, 2014 10:22 AM > *To:* user > *Subject:* Re: Hortonworks HDP 2 sandbox or Cloudera CDH Distribution > > > > Thank you Martin. I will make sure that I do not have vendor specific > question on this forum. > > > > But since am starting out with Hadoop, I wanted to learn about what are > the keys things that we have to keep in mind while deciding on which > distribution to take...open source hadoop, mapr m7, hortonworks HDP or > cloudera CDH. > > > > If I can get very brief idea of factors that one should consider then it > would certainly be very helpful to me. > > > > Thanks Again, Andy. > > > > On Thu, Mar 13, 2014 at 10:17 AM, Martin, Nick <[email protected]> wrote: > > Hi Andy, > > > > Generally speaking, the folks participating on this list avoid questions > of distribution preference. There are, perhaps obviously, both minor and > significant differences in distributions that you should research and > evaluate to find the best fit for your organization's strategy. Asking the > members of this list to publically advocate one distribution over another > is outside the scope of our collective purpose here, in my opinion. Upon > thorough review of the topic history of this list you'll doubtless find the > questions and responses are almost always distribution agnostic, which is > how things should be with a community like this. > > > > No matter which distribution you choose, said distribution will assuredly > have ample documentation regarding cluster configuration readily available > via a quick search from your web browser. Further, the two distributions > you mention below also have several methods by which you can ask their > experts specific questions related to configuring their solutions in your > environment (forums, separate lists, Google groups, etc.). > > > > *From:* [email protected] [mailto:[email protected]] > *Sent:* Thursday, March 13, 2014 9:58 AM > *To:* user > *Subject:* Hortonworks HDP 2 sandbox or Cloudera CDH Distribution > > > > Hello Team, > > > > I am initiating an POC to see value of having hadoop in our architecture > and so after discussing my current scenario with experts here, i think it > would be better for me to start using sandbox version rather then using > actual distribution from POC point of view. > > > > My query here is how to decide what sandbox version to use Hortonworks or > Cloudera, my goal is to get started as soon as possible and not spend most > time on configuration part of the equation. > > > > Also, from online research that i have done, it appears that Cloudera > Impala is more efficient and provides near real time ad-hoc queries > capabilities and based on that am thinking of going towards Cloudera > sandbox distribution and wanted to learn from experts opinion before moving > in that direction. > > > > Also - if am going through sandbox approach, what kind of cluster > configuration can i have, meaning how many slave and master nodes will > sandbox support. > > > > Pardon my question if they sound to basic. > > > > Thanks again, Andy. > > >
