Functional programming paradigms, the map/reduce pattern, and to a lesser 
extent distributed and parallel processing in general are subjects not widely 
understood by most quasi-technical management.  Further, the notion of 
commodity machines with guarenteed lack of reliability as a means of achieving 
high performance and high scalability is essentially counterintuitive.  Even 
referring newcomers to what I regard as the seminal papers on these topics (the 
papers written by Ghemewat, Dean et al at Google, and yes
  I know it all started with LISP but my management wasn't even alive in the 
1970s although I was :-)),  people steeped in a long tradition of "shared 
everything" database architectures still don't quite get it.  I spend 
considerable amounts of time in what amounts to management de-programming:  No 
MySQL can't do this, and Oracle can't either, except with Oracle it will cost 
you a lot more to find that out.  
   
  Hadoop, hTable, PIG and the like offer adopters a competitive edge which, in 
my mind, is so great that list participants may not wish their company 
identities to be known.  On the other hand, a good list of "Company X is 
solving general problem Y using N nodes of configuration K"  is extremely 
helpful in advancing the "cause" of this technology.  Any success stories, even 
those carefully disguised to protect identity, product, and process are 
extremely helpful.
   
  In our case I am planning to deploy Hadoop to process substantial quantities 
of data generated by our application's users.  Our current plan is to deploy on 
a 32 or 64 node cluster, with machines which contain:  4 cores, 4G memory, 2T 
local disk (JBOD).  A 32 node implementation with replication set to 3 will 
yield 18-20T of useable space.   We are also actively experimenting and 
researching on EC2, as a substantially larger grid of EC2 machines may yield 
acceptable performance at a price far lower than building out and maintaining 
our own grid.   
   
  Prototyping to date has yielded results that I can only describe as 
astounding :-).
   
  More stories most welcome....
  C G
  
Konstantin Shvachko <[EMAIL PROTECTED]> wrote:
  
This is exactly the reason I am proposing to create a testimonial
wiki page for hadoop.

See HADOOP-1754.
https://issues.apache.org/jira/browse/HADOOP-1754
As C G says the company name might not be always relevant as in "powered 
by hadoop", but applications, problems, tasks are.
--Konstantin

Eric Baldeschwieler wrote:

> Responses to the list welcome. I know of several companies not on 
> that list that are using it.
>
> It would be great to hear from you guys.
>
> E14
>
> On Sep 4, 2007, at 6:59 AM, C G wrote:
>
>> All:
>>
>> I am interested in hearing any success stories around deploying 
>> Hadoop in a commercial/non-academic environment. My interest is 
>> mostly around generating collateral for justifying our own 
>> deployment of Hadoop. Any stories would be great...if you can't 
>> name your company, if you could at least describe the application 
>> and/or business that would be incredibly useful.
>>
>> Thanks very much,
>> C G
>>
>>
>>
>> ---------------------------------
>> Be a better Globetrotter. Get better travel answers from someone who 
>> knows.
>> Yahoo! Answers - Check it out.
>>
>
>



       
---------------------------------
Sick sense of humor? Visit Yahoo! TV's Comedy with an Edge to see what's on, 
when. 

Reply via email to