Re: You guys rocked the house this week!

Andrew Purtell Sun, 14 Jun 2009 11:18:38 -0700

Hi Jim,

> BTW: Cloudera's next release is going to be based on 0.20, and
> they will either include HBase as alpha software, or put us
> in their supported stack, depending on the reaction from our
> community.


What does that mean, "depending on the reaction from our  community"?

> If we do this, Cloudera has volunteered to run that
> script on EC2 on a ~100 node cluster to burn it in. (They have
> some arrangement with Amazon) and they have volunteered to run
> the test on a "big" cluster for us. 

I think HBase, as well as Hadoop frankly, can also use a reasonably
scaled performance, reliability, and fault tolerance automated test
platform. (See "Re: scanner is returning everything in parent region
plus one of the daughters?") Think of it as expanding Hudson to 
a cluster of several nodes hosted with community resources, perhaps
on EC2, running some suite once per day, or perhaps triggered by a
project once they reach a certain milestone, so each project could
be allocated a budget in terms of hours/month and time limits of
hours/day or similar. ~10 nodes seems reasonably affordable, with
~100 used on occasion, the difference being daily versus weekly, or
weekly versus monthly. 

Stepping back from blue sky, I wonder if HBase anyway can pool
resources to run such a reasonably scaled performance, reliability,
and fault tolerance automated test at least twice a week. 10 extra
large EC2 instances running 5 hours per day is about $300/month.

  - Andy




________________________________
From: Jim Kellerman (POWERSET) <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Sunday, June 14, 2009 3:03:01 AM
Subject: You guys rocked the house this week!

I have received nothing but compliments in all my schmoozing
this week.

Although I was mostly absent from 0.20, it is 0.20
that has everyone excited. Congrats, and great work guys.

However, we still have to deliver on 0.20. It has to be rock
solid, or the buzz will turn against us.

Friday, I was at Cloudera along with Doug C, Erik14, Owen O'Malley,
Arun Murthy, Alan Gates (Pig), a guy from Hive (whose name I
can't remember at the moment), Dhruba (facebook) and, of course,
the Cloudera guys (Todd Lipcon, Jeff Hammerbacher, Christopne,
Amr, etc.)

The day went something like this:

1. 1st exercise: write (on a postit) 5 things you like about
   hadoop and 5 things you don't - most people submitted more
   than 10) and discussion.

2. 2nd exercise: write (on a postit) features that you'd like
   to see in the short term in hadoop

   - We had submissions that were truly short term and some that
     were truly "blue sky". These were divided into categories:
     Map/Reduce, HDFS, Build/Test, Core (including Avro)

   - We then split up into separate sessions. I attended HDFS.
     (the session leaders are supposed to send in notes from
     their session, and as soon as I get them, I will post.)

   - The biggest issue from HDFS was append (actually flush/sync),
     and not just me, there were about 7 votes for it (just "append")
     whereas my votes were like "flush/sync in 0.21" and HADOOP-4379
     in 0.20.x.

3. Third session: Blue Sky: not much happened here because every
   one was kind of burned out at this point.

Important points (for HBase):

1. We need to deliver a rock-solid 0.20 release or we will lose
   all the credibility that we gained this week.

   BTW: Cloudera's next release is going to be based on 0.20, and
   they will either include HBase as alpha software, or put us
   in their supported stack, depending on the reaction from our
   community. And despite the fact that their revenue stream depends
   on the Hadoop community, I got the feeling that they are getting
   pressured to have a version of HBase (not so much on 0.18, but
   more on 0.20). They have a '$' interest in seeing us succeed.

2. Once we get 0.20 out, we need to focus on beating the sh*t out
   of HADOOP-4379 patch for 0.20. Once we think it is solid, we
   need to create a script that randomly fails region server and
   datanodes. If we do this, Cloudera has volunteered to run that
   script on EC2 on a ~100 node cluster to burn it in. (They have
   some arrangement with Amazon) and they have volunteered to run
   the test on a "big" cluster for us. They will run it for several
   days if necessary to prove that it works.

   -- We need to be sure HADOOP-4379 is solid, which could lead to
      getting 4379 into hadoop 0.20.x if so. Dhruba, who led the HFDS
      breakout session, will do what it takes to fix issues around
      his current patch, provided we provide feedback to him. However,
      if we don't do #1 above, it won't matter.

   -- Master failover works

   -- Region server failover works.

3. After this week, both PIG and Hive are excited about using
   HBase as a source and a sink for their map-reduce jobs that
   they spawn. They have both come to realize that we are
   becoming more important in the Hadoop community, and are
   willing to devote resources to make their stuff work with HBase.
   (They will look bad if the other supports HBase and they do
   not - although to be fair, there was no data store available
   before this that met their needs either).

So keep up the great work and MAKE SURE 0.20 IS ROCK SOLID STABLE!!!

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)

Re: You guys rocked the house this week!

Reply via email to