Re: hicc wth HBase

Eric Yang Thu, 04 Nov 2010 17:22:54 -0700

For aggregation, I am betting my money on Pig+HBase.  Pig team has
recently completed HBase 0.20.6 integration to be able to load data
from and to HBase.  This enable us to write PigLatin functions to
slice and dice data.  This new feature is available in Pig trunk
(0.8), hence I am waiting for a pig release before incorporating
aggregation tool chains in Chukwa.  You are welcome to start
experiment with pig+hbase.

See:

https://issues.apache.org/jira/browse/PIG-1205

To polish HICC related issues:

1) The only thing that needs to be change for full integration are:
remove jdbc stuff, and change *.descriptor files to use metrics rest
api to graph.

2) Export function is currently a browser (client side) operation.  We
don't store png file on the server, and I don't plan to create static
image repository in HICC.  Hence, there is no static URL for getting
the images.  It is possible and trivial to add with server side
graphing libraries.

3) Aggregate is going to be series of pig jobs, waiting for pig 0.8 to release.

4) I am not good at writing document.  Helps are always welcome.

After we have a set of pig scripts to replace the current outdated
aggregation scripts.  We should look at Mahout to see if it is useful
for implementing AI algorithm to determine cluster failure.  Sign of
cluster crash happens many minutes before cluster crash.  My team had
a intern who which applied class 1 svm classification algorithm to
predict hadoop failure.  This was done at small scale single machine
training.  The same algorithm can be implemented using mahout to
refine hadoop error prediction algorithms.

regards,
Eric

On Thu, Nov 4, 2010 at 4:10 PM, Ariel Rabkin <[email protected]> wrote:
> Hi all.
>
> Want to report back on some preliminary efforts here at Berkeley to
> use HICC+HBase.
>
> So the good news is, it works. We're able to get data from adaptors,
> to collectors, into HBase, and then draw graphs of it.
> Many thanks to Eric for helping us debug.
>
>  A lot of the pain involved disentangling us from HBase 0.89.  That
> now seems to be mostly done.
>
> Now the rough edges.
>
> 1) The graphs-from-HBase don't seem to be at all integrated with the
> rest of HICC; it's a separate jsp.
> 2) Lots of graphical rough edges. E.g., the export button leaves
> incredible gunk in the address bar, without producing a usable URL.
> 3) No aggregates.
> 4) No documentation.
>
> Eric, what was the strategy you had in mind for aggregates, and
> integrating with the rest of HICC?
> This has now become a priority for the lab, so we have manpower to
> throw at it, but want to make optimal use of it.
>
> --Ari
>
> --
> Ari Rabkin [email protected]
> UC Berkeley Computer Science Department
>

Re: hicc wth HBase

Reply via email to