[jira] [Commented] (CHUKWA-680) Pattern recognition of Hadoop generated metrics

Otis Gospodnetic (JIRA) Fri, 27 Dec 2013 06:34:59 -0800

    [ 
https://issues.apache.org/jira/browse/CHUKWA-680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857504#comment-13857504
 ]


Otis Gospodnetic commented on CHUKWA-680:
-----------------------------------------

Thanks Michael.  Re 3.  I see this in the paper:
{quote}
The predictions generated for the current day along with any corrections on 
false 
alarms made by the administrator are fed into libsvm engine to generate an 
updated 
model. The updated model will be used for interpreting next day's metrics to 
generate 
predictions. These steps are automated.
{quote}

Does that essentially translate to:
if an email arrives and says "cluster unhealthy" and the person "corrects" 
that, then take that model and use it as the healthy/typical model tomorrow?

Or is there something more sophisticated involved that really *corrects* the 
existing model -- something that feeds the human's correction into the existing 
model and teaches it through this correction without either doing what I wrote 
above - using the latest model as the new "healthy/typical model" or explicitly 
retraining and building a whole new model?


> Pattern recognition of Hadoop generated metrics
> -----------------------------------------------
>
>                 Key: CHUKWA-680
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-680
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>         Environment: IBM InfoSphere BigInsights Enterprise
>            Reporter: michael yu
>            Assignee: michael yu
>            Priority: Minor
>              Labels: GSoC, GSoC2013
>         Attachments: Yu, Michael et al-project-report-draft.pdf
>
>   Original Estimate: 2,760h
>  Remaining Estimate: 2,760h
>
> Charles Lin and I are working on our IBM SJSU masters project on "Pattern 
> recognition of Hadoop generated metrics".
> The purpose of the project is to use libsvm to predict the health of the 
> cluster.
> The scope of the project includes:
> 1) gathering large scale data set of metrics for healthy and unhealthy 
> clusters
> 2) use #1 and libsvm to generate training model
> 3) periodic collection of metrics and comparing against training model using 
> libsvm to predict the cluster health
>    a) if unhealthy, send email notification to system administrator 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CHUKWA-680) Pattern recognition of Hadoop generated metrics

Reply via email to