[ 
https://issues.apache.org/jira/browse/CHUKWA-680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857313#comment-13857313
 ] 

michael yu commented on CHUKWA-680:
-----------------------------------

Sure thing.

# Each cluster will have its own train model.
# You are correct.  It is more along the lings of typical vs. atypical.
# If the workload changes and the existing training model has never seen it 
(i.e. has not processed this kind of relevant data)... then the SVM engine will 
most likely predict (indicate) that it's "atypical".  At that point, a 
notification will be sent to any registered email addresses.  The user has the 
ability to correct that "atypical" data point if it actually is "typical".  If 
this is done, the model will be retrained.

> Pattern recognition of Hadoop generated metrics
> -----------------------------------------------
>
>                 Key: CHUKWA-680
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-680
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>         Environment: IBM InfoSphere BigInsights Enterprise
>            Reporter: michael yu
>            Assignee: michael yu
>            Priority: Minor
>              Labels: GSoC, GSoC2013
>         Attachments: Yu, Michael et al-project-report-draft.pdf
>
>   Original Estimate: 2,760h
>  Remaining Estimate: 2,760h
>
> Charles Lin and I are working on our IBM SJSU masters project on "Pattern 
> recognition of Hadoop generated metrics".
> The purpose of the project is to use libsvm to predict the health of the 
> cluster.
> The scope of the project includes:
> 1) gathering large scale data set of metrics for healthy and unhealthy 
> clusters
> 2) use #1 and libsvm to generate training model
> 3) periodic collection of metrics and comparing against training model using 
> libsvm to predict the cluster health
>    a) if unhealthy, send email notification to system administrator 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to