Re: [Architecture] ML Model Summary Illustration and Comparison

Nirmal Fernando Sun, 12 Jul 2015 23:35:14 -0700

Why can't we use Within Set Sum of Squared Error (WSSSE) as a measure of
clustering?



On Fri, May 15, 2015 at 4:34 PM, CD Athuraliya <[email protected]> wrote:

> Hi all,
>
> We have implemented model comparison for classification and numerical
> prediction with following measures.
>
>    - Binary and multiclass classification - Accuracy
>    - Numerical prediction - Mean squared error
>
> We are currently working on a sorted view of models according to their
> accuracy/MSE. This release will not support cross comparison for clustering
> algorithms.
>
> Thanks,
> CD
>
> On Tue, May 5, 2015 at 5:41 PM, CD Athuraliya <[email protected]> wrote:
>
>> Hi all,
>>
>> With what chart types and implementations we are going to proceed for
>> alpha? We will be able to finalize comparison and summery views with them.
>>
>> Thanks,
>> CD
>>
>> On Fri, May 1, 2015 at 9:39 AM, Supun Sethunga <[email protected]> wrote:
>>
>>> Hi Nirmal,
>>>
>>> During the last discussion, what we decided was to, show some numerical
>>> value (Accuracy / Std error) next to each model to illustrate the
>>> performance in the model listing view, so that user can get an overall idea
>>> at one glance. And in a separate page, have the ROC comparison. Think we
>>> still need to figure out where would the later fit in, in the UI
>>> navigation..
>>>
>>> Thanks,
>>> Supun
>>>
>>> On Thu, Apr 30, 2015 at 6:51 PM, Nirmal Fernando <[email protected]>
>>> wrote:
>>>
>>>> Thanks for summarizing Supun. Did we think about how we gonna create
>>>> the cross-model comparisons view?
>>>>
>>>> On Thu, Apr 30, 2015 at 8:33 AM, Supun Sethunga <[email protected]>
>>>> wrote:
>>>>
>>>>> [-strategy@, +architecture@]
>>>>>
>>>>> On Thu, Apr 30, 2015 at 5:58 PM, Srinath Perera <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> should go to arch@
>>>>>>
>>>>>> On Thu, Apr 30, 2015 at 6:28 AM, Srinath Perera <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Supun!! this looks good.
>>>>>>>
>>>>>>> --Srinath
>>>>>>>
>>>>>>> On Thu, Apr 30, 2015 at 6:25 AM, Supun Sethunga <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Following is the break down of the Model Summary illustrations that
>>>>>>>> can be supported by ML at the moment. Initiating this thread to 
>>>>>>>> finalize on
>>>>>>>> what we can support and what cannot, with the initial release. Blue 
>>>>>>>> colored
>>>>>>>> ones are yet to implement.
>>>>>>>>
>>>>>>>>    - Numerical Prediction
>>>>>>>>       - Standard Error [1]
>>>>>>>>       - Residual Plot [2]
>>>>>>>>       - Feature Importance (*Graph containing weights assigned to
>>>>>>>>       each of the feature in the model*)
>>>>>>>>
>>>>>>>>
>>>>>>>>    - Classification:
>>>>>>>>    - Binary
>>>>>>>>       - ROC [3]
>>>>>>>>          - AUC
>>>>>>>>          - Confusion Matrix (*Available on spark as a
>>>>>>>>          static metric. But if this was calculated manually, it can be 
>>>>>>>> made
>>>>>>>>          interactive, so that user can find the optimal threshold*)
>>>>>>>>          - Accuracy
>>>>>>>>          - Feature Importance
>>>>>>>>       - Multi-Class
>>>>>>>>          - Confusion Matrix (*Available on spark*)
>>>>>>>>          - Accuracy
>>>>>>>>          - Feature Importance
>>>>>>>>
>>>>>>>>
>>>>>>>>    - Clustering
>>>>>>>>       - Scatter plot with clustered points
>>>>>>>>
>>>>>>>>
>>>>>>>> *Cross-comparing Models*
>>>>>>>>
>>>>>>>> As you can see, major limitation we have when cross comparing
>>>>>>>> models within a project is, different categories have different summary
>>>>>>>> statistics/plots, and hence we cannot compare two models in two 
>>>>>>>> categories.
>>>>>>>>
>>>>>>>> Following are the possibilities:
>>>>>>>>
>>>>>>>>    - ROC can be used to compare Binary classification models.
>>>>>>>>    - Cobweb (a radar chart) can be used to compare Multi-Class
>>>>>>>>    classification models (This is the possible alternative for ROC
>>>>>>>>    in multi-class case. But the drawback is, the graph will be very 
>>>>>>>> unclear
>>>>>>>>    when there are excess amounts of features in the models). [4] [5]
>>>>>>>>    - Accuracy can be used to compare all classification models.
>>>>>>>>
>>>>>>>> Please add if I've missed anything.
>>>>>>>>
>>>>>>>> *Ref:*
>>>>>>>> [1] http://onlinestatbook.com/2/regression/accuracy.html
>>>>>>>> [2] http://stattrek.com/regression/residual-analysis.aspx
>>>>>>>> [3]
>>>>>>>> http://www.sciencedirect.com/science/article/pii/S016786550500303X
>>>>>>>> [4]
>>>>>>>> http://www.academia.edu/2519022/Visualization_and_analysis_of_classifiers_performance_in_multi-class_medical_data
>>>>>>>> [5]
>>>>>>>> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.8450&rep=rep1&type=pdf
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Supun
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Supun Sethunga*
>>>>>>>> Software Engineer
>>>>>>>> WSO2, Inc.
>>>>>>>> http://wso2.com/
>>>>>>>> lean | enterprise | middleware
>>>>>>>> Mobile : +94 716546324
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ============================
>>>>>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>>>>>>> Site: http://people.apache.org/~hemapani/
>>>>>>> Photos: http://www.flickr.com/photos/hemapani/
>>>>>>> Phone: 0772360902
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ============================
>>>>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>>>>>> Site: http://people.apache.org/~hemapani/
>>>>>> Photos: http://www.flickr.com/photos/hemapani/
>>>>>> Phone: 0772360902
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Supun Sethunga*
>>>>> Software Engineer
>>>>> WSO2, Inc.
>>>>> http://wso2.com/
>>>>> lean | enterprise | middleware
>>>>> Mobile : +94 716546324
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Thanks & regards,
>>>> Nirmal
>>>>
>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>> Mobile: +94715779733
>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> *Supun Sethunga*
>>> Software Engineer
>>> WSO2, Inc.
>>> http://wso2.com/
>>> lean | enterprise | middleware
>>> Mobile : +94 716546324
>>>
>>
>>
>>
>> --
>> *CD Athuraliya*
>> Software Engineer
>> WSO2, Inc.
>> lean . enterprise . middleware
>> Mobile: +94 716288847 <94716288847>
>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>> <https://twitter.com/cdathuraliya> | Blog
>> <http://cdathuraliya.tumblr.com/>
>>
>
>
>
> --
> *CD Athuraliya*
> Software Engineer
> WSO2, Inc.
> lean . enterprise . middleware
> Mobile: +94 716288847 <94716288847>
> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
> <https://twitter.com/cdathuraliya> | Blog
> <http://cdathuraliya.tumblr.com/>
>



-- 

Thanks & regards,
Nirmal

Associate Technical Lead - Data Technologies Team, WSO2 Inc.
Mobile: +94715779733
Blog: http://nirmalfdo.blogspot.com/

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Re: [Architecture] ML Model Summary Illustration and Comparison

Reply via email to