[ 
https://issues.apache.org/jira/browse/MADLIB-998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Iyer updated MADLIB-998:
------------------------------
    Description: 
Add a class weight parameter to add weights to specific dependent variable 
values. This is useful for data with unbalanced classes i.e. situations where 1 
class has (far) fewer data points compared to other class(es). 

The general format will be similar to that in scikit-learn, described below: 

class_weight: Sets the weight for the positive and negative classes. If not 
given, all classes are set to have weight one.

If class_weight = balanced, values of y are automatically adjusted as inversely 
proportional to class frequencies in the input data i.e. the weights are set as 
n_samples / (n_classes * bincount ( y )).
Alternatively, class_weight can be a mapping, giving the weight for each class.
Eg. For dependent variable values 'a' and 'b', the class_weight can be
{a: 2, b: 3}. This would lead to each 'a' tuple's y value multiplied by 2 and
each 'b' y value will be multiplied by 3.

For regression, the class weights are always one.

'class_weight' will be part of the optional 'params' argument. 

  was:
Add a class weight parameter to add weights to specific dependent variable 
values. This is useful for data with unbalanced classes i.e. situations where 1 
class has (far) fewer data points compared to other class(es). 

The general format will be similar to that in scikit-learn, described below: 

class_weight: Sets the weight for the positive and negative classes. If not 
given, all classes are set to have weight one.

If class_weight = balanced, values of y are automatically adjusted as inversely 
proportional to class frequencies in the input data i.e. the weights are set as 
n_samples / (n_classes * bincount(y)).
Alternatively, class_weight can be a mapping, giving the weight for each class.
Eg. For dependent variable values 'a' and 'b', the class_weight can be
{a: 2, b: 3}. This would lead to each 'a' tuple's y value multiplied by 2 and
each 'b' y value will be multiplied by 3.

For regression, the class weights are always one.

'class_weight' will be part of the optional 'params' argument. 


> Class weights for SVM
> ---------------------
>
>                 Key: MADLIB-998
>                 URL: https://issues.apache.org/jira/browse/MADLIB-998
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Support Vector Machines
>            Reporter: Rahul Iyer
>
> Add a class weight parameter to add weights to specific dependent variable 
> values. This is useful for data with unbalanced classes i.e. situations where 
> 1 class has (far) fewer data points compared to other class(es). 
> The general format will be similar to that in scikit-learn, described below: 
> class_weight: Sets the weight for the positive and negative classes. If not 
> given, all classes are set to have weight one.
> If class_weight = balanced, values of y are automatically adjusted as 
> inversely proportional to class frequencies in the input data i.e. the 
> weights are set as n_samples / (n_classes * bincount ( y )).
> Alternatively, class_weight can be a mapping, giving the weight for each 
> class.
> Eg. For dependent variable values 'a' and 'b', the class_weight can be
> {a: 2, b: 3}. This would lead to each 'a' tuple's y value multiplied by 2 and
> each 'b' y value will be multiplied by 3.
> For regression, the class weights are always one.
> 'class_weight' will be part of the optional 'params' argument. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to