Re: Does MLLib has attribute importance?

2015-06-18 Thread Debasish Das
Running l1 and picking non zero coefficient s gives a good estimate of
interesting features as well...
On Jun 17, 2015 4:51 PM, Xiangrui Meng men...@gmail.com wrote:

 We don't have it in MLlib. The closest would be the ChiSqSelector,
 which works for categorical data. -Xiangrui

 On Thu, Jun 11, 2015 at 4:33 PM, Ruslan Dautkhanov dautkha...@gmail.com
 wrote:
  What would be closest equivalent in MLLib to Oracle Data Miner's
 Attribute
  Importance mining function?
 
 
 http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/feature_extr.htm#i1005920
 
  Attribute importance is a supervised function that ranks attributes
  according to their significance in predicting a target.
 
 
  Best regards,
  Ruslan Dautkhanov

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Does MLLib has attribute importance?

2015-06-18 Thread Xiangrui Meng
ChiSqSelector calls an RDD of labeled points, where the label is the
target. See 
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala#L120

On Wed, Jun 17, 2015 at 10:22 PM, Ruslan Dautkhanov
dautkha...@gmail.com wrote:
 Thank you Xiangrui.

 Oracle's attribute importance mining function have a target variable.
 Attribute importance is a supervised function that ranks attributes
 according to their significance in predicting a target.
 MLlib's ChiSqSelector does not have a target variable.




 --
 Ruslan Dautkhanov

 On Wed, Jun 17, 2015 at 5:50 PM, Xiangrui Meng men...@gmail.com wrote:

 We don't have it in MLlib. The closest would be the ChiSqSelector,
 which works for categorical data. -Xiangrui

 On Thu, Jun 11, 2015 at 4:33 PM, Ruslan Dautkhanov dautkha...@gmail.com
 wrote:
  What would be closest equivalent in MLLib to Oracle Data Miner's
  Attribute
  Importance mining function?
 
 
  http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/feature_extr.htm#i1005920
 
  Attribute importance is a supervised function that ranks attributes
  according to their significance in predicting a target.
 
 
  Best regards,
  Ruslan Dautkhanov



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Does MLLib has attribute importance?

2015-06-18 Thread Ruslan Dautkhanov
Got it. Thanks!



-- 
Ruslan Dautkhanov

On Thu, Jun 18, 2015 at 1:02 PM, Xiangrui Meng men...@gmail.com wrote:

 ChiSqSelector calls an RDD of labeled points, where the label is the
 target. See
 https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala#L120

 On Wed, Jun 17, 2015 at 10:22 PM, Ruslan Dautkhanov
 dautkha...@gmail.com wrote:
  Thank you Xiangrui.
 
  Oracle's attribute importance mining function have a target variable.
  Attribute importance is a supervised function that ranks attributes
  according to their significance in predicting a target.
  MLlib's ChiSqSelector does not have a target variable.
 
 
 
 
  --
  Ruslan Dautkhanov
 
  On Wed, Jun 17, 2015 at 5:50 PM, Xiangrui Meng men...@gmail.com wrote:
 
  We don't have it in MLlib. The closest would be the ChiSqSelector,
  which works for categorical data. -Xiangrui
 
  On Thu, Jun 11, 2015 at 4:33 PM, Ruslan Dautkhanov 
 dautkha...@gmail.com
  wrote:
   What would be closest equivalent in MLLib to Oracle Data Miner's
   Attribute
   Importance mining function?
  
  
  
 http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/feature_extr.htm#i1005920
  
   Attribute importance is a supervised function that ranks attributes
   according to their significance in predicting a target.
  
  
   Best regards,
   Ruslan Dautkhanov
 
 



Re: Does MLLib has attribute importance?

2015-06-17 Thread Xiangrui Meng
We don't have it in MLlib. The closest would be the ChiSqSelector,
which works for categorical data. -Xiangrui

On Thu, Jun 11, 2015 at 4:33 PM, Ruslan Dautkhanov dautkha...@gmail.com wrote:
 What would be closest equivalent in MLLib to Oracle Data Miner's Attribute
 Importance mining function?

 http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/feature_extr.htm#i1005920

 Attribute importance is a supervised function that ranks attributes
 according to their significance in predicting a target.


 Best regards,
 Ruslan Dautkhanov

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Does MLLib has attribute importance?

2015-06-17 Thread Ruslan Dautkhanov
Thank you Xiangrui.

Oracle's attribute importance mining function have a target variable.
Attribute importance is a supervised function that ranks attributes
according to their significance in predicting a target.
MLlib's ChiSqSelector does not have a target variable.




-- 
Ruslan Dautkhanov

On Wed, Jun 17, 2015 at 5:50 PM, Xiangrui Meng men...@gmail.com wrote:

 We don't have it in MLlib. The closest would be the ChiSqSelector,
 which works for categorical data. -Xiangrui

 On Thu, Jun 11, 2015 at 4:33 PM, Ruslan Dautkhanov dautkha...@gmail.com
 wrote:
  What would be closest equivalent in MLLib to Oracle Data Miner's
 Attribute
  Importance mining function?
 
 
 http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/feature_extr.htm#i1005920
 
  Attribute importance is a supervised function that ranks attributes
  according to their significance in predicting a target.
 
 
  Best regards,
  Ruslan Dautkhanov