[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2016-01-21 Thread Daniel Darabos (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110628#comment-15110628
 ] 

Daniel Darabos commented on SPARK-2309:
---

https://github.com/apache/spark/blob/v1.6.0/docs/ml-classification-regression.md#logistic-regression
 still says:

> The current implementation of logistic regression in spark.ml only supports 
> binary classes. Support for multiclass regression will be added in the future.

That can be removed now, right?

> Generalize the binary logistic regression into multinomial logistic regression
> --
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression 
> can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2015-10-12 Thread christian sommeregger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952869#comment-14952869
 ] 

christian sommeregger commented on SPARK-2309:
--

Sorry everyone! I got confused by the different terminologies out there. (the 
model on slideshare is of course implemented correctly)
I was talking about a conditional multinomial logit:

So instead of 
U_is=X_s*w_i
( U_is denotes the utility of item i in choice situation s, features X_s are 
constant across alternatives, weights w_i are
item specific)

we would use:
U_is=X_si*w
(weights are the same across alternatives, but features can be distinct for 
each item, and can be different for each s)

(check 6.3.3. in http://data.princeton.edu/wws509/notes/c6s3.html)

I would be happy to contribute some code. Do you think this would be an 
interesting extension. Should I create a new ticket for this ? 

> Generalize the binary logistic regression into multinomial logistic regression
> --
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression 
> can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2015-10-12 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953024#comment-14953024
 ] 

Sean Owen commented on SPARK-2309:
--

Hm, I'm not sure I've seen a formulation like that before. Typically you have a 
set of input features, and K output classes, and you learn K one-vs-all 
classification boundaries. But that means the same features in each case, but 
different coefficients for each class. That's what your reference says too, but 
in the "Multinomial logit" section, which is what we're talking about here no?

I'm actually a little confused by 
http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 on reviewing; does 
the notation change on the third line or am I missing a key step? x shouldn't 
be specific to the output class k; w should be.

> Generalize the binary logistic regression into multinomial logistic regression
> --
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression 
> can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2015-10-12 Thread christian sommeregger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953136#comment-14953136
 ] 

christian sommeregger commented on SPARK-2309:
--

yes, I think the k should be dropped in x_k on slide 25 in line 2.

Regarding my source: the current multinomial model in spark corresponds to: 
6.3.2
However, I think that something like 6.3.3 or 6.3.4 would be extremely useful. 

> Generalize the binary logistic regression into multinomial logistic regression
> --
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression 
> can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2015-10-12 Thread DB Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953530#comment-14953530
 ] 

DB Tsai commented on SPARK-2309:


I think the priority will be porting MLOR into Spark ML framework first, and 
then we can think about the extension. 

> Generalize the binary logistic regression into multinomial logistic regression
> --
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression 
> can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2015-10-12 Thread DB Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953520#comment-14953520
 ] 

DB Tsai commented on SPARK-2309:


That was a typo. In line 3, x should be x_k. Several people already sent me 
email and asked the same question. I'll update the slide. Thanks.

> Generalize the binary logistic regression into multinomial logistic regression
> --
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression 
> can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2015-10-10 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951744#comment-14951744
 ] 

Sean Owen commented on SPARK-2309:
--

Yeah, I also don't get it. In multinomial LR you still have the same features 
for every output class. The slide you show just shows a loss function computed 
over the loss for each of the N classes, not just 1. But the features are the 
same. Implicitly, if an example is in class k then it's not in the other 
classes.

> Generalize the binary logistic regression into multinomial logistic regression
> --
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression 
> can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2015-10-09 Thread DB Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951438#comment-14951438
 ] 

DB Tsai commented on SPARK-2309:


I don't quite get you, can you elaborate? But I'm pretty sure that the 
implementation in Spark MLlib is the same as slide and that's standard 
multinomial LoR. You can check the test code which shows that the result 
matches R.

> Generalize the binary logistic regression into multinomial logistic regression
> --
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression 
> can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2015-10-09 Thread christian sommeregger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950208#comment-14950208
 ] 

christian sommeregger commented on SPARK-2309:
--

Hey everybody! 
After inspecting the code on github I believe that we have not really 
implemented the standard multinomial problem from 
http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25
but a model that covers a set of binary choices with item specific weights, 
which is a slightly different thing. 

For a true multinomial setup each row in the training data needs to containt 
all items (K = number of choices) that were available in a specific choice 
situation,
The current labelled point object however has just a choice flag + the 
respective features of one item in each row:

e.g.: Labelled point (K=1)
0 | ()  
1 | ()  
3 | () 
3 | () 
0 | ()  


For the model on http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 we 
would rather need the following structure. 

e.g.: Always three Items in the choice set (K=3)
Choice Indicator | Item1Features | Item2Features | Item3Features
1 | () |  () |  () 
3 | () |  () |  ()* 
3 | () |  () |  ()*
 
e.g.: Flexible number of Items in the choice set (K varies)
8 | () |  () |  () |  () |  () |  () |  () |  
()* |  () 
2 | () |  ()* |  () 
3 | () |  () |  ()* |  () 


> Generalize the binary logistic regression into multinomial logistic regression
> --
>
> Key: SPARK-2309
> URL: https://issues.apache.org/jira/browse/SPARK-2309
> Project: Spark
>  Issue Type: New Feature
>  Components: MLlib
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Critical
> Fix For: 1.3.0
>
>
> Currently, there is no multi-class classifier in mllib. Logistic regression 
> can be extended to multinomial one straightforwardly. 
> The following formula will be implemented. 
> http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression

2014-07-15 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063163#comment-14063163
 ] 

Xiangrui Meng commented on SPARK-2309:
--

PR: https://github.com/apache/spark/pull/1379

 Generalize the binary logistic regression into multinomial logistic regression
 --

 Key: SPARK-2309
 URL: https://issues.apache.org/jira/browse/SPARK-2309
 Project: Spark
  Issue Type: New Feature
  Components: MLlib
Reporter: DB Tsai
Assignee: DB Tsai

 Currently, there is no multi-class classifier in mllib. Logistic regression 
 can be extended to multinomial one straightforwardly. 
 The following formula will be implemented. 
 http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25



--
This message was sent by Atlassian JIRA
(v6.2#6252)