[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier

2017-02-02 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849959#comment-15849959
 ] 

Tommaso Teofili commented on LUCENE-7274:
-

+1 thanks [~caomanhdat].

> Add LogisticRegressionDocumentClassifier
> 
>
> Key: LUCENE-7274
> URL: https://issues.apache.org/jira/browse/LUCENE-7274
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Cao Manh Dat
>Assignee: Tommaso Teofili
> Attachments: LUCENE-7274.patch
>
>
> Add LogisticRegressionDocumentClassifier for Lucene.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier

2017-02-02 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849629#comment-15849629
 ] 

Cao Manh Dat commented on LUCENE-7274:
--

[~teofili] After review the patch, I'm afraid that we should close this issue 
as won't fix.

Because all classifiers in classification module are lazy learning methods and 
relied on Lucene to quickly classify documents. They don't have any pre-trained 
model. Logistic Regression in other way is eager learning method, so It need a 
pre-trained model to classify documents. But the patch did not provide an api 
to train a logistic regression model, so it will be hard for users to use 
{{LogisticRegressionDocumentClassifier}}.

BTW SOLR-8492 and SOLR-9252 provide an api for training a model. The trained 
model will be stored as a document in Lucene index. So it will make Lucene 
depend on how Solr construct that model, but I don't think it will be a good 
idea. 

So I think we can close this issue and create another issue like "an unify api 
for eager learning method in classification module"


> Add LogisticRegressionDocumentClassifier
> 
>
> Key: LUCENE-7274
> URL: https://issues.apache.org/jira/browse/LUCENE-7274
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Cao Manh Dat
>Assignee: Tommaso Teofili
> Attachments: LUCENE-7274.patch
>
>
> Add LogisticRegressionDocumentClassifier for Lucene.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier

2017-01-23 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15834091#comment-15834091
 ] 

Cao Manh Dat commented on LUCENE-7274:
--

[~teofili] Sure, I will take a look at above points, soon!

> Add LogisticRegressionDocumentClassifier
> 
>
> Key: LUCENE-7274
> URL: https://issues.apache.org/jira/browse/LUCENE-7274
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Cao Manh Dat
>Assignee: Tommaso Teofili
> Attachments: LUCENE-7274.patch
>
>
> Add LogisticRegressionDocumentClassifier for Lucene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier

2017-01-23 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15834076#comment-15834076
 ] 

Tommaso Teofili commented on LUCENE-7274:
-

[~caomanhdat] would you have time to have a look into the above points ?

> Add LogisticRegressionDocumentClassifier
> 
>
> Key: LUCENE-7274
> URL: https://issues.apache.org/jira/browse/LUCENE-7274
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Cao Manh Dat
>Assignee: Tommaso Teofili
> Attachments: LUCENE-7274.patch
>
>
> Add LogisticRegressionDocumentClassifier for Lucene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier

2016-11-16 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669850#comment-15669850
 ] 

Tommaso Teofili commented on LUCENE-7274:
-

Hi [~caomanhdat], thanks for your patch.
A couple of comments:
- I think it'd be good if we could make it a {{LogisticRegressionClassifier}} 
and then extend it into a {{LogisticRegressionDocumentClassifier}} (like for 
{{KNearestNeighbourClassifier}}.
- IIUTC this implementation assumes each feature is stored in a separate field 
and the weights to be computed externally as a _double[]_ , can this work for 
example with Solr's capabilities to store AI models ?
- regarding the labels, wouldn't it be better to declare the classifier as a 
{{Classifier}} (it's a binary classifier in the end)?

> Add LogisticRegressionDocumentClassifier
> 
>
> Key: LUCENE-7274
> URL: https://issues.apache.org/jira/browse/LUCENE-7274
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Cao Manh Dat
>Assignee: Tommaso Teofili
> Attachments: LUCENE-7274.patch
>
>
> Add LogisticRegressionDocumentClassifier for Lucene.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org