[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier
[ https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849959#comment-15849959 ] Tommaso Teofili commented on LUCENE-7274: - +1 thanks [~caomanhdat]. > Add LogisticRegressionDocumentClassifier > > > Key: LUCENE-7274 > URL: https://issues.apache.org/jira/browse/LUCENE-7274 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/classification >Reporter: Cao Manh Dat >Assignee: Tommaso Teofili > Attachments: LUCENE-7274.patch > > > Add LogisticRegressionDocumentClassifier for Lucene. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier
[ https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849629#comment-15849629 ] Cao Manh Dat commented on LUCENE-7274: -- [~teofili] After review the patch, I'm afraid that we should close this issue as won't fix. Because all classifiers in classification module are lazy learning methods and relied on Lucene to quickly classify documents. They don't have any pre-trained model. Logistic Regression in other way is eager learning method, so It need a pre-trained model to classify documents. But the patch did not provide an api to train a logistic regression model, so it will be hard for users to use {{LogisticRegressionDocumentClassifier}}. BTW SOLR-8492 and SOLR-9252 provide an api for training a model. The trained model will be stored as a document in Lucene index. So it will make Lucene depend on how Solr construct that model, but I don't think it will be a good idea. So I think we can close this issue and create another issue like "an unify api for eager learning method in classification module" > Add LogisticRegressionDocumentClassifier > > > Key: LUCENE-7274 > URL: https://issues.apache.org/jira/browse/LUCENE-7274 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/classification >Reporter: Cao Manh Dat >Assignee: Tommaso Teofili > Attachments: LUCENE-7274.patch > > > Add LogisticRegressionDocumentClassifier for Lucene. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier
[ https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15834091#comment-15834091 ] Cao Manh Dat commented on LUCENE-7274: -- [~teofili] Sure, I will take a look at above points, soon! > Add LogisticRegressionDocumentClassifier > > > Key: LUCENE-7274 > URL: https://issues.apache.org/jira/browse/LUCENE-7274 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/classification >Reporter: Cao Manh Dat >Assignee: Tommaso Teofili > Attachments: LUCENE-7274.patch > > > Add LogisticRegressionDocumentClassifier for Lucene. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier
[ https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15834076#comment-15834076 ] Tommaso Teofili commented on LUCENE-7274: - [~caomanhdat] would you have time to have a look into the above points ? > Add LogisticRegressionDocumentClassifier > > > Key: LUCENE-7274 > URL: https://issues.apache.org/jira/browse/LUCENE-7274 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/classification >Reporter: Cao Manh Dat >Assignee: Tommaso Teofili > Attachments: LUCENE-7274.patch > > > Add LogisticRegressionDocumentClassifier for Lucene. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7274) Add LogisticRegressionDocumentClassifier
[ https://issues.apache.org/jira/browse/LUCENE-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669850#comment-15669850 ] Tommaso Teofili commented on LUCENE-7274: - Hi [~caomanhdat], thanks for your patch. A couple of comments: - I think it'd be good if we could make it a {{LogisticRegressionClassifier}} and then extend it into a {{LogisticRegressionDocumentClassifier}} (like for {{KNearestNeighbourClassifier}}. - IIUTC this implementation assumes each feature is stored in a separate field and the weights to be computed externally as a _double[]_ , can this work for example with Solr's capabilities to store AI models ? - regarding the labels, wouldn't it be better to declare the classifier as a {{Classifier}} (it's a binary classifier in the end)? > Add LogisticRegressionDocumentClassifier > > > Key: LUCENE-7274 > URL: https://issues.apache.org/jira/browse/LUCENE-7274 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/classification >Reporter: Cao Manh Dat >Assignee: Tommaso Teofili > Attachments: LUCENE-7274.patch > > > Add LogisticRegressionDocumentClassifier for Lucene. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org