[jira] [Resolved] (OPENNLP-574) Integrate the Apache Mahout classifiers
[ https://issues.apache.org/jira/browse/OPENNLP-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi resolved OPENNLP-574. --- Resolution: Won't Fix Fix Version/s: 1.7.1 > Integrate the Apache Mahout classifiers > > > Key: OPENNLP-574 > URL: https://issues.apache.org/jira/browse/OPENNLP-574 > Project: OpenNLP > Issue Type: Improvement >Reporter: Joern Kottmann >Assignee: Joern Kottmann >Priority: Minor > Fix For: 1.7.1 > > > The Apache Mahout implements a Logicstic Regression and HMM classifiers which > could be used by the OpenNLP components. As soon as the machine learning is > plugable in OpenNLP a sandbox component should be added which can integrate > the Mahout classifiers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OPENNLP-820) parser is mistagging quotes
[ https://issues.apache.org/jira/browse/OPENNLP-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joern Kottmann closed OPENNLP-820. -- Resolution: Won't Fix Looks like a mistake the model made and not a bug. Please re-open if that is not the case. > parser is mistagging quotes > --- > > Key: OPENNLP-820 > URL: https://issues.apache.org/jira/browse/OPENNLP-820 > Project: OpenNLP > Issue Type: Bug > Components: Parser >Affects Versions: 1.6.0 >Reporter: Steven Owens > Labels: english, newbie > Original Estimate: 1h > Remaining Estimate: 1h > > the parser is mistagging quotes (both single and double) with the default > English model. I notice most on opening quotes but it happens to closing > quotes. > ex. (TOP (NP (NP-S-NP (NN "))(ADVP-C-NP (RB Here))(. ?)(. "))) both double > quotes should be labeled ''(two single quotes). > same sentence labeled with the part of speech tagger using the default > English model: "__`` Here_RB ?_. "_'' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OPENNLP-748) Replace hard coded paths in the GazetteerIndexer.main with arguments
[ https://issues.apache.org/jira/browse/OPENNLP-748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joern Kottmann closed OPENNLP-748. -- Resolution: Fixed This change was done. > Replace hard coded paths in the GazetteerIndexer.main with arguments > > > Key: OPENNLP-748 > URL: https://issues.apache.org/jira/browse/OPENNLP-748 > Project: OpenNLP > Issue Type: Improvement > Components: Entity Linker >Reporter: Joern Kottmann >Priority: Minor > > The GazetteerIndexer has a main method which builds the Lucene index based on > a couple of input files and writes the index to an output path. All those > paths are hard coded. To be able to use this tool without adapting the code > it would be much better if all paths can be passed in via command line > arguments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-166) Remove slack parameter from GISModel
[ https://issues.apache.org/jira/browse/OPENNLP-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826624#comment-15826624 ] ASF GitHub Bot commented on OPENNLP-166: GitHub user kottmann opened a pull request: https://github.com/apache/opennlp/pull/72 OPENNLP-166: Remove or deprecate slack parameter You can merge this pull request into a Git repository by running: $ git pull https://github.com/kottmann/opennlp OPENNLP-166 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/opennlp/pull/72.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #72 commit 969b63c1e91849a4013554170967930ae0b923c9 Author: Jörn KottmannDate: 2017-01-17T19:05:00Z OPENNLP-166: Remove or deprecate slack parameter > Remove slack parameter from GISModel > > > Key: OPENNLP-166 > URL: https://issues.apache.org/jira/browse/OPENNLP-166 > Project: OpenNLP > Issue Type: Improvement > Components: Machine Learning >Reporter: Joern Kottmann >Assignee: Joern Kottmann >Priority: Minor > Fix For: 1.7.1 > > > The support for a slack parameter inside the model should be removed from the > GISModel class. Only old models which have been trained prior maxent 3.0 can > have such a slack parameter or correction constant. The training code since > maxent 3.0 always sets the correction constant to 1 to support backward > compatibility with old models. > Since the removal will break backward compatibility it should be aligned with > the opennlp-ml redesign. > The support to train models with a slack parameter has been completely > removed in OPENNLP-165. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OPENNLP-947) Organize imports according to new order
Joern Kottmann created OPENNLP-947: -- Summary: Organize imports according to new order Key: OPENNLP-947 URL: https://issues.apache.org/jira/browse/OPENNLP-947 Project: OpenNLP Issue Type: Improvement Reporter: Joern Kottmann Priority: Trivial Fix For: 1.7.1 It would be nice to do this for the code base and enforce it via checkstyle. We can tell people to make sure their IDE is configured correctly and everything will be fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-166) Remove slack parameter from GISModel
[ https://issues.apache.org/jira/browse/OPENNLP-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joern Kottmann updated OPENNLP-166: --- Fix Version/s: (was: 1.7.2) 1.7.1 > Remove slack parameter from GISModel > > > Key: OPENNLP-166 > URL: https://issues.apache.org/jira/browse/OPENNLP-166 > Project: OpenNLP > Issue Type: Improvement > Components: Machine Learning >Reporter: Joern Kottmann >Priority: Minor > Fix For: 1.7.1 > > > The support for a slack parameter inside the model should be removed from the > GISModel class. Only old models which have been trained prior maxent 3.0 can > have such a slack parameter or correction constant. The training code since > maxent 3.0 always sets the correction constant to 1 to support backward > compatibility with old models. > Since the removal will break backward compatibility it should be aligned with > the opennlp-ml redesign. > The support to train models with a slack parameter has been completely > removed in OPENNLP-165. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-675) Absence of logging and usage of System.out
[ https://issues.apache.org/jira/browse/OPENNLP-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826494#comment-15826494 ] ASF GitHub Bot commented on OPENNLP-675: GitHub user kottmann opened a pull request: https://github.com/apache/opennlp/pull/71 OPENNLP-675 Use verbose param to control printing to stdout You can merge this pull request into a Git repository by running: $ git pull https://github.com/kottmann/opennlp OPENNLP-675 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/opennlp/pull/71.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #71 commit 98b2d5e12d9b671c26d8ad9dc6fc3d5dc3798470 Author: Jörn KottmannDate: 2017-01-17T17:53:37Z OPENNLP-675 Use verbose param to control printing to stdout > Absence of logging and usage of System.out > -- > > Key: OPENNLP-675 > URL: https://issues.apache.org/jira/browse/OPENNLP-675 > Project: OpenNLP > Issue Type: New Feature > Components: Sentence Detector, Tokenizer >Reporter: Eugene Prystupa > Fix For: 1.7.1 > > > There seems to be no concept of logging used by the libraries. Instead > System.out.println is hard-coded in many places where debug information using > a logging framework would do it. > This makes awkward to use the modules integrated into a different application > (as it spams our logs or console). > Is the usage of System.out in core classes (like GISTrainer) by choice? Or is > it simply a technical debt? I am happy to work on it and provide a patch if > this is a technical debt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OPENNLP-919) Fix/Suppresse "Possible heap pollution from parameterized vararg type" warning
[ https://issues.apache.org/jira/browse/OPENNLP-919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joern Kottmann closed OPENNLP-919. -- Resolution: Fixed > Fix/Suppresse "Possible heap pollution from parameterized vararg type" warning > -- > > Key: OPENNLP-919 > URL: https://issues.apache.org/jira/browse/OPENNLP-919 > Project: OpenNLP > Issue Type: Improvement >Reporter: Joern Kottmann >Priority: Minor > Fix For: 1.7.1 > > > The warning should either be fixed, or if false alarm be suppressed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-919) Fix/Suppresse "Possible heap pollution from parameterized vararg type" warning
[ https://issues.apache.org/jira/browse/OPENNLP-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826412#comment-15826412 ] ASF GitHub Bot commented on OPENNLP-919: GitHub user kottmann opened a pull request: https://github.com/apache/opennlp/pull/70 OPENNLP-919: Remove type variable from varargs You can merge this pull request into a Git repository by running: $ git pull https://github.com/kottmann/opennlp OPENNLP-919 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/opennlp/pull/70.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #70 commit c75bce14266b727c0bd022929d3ee10bc3062704 Author: Jörn KottmannDate: 2017-01-17T17:17:36Z OPENNLP-919: Remove type variable from varargs > Fix/Suppresse "Possible heap pollution from parameterized vararg type" warning > -- > > Key: OPENNLP-919 > URL: https://issues.apache.org/jira/browse/OPENNLP-919 > Project: OpenNLP > Issue Type: Improvement >Reporter: Joern Kottmann >Priority: Minor > Fix For: 1.7.1 > > > The warning should either be fixed, or if false alarm be suppressed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-936) Add thread safe versions of some tools (ME sentence detection, tokenization, pos tagging)
[ https://issues.apache.org/jira/browse/OPENNLP-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826267#comment-15826267 ] ASF GitHub Bot commented on OPENNLP-936: GitHub user twgoetz opened a pull request: https://github.com/apache/opennlp/pull/69 OPENNLP-936: Add thread-safe versions of POSTaggerME, SentenceDetecto… …rME and TokenizerME. Include test case as well. I'm open to changing the names of the classes, if anybody has a better idea. You can merge this pull request into a Git repository by running: $ git pull https://github.com/twgoetz/opennlp opennlp-936 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/opennlp/pull/69.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #69 commit 1916d9458f68edbec573be8c6b6a214c9f91b91e Author: Thilo GoetzDate: 2017-01-17T15:40:46Z OPENNLP-936: Add thread-safe versions of POSTaggerME, SentenceDetectorME and TokenizerME. Include test case as well. > Add thread safe versions of some tools (ME sentence detection, tokenization, > pos tagging) > - > > Key: OPENNLP-936 > URL: https://issues.apache.org/jira/browse/OPENNLP-936 > Project: OpenNLP > Issue Type: Improvement > Components: POS Tagger >Affects Versions: 1.7.1 >Reporter: Thilo Goetz >Priority: Minor > Fix For: 1.7.1 > > > As discussed on the mailing list, add thread safe versions of maximum entropy > sentence detection, tokenization and pos tagging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site
[ https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated OPENNLP-6: - Attachment: OpenNLP-koji-1.png > Create a new OpenNLP project logo for the site > -- > > Key: OPENNLP-6 > URL: https://issues.apache.org/jira/browse/OPENNLP-6 > Project: OpenNLP > Issue Type: Improvement > Components: Website >Reporter: Joern Kottmann >Priority: Minor > Labels: help-wanted > Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, > kinow-opennlp-3.png, kinow-opennlp-3-variations.png, OpenNLP-koji-1.png, > OpenNLP-koji-1.png, OpenNLP-koji-2.png, OpenNLP.png, opennlp-variations.png > > > The current logo was not changed for a long time (if ever). > Lets create a new fresh looking logo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OPENNLP-946) GISTrainer should extend AbstractEventTrainer
Joern Kottmann created OPENNLP-946: -- Summary: GISTrainer should extend AbstractEventTrainer Key: OPENNLP-946 URL: https://issues.apache.org/jira/browse/OPENNLP-946 Project: OpenNLP Issue Type: Improvement Components: Machine Learning Reporter: Joern Kottmann Priority: Minor Fix For: 1.7.2 This should be refactored to make it fit into the ml framework. Currently GIS only fits in and then calls GISTrainer. Lets do the following: - GISTrainer will extend AbstractEventTrainer - GIS will be deprecated -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-904) Adding new functionality to know all possible lemmas given a word and pos tag pair
[ https://issues.apache.org/jira/browse/OPENNLP-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rodrigo Agerri updated OPENNLP-904: --- Fix Version/s: (was: 1.7.1) 1.7.2 > Adding new functionality to know all possible lemmas given a word and pos tag > pair > -- > > Key: OPENNLP-904 > URL: https://issues.apache.org/jira/browse/OPENNLP-904 > Project: OpenNLP > Issue Type: New Feature > Components: Lemmatizer >Reporter: Rodrigo Agerri >Assignee: Rodrigo Agerri >Priority: Minor > Fix For: 1.7.2 > > > Currently the various lemmatizers (DictionaryLemmatizer, LemmatizerME and > MorfologikLemmatizer) do not allow to obtain all posible lemmas given a word > and postag pair. This functionality is useful and should be added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)