[jira] [Resolved] (OPENNLP-574) Integrate the Apache Mahout classifiers

2017-01-17 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi resolved OPENNLP-574.
---
   Resolution: Won't Fix
Fix Version/s: 1.7.1

> Integrate the Apache Mahout classifiers 
> 
>
> Key: OPENNLP-574
> URL: https://issues.apache.org/jira/browse/OPENNLP-574
> Project: OpenNLP
>  Issue Type: Improvement
>Reporter: Joern Kottmann
>Assignee: Joern Kottmann
>Priority: Minor
> Fix For: 1.7.1
>
>
> The Apache Mahout implements a Logicstic Regression and HMM classifiers which 
> could be used by the OpenNLP components. As soon as the machine learning is 
> plugable in OpenNLP a sandbox component should be added which can integrate 
> the Mahout classifiers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OPENNLP-820) parser is mistagging quotes

2017-01-17 Thread Joern Kottmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Kottmann closed OPENNLP-820.
--
Resolution: Won't Fix

Looks like a mistake the model made and not a bug. Please re-open if that is 
not the case.

> parser is mistagging quotes
> ---
>
> Key: OPENNLP-820
> URL: https://issues.apache.org/jira/browse/OPENNLP-820
> Project: OpenNLP
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.6.0
>Reporter: Steven Owens
>  Labels: english, newbie
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> the parser is mistagging quotes (both single and double) with the default 
> English model. I notice most on opening quotes but it happens to closing 
> quotes. 
> ex. (TOP (NP (NP-S-NP (NN "))(ADVP-C-NP (RB Here))(. ?)(. ")))  both double 
> quotes should be labeled ''(two single quotes).
> same sentence labeled with the part of speech tagger using the default 
> English model: "__`` Here_RB ?_. "_''



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OPENNLP-748) Replace hard coded paths in the GazetteerIndexer.main with arguments

2017-01-17 Thread Joern Kottmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Kottmann closed OPENNLP-748.
--
Resolution: Fixed

This change was done.

> Replace hard coded paths in the GazetteerIndexer.main with arguments
> 
>
> Key: OPENNLP-748
> URL: https://issues.apache.org/jira/browse/OPENNLP-748
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Entity Linker
>Reporter: Joern Kottmann
>Priority: Minor
>
> The GazetteerIndexer has a main method which builds the Lucene index based on 
> a couple of input files and writes the index to an output path. All those 
> paths are hard coded. To be able to use this tool without adapting the code 
> it would be much better if all paths can be passed in via command line 
> arguments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-166) Remove slack parameter from GISModel

2017-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826624#comment-15826624
 ] 

ASF GitHub Bot commented on OPENNLP-166:


GitHub user kottmann opened a pull request:

https://github.com/apache/opennlp/pull/72

OPENNLP-166: Remove or deprecate slack parameter



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kottmann/opennlp OPENNLP-166

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/opennlp/pull/72.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #72


commit 969b63c1e91849a4013554170967930ae0b923c9
Author: Jörn Kottmann 
Date:   2017-01-17T19:05:00Z

OPENNLP-166: Remove or deprecate slack parameter




> Remove slack parameter from GISModel
> 
>
> Key: OPENNLP-166
> URL: https://issues.apache.org/jira/browse/OPENNLP-166
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Machine Learning
>Reporter: Joern Kottmann
>Assignee: Joern Kottmann
>Priority: Minor
> Fix For: 1.7.1
>
>
> The support for a slack parameter inside the model should be removed from the 
> GISModel class. Only old models which have been trained prior maxent 3.0 can 
> have such a slack parameter or correction constant. The training code since 
> maxent 3.0 always sets the correction constant to 1 to support backward 
> compatibility with old models.
> Since the removal will break backward compatibility it should be aligned with 
> the opennlp-ml redesign.
> The support to train models with a slack parameter has been completely 
> removed in OPENNLP-165.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OPENNLP-947) Organize imports according to new order

2017-01-17 Thread Joern Kottmann (JIRA)
Joern Kottmann created OPENNLP-947:
--

 Summary: Organize imports according to new order
 Key: OPENNLP-947
 URL: https://issues.apache.org/jira/browse/OPENNLP-947
 Project: OpenNLP
  Issue Type: Improvement
Reporter: Joern Kottmann
Priority: Trivial
 Fix For: 1.7.1


It would be nice to do this for the code base and enforce it via checkstyle. We 
can tell people to make sure their IDE is configured correctly and everything 
will be fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OPENNLP-166) Remove slack parameter from GISModel

2017-01-17 Thread Joern Kottmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Kottmann updated OPENNLP-166:
---
Fix Version/s: (was: 1.7.2)
   1.7.1

> Remove slack parameter from GISModel
> 
>
> Key: OPENNLP-166
> URL: https://issues.apache.org/jira/browse/OPENNLP-166
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Machine Learning
>Reporter: Joern Kottmann
>Priority: Minor
> Fix For: 1.7.1
>
>
> The support for a slack parameter inside the model should be removed from the 
> GISModel class. Only old models which have been trained prior maxent 3.0 can 
> have such a slack parameter or correction constant. The training code since 
> maxent 3.0 always sets the correction constant to 1 to support backward 
> compatibility with old models.
> Since the removal will break backward compatibility it should be aligned with 
> the opennlp-ml redesign.
> The support to train models with a slack parameter has been completely 
> removed in OPENNLP-165.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-675) Absence of logging and usage of System.out

2017-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826494#comment-15826494
 ] 

ASF GitHub Bot commented on OPENNLP-675:


GitHub user kottmann opened a pull request:

https://github.com/apache/opennlp/pull/71

OPENNLP-675 Use verbose param to control printing to stdout



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kottmann/opennlp OPENNLP-675

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/opennlp/pull/71.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #71


commit 98b2d5e12d9b671c26d8ad9dc6fc3d5dc3798470
Author: Jörn Kottmann 
Date:   2017-01-17T17:53:37Z

OPENNLP-675 Use verbose param to control printing to stdout




> Absence of logging and usage of System.out
> --
>
> Key: OPENNLP-675
> URL: https://issues.apache.org/jira/browse/OPENNLP-675
> Project: OpenNLP
>  Issue Type: New Feature
>  Components: Sentence Detector, Tokenizer
>Reporter: Eugene Prystupa
> Fix For: 1.7.1
>
>
> There seems to be no concept of logging used by the libraries. Instead 
> System.out.println is hard-coded in many places where debug information using 
> a logging framework would do it.
> This makes awkward to use the modules integrated into a different application 
> (as it spams our logs or console). 
> Is the usage of System.out in core classes (like GISTrainer) by choice? Or is 
> it simply a technical debt? I am happy to work on it and provide a patch if 
> this is a technical debt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OPENNLP-919) Fix/Suppresse "Possible heap pollution from parameterized vararg type" warning

2017-01-17 Thread Joern Kottmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Kottmann closed OPENNLP-919.
--
Resolution: Fixed

> Fix/Suppresse "Possible heap pollution from parameterized vararg type" warning
> --
>
> Key: OPENNLP-919
> URL: https://issues.apache.org/jira/browse/OPENNLP-919
> Project: OpenNLP
>  Issue Type: Improvement
>Reporter: Joern Kottmann
>Priority: Minor
> Fix For: 1.7.1
>
>
> The warning should either be fixed, or if false alarm be suppressed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-919) Fix/Suppresse "Possible heap pollution from parameterized vararg type" warning

2017-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826412#comment-15826412
 ] 

ASF GitHub Bot commented on OPENNLP-919:


GitHub user kottmann opened a pull request:

https://github.com/apache/opennlp/pull/70

OPENNLP-919: Remove type variable from varargs



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kottmann/opennlp OPENNLP-919

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/opennlp/pull/70.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #70


commit c75bce14266b727c0bd022929d3ee10bc3062704
Author: Jörn Kottmann 
Date:   2017-01-17T17:17:36Z

OPENNLP-919: Remove type variable from varargs




> Fix/Suppresse "Possible heap pollution from parameterized vararg type" warning
> --
>
> Key: OPENNLP-919
> URL: https://issues.apache.org/jira/browse/OPENNLP-919
> Project: OpenNLP
>  Issue Type: Improvement
>Reporter: Joern Kottmann
>Priority: Minor
> Fix For: 1.7.1
>
>
> The warning should either be fixed, or if false alarm be suppressed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-936) Add thread safe versions of some tools (ME sentence detection, tokenization, pos tagging)

2017-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826267#comment-15826267
 ] 

ASF GitHub Bot commented on OPENNLP-936:


GitHub user twgoetz opened a pull request:

https://github.com/apache/opennlp/pull/69

OPENNLP-936: Add thread-safe versions of POSTaggerME, SentenceDetecto…

…rME and TokenizerME. Include test case as well.

I'm open to changing the names of the classes, if anybody has a better idea.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twgoetz/opennlp opennlp-936

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/opennlp/pull/69.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #69


commit 1916d9458f68edbec573be8c6b6a214c9f91b91e
Author: Thilo Goetz 
Date:   2017-01-17T15:40:46Z

OPENNLP-936: Add thread-safe versions of POSTaggerME, SentenceDetectorME 
and TokenizerME. Include test case as well.




> Add thread safe versions of some tools (ME sentence detection, tokenization, 
> pos tagging)
> -
>
> Key: OPENNLP-936
> URL: https://issues.apache.org/jira/browse/OPENNLP-936
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: POS Tagger
>Affects Versions: 1.7.1
>Reporter: Thilo Goetz
>Priority: Minor
> Fix For: 1.7.1
>
>
> As discussed on the mailing list, add thread safe versions of maximum entropy 
> sentence detection, tokenization and pos tagging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OPENNLP-6) Create a new OpenNLP project logo for the site

2017-01-17 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated OPENNLP-6:
-
Attachment: OpenNLP-koji-1.png

> Create a new OpenNLP project logo for the site
> --
>
> Key: OPENNLP-6
> URL: https://issues.apache.org/jira/browse/OPENNLP-6
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Website
>Reporter: Joern Kottmann
>Priority: Minor
>  Labels: help-wanted
> Attachments: kinow-opennlp-1.png, kinow-opennlp-2.png, 
> kinow-opennlp-3.png, kinow-opennlp-3-variations.png, OpenNLP-koji-1.png, 
> OpenNLP-koji-1.png, OpenNLP-koji-2.png, OpenNLP.png, opennlp-variations.png
>
>
> The current logo was not changed for a long time (if ever). 
> Lets create a new fresh looking logo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OPENNLP-946) GISTrainer should extend AbstractEventTrainer

2017-01-17 Thread Joern Kottmann (JIRA)
Joern Kottmann created OPENNLP-946:
--

 Summary: GISTrainer should extend AbstractEventTrainer
 Key: OPENNLP-946
 URL: https://issues.apache.org/jira/browse/OPENNLP-946
 Project: OpenNLP
  Issue Type: Improvement
  Components: Machine Learning
Reporter: Joern Kottmann
Priority: Minor
 Fix For: 1.7.2


This should be refactored to make it fit into the ml framework. Currently GIS 
only fits in and then calls GISTrainer.

Lets do the following:
- GISTrainer will extend AbstractEventTrainer
- GIS will be deprecated



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OPENNLP-904) Adding new functionality to know all possible lemmas given a word and pos tag pair

2017-01-17 Thread Rodrigo Agerri (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Agerri updated OPENNLP-904:
---
Fix Version/s: (was: 1.7.1)
   1.7.2

> Adding new functionality to know all possible lemmas given a word and pos tag 
> pair
> --
>
> Key: OPENNLP-904
> URL: https://issues.apache.org/jira/browse/OPENNLP-904
> Project: OpenNLP
>  Issue Type: New Feature
>  Components: Lemmatizer
>Reporter: Rodrigo Agerri
>Assignee: Rodrigo Agerri
>Priority: Minor
> Fix For: 1.7.2
>
>
> Currently the various lemmatizers (DictionaryLemmatizer, LemmatizerME and 
> MorfologikLemmatizer) do not allow to obtain all posible lemmas given a word 
> and postag pair. This functionality is useful and should be added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)