[jira] [Created] (OPENNLP-977) TrainerFactory uses Deprecated methods

2017-02-06 Thread Russ, Daniel (NIH/CIT) [E] (JIRA)
Russ, Daniel (NIH/CIT) [E] created OPENNLP-977:
--

 Summary: TrainerFactory uses Deprecated methods
 Key: OPENNLP-977
 URL: https://issues.apache.org/jira/browse/OPENNLP-977
 Project: OpenNLP
  Issue Type: Bug
  Components: Machine Learning
Affects Versions: 1.7.2
Reporter: Russ, Daniel (NIH/CIT) [E]
Priority: Minor
 Fix For: 1.7.3


getEventTrainer/getEventModelSequenceTrainer use maps instead of Training 
Parameters.  Also EventModelSequenceTrainer uses maps instead of 
TrainingParameters.  This is not an outward facing interface to most users.  
All init(Map,Map) methods should be deprecated in 
favor of init(TraingParameters, Map



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (OPENNLP-859) Cannot get entities from trained model using DictionaryFeatureGenerator

2017-01-27 Thread Russ, Daniel (NIH/CIT) [E] (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843482#comment-15843482
 ] 

Russ, Daniel (NIH/CIT) [E] commented on OPENNLP-859:


Hi Damiano,
   If you capitalized the "M" in Maria, does it detect "Maria" as a name.
Daniel

> Cannot get entities from trained model using DictionaryFeatureGenerator 
> 
>
> Key: OPENNLP-859
> URL: https://issues.apache.org/jira/browse/OPENNLP-859
> Project: OpenNLP
>  Issue Type: Question
>  Components: Name Finder
>Affects Versions: 1.6.0
> Environment: ubuntu 16.04 java 8
>Reporter: Damiano Porta
>
> Hello,
> I have created the following training data.
> {code:title=train.txt|borderStyle=solid}
> Ciao mi chiamo  Damiano  ed abito a Roma  .
> il mio indirizzo è via del  Corso  nella provincia di Roma 
> .
> il mio cap è lo 00144 nella capitale e e il mio nome è   john 
>  .
> Abito a Roma in via tar dei tali 10 ,  Mario  è il mio 
> amico .
> Oggi ho incontrato  giovanni  e siamo andati a giocare a 
> calcio .
> {code}
> And then this code:
> {code:title=test.java|borderStyle=solid}
> Charset charset = Charset.forName("UTF-8");
> ObjectStream lineStream =
> new PlainTextByLineStream(new 
> FileInputStream("/home/damiano/person.train"), charset);
> ObjectStream sampleStream = new 
> NameSampleDataStream(lineStream);
> TokenNameFinderModel model;
> Dictionary dictionary = new Dictionary();
> dictionary.put(new StringList(new String[]{"giovanni"}));
> dictionary.put(new StringList(new String[]{"maria"}));
> dictionary.put(new StringList(new String[]{"luca"}));
>   
> BufferedOutputStream aa = null;
>   
> AdaptiveFeatureGenerator featureGenerator = new 
> CachedFeatureGenerator(
>  new AdaptiveFeatureGenerator[]{  
>
> new WindowFeatureGenerator(new TokenFeatureGenerator(), 
> 2, 2),
> new WindowFeatureGenerator(new 
> TokenClassFeatureGenerator(true), 2, 2),
> new OutcomePriorFeatureGenerator(),
> new PreviousMapFeatureGenerator(),
> new BigramNameFeatureGenerator(),
> new SentenceFeatureGenerator(true, false),
> new DictionaryFeatureGenerator("person", dictionary)
>});
> try {
> model = NameFinderME.train("it", "person", sampleStream, 
> TrainingParameters.defaultParams(),
> featureGenerator, Collections.emptyMap());
> }
> finally {
>   sampleStream.close();
> }
> // Save trained model
> try (BufferedOutputStream modelOut = new BufferedOutputStream(new 
> FileOutputStream("/home/damiano/it-person-custom.bin"))) {
>   model.serialize(modelOut);
> }
> 
> // Read the trained model
> try (InputStream modelIn = new 
> FileInputStream("/home/damiano/it-person-custom.bin")) {
> TokenNameFinderModel nerModel = new TokenNameFinderModel(modelIn);
> NameFinderME nameFinder = new NameFinderME(nerModel, 
> featureGenerator, NameFinderME.DEFAULT_BEAM_SIZE);
>   
> String sentence[] = new String[]{
> "Ciao", "mi", "chiamo", "Damiano", "e", "sono", "di", "Roma", 
> "."
> };
> 
> Span nameSpans[] = nameFinder.find(sentence); 
>   
> System.out.println(Arrays.toString(Span.spansToStrings(nameSpans, 
> sentence)));
> }  
> {code}
> When i try 
> {code}
> "Ciao", "mi", "chiamo", "Damiano", "e", "sono", "di", "Roma", "."
> {code}
> it correctly detect "Damiano" as PERSON, but if i change it with:
> {code}
> "Ciao", "mi", "chiamo", "maria", "e", "sono", "di", "Roma", "."
> {code}
> it does not detect "maria" as PERSON but I added "maria" in the dictionary so 
> it should get it. Why not ?
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OPENNLP-971) Remove static training methods from GIS

2017-01-27 Thread Russ, Daniel (NIH/CIT) [E] (JIRA)
Russ, Daniel (NIH/CIT) [E] created OPENNLP-971:
--

 Summary: Remove static training methods from GIS
 Key: OPENNLP-971
 URL: https://issues.apache.org/jira/browse/OPENNLP-971
 Project: OpenNLP
  Issue Type: Improvement
  Components: Machine Learning
Affects Versions: 1.7.1
Reporter: Russ, Daniel (NIH/CIT) [E]
Priority: Minor
 Fix For: 1.7.3


The pluggable TrainingParameters has been implemented.  There is no reason to 
call the static train methods on GIS.  They should be Deprecated in 1.7.3, and 
removed in a later version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OPENNLP-969) Trainers should have a Constructor that takes a TrainingParamer

2017-01-27 Thread Russ, Daniel (NIH/CIT) [E] (JIRA)
Russ, Daniel (NIH/CIT) [E] created OPENNLP-969:
--

 Summary: Trainers should have a Constructor that takes a 
TrainingParamer
 Key: OPENNLP-969
 URL: https://issues.apache.org/jira/browse/OPENNLP-969
 Project: OpenNLP
  Issue Type: Improvement
  Components: Machine Learning
Affects Versions: 1.7.1
Reporter: Russ, Daniel (NIH/CIT) [E]
Priority: Minor
 Fix For: 1.7.2


Every time we construct a trainer, we construct, then init. Have an optional 
Constructor that takes a TrainingParameter and calls the init method. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OPENNLP-963) Both AbstractTrainer and AbstractEventTrainer defined a reportMap

2017-01-25 Thread Russ, Daniel (NIH/CIT) [E] (JIRA)
Russ, Daniel (NIH/CIT) [E] created OPENNLP-963:
--

 Summary: Both AbstractTrainer and AbstractEventTrainer defined a 
reportMap
 Key: OPENNLP-963
 URL: https://issues.apache.org/jira/browse/OPENNLP-963
 Project: OpenNLP
  Issue Type: Bug
  Components: Machine Learning
Affects Versions: 1.7.1
Reporter: Russ, Daniel (NIH/CIT) [E]
Priority: Minor


Both AbstractEventTrainer and AbstractTrainer hold a references to different 
reportMaps. This can cause a NullPointerException when addToReport is called. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-927) Merge TrainingParameters and PluggableParameters

2017-01-18 Thread Russ, Daniel (NIH/CIT) [E] (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829061#comment-15829061
 ] 

Russ, Daniel (NIH/CIT) [E] commented on OPENNLP-927:





> Merge TrainingParameters and PluggableParameters
> 
>
> Key: OPENNLP-927
> URL: https://issues.apache.org/jira/browse/OPENNLP-927
> Project: OpenNLP
>  Issue Type: New Feature
>  Components: Machine Learning
>Affects Versions: 1.7.0
>Reporter: Daniel Russ
>Assignee: Daniel Russ
>Priority: Minor
> Fix For: 1.7.1
>
>
> The PluggableParameters class was added to pull out the 
> get(Int/String/Boolean)Parameters() methods from the AbstractTrainer.  Merge 
> the functionality of the PluggableParameters into the TrainingParameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)