[jira] [Created] (OPENNLP-977) TrainerFactory uses Deprecated methods
Russ, Daniel (NIH/CIT) [E] created OPENNLP-977: -- Summary: TrainerFactory uses Deprecated methods Key: OPENNLP-977 URL: https://issues.apache.org/jira/browse/OPENNLP-977 Project: OpenNLP Issue Type: Bug Components: Machine Learning Affects Versions: 1.7.2 Reporter: Russ, Daniel (NIH/CIT) [E] Priority: Minor Fix For: 1.7.3 getEventTrainer/getEventModelSequenceTrainer use maps instead of Training Parameters. Also EventModelSequenceTrainer uses maps instead of TrainingParameters. This is not an outward facing interface to most users. All init(Map,Map) methods should be deprecated in favor of init(TraingParameters, Map -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OPENNLP-859) Cannot get entities from trained model using DictionaryFeatureGenerator
[ https://issues.apache.org/jira/browse/OPENNLP-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843482#comment-15843482 ] Russ, Daniel (NIH/CIT) [E] commented on OPENNLP-859: Hi Damiano, If you capitalized the "M" in Maria, does it detect "Maria" as a name. Daniel > Cannot get entities from trained model using DictionaryFeatureGenerator > > > Key: OPENNLP-859 > URL: https://issues.apache.org/jira/browse/OPENNLP-859 > Project: OpenNLP > Issue Type: Question > Components: Name Finder >Affects Versions: 1.6.0 > Environment: ubuntu 16.04 java 8 >Reporter: Damiano Porta > > Hello, > I have created the following training data. > {code:title=train.txt|borderStyle=solid} > Ciao mi chiamo Damiano ed abito a Roma . > il mio indirizzo è via del Corso nella provincia di Roma > . > il mio cap è lo 00144 nella capitale e e il mio nome è john > . > Abito a Roma in via tar dei tali 10 , Mario è il mio > amico . > Oggi ho incontrato giovanni e siamo andati a giocare a > calcio . > {code} > And then this code: > {code:title=test.java|borderStyle=solid} > Charset charset = Charset.forName("UTF-8"); > ObjectStream lineStream = > new PlainTextByLineStream(new > FileInputStream("/home/damiano/person.train"), charset); > ObjectStream sampleStream = new > NameSampleDataStream(lineStream); > TokenNameFinderModel model; > Dictionary dictionary = new Dictionary(); > dictionary.put(new StringList(new String[]{"giovanni"})); > dictionary.put(new StringList(new String[]{"maria"})); > dictionary.put(new StringList(new String[]{"luca"})); > > BufferedOutputStream aa = null; > > AdaptiveFeatureGenerator featureGenerator = new > CachedFeatureGenerator( > new AdaptiveFeatureGenerator[]{ > > new WindowFeatureGenerator(new TokenFeatureGenerator(), > 2, 2), > new WindowFeatureGenerator(new > TokenClassFeatureGenerator(true), 2, 2), > new OutcomePriorFeatureGenerator(), > new PreviousMapFeatureGenerator(), > new BigramNameFeatureGenerator(), > new SentenceFeatureGenerator(true, false), > new DictionaryFeatureGenerator("person", dictionary) >}); > try { > model = NameFinderME.train("it", "person", sampleStream, > TrainingParameters.defaultParams(), > featureGenerator, Collections.emptyMap()); > } > finally { > sampleStream.close(); > } > // Save trained model > try (BufferedOutputStream modelOut = new BufferedOutputStream(new > FileOutputStream("/home/damiano/it-person-custom.bin"))) { > model.serialize(modelOut); > } > > // Read the trained model > try (InputStream modelIn = new > FileInputStream("/home/damiano/it-person-custom.bin")) { > TokenNameFinderModel nerModel = new TokenNameFinderModel(modelIn); > NameFinderME nameFinder = new NameFinderME(nerModel, > featureGenerator, NameFinderME.DEFAULT_BEAM_SIZE); > > String sentence[] = new String[]{ > "Ciao", "mi", "chiamo", "Damiano", "e", "sono", "di", "Roma", > "." > }; > > Span nameSpans[] = nameFinder.find(sentence); > > System.out.println(Arrays.toString(Span.spansToStrings(nameSpans, > sentence))); > } > {code} > When i try > {code} > "Ciao", "mi", "chiamo", "Damiano", "e", "sono", "di", "Roma", "." > {code} > it correctly detect "Damiano" as PERSON, but if i change it with: > {code} > "Ciao", "mi", "chiamo", "maria", "e", "sono", "di", "Roma", "." > {code} > it does not detect "maria" as PERSON but I added "maria" in the dictionary so > it should get it. Why not ? > Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OPENNLP-971) Remove static training methods from GIS
Russ, Daniel (NIH/CIT) [E] created OPENNLP-971: -- Summary: Remove static training methods from GIS Key: OPENNLP-971 URL: https://issues.apache.org/jira/browse/OPENNLP-971 Project: OpenNLP Issue Type: Improvement Components: Machine Learning Affects Versions: 1.7.1 Reporter: Russ, Daniel (NIH/CIT) [E] Priority: Minor Fix For: 1.7.3 The pluggable TrainingParameters has been implemented. There is no reason to call the static train methods on GIS. They should be Deprecated in 1.7.3, and removed in a later version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OPENNLP-969) Trainers should have a Constructor that takes a TrainingParamer
Russ, Daniel (NIH/CIT) [E] created OPENNLP-969: -- Summary: Trainers should have a Constructor that takes a TrainingParamer Key: OPENNLP-969 URL: https://issues.apache.org/jira/browse/OPENNLP-969 Project: OpenNLP Issue Type: Improvement Components: Machine Learning Affects Versions: 1.7.1 Reporter: Russ, Daniel (NIH/CIT) [E] Priority: Minor Fix For: 1.7.2 Every time we construct a trainer, we construct, then init. Have an optional Constructor that takes a TrainingParameter and calls the init method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OPENNLP-963) Both AbstractTrainer and AbstractEventTrainer defined a reportMap
Russ, Daniel (NIH/CIT) [E] created OPENNLP-963: -- Summary: Both AbstractTrainer and AbstractEventTrainer defined a reportMap Key: OPENNLP-963 URL: https://issues.apache.org/jira/browse/OPENNLP-963 Project: OpenNLP Issue Type: Bug Components: Machine Learning Affects Versions: 1.7.1 Reporter: Russ, Daniel (NIH/CIT) [E] Priority: Minor Both AbstractEventTrainer and AbstractTrainer hold a references to different reportMaps. This can cause a NullPointerException when addToReport is called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OPENNLP-927) Merge TrainingParameters and PluggableParameters
[ https://issues.apache.org/jira/browse/OPENNLP-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829061#comment-15829061 ] Russ, Daniel (NIH/CIT) [E] commented on OPENNLP-927: > Merge TrainingParameters and PluggableParameters > > > Key: OPENNLP-927 > URL: https://issues.apache.org/jira/browse/OPENNLP-927 > Project: OpenNLP > Issue Type: New Feature > Components: Machine Learning >Affects Versions: 1.7.0 >Reporter: Daniel Russ >Assignee: Daniel Russ >Priority: Minor > Fix For: 1.7.1 > > > The PluggableParameters class was added to pull out the > get(Int/String/Boolean)Parameters() methods from the AbstractTrainer. Merge > the functionality of the PluggableParameters into the TrainingParameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)