[GitHub] incubator-hivemall pull request #93: [WIP][HIVEMALL-126] Maximum Entropy Mod...

helenahm Tue, 01 Aug 2017 00:18:58 -0700

Github user helenahm commented on a diff in the pull request:

    https://github.com/apache/incubator-hivemall/pull/93#discussion_r130533347
  
    --- Diff: core/pom.xml ---
    @@ -103,6 +103,12 @@
                        <version>${guava.version}</version>
                        <scope>provided</scope>
                </dependency>
    +           <dependency>
    +                   <groupId>opennlp</groupId>
    +                   <artifactId>maxent</artifactId>
    +                   <version>3.0.0</version>
    --- End diff --
    
    In general I totally agree. I think it would be good to perform the move to 
another version of maxent in a few steps.
    
    1. The code I have re-used is that of GISTrainer. That is more or less 
updating the weights in a matrix where matrix is hivemall's matrix. Everything 
else is just following your class structure. I have checked that the resulting 
models are the same and I have also confirmed that the resulting model makes 
sense on my own data. So the resulting weights must be correct. Can we say that 
training is correct and accept the current version as the correct and 
functioning one?
    
    2. After that there are a few options:
    we could try to re-write the code in a way that will accept the newest 
version of opennlp maxent and all the following versions. I guess that would 
require changes in opennlp maxent too, but perhaps it is better than manual 
alteration of GISTrainer every time you update something, and both projects 
will benefit from such collaboration.
    
    if not, perhaps for Hivemall as a project, we may consider re-writing 
iterative scaling from scratch to make it Hivemall efficient, perhaps using the 
tricks OpenNLP uses to make the code more efficient, and making sure that the 
resulting weights are comparable, but without aiming to being able to plug a 
new OpenNLP jar each time new version appears. 
    
    What do you think?
    
    Regards,
    Elena.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hivemall pull request #93: [WIP][HIVEMALL-126] Maximum Entropy Mod...

Reply via email to