Hey Joern,

Sure, you can find the model data links here, along with our evaluation of them.

http://irds.usc.edu/SentimentAnalysisParser/datasets.html

There are other evaluations here:

http://irds.usc.edu/SentimentAnalysisParser/models.html

The HT provider review I cannot contribute at this time and I question its broad
applicability since it’s related to human trafficking. In addition we are still 
working
on publishing our analysis & evaluation of it which is why I removed it from the
commit.

Cheers,
Chris





On 6/29/17, 7:36 AM, "Joern Kottmann" <[email protected]> wrote:

    Which data sets did you use to evaluate this?
    I was looking for a bit more than a sample file to train it.
    
    I noticed that you checked in stanford and netflix models.
    
    The stanford data set is probably this one:
    http://ai.stanford.edu/~amaas/data/sentiment/
    
    Do you have a link for the netflix data?
    
    Jörn
    
    On Thu, Jun 29, 2017 at 4:00 PM, Chris Mattmann <[email protected]> wrote:
    > Absolutely you can find it here:
    >
    > 
opennlp-tools/src/test/resources/opennlp/tools/sentiment/sample_train_categ 
(for categorical /multi-class)
    > 
opennlp-tools/src/test/resources/opennlp/tools/sentiment/sample_train_categ2 
(for categorical/multi-class)
    >
    > We can also do similar files where instead of multi-class, we just use 
pos/neg as the label.
    >
    > Cheers,
    > Chris
    >
    >
    >
    >
    >
    > On 6/29/17, 2:35 AM, "Joern Kottmann" <[email protected]> wrote:
    >
    >     Hello Chris,
    >
    >     could you please point me to files I can use to train the sentiment
    >     component? I am currently looking again through the code and would
    >     like to train it myself.
    >
    >     Jörn
    >
    >     On Tue, Jun 27, 2017 at 4:59 PM, Dan Russ <[email protected]> wrote:
    >     > Hi All,
    >     >    First, let me take a share of blame for the comment Chris 
mentioned.  I believe I said something like the pull request was X revision 
behind and Y revisions ahead.  It was not meant to be rude, it was meant to say 
it is hard to review code when it is so different from the current code base. I 
am very excited that sentiment analysis is going to be added to OpenNLP, but I 
have not had time to play with it. If I were to say “great job” before I have 
add a chance to look at it, it would be flattery not honest praise.
    >     >
    >     >   Let’s clean up the merge.  I agree with Chris that scalability 
and perfection should not be our initial goal.  Let’s get something, and we can 
decide how to optimize later (even if it require a complete rewrite).  
Perfection is the enemy of the good.
    >     >
    >     >   Finally, because of Chris’ comments it is hard to thank Ana and 
Chris without sounding insincere.  But I’ll try, thank you Chris and Ana.  I 
hope we can get beyond this and that Chris and Ana will continue to improve the 
performance of the sentiment analysis tool and happily remain part of the 
OpenNLP family.  It is also a good time to toss a big thank you to all of the 
committers, users, and PMC member.  I use OpenNLP almost everyday.  Your work 
is extremely valuable to me.
    >     >
    >     > Thank you,
    >     > Daniel
    >     >
    >     >> On Jun 27, 2017, at 10:25 AM, Chris Mattmann <[email protected]> 
wrote:
    >     >>
    >     >> Hi everyone,
    >     >>
    >     >> I spoke with Joern in Slack. Some of his concerns are:
    >     >>
    >     >> 1. This was done with a Merge commit and apparently they squash 
and rebase.
    >     >> [would be helpful to see some pointer on this for documentation, 
thus far I
    >     >> haven’t found any]
    >     >> 2. Apparently we literally need to ask others for +1 votes and 
record them
    >     >> before committing? I thought since Ana and I are committers aren 
were +1,
    >     >> and since Joern had been providing feedback (the last of which was 
to add
    >     >> tests, which we did) that he would be +1 as well (I guess he is 
not, and I guess
    >     >> formally we need to do a +1 vote even still)
    >     >> 3. There was concern about scalability of the code.
    >     >> 4. There are thoughts that the code was not perfect yet (even 
though it works
    >     >> fine in the MEMEX project for Ana and I)
    >     >>
    >     >> So, Joern has opened up a revert PR.
    >     >>
    >     >> I suppose I should state I find this process extremely heavyweight 
and unwelcoming.
    >     >> To me, there should be a modicum of trust for committers, but I 
feel like even as a
    >     >> committer, I am operating as a “contributor” to the project. 
Committer means that
    >     >> there is trust to modify the source code base. Of the issues 
above, the only one I see
    >     >> as a moderate snafu was #1, and frankly if there are some 
instructions that show me
    >     >> how to do squashing and rebasing *first* I will try to do that in 
the future since I am
    >     >> not a GIt expert.
    >     >>
    >     >> That said, I must state I feel pretty put off by Apache OpenNLP. 
This originated as a GSoC
    >     >> effort, and we have worked pretty consistently on this over the 
last year. We used a
    >     >> separate GitHub project to get started, kept Joern involved as 
another mentor, even
    >     >> provided access and commit writes to that GitHub repository for a 
long time, so this
    >     >> code was developed in the open. Joern even created a branch in 
ApacheOpenNLP in the code and I suppose
    >     >> I should have gone and worked on that branch first since master is 
apparently so
    >     >> pristine that even an Apache veteran like me can’t get something 
in to it without
    >     >> making a whole bunch of (what are IMO minor issues, and what are 
IMO heavyweight
    >     >> “community” issues).
    >     >>
    >     >> I am concerned from a community point of view that the first 
comment wasn’t “Great
    >     >> job Chris, you got Sentiment Analysis into Apache, *but* I have 
these concerns 1-4 above”.
    >     >> It was “The PR was merged wrong in ways 1-4 and I’m going to 
revert it.”
    >     >>
    >     >> That’s pretty off-putting to someone who is semi-new like me and 
like Ana.
    >     >>
    >     >> Anyways, go ahead and revert it. Sorry to have caused any issues.
    >     >>
    >     >> Chris
    >     >>
    >     >>
    >     >>
    >     >> On 6/27/17, 7:06 AM, "Chris Mattmann" <[email protected]> wrote:
    >     >>
    >     >>    Hi Joern,
    >     >>
    >     >>    I’m confused. Why did you revert my commit?
    >     >>
    >     >>    Every one of those check points you put on the PR was checked?
    >     >>    We have been discussing this for months, you have seen the
    >     >>    code for months, Ana and I have worked diligently on the code
    >     >>    in plain view of everyone.
    >     >>
    >     >>    Please explain.
    >     >>
    >     >>    Chris
    >     >>
    >     >>
    >     >>
    >     >>
    >     >>    On 6/27/17, 1:23 AM, "kottmann" <[email protected]> wrote:
    >     >>
    >     >>        GitHub user kottmann opened a pull request:
    >     >>
    >     >>            https://github.com/apache/opennlp/pull/238
    >     >>
    >     >>            Revert merging of sentiment work, no consent to merge it
    >     >>
    >     >>            Thank you for contributing to Apache OpenNLP.
    >     >>
    >     >>            In order to streamline the review of the contribution 
we ask you
    >     >>            to ensure the following steps have been taken:
    >     >>
    >     >>            ### For all changes:
    >     >>            - [ ] Is there a JIRA ticket associated with this PR? 
Is it referenced
    >     >>                 in the commit message?
    >     >>
    >     >>            - [ ] Does your PR title start with OPENNLP-XXXX where 
XXXX is the JIRA number you are trying to resolve? Pay particular attention to 
the hyphen "-" character.
    >     >>
    >     >>            - [ ] Has your PR been rebased against the latest 
commit within the target branch (typically master)?
    >     >>
    >     >>            - [ ] Is your initial contribution a single, squashed 
commit?
    >     >>
    >     >>            ### For code changes:
    >     >>            - [ ] Have you ensured that the full suite of tests is 
executed via mvn clean install at the root opennlp folder?
    >     >>            - [ ] Have you written or updated unit tests to verify 
your changes?
    >     >>            - [ ] If adding new dependencies to the code, are these 
dependencies licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
    >     >>            - [ ] If applicable, have you updated the LICENSE file, 
including the main LICENSE file in opennlp folder?
    >     >>            - [ ] If applicable, have you updated the NOTICE file, 
including the main NOTICE file found in opennlp folder?
    >     >>
    >     >>            ### For documentation related changes:
    >     >>            - [ ] Have you ensured that format looks appropriate 
for the output in which it is rendered?
    >     >>
    >     >>            ### Note:
    >     >>            Please ensure that once the PR is submitted, you check 
travis-ci for build issues and submit an update to your PR as soon as possible.
    >     >>
    >     >>
    >     >>        You can merge this pull request into a Git repository by 
running:
    >     >>
    >     >>            $ git pull https://github.com/kottmann/opennlp 
revert_sentiment
    >     >>
    >     >>        Alternatively you can review and apply these changes as the 
patch at:
    >     >>
    >     >>            https://github.com/apache/opennlp/pull/238.patch
    >     >>
    >     >>        To close this pull request, make a commit to your 
master/trunk branch
    >     >>        with (at least) the following in the commit message:
    >     >>
    >     >>            This closes #238
    >     >>
    >     >>        ----
    >     >>        commit 123222eb34724bae793e9d6d22e202c0aee0aa45
    >     >>        Author: Jörn Kottmann <[email protected]>
    >     >>        Date:   2017-06-27T08:19:19Z
    >     >>
    >     >>            Revert merging of sentiment work, no consent to merge it
    >     >>
    >     >>        ----
    >     >>
    >     >>
    >     >>        ---
    >     >>        If your project is set up for it, you can reply to this 
email and have your
    >     >>        reply appear on GitHub as well. If your project does not 
have this feature
    >     >>        enabled and wishes so, or if the feature is enabled but not 
working, please
    >     >>        contact infrastructure at [email protected] or file 
a JIRA ticket
    >     >>        with INFRA.
    >     >>        ---
    >     >>
    >     >>
    >     >>
    >     >>
    >     >>
    >     >>
    >     >
    >
    >
    >
    


Reply via email to