[ 
https://issues.apache.org/jira/browse/TIKA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968806#comment-15968806
 ] 

ASF GitHub Bot commented on TIKA-2016:
--------------------------------------

thammegowda opened a new pull request #169: TIKA-2016  Sentiment Analysis 
Parser Contributed by amensiko and thammegowda
URL: https://github.com/apache/tika/pull/169
 
 
   This PR is a refresh of work started by @amensiko and left unmerged in #127 
   
   What is supported:
   - Sentiment Analysis is powered by OpenNLP
   - Model path can be changed to any compatible Open NLP Max Ent model ( to 
change, update tika config xml)
      - It can be HTTP URL (caching is not supported at the moment, use it for 
testing only)
      - It can be a file known to class loader (for hadoop /spark users)
      - It can be a file on file system  (for stand alone users)
      - by default, it retrieves a model from USC DataScience's github repo,  
trained on netflix dataset of positive negative sentiments
   
   **Test case included?** yes - Checkout SentimentParserTest
   **Example Tika Conf included?**  yes - checkout the config file used by the 
test case.
   
   ## How to test?
   
   ```bash
   $ mvn clean package
   $ echo "What a wonderful thought it is that some of the best days of our 
lives havent happened yet" > test.sent
   $ java -jar tika-app/target/tika-app-1.15-SNAPSHOT.jar \
            
--config=tika-parsers/src/test/resources/org/apache/tika/parser/sentiment/analysis/tika-config-sentiment-opennlp.xml
 \
            -m test.sent 
   # Output 
   Content-Length: 91
   Content-Type: application/sentiment
   Sentiment: positive
   X-Parsed-By: org.apache.tika.parser.CompositeParser
   X-Parsed-By: org.apache.tika.parser.sentiment.analysis.SentimentParser
   resourceName: test.sent
   ```
   
   Note: `Sentiment: positive` was added to metadata
   
   #Closes #127 and TIKA-2016
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> A parser that combines Apache OpenNLP and Apache Tika and provides facilities 
> for automatically deriving sentiment from text.
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-2016
>                 URL: https://issues.apache.org/jira/browse/TIKA-2016
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Anastasija Mensikova
>            Assignee: Chris A. Mattmann
>              Labels: analysis, gsoc2016, memex, parser, sentiment
>             Fix For: 1.15
>
>
> A new project that implements a parser that uses Apache OpenNLP and Apache 
> Tika to perform Sentiment Analysis.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to