[
https://issues.apache.org/jira/browse/TIKA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968806#comment-15968806
]
ASF GitHub Bot commented on TIKA-2016:
--------------------------------------
thammegowda opened a new pull request #169: TIKA-2016 Sentiment Analysis
Parser Contributed by amensiko and thammegowda
URL: https://github.com/apache/tika/pull/169
This PR is a refresh of work started by @amensiko and left unmerged in #127
What is supported:
- Sentiment Analysis is powered by OpenNLP
- Model path can be changed to any compatible Open NLP Max Ent model ( to
change, update tika config xml)
- It can be HTTP URL (caching is not supported at the moment, use it for
testing only)
- It can be a file known to class loader (for hadoop /spark users)
- It can be a file on file system (for stand alone users)
- by default, it retrieves a model from USC DataScience's github repo,
trained on netflix dataset of positive negative sentiments
**Test case included?** yes - Checkout SentimentParserTest
**Example Tika Conf included?** yes - checkout the config file used by the
test case.
## How to test?
```bash
$ mvn clean package
$ echo "What a wonderful thought it is that some of the best days of our
lives havent happened yet" > test.sent
$ java -jar tika-app/target/tika-app-1.15-SNAPSHOT.jar \
--config=tika-parsers/src/test/resources/org/apache/tika/parser/sentiment/analysis/tika-config-sentiment-opennlp.xml
\
-m test.sent
# Output
Content-Length: 91
Content-Type: application/sentiment
Sentiment: positive
X-Parsed-By: org.apache.tika.parser.CompositeParser
X-Parsed-By: org.apache.tika.parser.sentiment.analysis.SentimentParser
resourceName: test.sent
```
Note: `Sentiment: positive` was added to metadata
#Closes #127 and TIKA-2016
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> A parser that combines Apache OpenNLP and Apache Tika and provides facilities
> for automatically deriving sentiment from text.
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: TIKA-2016
> URL: https://issues.apache.org/jira/browse/TIKA-2016
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Reporter: Anastasija Mensikova
> Assignee: Chris A. Mattmann
> Labels: analysis, gsoc2016, memex, parser, sentiment
> Fix For: 1.15
>
>
> A new project that implements a parser that uses Apache OpenNLP and Apache
> Tika to perform Sentiment Analysis.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)