[
https://issues.apache.org/jira/browse/TIKA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995572#comment-15995572
]
ASF GitHub Bot commented on TIKA-2016:
--------------------------------------
chrismattmann commented on issue #169: TIKA-2016 Sentiment Analysis Parser
Contributed by amensiko and thammegowda
URL: https://github.com/apache/tika/pull/169#issuecomment-299022223
## Build passes:
```
[INFO]
------------------------------------------------------------------------
[INFO] Building Apache Tika 1.15-SNAPSHOT
[INFO]
------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ tika ---
[INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ tika ---
[INFO]
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @
tika ---
[INFO]
[INFO] --- forbiddenapis:2.2:check (default) @ tika ---
[INFO] Skipping execution for packaging "pom"
[INFO]
[INFO] --- forbiddenapis:2.2:testCheck (default) @ tika ---
[INFO] Skipping execution for packaging "pom"
[INFO]
[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ tika ---
[INFO] Installing /Users/mattmann/git/tika-gh/pom.xml to
/Users/mattmann/.m2/repository/org/apache/tika/tika/1.15-SNAPSHOT/tika-1.15-SNAPSHOT.pom
[INFO]
------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Tika parent ................................. SUCCESS [ 1.047
s]
[INFO] Apache Tika core ................................... SUCCESS [ 22.697
s]
[INFO] Apache Tika parsers ................................ SUCCESS [03:23
min]
[INFO] Apache Tika XMP .................................... SUCCESS [ 1.649
s]
[INFO] Apache Tika serialization .......................... SUCCESS [ 1.436
s]
[INFO] Apache Tika batch .................................. SUCCESS [01:50
min]
[INFO] Apache Tika language detection ..................... SUCCESS [ 3.730
s]
[INFO] Apache Tika application ............................ SUCCESS [ 32.442
s]
[INFO] Apache Tika OSGi bundle ............................ SUCCESS [ 17.231
s]
[INFO] Apache Tika translate .............................. SUCCESS [ 1.825
s]
[INFO] Apache Tika server ................................. SUCCESS [ 35.426
s]
[INFO] Apache Tika examples ............................... SUCCESS [ 9.609
s]
[INFO] Apache Tika Java-7 Components ...................... SUCCESS [ 1.738
s]
[INFO] Apache Tika eval ................................... SUCCESS [ 24.480
s]
[INFO] Apache Tika ........................................ SUCCESS [ 0.022
s]
[INFO]
------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO]
------------------------------------------------------------------------
[INFO] Total time: 07:48 min
[INFO] Finished at: 2017-05-03T13:03:29-07:00
[INFO] Final Memory: 164M/1530M
[INFO]
------------------------------------------------------------------------
LMC-053601:tika-gh mattmann$
```
I also tried it myself on the following file:
`sample.sent`
```
Man I'm so tired of battling against OSGI!
```
`sample2.sent`
```
Whatever, I need some cooling off time!
```
# Binary sentiment
```
LMC-053601:tika-gh mattmann$ java -jar
tika-app/target/tika-app-1.15-SNAPSHOT.jar \
>
--config=tika-parsers/src/test/resources/org/apache/tika/parser/sentiment/analysis/tika-config-sentiment-opennlp.xml
\
> -m sample.sent
WARN JBIG2ImageReader not loaded. jbig2 files will be ignored
INFO Sentiment Model is at
https://raw.githubusercontent.com/USCDataScience/SentimentAnalysisParser/master/sentiment-models/en-netflix-sentiment.bin
Content-Length: 43
Content-Type: application/sentiment
Sentiment: negative
X-Parsed-By: org.apache.tika.parser.CompositeParser
X-Parsed-By: org.apache.tika.parser.sentiment.analysis.SentimentParser
resourceName: sample.sent
LMC-053601:tika-gh mattmann$
```
# Categorical (multi-class sentiment)
Changing to use `sample2.sent`
```
LMC-053601:tika-gh mattmann$ java -jar
tika-app/target/tika-app-1.15-SNAPSHOT.jar
--config=tika-parsers/src/test/resources/org/apache/tika/parser/sentiment/analysis/tika-config-sentiment-opennlp-cat.xml
-m sample2.sent
WARN JBIG2ImageReader not loaded. jbig2 files will be ignored
INFO Sentiment Model is at
https://raw.githubusercontent.com/USCDataScience/SentimentAnalysisParser/master/sentiment-models/ht-sentiment-categ.bin
Content-Length: 39
Content-Type: application/sentiment
Sentiment: angry
X-Parsed-By: org.apache.tika.parser.CompositeParser
X-Parsed-By: org.apache.tika.parser.sentiment.analysis.SentimentParser
resourceName: sample2.sent
LMC-053601:tika-gh mattmann$
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> A parser that combines Apache OpenNLP and Apache Tika and provides facilities
> for automatically deriving sentiment from text.
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: TIKA-2016
> URL: https://issues.apache.org/jira/browse/TIKA-2016
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Reporter: Anastasija Mensikova
> Assignee: Chris A. Mattmann
> Labels: analysis, gsoc2016, memex, parser, sentiment
> Fix For: 1.15
>
>
> A new project that implements a parser that uses Apache OpenNLP and Apache
> Tika to perform Sentiment Analysis.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)