[
https://issues.apache.org/jira/browse/TIKA-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992696#comment-14992696
]
ASF GitHub Bot commented on TIKA-1787:
--------------------------------------
GitHub user TaichiHo opened a pull request:
https://github.com/apache/tika/pull/62
fix for TIKA-1787 contributed by Yueheng He
Succeed in building using java 1.8.0_65.
To see the effect, create a text file like the following.
```
Good afternoon Rajat Raina, how are you today? Hi, I am Tom Brady. I go to
school at Stanford University, which is located in California.
```
Save it as test.ner and feed it to tika.
```
java -classpath tika-app/target/tika-app-1.12-SNAPSHOT.jar
org.apache.tika.cli.TikaCLI -m test.ner
```
The result should look like this
```
Content-Length: 137
Content-Type: application/stanford-ner
LOCATION: [California]
ORGANIZATION: [Stanford University]
PERSON: [Rajat Raina, Tom Brady]
X-Parsed-By: org.apache.tika.parser.DefaultParser
X-Parsed-By: org.apache.tika.parser.stanfordNer.StanfordNerParser
resourceName: test.ner
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/TaichiHo/tika TIKA-1787
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tika/pull/62.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #62
----
commit b94331ece262bb8d8408dda7b22b6dc0bb69557e
Author: Taichi <[email protected]>
Date: 2015-11-05T22:47:22Z
fix for TIKA-1787 contributed by Yueheng He
----
> Include Stanford Name Entity Recognition in Tika
> ------------------------------------------------
>
> Key: TIKA-1787
> URL: https://issues.apache.org/jira/browse/TIKA-1787
> Project: Tika
> Issue Type: Improvement
> Components: mime, parser
> Affects Versions: 1.12
> Environment: Java 1.8, Mac OSX 10.11
> Reporter: Yueheng He
> Labels: features, newbie, test
> Fix For: 1.12
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> Using the Stanford Name Entity Recognition, Tika will be able to extract name
> entities like PERSON, ORGANIZATION, LOCATION, etc from the given text. The
> extracted name entities will be added to the metadata
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)