[ 
https://issues.apache.org/jira/browse/TIKA-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992696#comment-14992696
 ] 

ASF GitHub Bot commented on TIKA-1787:
--------------------------------------

GitHub user TaichiHo opened a pull request:

    https://github.com/apache/tika/pull/62

    fix for TIKA-1787 contributed by Yueheng He

    Succeed in building using java 1.8.0_65. 
    To see the effect, create a text file like the following. 
    ```
    Good afternoon Rajat Raina, how are you today? Hi, I am Tom Brady. I go to 
school at Stanford University, which is located in California.
    ```
    Save it as test.ner and feed it to tika. 
    ```
    java -classpath tika-app/target/tika-app-1.12-SNAPSHOT.jar 
org.apache.tika.cli.TikaCLI -m test.ner
    ```
    The result should look like this
    ```
    Content-Length: 137
    Content-Type: application/stanford-ner
    LOCATION: [California]
    ORGANIZATION: [Stanford University]
    PERSON: [Rajat Raina, Tom Brady]
    X-Parsed-By: org.apache.tika.parser.DefaultParser
    X-Parsed-By: org.apache.tika.parser.stanfordNer.StanfordNerParser
    resourceName: test.ner
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/TaichiHo/tika TIKA-1787

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tika/pull/62.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #62
    
----
commit b94331ece262bb8d8408dda7b22b6dc0bb69557e
Author: Taichi <[email protected]>
Date:   2015-11-05T22:47:22Z

    fix for TIKA-1787 contributed by Yueheng He

----


> Include Stanford Name Entity Recognition in Tika
> ------------------------------------------------
>
>                 Key: TIKA-1787
>                 URL: https://issues.apache.org/jira/browse/TIKA-1787
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime, parser
>    Affects Versions: 1.12
>         Environment: Java 1.8, Mac OSX 10.11
>            Reporter: Yueheng He
>              Labels: features, newbie, test
>             Fix For: 1.12
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Using the Stanford Name Entity Recognition, Tika will be able to extract name 
> entities like PERSON, ORGANIZATION, LOCATION, etc from the given text. The 
> extracted name entities will be added to the metadata



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to