[ 
https://issues.apache.org/jira/browse/TIKA-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175188#comment-15175188
 ] 

Hudson commented on TIKA-1876:
------------------------------

UNSTABLE: Integrated in tika-trunk-jdk1.7 #917 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/917/])
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev 
a13369b098bea09421e35023c131adc092dcb6e4)
* 
tika-parsers/src/test/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniserTest.java
* 
tika-parsers/src/main/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniser.java
* 
tika-parsers/src/main/resources/org/apache/tika/parser/ner/nltk/NLTKServer.properties
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev 
7ebe007ec03088449f67619ef1e6cb564178b14b)
* tika-server/src/main/java/org/apache/tika/server/resource/TikaResource.java
* tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
* tika-server/src/main/java/org/apache/tika/server/RichTextContentHandler.java
* 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/XWPFListManager.java
* tika-parsers/src/main/java/org/apache/tika/parser/ner/NERecogniser.java
* tika-core/src/main/java/org/apache/tika/mime/MimeType.java
* CHANGES.txt
* 
tika-server/src/main/java/org/apache/tika/server/resource/UnpackerResource.java
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev 
c809690ec87ffa600018dbc5eee6d6756645adb0)
* .gitignore
* 
tika-parsers/src/main/resources/org/apache/tika/parser/ner/nltk/NLTKServer.properties
* 
tika-parsers/src/main/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniser.java
* 
tika-parsers/src/test/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniserTest.java
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev 
3a7e24c9a5d77ae41bde0c2106233a2064c5e707)
* .gitignore
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev 
114d0ff24bd04395852012a3382d50c3e906e6db)
* tika-parsers/pom.xml
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev 
cdb684d9c1b0ebb01a783180f07417760fa04d6f)
* 
tika-parsers/src/main/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniser.java
Fix for TIKA-1876 Integrate Natural Language Toolkit (NLTK) into Tika 
(mattmann: rev 3fbc03cead1c54bd023a19e52e31609b51926d7d)
* CHANGES.txt


> Integrate Natural Language Toolkit (NLTK) into Tika to perform Named Entity 
> Recognition
> ---------------------------------------------------------------------------------------
>
>                 Key: TIKA-1876
>                 URL: https://issues.apache.org/jira/browse/TIKA-1876
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 1.13
>            Reporter: Manali Shah
>            Assignee: Chris A. Mattmann
>             Fix For: 1.13
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Hi all, 
> Apache Tika already performs Named Entity Recognition using Open NLP and 
> Stanford Core NLP. Natural Language Toolkit is another open source python 
> library and I believe it will be a great idea to have NLTK integrated along 
> with Tika. 
> NLTK can extract NER as well as classify them. For this purpose I, along with 
> Prof Chris Mattmann have published NLTKRest, a python pip/setuptools 
> installable module that exposes NLTK as a REST service. 
> I have tested the working of Tika along with NLTKRest on my local repository 
> and will soon submit a pull request. 
> Link to rest server: https://github.com/manalishah/NLTKRest



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to