[
https://issues.apache.org/jira/browse/TIKA-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175188#comment-15175188
]
Hudson commented on TIKA-1876:
------------------------------
UNSTABLE: Integrated in tika-trunk-jdk1.7 #917 (See
[https://builds.apache.org/job/tika-trunk-jdk1.7/917/])
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev
a13369b098bea09421e35023c131adc092dcb6e4)
*
tika-parsers/src/test/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniserTest.java
*
tika-parsers/src/main/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniser.java
*
tika-parsers/src/main/resources/org/apache/tika/parser/ner/nltk/NLTKServer.properties
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev
7ebe007ec03088449f67619ef1e6cb564178b14b)
* tika-server/src/main/java/org/apache/tika/server/resource/TikaResource.java
* tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
* tika-server/src/main/java/org/apache/tika/server/RichTextContentHandler.java
*
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/XWPFListManager.java
* tika-parsers/src/main/java/org/apache/tika/parser/ner/NERecogniser.java
* tika-core/src/main/java/org/apache/tika/mime/MimeType.java
* CHANGES.txt
*
tika-server/src/main/java/org/apache/tika/server/resource/UnpackerResource.java
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev
c809690ec87ffa600018dbc5eee6d6756645adb0)
* .gitignore
*
tika-parsers/src/main/resources/org/apache/tika/parser/ner/nltk/NLTKServer.properties
*
tika-parsers/src/main/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniser.java
*
tika-parsers/src/test/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniserTest.java
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev
3a7e24c9a5d77ae41bde0c2106233a2064c5e707)
* .gitignore
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev
114d0ff24bd04395852012a3382d50c3e906e6db)
* tika-parsers/pom.xml
fix for TIKA-1876 contributed by manalishah (manalishah.91: rev
cdb684d9c1b0ebb01a783180f07417760fa04d6f)
*
tika-parsers/src/main/java/org/apache/tika/parser/ner/nltk/NLTKNERecogniser.java
Fix for TIKA-1876 Integrate Natural Language Toolkit (NLTK) into Tika
(mattmann: rev 3fbc03cead1c54bd023a19e52e31609b51926d7d)
* CHANGES.txt
> Integrate Natural Language Toolkit (NLTK) into Tika to perform Named Entity
> Recognition
> ---------------------------------------------------------------------------------------
>
> Key: TIKA-1876
> URL: https://issues.apache.org/jira/browse/TIKA-1876
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Affects Versions: 1.13
> Reporter: Manali Shah
> Assignee: Chris A. Mattmann
> Fix For: 1.13
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> Hi all,
> Apache Tika already performs Named Entity Recognition using Open NLP and
> Stanford Core NLP. Natural Language Toolkit is another open source python
> library and I believe it will be a great idea to have NLTK integrated along
> with Tika.
> NLTK can extract NER as well as classify them. For this purpose I, along with
> Prof Chris Mattmann have published NLTKRest, a python pip/setuptools
> installable module that exposes NLTK as a REST service.
> I have tested the working of Tika along with NLTKRest on my local repository
> and will soon submit a pull request.
> Link to rest server: https://github.com/manalishah/NLTKRest
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)