[ 
https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065485#comment-15065485
 ] 

ASF GitHub Bot commented on TIKA-1815:
--------------------------------------

GitHub user thammegowda opened a pull request:

    https://github.com/apache/tika/pull/66

    Fix for TIKA-1815 contributed by Thamme Gowda

    + Outputting the text content to XMLDocumentHandler

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/thammegowda/tika fix-TIKA-1815

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tika/pull/66.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #66
    
----
commit e96da2bc28d5eef81d034e39eb05099ed5d38ac1
Author: Thamme Gowda <[email protected]>
Date:   2015-10-30T21:47:45Z

    Add NamedEntityParser
    
    Add OpenNLPNERecogniser as default

commit a720507a1c1906a501470a7d5c5cec335412fcd3
Author: Thamme Gowda <[email protected]>
Date:   2015-10-30T22:16:11Z

    Set charset for converting text to stream

commit 6b1a20e681a5d319886464ec147967c876b7e60d
Author: Thamme Gowda <[email protected]>
Date:   2015-10-31T04:23:43Z

    Automated OpenNLP NER model downloader

commit e381ea88ebd2bb8f5adfe36d710acfce673e30aa
Author: Thamme Gowda <[email protected]>
Date:   2015-11-04T00:31:40Z

    using a secondary parser to convert non-text streams

commit ea7871bd4afae7d18e500ffc285e58afd08f5e86
Author: Thamme Gowda <[email protected]>
Date:   2015-11-08T07:36:48Z

    Add regex based NER

commit 084985b3612438e9ca7107fecdffd67757d04d10
Author: Thamme Gowda <[email protected]>
Date:   2015-11-08T07:38:17Z

    Add CoreNLP NER with runtime binding

commit e4d74218ece77143d1e5245a3ef64ddf5578c310
Author: Thamme Gowda <[email protected]>
Date:   2015-11-08T23:41:15Z

    Added support for chaining NER implementations

commit 7e6b43c83ec6cdd35ea258f52c0110ba986c82b3
Author: Thamme Gowda <[email protected]>
Date:   2015-11-09T05:58:58Z

    charset specified

commit caba68773a287752dea43f3366e6d4309fde861c
Author: Thamme Gowda <[email protected]>
Date:   2015-11-10T01:34:04Z

    Merge branch 'trunk' of github.com:apache/tika into trunk

commit 08b916790b279cda0201f2529ca58646dea4b2f9
Author: Thamme Gowda <[email protected]>
Date:   2015-11-10T19:06:29Z

    Resolved Code formatting issues
    
    + Removed star imports
    + Removed dead code / commented code
    + Added License header to missing files

commit e07ac630d54cc79d9a7bfc9ac82332474d07434b
Author: Thamme Gowda <[email protected]>
Date:   2015-11-16T09:05:07Z

    Add missing doc strings, fix code formatting issues

commit 96d4d7cc29d4bcd8ac0cf7a595c39b6ed64d4d19
Author: Thamme Gowda <[email protected]>
Date:   2015-11-18T03:03:41Z

    Fix: build phase for model downloader

commit 6d0b121b8b321e8a31257fc608bb001d3fe7afb5
Author: Thamme Gowda <[email protected]>
Date:   2015-12-11T14:33:36Z

    Merge branch 'trunk' of github.com:apache/tika into trunk

commit 66d3a10ffabf1f54cff384ce1c7325c2a3c16279
Author: Thamme Gowda <[email protected]>
Date:   2015-12-19T18:59:26Z

    Fix : TIKA-1815 by Thamme Gowda N.
    
    1. Writing text content to XMLContentHandler
    2. Added RegexNERParser to Default parser chain

----


> Text content from parser is empty when NamedEntityParser is enabled
> -------------------------------------------------------------------
>
>                 Key: TIKA-1815
>                 URL: https://issues.apache.org/jira/browse/TIKA-1815
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Thamme Gowda N
>             Fix For: 1.12
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When the NamedEntityParser is enabled, the Tika#parseToString() and other 
> parse() methods produces an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to