[
https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065485#comment-15065485
]
ASF GitHub Bot commented on TIKA-1815:
--------------------------------------
GitHub user thammegowda opened a pull request:
https://github.com/apache/tika/pull/66
Fix for TIKA-1815 contributed by Thamme Gowda
+ Outputting the text content to XMLDocumentHandler
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/thammegowda/tika fix-TIKA-1815
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tika/pull/66.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #66
----
commit e96da2bc28d5eef81d034e39eb05099ed5d38ac1
Author: Thamme Gowda <[email protected]>
Date: 2015-10-30T21:47:45Z
Add NamedEntityParser
Add OpenNLPNERecogniser as default
commit a720507a1c1906a501470a7d5c5cec335412fcd3
Author: Thamme Gowda <[email protected]>
Date: 2015-10-30T22:16:11Z
Set charset for converting text to stream
commit 6b1a20e681a5d319886464ec147967c876b7e60d
Author: Thamme Gowda <[email protected]>
Date: 2015-10-31T04:23:43Z
Automated OpenNLP NER model downloader
commit e381ea88ebd2bb8f5adfe36d710acfce673e30aa
Author: Thamme Gowda <[email protected]>
Date: 2015-11-04T00:31:40Z
using a secondary parser to convert non-text streams
commit ea7871bd4afae7d18e500ffc285e58afd08f5e86
Author: Thamme Gowda <[email protected]>
Date: 2015-11-08T07:36:48Z
Add regex based NER
commit 084985b3612438e9ca7107fecdffd67757d04d10
Author: Thamme Gowda <[email protected]>
Date: 2015-11-08T07:38:17Z
Add CoreNLP NER with runtime binding
commit e4d74218ece77143d1e5245a3ef64ddf5578c310
Author: Thamme Gowda <[email protected]>
Date: 2015-11-08T23:41:15Z
Added support for chaining NER implementations
commit 7e6b43c83ec6cdd35ea258f52c0110ba986c82b3
Author: Thamme Gowda <[email protected]>
Date: 2015-11-09T05:58:58Z
charset specified
commit caba68773a287752dea43f3366e6d4309fde861c
Author: Thamme Gowda <[email protected]>
Date: 2015-11-10T01:34:04Z
Merge branch 'trunk' of github.com:apache/tika into trunk
commit 08b916790b279cda0201f2529ca58646dea4b2f9
Author: Thamme Gowda <[email protected]>
Date: 2015-11-10T19:06:29Z
Resolved Code formatting issues
+ Removed star imports
+ Removed dead code / commented code
+ Added License header to missing files
commit e07ac630d54cc79d9a7bfc9ac82332474d07434b
Author: Thamme Gowda <[email protected]>
Date: 2015-11-16T09:05:07Z
Add missing doc strings, fix code formatting issues
commit 96d4d7cc29d4bcd8ac0cf7a595c39b6ed64d4d19
Author: Thamme Gowda <[email protected]>
Date: 2015-11-18T03:03:41Z
Fix: build phase for model downloader
commit 6d0b121b8b321e8a31257fc608bb001d3fe7afb5
Author: Thamme Gowda <[email protected]>
Date: 2015-12-11T14:33:36Z
Merge branch 'trunk' of github.com:apache/tika into trunk
commit 66d3a10ffabf1f54cff384ce1c7325c2a3c16279
Author: Thamme Gowda <[email protected]>
Date: 2015-12-19T18:59:26Z
Fix : TIKA-1815 by Thamme Gowda N.
1. Writing text content to XMLContentHandler
2. Added RegexNERParser to Default parser chain
----
> Text content from parser is empty when NamedEntityParser is enabled
> -------------------------------------------------------------------
>
> Key: TIKA-1815
> URL: https://issues.apache.org/jira/browse/TIKA-1815
> Project: Tika
> Issue Type: Bug
> Components: parser
> Reporter: Thamme Gowda N
> Fix For: 1.12
>
> Original Estimate: 0.5h
> Remaining Estimate: 0.5h
>
> When the NamedEntityParser is enabled, the Tika#parseToString() and other
> parse() methods produces an empty string.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)