[
https://issues.apache.org/jira/browse/STREAMS-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15309171#comment-15309171
]
Steve Blackmon commented on STREAMS-51:
---------------------------------------
a semi-working version of this module exists in a branch and is worth wrapping
up and merging, IMO. it can scrape HTML or other tika doc types to turn a URL
into a structured document, then populate actor.name, content, published, and
other streams fields based on what's in the page.
> Complete, test, and document tika processor
> -------------------------------------------
>
> Key: STREAMS-51
> URL: https://issues.apache.org/jira/browse/STREAMS-51
> Project: Streams
> Issue Type: Story
> Reporter: Steve Blackmon
>
> Complete, test, and document tika processor
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)