Alright, got Any23 working... I can probably use this and have Tika / Boilerpipes be a fall-back. I'll experiment. If I come up with anything. I'll implement it in my project first, test it out, then build some 'issues' and push it back to streams.
On 5/13/14, 9:21 PM, "Steve Blackmon" <[email protected]> wrote: >No objections, that will be a great feature. Apache Any23 may be of >interest since it contains a growing catalog of common microformats. > >Steve Blackmon >[email protected] > > >On Tue, May 13, 2014 at 12:39 PM, Matthew Hager [W2O Digital] ><[email protected]> wrote: >> Team, >> >> I am looking to expand streams-processor-urls to include pulling >>content from the page, determining the type of content that is, and >>extracting as much meta data as possible from the page. Would anyone >>have any objections to that being placed in the same package as there >>will be a lot of 'overlap' between helper functions and dependencies. >> >> If not, I'll create the stories and start working on it. >> >> Thank you for your time! >> >> Thanks! >> Matthew
