On 2010-08-15 06:54, Ken Krugler wrote:
For what it's worth, I just committed some patches to Tika that should
improve Tika's ability to extract HTML outlinks (in img and frame
elements, at least). Support for iframe should be coming soon :)
This is in 0.8-SNAPSHOT, and there's one troubling
[
https://issues.apache.org/jira/browse/NUTCH-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12898706#action_12898706
]
Chris A. Mattmann commented on NUTCH-887:
-
bq. Ah, good - I missed that, I need to
2 matches
Mail list logo