[ 
https://issues.apache.org/jira/browse/NUTCH-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207778#comment-16207778
 ] 

ASF GitHub Bot commented on NUTCH-2443:
---------------------------------------

jorgelbg opened a new pull request #230: NUTCH-2443 add source tag to the 
parse-html and parse-tika outlink ex…
URL: https://github.com/apache/nutch/pull/230
 
 
   Add support for the `video`/`source` tag in the outlink extractor of the 
`parse-html` and `parse-tika` plugin. 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Extract links from the video tag with the parse-html plugin
> -----------------------------------------------------------
>
>                 Key: NUTCH-2443
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2443
>             Project: Nutch
>          Issue Type: Improvement
>          Components: parser, plugin
>    Affects Versions: 1.13
>            Reporter: Jorge Luis Betancourt Gonzalez
>            Assignee: Jorge Luis Betancourt Gonzalez
>            Priority: Minor
>             Fix For: 1.14
>
>
> At the moment the {{parse-html}} extracts links from the tags {{a, area, 
> form}} (configurable){{, frame, iframe, script, link, img}}. Since we allow 
> extracting links to binary files (images) extracting links also from the 
> {{video}} tag should be supported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to