[ https://issues.apache.org/jira/browse/NUTCH-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181079#comment-14181079 ]
Sebastian Nagel commented on NUTCH-1644: ---------------------------------------- What's the exact objective? # extract embedded content (single elements, attributes) from DOM tree, or # transform XML to readable and indexable text > Should have a parser that uses xpath > ------------------------------------ > > Key: NUTCH-1644 > URL: https://issues.apache.org/jira/browse/NUTCH-1644 > Project: Nutch > Issue Type: New Feature > Components: parser > Affects Versions: 2.2.1 > Reporter: cihad güzel > Assignee: Lewis John McGibbney > Labels: parser, xpath > Fix For: 2.3 > > Attachments: NUTCH-1644.patch > > > May want to parse some url via xpath. May be blog or news web sites. Should > be a plugin using xpath parse. -- This message was sent by Atlassian JIRA (v6.3.4#6332)