[
https://issues.apache.org/jira/browse/CONNECTORS-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16733194#comment-16733194
]
Karl Wright commented on CONNECTORS-1557:
-----------------------------------------
We really cannot support two slightly different HTML extractors, so I'm
uncomfortable committing this as-is, unless it's structured as a
backwards-compatible extension of the existing extractor. Therefore, can you
explain in detail what you did, and what specific functional changes you made?
Thanks.
> HTML Tag extractor
> ------------------
>
> Key: CONNECTORS-1557
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1557
> Project: ManifoldCF
> Issue Type: New Feature
> Affects Versions: ManifoldCF 2.11
> Reporter: Donald Van den Driessche
> Assignee: Karl Wright
> Priority: Major
> Attachments: html-tag-extraction-connector.zip
>
>
> I wrote a HTML Tag extractor, based on the HTML Extractor.
> I needed to extract specific HTML tags and transfer them to their own field
> in my output repository.
> Input
> * Englobing tag (CSS selector)
> * Blacklist (CSS selector)
> * Fieldmapping (CSS selector)
> * Strip HTML
> Process
> * Retrieve Englobing tag
> * Remove blacklist
> * Map selected CSS selectors in Fieldmapping (arrays if multiple finds) +
> strip HTML (if requested)
> * Englobing tag minus blacklist: strip HTML (if requested) and return as
> output (content)
> How can I best deliver the source code?
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)