[
https://issues.apache.org/jira/browse/NUTCH-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106802#comment-16106802
]
Kaidul Islam edited comment on NUTCH-2389 at 7/31/17 5:08 AM:
--------------------------------------------------------------
Hi [~lewismc] You're welcome :) I was able to build successfully using {{ant
clean runtime}}. Now it's showing jenkin build failed. Reading the failure log
which mainly occurred in unit test part, I've made some changes and now I want
to check whether jenkin build can pass this time and would send a PR. How can I
check the jenkin build test before PR?
was (Author: kaidul):
Hi [~lewismc] You're welcome :) I was able to build successfully using {{ant
clean runtime}}. Now it's showing jenkin build failed. Reading the failure log,
I've made some changes and now I want to check whether jenkin build can pass
this time and would send a PR. How can I check the jenkin build test before PR?
> Precise data parsing using Jsoup CSS selectors
> ----------------------------------------------
>
> Key: NUTCH-2389
> URL: https://issues.apache.org/jira/browse/NUTCH-2389
> Project: Nutch
> Issue Type: New Feature
> Components: parser
> Affects Versions: 2.3
> Reporter: Kaidul Islam
> Assignee: Kaidul Islam
> Fix For: 2.4
>
> Original Estimate: 0.05h
> Remaining Estimate: 0.05h
>
> As far as I know, currently Nutch 1.x and 2.x has no features to
> extract/parse exact contents for specific websites. I've developed a plugin
> {{parse-jsoup}} using Jsoup for my current project to extract precise content
> for site specific crawling using detailed XML configuration(field name,
> CSS-selector, attribute, extraction rules, data-type, default-value etc).
> Please let me know if this feature seems relevant and currently not present
> in Nutch. I have also plan to export it into Nutch 1.x.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)