Kaidul Islam created NUTCH-2389:
-----------------------------------

             Summary: Precise data parsing using Jsoup CSS selectors
                 Key: NUTCH-2389
                 URL: https://issues.apache.org/jira/browse/NUTCH-2389
             Project: Nutch
          Issue Type: New Feature
          Components: parser
    Affects Versions: 2.3
            Reporter: Kaidul Islam
            Assignee: Kaidul Islam
             Fix For: 2.4


Currently Nutch 1.x and 2.x has no features to extract/parse exact contents for 
specific websites. I've developed a plugin using Jsoup for my current project 
to extract precise content for site specific crawling.

Please let me know if this feature seems relevant and currently not present in 
Nutch. I have also plan to export it into Nutch 1.x.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to