lewismc opened a new pull request, #1380:
URL: https://github.com/apache/bigtop/pull/1380

   ### Description of PR
   [https://issues.apache.org/jira/browse/BIGTOP-284](BIGTOP-284) seeks to 
introduce [Apache Nutch](https://nutch.apache.org) smoke tests into the Bigtop 
ecosystem. I commented on the original ticket way back in 2011 and never did 
anything about it. This PR seeks to address that. 
   Nutch is a highly extensible, highly scalable, matured, production-ready Web 
crawler which enables fine grained configuration and accommodates a wide 
variety of data acquisition tasks. Nutch relies on Apache Hadoop data 
structures, Nutch is great for batch processing large data volumes via 
MapReduce jobs but can also be tailored to smaller jobs.
   
   ### How was this patch tested?
   Testing is ongoing. The goal is for the Nutch community to test this patch 
and hopefully update this thread with feedback. More details to follow.
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'BIGTOP-3638. Your PR title ...')?
   - [X] Make sure that newly added files do not have any licensing issues. 
When in doubt refer to https://www.apache.org/licenses/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to