Hi Lewis, thanks! Since Common Crawl runs Nutch on a Bigtop cluster any deeper integration into the Bigtop ecosystem is very welcome. The smoke tests are maybe the most useful part. Every time Bigtop is updated, it takes a while to verify that Nutch and all it's plugins are running smoothly.
But I have no good idea about packaging. All the Bigtop packages are infrastructure providing core components or services. The way how Nutch is used and deployed on a Hadoop cluster is all in the "user space": jar and configuration files are specific for this particular Nutch job setup and their classpath does not interfere with that of other jobs. "Nutch server" would go easier as a Bigtop package. But Nutch server was never designed to run on a cluster, just in local mode. ~Sebastian

