CLASSIFICATION: UNCLASSIFIED Working thru the tutorial for v1 of nutch. urls/seed.txt contains https://the.website.mil/inside/
regex-urlfilter.txt contains edits... # accept anything else #+. # limit to the.website.mil +^https://([a-z0-9]*\.)the.website.mil/inside Yet nothing gets populated in the crawl db... bin/nutch inject crawl/crawldb urls Injector: starting at 2016-07-21 07:32:02 Injector: crawlDb: crawl/crawldb Injector: urlDir: urls Injector: Converting injected urls to crawl db entries. Injector: Total number of urls rejected by filters: 1 Injector: Total number of urls after normalization: 0 Injector: Merging injected urls into crawl db. Injector: overwrite: false Injector: update: false Injector: URLs merged: 0 Injector: Total new urls injected: 0 Thanks, Kris ~~~~~~~~~~~~~~~~~~~~~~~~~~ Kris T. Musshorn FileMaker Developer - Contractor - Catapult Technology Inc. US Army Research Lab Aberdeen Proving Ground Application Management & Development Branch 410-278-7251 [email protected] ~~~~~~~~~~~~~~~~~~~~~~~~~~ CLASSIFICATION: UNCLASSIFIED

