Hi Ryan, you may have a look at the plugin scoring-depth. It tracks the depth (links away from one of the seeds) of a crawled page and could be modified to write also the parents (maybe only the first) into the CrawlDatum metadata.
Best, Sebastian On 4/9/19 9:08 PM, Ryan Suarez wrote: > Greetings, > > We are running nutch v1.5 with SOLR v7.3.1 > > I would like to determine how a specific site was crawled. What were > the parent links that the nutch crawler followed all the way back to > the root? > > Could someone let me know what is the best way to accomplish this? > > regards, > Ryan >

