Hi, I am using Nutch to generate a small dataset of web; dataset on which I am planning of running a focused crawler later.
I did a test crawl of and I have the 'segments' folder built up. Now I need to get that exact html pages it fetched out of the seed url/s. Is it possible to create a dataset this way? If so, how do I get those html pages? Thanks a lot!

