I want to loop through URLs which have been crawled / indexed.

I have a (known) subset of URLs that I want to get the (raw) content for

if I know the segment, I can do something like this:
      String segName = "20100817162607";
      String url = "http://adomain.com/awebappOfInterest/someContent.do";;

      HitDetails detail = new HitDetails(segName, url);
      Configuration conf = NutchConfiguration.create();

      NutchBean bean = new NutchBean(conf);

      byte[] contentBytes = bean.getContent(detail);
      for (byte b : contentBytes)
      {
         System.out.print((char)b);
      }

My question is, given, a known Url, how can I find what segment it is in? Is 
there something in the API for giving an URL and getting back the name of the 
segment it is found in?

regards,
-henry
[email protected]

InfoNow Corporation  |  This communication, including attachments, is for the 
exclusive use of addressee and may contain proprietary, confidential or 
privileged information.

Reply via email to