The segment is recorded as a field in the Lucene-index. One easy way to do it 
would be to:
-  do a Lucene-search for the url, 
- read the "segment"-field from the resulting Lucene-Document and 
- call SegmentReader with this value as the segment-argument.

Bart Robeyns
Panoptic




-----Original Message-----
From: Carl Cerecke [mailto:[EMAIL PROTECTED]
Sent: Thu 8/30/2007 6:30
To: [email protected]
Subject: Getting page information given the URL
 
Hi,

How do I get the page information from whichever segment it is in, given 
a URL?

I'm basically looking for a class to use from the command-line which, 
given an index and a url, returns me the information for that url from 
whichever segment it is in. Similar to SegmentReader -get, but without 
having to specify the segment.

This seems like it should be relatively simple to do, but it has evaded 
me thus far...

Is the best approach to merge all the segments (hundreds of them) into 
one big segment? Would this work? What would the performance be like for 
this approach?

Cheers,
Carl.

Reply via email to