Oki. If that is the case I guess I have to go for the index-basic plugin mod.   

-Ronny

-----Opprinnelig melding-----
Fra: Sami Siren [mailto:[EMAIL PROTECTED] 
Sendt: 20. juni 2007 09:50
Til: nutch-user@lucene.apache.org
Emne: Re: Lucene client and nutch index

Doğacan Güney wrote:
> On 6/20/07, Naess, Ronny <[EMAIL PROTECTED]> wrote:
>> Thanks, DoÄŸacan.
>>
>> Thanks for the clarification conserning the content setting.
>>
>> The index-basic plugin modification is okay, but is it possible to 
>> access the segment data containing content from a lucene client?
> 
> It is possible if you are OK with adding hadoop jar to your lucene 
> client. Take a look at FetchedSegments, it will show you how to access 
> content in a segment. Basically, content is a set of MapFile's (all 
> part-*'s under content are MapFiles), when you want to access content 
> of a url, you first apply a hash function to find out under which part 
> it is stored then get it with a MapFile.get(). This may sound 
> difficult but it actually is very easy. I would definitely suggest 
> reading FetchedSegments.java, especially getContent and getEntry.

You might however encounter the requirement for java 1.5 by using hadoop 
functionality.

--
 Sami Siren

!DSPAM:4678dc50320701387220021!

Reply via email to