Hi Stefan,

thanks for your answer. I've already done this kind of solution, it perfectly 
fits in my wrapping framework. The performance loss is acceptable, because it 
happens only at build-time. I used the SegmentReader to read out the compressed 
stuff and write it into a lucene index. If anybody else runs into this problem 
(although I think I am the only coder working in stone age :D) don't hesitate 
to ask me.

Regards,
Fabian


-----Ursprüngliche Nachricht-----
Von: Stefan Groschupf [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 30. Mai 2006 23:48
An: [email protected]
Betreff: Re: nutch compressing huge content data


Hi Fabian,
wow nutch 0.6 is really old school.. :-)
However the simplest thing you can do is just write a class that  
reads the data from a segment (parsed text and data) and writes those  
into a own index.
Should be simple if you know how to write into a lucene index.
HTH
Stefan



Am 23.05.2006 um 15:31 schrieb Kraemer, Fabian:

> Hi.
>
> I use lucene 1.4.3 and nutch 0.6. I have a working implementation  
> of lucene, searching over several indices. All the data is  
> generated directly from the db, not by a crawler. The search  
> request can go over multiple indices with boolean clauses for each  
> index.
>
> I have the problem that I wanted to use nutch only for crawling and  
> indexing, not for the search (because it is already implemented).  
> But I got the problem, that nutch seems to compress the data in  
> several fields of a document. I don't want to use Nutch search  
> mechanism nor do I want to touch my working search implementation.
>
> I got two questions:
>
> 1) how can I stop nutch from compressing the data in a field
> 2) will this "uncompressed" index be equal to an index produced by  
> an IndexWriter of lucene (1.4.3?)
>
> Thanks for your help,
>
> Fabian
>
>



-------------------------------------------------------
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid7521&bid$8729&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to