Re: [Nutch-general] How does Nutch-0.7.2 data upgrade to 0.8?

Ken Krugler Wed, 23 Aug 2006 10:23:54 -0700

>It's really  a sad news for me. I must spend a lot of time on fetching it
>again.


If it's only just HTML, then you could do a quick hack in 0.8 to 
"fetch" the pages from your 0.7 crawl, using a modified fetcher. You 
wouldn't have all of the header info, but if everything is text/html 
then you might be OK.

-- Ken


>Andrzej Bialecki wrote:
>>
>>  King Kong wrote:
>>>  I had fetched about 3Gbytes pages in Nutch-0.7.2 .
>>>  Now, I want to move it to Nutch-0.8, How can I do it ?
>>>  
>>
>>  Unfortunately, the data is not portable between these versions. The only
>>  thing you could do to preserve your webdb is to dump it into a text
>>  file, and then inject into a 0.8 crawldb. As for the segments, you will
>>  have to refetch them.
>>
>>  --
>>  Best regards,
>>  Andrzej Bialecki     <><
>>   ___. ___ ___ ___ _ _   __________________________________
>>  [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>>  ___|||__||  \|  ||  |  Embedded Unix, System Integration
>  > http://www.sigram.com  Contact: info at sigram dot com

-- 
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] How does Nutch-0.7.2 data upgrade to 0.8?

Reply via email to