>  >> you could do a quick hack in 0.8 to
>"fetch" the pages from your 0.7 crawl, using a modified fetcher.
>
>   what do you mean? Do I have to modify the fetcher code by myself ?

Yes, you'd have to modify the 0.8 fetcher code (or rather create your 
own plug-in) that uses a Nutch 0.7 search setup to get at all of the 
previously fetched content.

-- Ken


>Ken Krugler wrote:
>>
>  >>It's really  a sad news for me. I must spend a lot of time on fetching it
>>>again.
>>
>>  If it's only just HTML, then you could do a quick hack in 0.8 to
>>  "fetch" the pages from your 0.7 crawl, using a modified fetcher. You
>>  wouldn't have all of the header info, but if everything is text/html
>>  then you might be OK.
>>
>>  -- Ken
>>
>>
>>>Andrzej Bialecki wrote:
>>>>
>>>>   King Kong wrote:
>>>>>   I had fetched about 3Gbytes pages in Nutch-0.7.2 .
>>>>>   Now, I want to move it to Nutch-0.8, How can I do it ?
>>>>> 
>>>>
>>>>   Unfortunately, the data is not portable between these versions. The
>>>>  only
>>>>   thing you could do to preserve your webdb is to dump it into a text
>>>>   file, and then inject into a 0.8 crawldb. As for the segments, you will
>>>>   have to refetch them.
>>>>
>>>>   --
>>>>   Best regards,
>>>>   Andrzej Bialecki     <><
>>>>    ___. ___ ___ ___ _ _   __________________________________
>>>>   [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>>>>   ___|||__||  \|  ||  |  Embedded Unix, System Integration
>>>   > http://www.sigram.com  Contact: info at sigram dot com
>>
>>  --
>>  Ken Krugler
>>  Krugle, Inc.
>>  +1 530-210-6378
>>  "Find Code, Find Answers"
>>
>>
>
>--
>View this message in context: 
>http://www.nabble.com/How-does-Nutch-0.7.2-data-upgrade-to-0.8--tf2151013.html#a5949225
>Sent from the Nutch - User forum at Nabble.com.


-- 
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to