>> you could do a quick hack in 0.8 to
"fetch" the pages from your 0.7 crawl, using a modified fetcher.
what do you mean? Do I have to modify the fetcher code by myself ?
Yes, you'd have to modify the 0.8 fetcher code (or rather create your
own plug-in) that uses a Nutch 0.7 search setup to get at all of the
previously fetched content.
-- Ken
Ken Krugler wrote:
>>It's really a sad news for me. I must spend a lot of time on fetching it
again.
If it's only just HTML, then you could do a quick hack in 0.8 to
"fetch" the pages from your 0.7 crawl, using a modified fetcher. You
wouldn't have all of the header info, but if everything is text/html
then you might be OK.
-- Ken
Andrzej Bialecki wrote:
King Kong wrote:
I had fetched about 3Gbytes pages in Nutch-0.7.2 .
Now, I want to move it to Nutch-0.8, How can I do it ?
Unfortunately, the data is not portable between these versions. The
only
thing you could do to preserve your webdb is to dump it into a text
file, and then inject into a 0.8 crawldb. As for the segments, you will
have to refetch them.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"
--
View this message in context:
http://www.nabble.com/How-does-Nutch-0.7.2-data-upgrade-to-0.8--tf2151013.html#a5949225
Sent from the Nutch - User forum at Nabble.com.
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"