>> you could do a quick hack in 0.8 to "fetch" the pages from your 0.7 crawl, using a modified fetcher. what do you mean? Do I have to modify the fetcher code by myself ?
Ken Krugler wrote: > >>It's really a sad news for me. I must spend a lot of time on fetching it >>again. > > If it's only just HTML, then you could do a quick hack in 0.8 to > "fetch" the pages from your 0.7 crawl, using a modified fetcher. You > wouldn't have all of the header info, but if everything is text/html > then you might be OK. > > -- Ken > > >>Andrzej Bialecki wrote: >>> >>> King Kong wrote: >>>> I had fetched about 3Gbytes pages in Nutch-0.7.2 . >>>> Now, I want to move it to Nutch-0.8, How can I do it ? >>>> >>> >>> Unfortunately, the data is not portable between these versions. The >>> only >>> thing you could do to preserve your webdb is to dump it into a text >>> file, and then inject into a 0.8 crawldb. As for the segments, you will >>> have to refetch them. >>> >>> -- >>> Best regards, >>> Andrzej Bialecki <>< >>> ___. ___ ___ ___ _ _ __________________________________ >>> [__ || __|__/|__||\/| Information Retrieval, Semantic Web >>> ___|||__|| \| || | Embedded Unix, System Integration >> > http://www.sigram.com Contact: info at sigram dot com > > -- > Ken Krugler > Krugle, Inc. > +1 530-210-6378 > "Find Code, Find Answers" > > -- View this message in context: http://www.nabble.com/How-does-Nutch-0.7.2-data-upgrade-to-0.8--tf2151013.html#a5949225 Sent from the Nutch - User forum at Nabble.com. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
