Hi, I'm trying to copy the nutch database.
It seems to be enough to list all pages by MD5 and get all Links of those pages. I open up a reader of the db directory, make an new db directory and open a writer for it. When i copy all the database the hd space isnt't enough to merge the tempfile for big databases, but it works for small db's. I tried to do it by the piece, to close the writer after a number of pages and reopen it agailn. It works for pages but now there aren't enough links in the new db. The more pages and links I do in one round the more links I get in the new db. Can somebody help me with this. Is there a posibility to avoid this? Regards, Jakob -- Geschenkt: 3 Monate GMX ProMail gratis + 3 Ausgaben stern gratis ++ Jetzt anmelden & testen ++ http://www.gmx.net/de/go/promail ++ ------------------------------------------------------- SF.Net email is sponsored by: Discover Easy Linux Migration Strategies from IBM. Find simple to follow Roadmaps, straightforward articles, informative Webcasts and more! Get everything you need to get up to speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
