I am running into some problems.

I have 8 segments all with approximately 250K (~2 million) URLS. I am trying to merge that into one.

But takes forever, it had been running for about 3 days before I stopped it. It also has used 904 GB in the /tmp directory.

The machine that it is running on is a Dual Intel Quad core 2.8 GHz, with 24 GB of RAM. The CPU stays at about 20% utilization.

Any ideas? I went through the nutch configs and didn't see anything that seemed like it would add more memory, workers, etc to this task.

Any help would be greatly appreciated.

Thank you,

-John




John Martyniak
President/CEO
Before Dawn Solutions, Inc.
9457 S. University Blvd #266
Highlands Ranch, CO 80126
o: 877-499-1562
c: 303-522-1756
e: j...@beforedawnsoutions.com
w: http://www.beforedawnsolutions.com

Reply via email to