hi i found problem.. In the nutch-site.xml it parsed almost everything becouse of this memory vm limit exceeds..
Uygar BAYAR wrote: > > hi > we have 4 machine cluster. (dual core CPU 3.20GHz 2GB RAM 400GB disk).We > use nutch 0.9 and hadoop 0.13.1. We try to crawl web (60K site) 5 depth. > When we came 4th segment parse it gave java.lang.OutOfMemoryError: > Requested array size exceeds VM limit error each machine.. Our segment > size > crawled/segments/20071002163239 3472754178 > i try several map reduce configurations nothing change.. (400-50 ; 300-15 > ;50-15 ; 100-15; 200-35) > i also set heap size in hadoop-env and nutch script to 2000M > > > > -- View this message in context: http://www.nabble.com/java.lang.OutOfMemoryError%3A-Requested-array-size-exceeds-VM-limit-tf4562352.html#a13040990 Sent from the Hadoop Users mailing list archive at Nabble.com.
