JAVA_HEAP_MAX in bin/nutch. Also reduce the timeLimitFetch in the
bin/crawl. I faced the java heap size issue, and it looks like data fetched
within timeLimitfetch will be kept in the memory, if you have numbers of
thread, running in long timeLimitFetch, the JAVA_HEAP_MAX may not enough,
even if you can assign more than 10G to it. (I did assigned 12G and
failed!)

On Mon, Feb 16, 2015 at 3:35 PM, Siddharth Mahendra Dasani <[email protected]>
wrote:

> did you change the java heap size?? how do we change that?
> On Feb 16, 2015 3:34 PM, "Renxia Wang" <[email protected]> wrote:
>
>> The log file under logs folder generally provides more info. This issue
>> may caused by small java heap size.
>>
>> On Mon, Feb 16, 2015 at 3:28 PM, Siddharth Mahendra Dasani <
>> [email protected]> wrote:
>>
>>> the logs are those which are created on the console
>>>
>>>
>>> On Mon, Feb 16, 2015 at 3:26 PM, Siddharth Mahendra Dasani <
>>> [email protected]> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Feb 16, 2015 at 2:24 PM, Renxia Wang <[email protected]> wrote:
>>>>
>>>>> Hi Siddharth,
>>>>>
>>>>> Can you provide the log file for this failure?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Renxia
>>>>>
>>>>> On Mon, Feb 16, 2015 at 1:39 PM, Siddharth Mahendra Dasani <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>  I am a student in Professor Mattman's CSCI572 class. I crawled using
>>>>>> nutch with all the required parameters set. It went through for like 40
>>>>>> mins and then it throws the java.io.IOException and following it the job
>>>>>> failed. Is someone else facing this issue and if yes how to resolve it?
>>>>>>
>>>>>> Regards,
>>>>>> Siddharth Dasani.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>

Reply via email to