Actually the answer is for Talat. Sorry for the confusion.
On 12/29/2013 02:31 AM, Law-Firms-In.com wrote:
> Hi Yazdi,
>
> thank you for your reply.
>
> I am running on a single server currently (1 server).
>
> In my hbase (hbase-0.90.6/conf) folder I see the following files:
>
> hadoop-metrics.properties
> hbase-env.sh
> hbase-site.xml
> log4j.properties
> regionservers
>
> I dont see the files you ask for. Please let me know what info might
> help you pinpoint the problem.
>
> Greetings,
>
> Domi
>
>
> On 12/28/2013 10:42 PM, Talat Uyarer wrote:
>> Hi,
>>
>> May i ask you do you run on local mode with one server. If you will say no.
>> Can you share with me your mapred-site.xml or you can share job.xml options
>> with your cluster infrastructer.
>>
>> Thanks
>> Talat
>> 28 Ara 2013 21:39 tarihinde "Law-Firms-In.com" <[email protected]>
>> yazdı:
>>
>>> Hello all,
>>>
>>> I wanted to inquiry about the general performance of nutch. I have seen
>>> this page here
>>> (http://digitalpebble.blogspot.cz/2013/09/nutch-fight-17-vs-221.html)
>>> where it takes
>>>
>>> 78minutes
>>>
>>> for 1 iteration with 3M urls/ 5K per iteration with 100 urls/host.
>>>
>>> I have myself the same setup as in the test but with currently only
>>> around 70k urls in the database.
>>>
>>> The steps fetch/parse go very quick but the steps generate/update take
>>> both _forever_. I have for 1 run about 12 hours and by far the most time
>>> is spent at update followed by generate.
>>>
>>> Is there ANYTHING I can do to speedup the process? I have a strong
>>> dedicated server with 52GB RAM. One thing I notice is that during
>>> generate/update ALL available RAM is used (Mem: 52438M total,
>>> 52267M used, 170M free, 191M buffers).
>>>
>>> I am thankful for any help/feedback!
>>>
>>> Domi
>>>
>>>
>>>