I have lot of space left on the /tmp . I don't have separate partition
for /tmp... i have a folder called /tmp... There is lot of space
left.. close to 1.3Terabytes...

                      1.4T   55G  1.3T   5% /
tmpfs                 3.8G     0  3.8G   0% /lib/init/rw
varrun                3.8G  120K  3.8G   1% /var/run
varlock               3.8G     0  3.8G   0% /var/lock
udev                  3.8G  152K  3.8G   1% /dev
tmpfs                 3.8G     0  3.8G   0% /dev/shm
lrm                   3.8G  2.5M  3.8G   1%
/lib/modules/2.6.28-15-server/volatile
/dev/sda5             228M   29M  187M  14% /boot
/dev/sr0              388K  388K     0 100% /media/cdrom0

I also noticed that /tmp/hadoop-root directory is 6.8 Gb...

I have attached the jstack of the process that is doing the update.... below

2009-11-02 19:11:54
Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.2-b01 mixed mode):

"Attach Listener" daemon prio=10 tid=0x0000000041bb1000 nid=0xd3b
waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Comm thread for attempt_local_0001_r_000000_0" daemon prio=10
tid=0x00007f3ff4002800 nid=0x6b8f waiting on condition
[0x00007f4000e97000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.Task$1.run(Task.java:403)
        at java.lang.Thread.run(Thread.java:619)

"Thread-12" prio=10 tid=0x0000000041b37800 nid=0x25f3 runnable
[0x00007f4000f98000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.Byte.hashCode(Byte.java:394)
        at 
java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:882)
        at 
org.apache.hadoop.io.AbstractMapWritable.addToMap(AbstractMapWritable.java:78)
        - locked <0x00007f47ef4d9310> (a org.apache.hadoop.io.MapWritable)
        at 
org.apache.hadoop.io.AbstractMapWritable.<init>(AbstractMapWritable.java:128)
        at org.apache.hadoop.io.MapWritable.<init>(MapWritable.java:42)
        at org.apache.hadoop.io.MapWritable.<init>(MapWritable.java:52)
        at org.apache.nutch.crawl.CrawlDatum.set(CrawlDatum.java:321)
        at org.apache.nutch.crawl.CrawlDbReducer.reduce(CrawlDbReducer.java:73)
        at org.apache.nutch.crawl.CrawlDbReducer.reduce(CrawlDbReducer.java:35)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170)

"Low Memory Detector" daemon prio=10 tid=0x00007f3ffc004000 nid=0x25d0
runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x00007f3ffc001000 nid=0x25cf
waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x00000000417be800 nid=0x25ce
waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00000000417bc800 nid=0x25cd
runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x000000004179e000 nid=0x25cc in
Object.wait() [0x00007f40016f7000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007f400f63e6c0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
        - locked <0x00007f400f63e6c0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x0000000041797000 nid=0x25cb
in Object.wait() [0x00007f40017f8000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007f400f63e6f8> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:485)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
        - locked <0x00007f400f63e6f8> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x0000000041734000 nid=0x25c5 waiting on condition
[0x00007f49d75c2000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1152)
        at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:94)
        at org.apache.nutch.crawl.CrawlDb.run(CrawlDb.java:189)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.CrawlDb.main(CrawlDb.java:150)

"VM Thread" prio=10 tid=0x0000000041790000 nid=0x25ca runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x000000004173e000
nid=0x25c6 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000041740000
nid=0x25c7 runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000041742000
nid=0x25c8 runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000041744000
nid=0x25c9 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f3ffc006800 nid=0x25d1
waiting on condition

JNI global references: 907



Any help related to this would be really helpful...

On Mon, Nov 2, 2009 at 3:56 PM, Julien Nioche
<lists.digitalpeb...@gmail.com> wrote:
> Hi again
>
>
>> i know the process is not stuck.. and the process is running because i
>> turned on the hadoop logs and i can see logs being written to it...
>> I'm not sure how to check if the task is completely stuck or not...
>>
>
> run jps to identify the process id then *jstack id* several times to see if
> it is blocked at the same place
>
> how much space do you have left on the partition where /tmp is mounted?
>
> J.
>
>
>
>> Below is the sample log as i'm sending this email.... Its been on the
>> updatedb process for the last 19 days and the it has been generating
>> debug logs similar to this........
>>
>> Has anyone else has this same issue before...
>>
>>
>> 2009-11-02 13:34:21,112 DEBUG mapred.Counters - Creating group
>> org.apache.hadoop.mapred.Task$FileSystemCounter with bundle
>> 2009-11-02 13:34:21,112 DEBUG mapred.Counters - Adding LOCAL_READ
>> 2009-11-02 13:34:21,112 DEBUG mapred.Counters - Adding LOCAL_WRITE
>> 2009-11-02 13:34:21,112 DEBUG mapred.Counters - Creating group
>> org.apache.hadoop.mapred.Task$Counter with bundle
>> 2009-11-02 13:34:21,112 DEBUG mapred.Counters - Adding
>> COMBINE_OUTPUT_RECORDS
>> 2009-11-02 13:34:21,112 DEBUG mapred.Counters - Adding MAP_INPUT_RECORDS
>> 2009-11-02 13:34:21,113 DEBUG mapred.Counters - Adding MAP_OUTPUT_BYTES
>> 2009-11-02 13:34:21,113 DEBUG mapred.Counters - Adding MAP_INPUT_BYTES
>> 2009-11-02 13:34:21,113 DEBUG mapred.Counters - Adding MAP_OUTPUT_RECORDS
>> 2009-11-02 13:34:21,113 DEBUG mapred.Counters - Adding
>> COMBINE_INPUT_RECORDS
>> 2009-11-02 13:34:21,643 INFO  mapred.JobClient -  map 93% reduce 0%
>> 2009-11-02 13:34:22,121 INFO  mapred.MapTask - Spilling map output:
>> record full = true
>> 2009-11-02 13:34:22,121 INFO  mapred.MapTask - bufstart = 10420198;
>> bufend = 13893589; bufvoid = 99614720
>> 2009-11-02 13:34:22,121 INFO  mapred.MapTask - kvstart = 131070; kvend
>> = 65533; length = 327680
>> 2009-11-02 13:34:22,427 INFO  mapred.MapTask - Finished spill 3
>> 2009-11-02 13:34:23,301 INFO  mapred.MapTask - Starting flush of map output
>> 2009-11-02 13:34:23,384 INFO  mapred.MapTask - Finished spill 4
>> 2009-11-02 13:34:23,390 DEBUG mapred.MapTask -
>> MapId=attempt_local_0001_m_000003_0 Reducer=0Spill =0(0,224, 228)
>> 2009-11-02 13:34:23,390 DEBUG mapred.MapTask -
>> MapId=attempt_local_0001_m_000003_0 Reducer=0Spill =1(0,242, 246)
>> 2009-11-02 13:34:23,390 DEBUG mapred.MapTask -
>> MapId=attempt_local_0001_m_000003_0 Reducer=0Spill =2(0,242, 246)
>> 2009-11-02 13:34:23,390 DEBUG mapred.MapTask -
>> MapId=attempt_local_0001_m_000003_0 Reducer=0Spill =3(0,242, 246)
>> 2009-11-02 13:34:23,390 DEBUG mapred.MapTask -
>> MapId=attempt_local_0001_m_000003_0 Reducer=0Spill =4(0,242, 246)
>> 2009-11-02 13:34:23,390 INFO  mapred.Merger - Merging 5 sorted segments
>> 2009-11-02 13:34:23,392 INFO  mapred.Merger - Down to the last
>> merge-pass, with 5 segments left of total size: 1192 bytes
>> 2009-11-02 13:34:23,393 INFO  mapred.MapTask - Index: (0, 354, 358)
>> 2009-11-02 13:34:23,394 INFO  mapred.TaskRunner -
>> Task:attempt_local_0001_m_000003_0 is done. And is in the process of
>> commiting
>> 2009-11-02 13:34:23,395 DEBUG mapred.TaskRunner -
>> attempt_local_0001_m_000003_0 Progress/ping thread exiting since it
>> got interrupted
>> 2009-11-02 13:34:23,395 INFO  mapred.LocalJobRunner -
>>
>> file:/opt/tsweb/nutch-1.0/newHyperseekCrawl/db/current/part-00000/data:100663296+33554432
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Creating group
>> org.apache.hadoop.mapred.Task$FileSystemCounter with bundle
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Adding LOCAL_READ
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Adding LOCAL_WRITE
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Creating group
>> org.apache.hadoop.mapred.Task$Counter with bundle
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Adding
>> COMBINE_OUTPUT_RECORDS
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Adding MAP_INPUT_RECORDS
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Adding MAP_OUTPUT_BYTES
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Adding MAP_INPUT_BYTES
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Adding MAP_OUTPUT_RECORDS
>> 2009-11-02 13:34:23,396 DEBUG mapred.Counters - Adding
>> COMBINE_INPUT_RECORDS
>> 2009-11-02 13:34:23,397 INFO  mapred.TaskRunner - Task
>> 'attempt_local_0001_m_000003_0' done.
>> 2009-11-02 13:34:23,397 DEBUG mapred.SortedRanges - currentIndex 0   0:0
>> 2009-11-02 13:34:23,397 DEBUG conf.Configuration -
>> java.io.IOException: config(config)
>>        at
>> org.apache.hadoop.conf.Configuration.<init>(Configuration.java:192)
>>        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:139)
>>        at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:132)
>>
>> 2009-11-02 13:34:23,398 DEBUG mapred.MapTask - Writing local split to
>> /tmp/hadoop-root/mapred/local/localRunner/split.dta
>> 2009-11-02 13:34:23,451 DEBUG mapred.TaskRunner -
>> attempt_local_0001_m_000004_0 Progress/ping thread started
>> 2009-11-02 13:34:23,452 INFO  mapred.MapTask - numReduceTasks: 1
>> 2009-11-02 13:34:23,453 INFO  mapred.MapTask - io.sort.mb = 100
>> 2009-11-02 13:34:23,644 INFO  mapred.JobClient -  map 100% reduce 0%
>>
>> Mathan
>> On Mon, Nov 2, 2009 at 4:11 AM, Andrzej Bialecki <a...@getopt.org> wrote:
>> > Kalaimathan Mahenthiran wrote:
>> >>
>> >> I forgot to add the detail...
>> >>
>> >> The segment i'm trying to do updatedb on has 1.3 millions urls fetched
>> >> and 1.08 million urls parsed..
>> >>
>> >> Any help related to this would be appreciated...
>> >>
>> >>
>> >> On Sun, Nov 1, 2009 at 11:53 PM, Kalaimathan Mahenthiran
>> >> <matha...@gmail.com> wrote:
>> >>>
>> >>> hi everyone
>> >>>
>> >>> I'm using nutch 1.0. I have fetched successfully and currently on the
>> >>> updatedb process. I'm doing updatedb and its taking so long. I don't
>> >>> know why its taking this long. I have a new machine with quad core
>> >>> processor and 8 gb of ram.
>> >>>
>> >>> I believe this system is really good in terms of processing power. I
>> >>> don't think processing power is the problem here. I noticed that all
>> >>> the ram is getting using up. close to 7.7gb by the updatedb process.
>> >>> The computer is becoming is really slow.
>> >>>
>> >>> The updatedb process has been running for the last 19 days continually
>> >>> with the message merging segment data into db.. Does anyone know why
>> >>> its taking so long... Is there any configuration setting i can do to
>> >>> increase the speed of the updatedb process...
>> >
>> > First, this process normally takes just a few minutes, depending on the
>> > hardware, and not several days - so something is wrong.
>> >
>> > * do you run this in "local" or pseudo-distributed mode (i.e. running a
>> real
>> > jobtracker and tasktracker?) Try the pseudo-distributed mode, because
>> then
>> > you can monitor the progress in the web UI.
>> >
>> > * how many reduce tasks do you have? with large updates it helps if you
>> run
>> >> 1 reducer, to split the final sorting.
>> >
>> > * if the task appears to be completely stuck, please generate a thread
>> dump
>> > (kill -SIGQUIT) and see where it's stuck. This could be related to
>> > urlfilter-regex or urlnormalizer-regex - you can identify if these are
>> > problematic by removing them from the config and re-running the
>> operation.
>> >
>> > * minor issue - when specifying the path names of segments and crawldb,
>> do
>> > NOT append the trailing slash - it's not harmful in this particular case,
>> > but you could have a nasty surprise when doing e.g. copy / mv operations
>> ...
>> >
>> > --
>> > Best regards,
>> > Andrzej Bialecki     <><
>> >  ___. ___ ___ ___ _ _   __________________________________
>> > [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>> > ___|||__||  \|  ||  |  Embedded Unix, System Integration
>> > http://www.sigram.com  Contact: info at sigram dot com
>> >
>> >
>>
>
>
>
> --
> DigitalPebble Ltd
> http://www.digitalpebble.com
>

Reply via email to