Ryusuke,

I've simplified my test approach by simply copying a single 10 byte file
repeatedly.  When doing so, I get consistent NILFS log output at the
time of hang.  Examination of /proc/sysreq-trigger also shows the copy
process to be in the same state as before (at sys_write).

Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_construct: begin
(mode=0x1)                              
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: **
LSEG BEGIN                            
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: **
GC INODE STAGE                        
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: **
FILE STAGE                            
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: **
IFILE STAGE                           
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: called
(ino=6, stage-count=3)                 
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers:
called (ino=6)                        
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers:
done (ino=6)                          
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_node_buffers:
called (ino=6)                        
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: done
(err=0)                                  
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: **
CP STAGE                              
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: called
(ino=4, stage-count=4)                 
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers:
called (ino=4)                        
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers:
done (ino=4)                          
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_node_buffers:
called (ino=4)                        
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: done
(err=0)                                  
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: **
SUFILE STAGE                          
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: called
(ino=5, stage-count=5)                 
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers:
called (ino=5)                        
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers:
done (ino=5)                          
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_node_buffers:
called (ino=5)                        
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: done
(err=0)                                  
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: **
DAT STAGE                             
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: called
(ino=3, stage-count=6)                 
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers:
called (ino=3)                        
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers:
done (ino=3)                          
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_node_buffers:
called (ino=3)                        
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: done
(err=0)                                  
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: **
SR STAGE                              
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: add
a super root block                   
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: **
LSEG END                              
Jun  9 12:55:14 kernel: ========= NILFS SEGMENT INFORMATION ========

Jun  9 12:55:14 kernel: full segment: segnum=101, start=206848,
end=208895                                    
Jun  9 12:55:14 kernel: partial segment: start=208788, rest=108

Jun  9 12:55:14 kernel: ------------------ SUMMARY -----------------

Jun  9 12:55:14 kernel: nfinfo     = 4 (number of files)

Jun  9 12:55:14 kernel: nblocks    = 22 (number of blocks)

Jun  9 12:55:14 kernel: sumbytes   = 408 (size of summary in bytes)

Jun  9 12:55:14 kernel: nsumblk    = 1 (number of summary blocks)

Jun  9 12:55:14 kernel: flags      = LOGBGN|LOGEND|SR

Jun  9 12:55:14 kernel: ============================================

Jun  9 12:55:14 kernel: NILFS(segment)
nilfs_segctor_update_payload_blocknr: called                           
Jun  9 12:55:14 kernel: NILFS(segment)
nilfs_segctor_update_payload_blocknr: done                             
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_file_bmap:
called                                
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_file_bmap:
done                                  
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_checkpoint:
called                               
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_checkpoint:
done                                 
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_update_segusage:
called                                  
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_update_segusage:
done                                    
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_checksums:
called                                
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_checksums:
done                                  
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_write: submitting
summary blocks                          
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_alloc_seg_bio: allocated
bio (max_vecs=64)                       
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_write: submitting
normal blocks (index=1)                 
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_submit_seg_bio: submitting
bio (start_sector=1670304, size=90112)
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_write: submitted a
segment (err=0, pseg_start=208788, #re)
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_wait: called nbio=1

Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_wait: wait completed

Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_complete_write:
completing segment (flags=0x7)           
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_complete_write:
completed a segment having a super root )
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_do_construct:
submitted all segments                     
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_construct: end
(stage=9)                                 
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_notify: complete
requests from seq=2 to seq=1            
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_transaction_unlock: task
c08c0820 unlocked segment semaphore     
Jun  9 12:55:14 kernel: NILFS(segment) nilfs_segctor_thread: sequence:
req=1, done=1, state=0 
  
Bill

-----Original Message-----
From: Dunphy, Bill 
Sent: Monday, June 08, 2009 2:31 PM
To: 'Ryusuke Konishi'; [email protected]
Subject: RE: [NILFS users] [SPAM] Write hang on ARM based target

Ryusuke,

Thanks for the prompt reply.

>Hi!
>On Mon, 8 Jun 2009 09:46:52 -0600, "Dunphy, Bill" wrote:
>> Greetings,
>>  
>> I'm attempting to get NILFS up and running on a Marvell 
>> DB-88F6281-BP-A
>> (ARM) and have experienced a consistent hang during data transfer.  
>> This target board is running a 2.6.22.18 kernel and I've tested it 
>> against NILFS versions 2.0.12, 2.0.14 and the latest git pull 
>> (6/4/09).  I've eliminated the garbage collector and and have enabled

>> full debugging for log capture.  The failure is not immediate or at a

>> fixed point - typically in the 1-2GB neighborhood of writing variably
sized files.
>> This same exercise (a simple recursive local copy) works flawlessly 
>> on my host Ubuntu i86 machine as well as on the target with an ext3 
>> file system.
>>  
>> Is there a specific kernel feature that I might be missing and need 
>> to enable?  A kernel version sensitivity?  Is this log information 
>> something you would be interested in seeing?  If so, how verbose 
>> would you like it or better still, how would you like me to collect
it?
>
>Thanks for reporting.
>
>What will happen if you trigger SysRq when you meet the hang?
>
> # echo t > /proc/sysrq-trigger
>
>I don't know whether the arm kernel codes is supporting sysrq, but we
may get the stack dump of >the nilfs task hung at some lock.  If it
works, it would help us to narrow down the problem.

Here's the stack trace of the copy process which has entered a permanent
disk sleep state:

cp            D C002CAA4     0   339    333 (NOTLB)
[<c002c580>] (schedule+0x0/0x600) from [<c002d54c>]
(schedule_timeout+0x98/0xc4) [<c002d4b4>] (schedule_timeout+0x0/0xc4)
from [<c002cc28>] (io_schedule_timeout+0x34/0x58)
 r7:00000001 r6:dfe6fc64 r5:dfd7bac0 r4:c03fb3a0 [<c002cbf4>]
(io_schedule_timeout+0x0/0x58) from [<c014b834>]
(congestion_wait+0x70/0x94)
 r4:c03df664
[<c014b7c4>] (congestion_wait+0x0/0x94) from [<c0146b2c>]
(balance_dirty_pages_ratelimited_nr+0x12c/0x1c0)
 r8:00000000 r7:cf142660 r6:00000600 r5:c04038c4 r4:ffffffce
[<c0146a00>] (balance_dirty_pages_ratelimited_nr+0x0/0x1c0) from
[<c0142860>] (generic_file_buffered_write+0x570/0x670)
[<c01422f0>] (generic_file_buffered_write+0x0/0x670) from [<c0142db0>]
(__generic_file_aio_write_nolock+0x450/0x4a0)
[<c0142960>] (__generic_file_aio_write_nolock+0x0/0x4a0) from
[<c0142e78>] (generic_file_aio_write+0x78/0xf4)
[<c0142e00>] (generic_file_aio_write+0x0/0xf4) from [<c0160430>]
(do_sync_write+0xb8/0x108) [<c0160378>] (do_sync_write+0x0/0x108) from
[<c0160d18>] (vfs_write+0xb8/0x148) [<c0160c60>] (vfs_write+0x0/0x148)
from [<c01612dc>] (sys_write+0x44/0x70)
 r7:00000549 r6:dead0bc0 r5:00000000 r4:00009000 [<c0161298>]
(sys_write+0x0/0x70) from [<c0027400>] (ret_fast_syscall+0x0/0x2c)
 r8:c00275a8 r7:00000004 r6:0000000d r5:bee23f50 r4:00000549

>Also, you can know the progress of segment writer by specifying the 
>following debug options,
>
> # echo "-vvv segment -vv seginfo" > /proc/fs/nilfs2/debug_option
>
>This feature is only available on the nilfs2 out-of-tree module built
with CONFIG_NILFS_DEBUG=y >option.

Here's the tail of the syslog with segment and seginfo logging enhanced
as requested:

NILFS(segment) nilfs_segctor_sync: start task=dfd7bac0 seq=159
NILFS(segment) nilfs_segctor_thread: woke
NILFS(segment) nilfs_segctor_thread: sequence: req=159, done=158,
state=0
NILFS(segment) nilfs_transaction_lock: task c0909820 locking segment
semaphore
NILFS(segment) nilfs_transaction_lock: locked
NILFS(segment) nilfs_segctor_confirm: Skipped construction (no changes)
NILFS(segment) nilfs_segctor_notify: complete requests from seq=159 to
seq=159
NILFS(segment) nilfs_segctor_wakeup: wakeup task=dfe6fb44 seq=159
NILFS(segment) nilfs_transaction_unlock: task c0909820 unlocked segment
semaphore
NILFS(segment) nilfs_segctor_thread: sequence: req=159, done=159,
state=0
NILFS(segment) nilfs_segctor_sync: done task=dfd7bac0 seq=159 err=0
NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment
semaphore
NILFS(segment) nilfs_transaction_lock: locked
NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment
semaphore
NILFS(segment) nilfs_segctor_sync: start task=dfd7bac0 seq=160
NILFS(segment) nilfs_segctor_thread: woke
NILFS(segment) nilfs_segctor_thread: sequence: req=160, done=159,
state=0
NILFS(segment) nilfs_transaction_lock: task c0909820 locking segment
semaphore
NILFS(segment) nilfs_transaction_lock: locked
NILFS(segment) nilfs_segctor_confirm: Skipped construction (no changes)
NILFS(segment) nilfs_segctor_notify: comion_unlock: task dfd7bac0
unlocked segment semaphore
NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment
semaphore
NILFS(segment) nilfs_transaction_lock: locked
NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment
semaphore

// Bunch of repeated lock, lock, unlock entries...

NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment
semaphore
NILFS(segment) nilfs_transaction_lock: locked
NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment
semaphore
NILFS(segment) nilfs_segctor_sync: start task=dfd7bac0 seq=703
NILFS(segment) nilfs_segctor_thread: woke
NILFS(segment) nilfs_segctor_thread: sequence: req=703, done=702,
state=0
NILFS(segment) nilfs_transaction_lock: task c0909820 locking segment
semaphore
NILFS(segment) nilfs_transaction_lock: locked
NILFS(segment) nilfs_segctor_confirm: Skipped construction (no changes)
NILFS(segment) nilfs_segctor_notify: complete requests from seq=703 to
seq=703
NILFS(segment) nilfs_segctor_wakeup: wakeup task=dfe6fb44 seq=703
NILFS(segment) nilfs_transaction_unlock: task c0909820 unlocked segment
semaphore
NILFS(segment) nilfs_segctor_thread: sequence: req=703, done=703,
state=0
NILFS(segment) nilfs_segctor_sync: done task=dfd7bac0 seq=703 err=0
NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment
semaphore
NILFS(segment) nilfs_transaction_lock: locked
NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment
semaphore

// Bunch of repeated lock, lock, unlock entries...

NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment
semaphore
NILFS(segment) nilfs_transaction_lock: locked
NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment
semaphore

// Game over.

As you can see, I've cut out some repeated sections to trim down the
output.  If you need more log info beyond this tail, let me know how
much and how you would like me to pass it to you.

Bill 
_______________________________________________
users mailing list
[email protected]
https://www.nilfs.org/mailman/listinfo/users

Reply via email to