Ryusuke, I've simplified my test approach by simply copying a single 10 byte file repeatedly. When doing so, I get consistent NILFS log output at the time of hang. Examination of /proc/sysreq-trigger also shows the copy process to be in the same state as before (at sys_write).
Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_construct: begin (mode=0x1) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: ** LSEG BEGIN Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: ** GC INODE STAGE Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: ** FILE STAGE Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: ** IFILE STAGE Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: called (ino=6, stage-count=3) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers: called (ino=6) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers: done (ino=6) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_node_buffers: called (ino=6) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: done (err=0) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: ** CP STAGE Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: called (ino=4, stage-count=4) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers: called (ino=4) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers: done (ino=4) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_node_buffers: called (ino=4) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: done (err=0) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: ** SUFILE STAGE Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: called (ino=5, stage-count=5) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers: called (ino=5) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers: done (ino=5) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_node_buffers: called (ino=5) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: done (err=0) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: ** DAT STAGE Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: called (ino=3, stage-count=6) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers: called (ino=3) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_data_buffers: done (ino=3) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_lookup_dirty_node_buffers: called (ino=3) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_scan_file: done (err=0) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: ** SR STAGE Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: add a super root block Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_collect_blocks: ** LSEG END Jun 9 12:55:14 kernel: ========= NILFS SEGMENT INFORMATION ======== Jun 9 12:55:14 kernel: full segment: segnum=101, start=206848, end=208895 Jun 9 12:55:14 kernel: partial segment: start=208788, rest=108 Jun 9 12:55:14 kernel: ------------------ SUMMARY ----------------- Jun 9 12:55:14 kernel: nfinfo = 4 (number of files) Jun 9 12:55:14 kernel: nblocks = 22 (number of blocks) Jun 9 12:55:14 kernel: sumbytes = 408 (size of summary in bytes) Jun 9 12:55:14 kernel: nsumblk = 1 (number of summary blocks) Jun 9 12:55:14 kernel: flags = LOGBGN|LOGEND|SR Jun 9 12:55:14 kernel: ============================================ Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_update_payload_blocknr: called Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_update_payload_blocknr: done Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_file_bmap: called Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_file_bmap: done Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_checkpoint: called Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_checkpoint: done Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_update_segusage: called Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_update_segusage: done Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_checksums: called Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_fill_in_checksums: done Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_write: submitting summary blocks Jun 9 12:55:14 kernel: NILFS(segment) nilfs_alloc_seg_bio: allocated bio (max_vecs=64) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_write: submitting normal blocks (index=1) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_submit_seg_bio: submitting bio (start_sector=1670304, size=90112) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_write: submitted a segment (err=0, pseg_start=208788, #re) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_wait: called nbio=1 Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segbuf_wait: wait completed Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_complete_write: completing segment (flags=0x7) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_complete_write: completed a segment having a super root ) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_do_construct: submitted all segments Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_construct: end (stage=9) Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_notify: complete requests from seq=2 to seq=1 Jun 9 12:55:14 kernel: NILFS(segment) nilfs_transaction_unlock: task c08c0820 unlocked segment semaphore Jun 9 12:55:14 kernel: NILFS(segment) nilfs_segctor_thread: sequence: req=1, done=1, state=0 Bill -----Original Message----- From: Dunphy, Bill Sent: Monday, June 08, 2009 2:31 PM To: 'Ryusuke Konishi'; [email protected] Subject: RE: [NILFS users] [SPAM] Write hang on ARM based target Ryusuke, Thanks for the prompt reply. >Hi! >On Mon, 8 Jun 2009 09:46:52 -0600, "Dunphy, Bill" wrote: >> Greetings, >> >> I'm attempting to get NILFS up and running on a Marvell >> DB-88F6281-BP-A >> (ARM) and have experienced a consistent hang during data transfer. >> This target board is running a 2.6.22.18 kernel and I've tested it >> against NILFS versions 2.0.12, 2.0.14 and the latest git pull >> (6/4/09). I've eliminated the garbage collector and and have enabled >> full debugging for log capture. The failure is not immediate or at a >> fixed point - typically in the 1-2GB neighborhood of writing variably sized files. >> This same exercise (a simple recursive local copy) works flawlessly >> on my host Ubuntu i86 machine as well as on the target with an ext3 >> file system. >> >> Is there a specific kernel feature that I might be missing and need >> to enable? A kernel version sensitivity? Is this log information >> something you would be interested in seeing? If so, how verbose >> would you like it or better still, how would you like me to collect it? > >Thanks for reporting. > >What will happen if you trigger SysRq when you meet the hang? > > # echo t > /proc/sysrq-trigger > >I don't know whether the arm kernel codes is supporting sysrq, but we may get the stack dump of >the nilfs task hung at some lock. If it works, it would help us to narrow down the problem. Here's the stack trace of the copy process which has entered a permanent disk sleep state: cp D C002CAA4 0 339 333 (NOTLB) [<c002c580>] (schedule+0x0/0x600) from [<c002d54c>] (schedule_timeout+0x98/0xc4) [<c002d4b4>] (schedule_timeout+0x0/0xc4) from [<c002cc28>] (io_schedule_timeout+0x34/0x58) r7:00000001 r6:dfe6fc64 r5:dfd7bac0 r4:c03fb3a0 [<c002cbf4>] (io_schedule_timeout+0x0/0x58) from [<c014b834>] (congestion_wait+0x70/0x94) r4:c03df664 [<c014b7c4>] (congestion_wait+0x0/0x94) from [<c0146b2c>] (balance_dirty_pages_ratelimited_nr+0x12c/0x1c0) r8:00000000 r7:cf142660 r6:00000600 r5:c04038c4 r4:ffffffce [<c0146a00>] (balance_dirty_pages_ratelimited_nr+0x0/0x1c0) from [<c0142860>] (generic_file_buffered_write+0x570/0x670) [<c01422f0>] (generic_file_buffered_write+0x0/0x670) from [<c0142db0>] (__generic_file_aio_write_nolock+0x450/0x4a0) [<c0142960>] (__generic_file_aio_write_nolock+0x0/0x4a0) from [<c0142e78>] (generic_file_aio_write+0x78/0xf4) [<c0142e00>] (generic_file_aio_write+0x0/0xf4) from [<c0160430>] (do_sync_write+0xb8/0x108) [<c0160378>] (do_sync_write+0x0/0x108) from [<c0160d18>] (vfs_write+0xb8/0x148) [<c0160c60>] (vfs_write+0x0/0x148) from [<c01612dc>] (sys_write+0x44/0x70) r7:00000549 r6:dead0bc0 r5:00000000 r4:00009000 [<c0161298>] (sys_write+0x0/0x70) from [<c0027400>] (ret_fast_syscall+0x0/0x2c) r8:c00275a8 r7:00000004 r6:0000000d r5:bee23f50 r4:00000549 >Also, you can know the progress of segment writer by specifying the >following debug options, > > # echo "-vvv segment -vv seginfo" > /proc/fs/nilfs2/debug_option > >This feature is only available on the nilfs2 out-of-tree module built with CONFIG_NILFS_DEBUG=y >option. Here's the tail of the syslog with segment and seginfo logging enhanced as requested: NILFS(segment) nilfs_segctor_sync: start task=dfd7bac0 seq=159 NILFS(segment) nilfs_segctor_thread: woke NILFS(segment) nilfs_segctor_thread: sequence: req=159, done=158, state=0 NILFS(segment) nilfs_transaction_lock: task c0909820 locking segment semaphore NILFS(segment) nilfs_transaction_lock: locked NILFS(segment) nilfs_segctor_confirm: Skipped construction (no changes) NILFS(segment) nilfs_segctor_notify: complete requests from seq=159 to seq=159 NILFS(segment) nilfs_segctor_wakeup: wakeup task=dfe6fb44 seq=159 NILFS(segment) nilfs_transaction_unlock: task c0909820 unlocked segment semaphore NILFS(segment) nilfs_segctor_thread: sequence: req=159, done=159, state=0 NILFS(segment) nilfs_segctor_sync: done task=dfd7bac0 seq=159 err=0 NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment semaphore NILFS(segment) nilfs_transaction_lock: locked NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment semaphore NILFS(segment) nilfs_segctor_sync: start task=dfd7bac0 seq=160 NILFS(segment) nilfs_segctor_thread: woke NILFS(segment) nilfs_segctor_thread: sequence: req=160, done=159, state=0 NILFS(segment) nilfs_transaction_lock: task c0909820 locking segment semaphore NILFS(segment) nilfs_transaction_lock: locked NILFS(segment) nilfs_segctor_confirm: Skipped construction (no changes) NILFS(segment) nilfs_segctor_notify: comion_unlock: task dfd7bac0 unlocked segment semaphore NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment semaphore NILFS(segment) nilfs_transaction_lock: locked NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment semaphore // Bunch of repeated lock, lock, unlock entries... NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment semaphore NILFS(segment) nilfs_transaction_lock: locked NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment semaphore NILFS(segment) nilfs_segctor_sync: start task=dfd7bac0 seq=703 NILFS(segment) nilfs_segctor_thread: woke NILFS(segment) nilfs_segctor_thread: sequence: req=703, done=702, state=0 NILFS(segment) nilfs_transaction_lock: task c0909820 locking segment semaphore NILFS(segment) nilfs_transaction_lock: locked NILFS(segment) nilfs_segctor_confirm: Skipped construction (no changes) NILFS(segment) nilfs_segctor_notify: complete requests from seq=703 to seq=703 NILFS(segment) nilfs_segctor_wakeup: wakeup task=dfe6fb44 seq=703 NILFS(segment) nilfs_transaction_unlock: task c0909820 unlocked segment semaphore NILFS(segment) nilfs_segctor_thread: sequence: req=703, done=703, state=0 NILFS(segment) nilfs_segctor_sync: done task=dfd7bac0 seq=703 err=0 NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment semaphore NILFS(segment) nilfs_transaction_lock: locked NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment semaphore // Bunch of repeated lock, lock, unlock entries... NILFS(segment) nilfs_transaction_lock: task dfd7bac0 locking segment semaphore NILFS(segment) nilfs_transaction_lock: locked NILFS(segment) nilfs_transaction_unlock: task dfd7bac0 unlocked segment semaphore // Game over. As you can see, I've cut out some repeated sections to trim down the output. If you need more log info beyond this tail, let me know how much and how you would like me to pass it to you. Bill _______________________________________________ users mailing list [email protected] https://www.nilfs.org/mailman/listinfo/users
