On Mon, Jul 21, 2008 at 8:05 PM, Sage Weil <[EMAIL PROTECTED]> wrote: > On Mon, 21 Jul 2008, Padraig O'Sullivan wrote: >> Everything goes fine until I attempt to create a few files and >> directories using csyn as suggested at the end of the document on >> creating a small test cluster. The csyn command just hangs and never >> returns. The output I get is: >> >> # ./csyn --syn makedirs 1 1 1 --syn walk >> starting csyn at 0.0.0.0:36175/7874/0 >> mounting and starting 1 syn client(s) >> waiting for client(s) to finish >> >> When I do an strace of the csyn process, I can see that it is waiting >> on the futex system call. Does anyone have ideas as to why this is >> happening? I'd like to be able to debug this further and understand >> exactly what's happening. Any suggestions on how to debug this or what >> log files (I have debugging enabled for all of the Ceph daemons) etc. >> to look at would be great. > > All of csyn's interactions with the system are over the network, so > stracing it won't tell you too much. Try passing --debug_client 10 and > --debug_ms 1 to see what messages it is passing and whether it > successfully "mounts" the fs. > > My guess is the osds are having trouble getting initialized. You can do > > ./cmonctl mds stat > ./cmonctl pg stat > > to tell you whether the MDS successfully started, and whether the > placements groups came online (they should all be 'active+clean', but if > the osds are having trouble initializing they'll be stuck at 'creating'). > > Let me know how it goes, and feel free to ping me in #ceph on oftc.net. > > sage >
You were right about the osds having trouble initializing. The output I'm getting from the cmonctl commands looks like what you expected: # ./cmonctl mds stat mon0 <- 'mds stat' mon0 -> 'e2: 1 nodes: 1 up:creating' (0) # ./cmonctl pg stat mon0 <- 'pg stat' mon0 -> 'v5: 1152 pgs: 1152 creating; 0 KB used, 0 KB / 0 KB free' (0) # So there is definitely something up with btrfs here (I guess btrfs is far from stable so I can't complain) since I have had no issues when setting up a cluster with block devices. I have debugging enabled on the osd daemons and have been having a look through their log files. I have 2 storage nodes and in the log file for the first one (after some initialization messages), I keep seeing the following lines repeated over and over: 080721.054648.450368 osd0 3 calc_priors_during 2x0.0 [2,3) = 080721.054648.450396 osd0 3 mkpg 2x0.0 e2 h ec=0 les=0 3/3/3 : querying priors 080721.054648.450412 osd0 3 try_create_pg 2x0.0 - creating now 080721.054648.450420 osd0 3 _create_lock_pg 2x0.0 080721.054648.450976 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0] write_log 080721.054648.451080 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0] write_log to [0,0) 080721.054648.451090 osd0 3 _create_lock_new_pg pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0] 080721.054648.451113 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0] peer. acting is [0], prior_set is 080721.054648.451129 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0] have prior_set info. peers_complete_thru 0'0 080721.054648.451146 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0] newest_update 0'0 (me) 080721.054648.451153 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0] oldest_update 0'0 080721.054648.451734 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0] peers_complete_thru 0'0 080721.054648.451782 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0 active] write_log 080721.054648.451846 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0 active] write_log to [0,0) 080721.054648.451871 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0 active] clean_up_local 080721.054648.451888 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0 active] activate - complete 080721.054648.451927 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0 active] finish_recovery 080721.054648.451943 osd0 3 pg[2x0.0( empty ec=3 les=3 3/3/3) r=0 pct 0'0 active clean] purge_strays 080721.054648.452038 osd0 3 calc_priors_during 2x0.1 [2,3) = 080721.054648.452046 osd0 3 mkpg 2x0.1 e2 h ec=0 les=0 3/3/3 : querying priors 080721.054648.452053 osd0 3 try_create_pg 2x0.1 - creating now 080721.054648.452058 osd0 3 _create_lock_pg 2x0.1 while in the log file for the second storage node, I see the following lines repeated for some time: 080721.053755.682731 osd1 4 calc_priors_during 2x0.0 [2,4) = 0 080721.053755.683295 osd1 4 mkpg 2x0.0 e2 h ec=0 les=0 4/4/4 : querying priors 0 080721.053755.683321 osd1 4 try_create_pg 2x0.0 - waiting for priors 0 080721.053755.683427 osd1 4 calc_priors_during 2x0.1 [2,4) = 0 080721.053755.683435 osd1 4 mkpg 2x0.1 e2 h ec=0 les=0 4/4/4 : querying priors 0 080721.053755.683442 osd1 4 try_create_pg 2x0.1 - waiting for priors 0 Then those messages stop and I begin seeing this repeated: 080721.053755.754045 osd1 4 try_create_pg 4x2.7f - waiting for priors 0 080721.053755.754275 osd1 4 do_queries querying osd0 on 585 PGs 080721.053755.754438 osd1 4 kick_pg_split_queue 080721.053755.754517 osd1 4 wait_for_no_ops - none 080721.053755.754525 osd1 4 handle_osd_map epochs [1,4], i have 4 080721.053755.754539 osd1 4 handle_osd_map already had incremental map epoch 1 080721.053755.754565 osd1 4 .. it is 196 bytes 080721.053755.754583 osd1 4 handle_osd_map already had incremental map epoch 2 080721.053755.754613 osd1 4 .. it is 587 bytes 080721.053755.754625 osd1 4 handle_osd_map already had incremental map epoch 3 080721.053755.754646 osd1 4 .. it is 120 bytes 080721.053755.754658 osd1 4 handle_osd_map already had incremental map epoch 4 080721.053755.754678 osd1 4 .. it is 120 bytes 080721.053755.754685 osd1 4 write_superblock sb(fsid 5657c7053faf5f22.5caaf2297d8eaf77 osd1 e4 [1,4]) 080721.053756.601799 osd1 4 heartbeat: stat(080721.053756.601331 rdlat=0 / 0 fshedin=0) 080721.053757.702019 osd1 4 heartbeat: stat(080721.053757.701902 rdlat=0 / 0 fshedin=0) 080721.053758.402729 osd1 4 heartbeat: stat(080721.053758.402609 rdlat=0 / 0 fshedin=0) 080721.053759.598931 osd1 4 do_mon_report 080721.053759.598953 osd1 4 send_alive up_thru currently 0 want 0 080721.053759.598977 osd1 4 send_pg_stats 080721.053759.803331 osd1 4 heartbeat: stat(080721.053759.803197 rdlat=0 / 0 fshedin=0) 080721.053800.404611 osd1 4 heartbeat: stat(080721.053800.404157 rdlat=0 / 0 fshedin=0) 080721.053801.105346 osd1 4 heartbeat: stat(080721.053801.105230 rdlat=0 / 0 fshedin=0) 080721.053802.306048 osd1 4 heartbeat: stat(080721.053802.305933 rdlat=0 / 0 fshedin=0) 080721.053802.806434 osd1 4 heartbeat: stat(080721.053802.806326 rdlat=0 / 0 fshedin=0) 080721.053804.206704 osd1 4 heartbeat: stat(080721.053804.206541 rdlat=0 / 0 fshedin=0) 080721.053804.599602 osd1 4 do_mon_report I'm not really sure what these messages mean in the log files as I'm not that familiar with Ceph internally yet. Do you have any pointers or ideas (based on this limited information) on what's going on here? Or if you have any suggestions on other log files to look at? I've read through the USENIX paper describing the architecture and I'm starting to go through the source code to gain a better understanding of how everything works which I hope will also help me in understanding what the messages in the log files mean (or at least where they are coming from). I'm taking a grad course in HPC in the Fall which involves a significant project and I was hoping to do a project with Ceph as I find it pretty interesting and I'd like to be able to contribute to an open source project. I know this is quite off-topic now but are the projects suggested for Google's summer of code still available for someone to work on? -Padraig ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Ceph-devel mailing list Ceph-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ceph-devel