this is the info file contents.. is there another file you would want to see for config? type=2 count=2 status=1 sub_count=2 stripe_count=1 replica_count=2 disperse_count=0 redundancy_count=0 version=3 transport-type=0 volume-id=98c258e6-ae9e-4407-8f25-7e3f7700e100 username=removed just cause password=removed just cause op-version=3 client-op-version=3 quota-version=0 parent_volname=N/A restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 diagnostics.count-fop-hits=on diagnostics.latency-measurement=on performance.readdir-ahead=on brick-0=media1-be:-gluster-brick1-gluster_volume_0 brick-1=media2-be:-gluster-brick1-gluster_volume_0
here are some log entries, etc-glusterfs-glusterd.vol.log: The message "I [MSGID: 106006] [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 39 times between [2016-10-06 20:10:14.963402] and [2016-10-06 20:12:11.979684] [2016-10-06 20:12:14.980203] I [MSGID: 106006] [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify] 0-management: nfs has disconnected from glusterd. [2016-10-06 20:13:50.993490] W [socket.c:596:__socket_rwv] 0-nfs: readv on /var/run/gluster/360710d59bc4799f8c8a6374936d2b1b.socket failed (Invalid argument) I can provide any specific details you would like to see.. Last night I tried 1 more time and it appeared to be working ok for running 1 VM under VMware but as soon as I had 3 running the targets became unresponsive. I believe gluster volume is ok but for whatever reason the ISCSI target daemon seems to be having some issues... here is from the messages file: Oct 5 23:13:00 media2 kernel: MODE SENSE: unimplemented page/subpage: 0x1c/0x02 Oct 5 23:13:00 media2 kernel: MODE SENSE: unimplemented page/subpage: 0x1c/0x02 Oct 5 23:13:35 media2 kernel: iSCSI/iqn.1998-01.com.vmware:vmware4-0941d552: Unsupported SCSI Opcode 0x4d, sending CHECK_CONDITION. Oct 5 23:13:35 media2 kernel: iSCSI/iqn.1998-01.com.vmware:vmware4-0941d552: Unsupported SCSI Opcode 0x4d, sending CHECK_CONDITION. and here are some more VMware iscsi errors: 2016-10-06T20:22:11.496Z cpu2:32825)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x89 (0x412e808532c0, 32801) to dev "naa.6001405c0d86944f3d2468d80c7d1540" on 2016-10-06T20:22:11.635Z cpu2:32787)ScsiDeviceIO: 2338: Cmd(0x412e808532c0) 0x89, CmdSN 0x4f05 from world 32801 to dev "naa.6001405c0d86944f3d2468d80c7d1 2016-10-06T20:22:11.635Z cpu3:35532)Fil3: 15389: Max timeout retries exceeded for caller Fil3_FileIO (status 'Timeout') 2016-10-06T20:22:11.635Z cpu2:196414)HBX: 2832: Waiting for timed out [HB state abcdef02 offset 3928064 gen 25 stampUS 49571997650 uuid 57f5c142-45632d75 2016-10-06T20:22:11.635Z cpu3:35532)HBX: 2832: Waiting for timed out [HB state abcdef02 offset 3928064 gen 25 stampUS 49571997650 uuid 57f5c142-45632d75- 2016-10-06T20:22:11.635Z cpu0:32799)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x28 (0x412e80848580, 32799) to dev "naa.6001405c0d86944f3d2468d80c7d1540" on 2016-10-06T20:22:11.635Z cpu0:32799)ScsiDeviceIO: 2325: Cmd(0x412e80848580) 0x28, CmdSN 0x4f06 from world 32799 to dev "naa.6001405c0d86944f3d2468d80c7d1 2016-10-06T20:22:11.773Z cpu0:32843)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x28 (0x412e80848580, 32799) to dev "naa.6001405c0d86944f3d2468d80c7d1540" on 2016-10-06T20:22:11.916Z cpu0:35549)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x28 (0x412e80848580, 32799) to dev "naa.6001405c0d86944f3d2468d80c7d1540" on 2016-10-06T20:22:12.000Z cpu2:33431)iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x410987bf0800 network resource pool netsched.pools.persist.iscsi associa 2016-10-06T20:22:12.000Z cpu2:33431)iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x410987bf0800 network tracker id 16 tracker.iSCSI.172.16.1.40 associated 2016-10-06T20:22:12.056Z cpu0:35549)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x28 (0x412e80848580, 32799) to dev "naa.6001405c0d86944f3d2468d80c7d1540" on 2016-10-06T20:22:12.194Z cpu0:35549)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x28 (0x412e80848580, 32799) to dev "naa.6001405c0d86944f3d2468d80c7d1540" on 2016-10-06T20:22:12.253Z cpu2:33431)WARNING: iscsi_vmk: iscsivmk_StartConnection: vmhba38:CH:1 T:1 CN:0: iSCSI connection is being marked "ONLINE" 2016-10-06T20:22:12.253Z cpu2:33431)WARNING: iscsi_vmk: iscsivmk_StartConnection: Sess [ISID: 00023d000004 TARGET: iqn.2016-09.iscsi.gluster:shared TPGT: 2016-10-06T20:22:12.253Z cpu2:33431)WARNING: iscsi_vmk: iscsivmk_StartConnection: Conn [CID: 0 L: 172.16.1.53:49959 R: 172.16.1.40:3260] Is it that the gluster overhead is just killing LIO/target? thanks, Mike On Thu, Oct 6, 2016 at 12:22 PM, Vijay Bellur <[email protected]> wrote: > Hi Mike, > > Can you please share your gluster volume configuration? > > Also do you notice anything in client logs on the node where fileio > backstore is configured? > > Thanks, > Vijay > > On Wed, Oct 5, 2016 at 8:56 PM, Michael Ciccarelli <[email protected]> > wrote: > > So I have a fairly basic setup using glusterfs between 2 nodes. The nodes > > have 10 gig connections and the bricks reside on SSD LVM LUNs: > > > > Brick1: media1-be:/gluster/brick1/gluster_volume_0 > > Brick2: media2-be:/gluster/brick1/gluster_volume_0 > > > > > > On this volume I have a LIO iscsi target with 1 fileio backstore that's > > being shared out to vmware ESXi hosts. The volume is around 900 gig and > the > > fileio store is around 850g: > > > > -rw-r--r-- 1 root root 912680550400 Oct 5 20:47 iscsi.disk.3 > > > > I set the WWN to be the same so the ESXi hosts see the nodes as 2 paths > to > > the same target. I believe this is what I want. The issues I'm seeing is > > that while the IO wait is low I'm seeing high CPU usage with only 3 VMs > > running on only 1 of the ESX servers: > > > > this is media2-be: > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ > COMMAND > > 1474 root 20 0 1396620 37912 5980 S 135.0 0.1 157:01.84 > > glusterfsd > > 1469 root 20 0 747996 13724 5424 S 2.0 0.0 1:10.59 > > glusterfs > > > > And this morning it seemed like I had to restart the LIO service on > > media1-be as the VMware was seeing time-out issues. I'm seeing issues > like > > this on the VMware ESX servers: > > > > 2016-10-06T00:51:41.100Z cpu0:32785)WARNING: ScsiDeviceIO: 1223: Device > > naa.600140501ce79002e724ebdb66a6756d performance has deteriorated. I/O > > latency increased from average value of 33420 microseconds to 732696 > > microseconds. > > > > Are there any special settings I need to have gluster+LIO+vmware to work? > > Has anyone gotten this to work fairly well that it is stable? What am I > > missing? > > > > thanks, > > Mike > > > > > > > > _______________________________________________ > > Gluster-users mailing list > > [email protected] > > http://www.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
