On Saturday, July 16, 2016, Oliver Dzombic <[email protected]> wrote:
> Hi Jake, > > thank you very much both was needed, MTU and VAAI deactivated ( i hope > that wont interfere with vmotion or other features ). > > I changed now the MTU of vmkernel and vswitch. That solved this problem. Try turning VAAI back on at some point. > > So i could make an ext4 filesystem and mount it. > > Running > > dd if=/dev/zero of=/mnt/8G_test bs=4k count=2M conv=fdatasync > > Something is strange to me: > > The network gets streight 1 Gbit ( maximum connection ) of iscsi bandwidth. > > But inside the vm i can only see 40-50MB/s. > > I mean replicationsize is 2. So it would be easy to say 1/2 of 1 Gbit = > 500 Mbit = 40-50MB/s. > > But should this reduction not be inside of the ceph cluster ? Which is > going with 10G network ? > > I mean the data are hitting with 1 Gbit the ceph iscsi server. So now > this is transported to RBD internally by tgt. > And there its multiplied by 2 ( over the cluster network which is 10G ) > before the ACK is sended back to iscsi. So the cluster will internally > duplicate it via 10G. So my expected bandwidth inside the vm should be > higher than half of the maximum speed. > > Is this a wrong understanding of the mechanism ? The delay is most likely just having to wait for 2 disks to actually do the write. > > -- > Mit freundlichen Gruessen / Best regards > > Oliver Dzombic > IP-Interactive > > mailto:[email protected] <javascript:;> > > Anschrift: > > IP Interactive UG ( haftungsbeschraenkt ) > Zum Sonnenberg 1-3 > 63571 Gelnhausen > > HRB 93402 beim Amtsgericht Hanau > Geschäftsführung: Oliver Dzombic > > Steuer Nr.: 35 236 3622 1 > UST ID: DE274086107 > > > Am 16.07.2016 um 02:18 schrieb Jake Young: > > I had some odd issues like that due to MTU mismatch. > > > > Keep in mind that the vSwitch and vmkernel port have independent MTU > > settings. Verify you can ping with large size packets without > > fragmentation between your host and iscsi target. > > > > If that's not it, you can try to disable VAAI options to see if one of > > them is causing issues. I haven't used ESXi 6.0 yet. > > > > Jake > > > > > > On Friday, July 15, 2016, Oliver Dzombic <[email protected] > <javascript:;> > > <mailto:[email protected] <javascript:;>>> wrote: > > > > Hi, > > > > i am currently trying out the stuff. > > > > My tgt config: > > > > # cat tgtd.conf > > # The default config file > > include /etc/tgt/targets.conf > > > > # Config files from other packages etc. > > include /etc/tgt/conf.d/*.conf > > > > nr_iothreads=128 > > > > > > ----- > > > > # cat iqn.2016-07.tgt.esxi-test.conf > > <target iqn.2016-07.tgt.esxi-test> > > initiator-address ALL > > scsi_sn esxi-test > > #vendor_id CEPH > > #controller_tid 1 > > write-cache on > > read-cache on > > driver iscsi > > bs-type rbd > > <backing-store vmware1/esxi-test> > > lun 1 > > scsi_id cf10000c4a71e700506357 > > </backing-store> > > </target> > > > > > > -------------- > > > > > > If i create a vm inside esxi 6 and try to format the virtual hdd, i > see > > in logs: > > > > sd:2:0:0:0: [sda] CDB: > > Write(10): 2a 00 0f 86 a8 80 00 01 40 00 > > mptscsih: ioc0: task abort: SUCCESS (rv=2002) (sc=ffff880068aa5e00) > > mptscsih: ioc0: attempting task abort! ( sc=ffff880068aa4a80) > > > > With the LSI HDD emulation. With the vmware paravirtualization > > everything just freeze. > > > > Any idea with that issue ? > > > > -- > > Mit freundlichen Gruessen / Best regards > > > > Oliver Dzombic > > IP-Interactive > > > > mailto:[email protected] <javascript:;> > > > > Anschrift: > > > > IP Interactive UG ( haftungsbeschraenkt ) > > Zum Sonnenberg 1-3 > > 63571 Gelnhausen > > > > HRB 93402 beim Amtsgericht Hanau > > Geschäftsführung: Oliver Dzombic > > > > Steuer Nr.: 35 236 3622 1 > > UST ID: DE274086107 > > > > > > Am 11.07.2016 um 22:24 schrieb Jake Young: > > > I'm using this setup with ESXi 5.1 and I get very good > performance. I > > > suspect you have other issues. Reliability is another story (see > > Nick's > > > posts on tgt and HA to get an idea of the awful problems you can > > have), > > > but for my test labs the risk is acceptable. > > > > > > > > > One change I found helpful is to run tgtd with 128 threads. I'm > > running > > > Ubuntu 14.04, so I editted my /etc/init.tgt.conf file and changed > the > > > line that read: > > > > > > exec tgtd > > > > > > to > > > > > > exec tgtd --nr_iothreads=128 > > > > > > > > > If you're not concerned with reliability, you can enhance > throughput > > > even more by enabling rbd client write-back cache in your tgt VM's > > > ceph.conf file (you'll need to restart tgtd for this to take > effect): > > > > > > [client] > > > rbd_cache = true > > > rbd_cache_size = 67108864 # (64MB) > > > rbd_cache_max_dirty = 50331648 # (48MB) > > > rbd_cache_target_dirty = 33554432 # (32MB) > > > rbd_cache_max_dirty_age = 2 > > > rbd_cache_writethrough_until_flush = false > > > > > > > > > > > > > > > Here's a sample targets.conf: > > > > > > <target iqn.2014-04.tgt.Charter> > > > initiator-address ALL > > > scsi_sn Charter > > > #vendor_id CEPH > > > #controller_tid 1 > > > write-cache on > > > read-cache on > > > driver iscsi > > > bs-type rbd > > > <backing-store charter/vmguest> > > > lun 5 > > > scsi_id cfe1000c4a71e700506357 > > > </backing-store> > > > <backing-store charter/voting> > > > lun 6 > > > scsi_id cfe1000c4a71e700507157 > > > </backing-store> > > > <backing-store charter/oradata> > > > lun 7 > > > scsi_id cfe1000c4a71e70050da7a > > > </backing-store> > > > <backing-store charter/oraback> > > > lun 8 > > > scsi_id cfe1000c4a71e70050bac0 > > > </backing-store> > > > </target> > > > > > > > > > > > > I don't have FIO numbers handy, but I have some oracle calibrate io > > > output. > > > > > > We're running Oracle RAC database servers in linux VMs on ESXi 5.1, > > > which use iSCSI to connect to the tgt service. I only have a > single > > > connection setup in ESXi for each LUN. I tested using > > multipathing and > > > two tgt VMs presenting identical LUNs/RBD disks, but found that > there > > > wasn't a significant performance gain by doing this, even with > > > round-robin path selecting in VMware. > > > > > > > > > These tests were run from two RAC VMs, each on a different host, > with > > > both hosts connected to the same tgt instance. The way we have > oracle > > > configured, it would have been using two of the LUNs heavily > > during this > > > calibrate IO test. > > > > > > > > > This output is with 128 threads in tgtd and rbd client cache > enabled: > > > > > > START_TIME END_TIME MAX_IOPS MAX_MBPS > > MAX_PMBPS LATENCY DISKS > > > -------------------- -------------------- ---------- ---------- > > ---------- ---------- ---------- > > > 28-JUN-016 15:10:50 28-JUN-016 15:20:04 14153 658 > > 412 14 75 > > > > > > > > > This output is with the same configuration, but with rbd client > cache > > > disabled: > > > > > > START_TIME END_TIME MAX_IOPS MAX_MBPS > > MAX_PMBPS LATENCY DISKS > > > -------------------- -------------------- ---------- ---------- > > ---------- ---------- ---------- > > > 28-JUN-016 22:44:29 28-JUN-016 22:49:05 7449 161 > > 219 20 75 > > > > > > This output is from a directly connected EMC VNX5100 FC SAN with 25 > > > disks using dual 8Gb FC links on a different lab system: > > > > > > START_TIME END_TIME MAX_IOPS MAX_MBPS > > MAX_PMBPS LATENCY DISKS > > > -------------------- -------------------- ---------- ---------- > > ---------- ---------- ---------- > > > 28-JUN-016 22:11:25 28-JUN-016 22:18:48 6487 299 > > 224 19 75 > > > > > > > > > One of our goals for our Ceph cluster is to replace the EMC SANs. > > We've > > > accomplished this performance wise, the next step is to get a > > plausible > > > iSCSI HA solution working. I'm very interested in what Mike > > Christie is > > > putting together. I'm in the process of vetting the SUSE solution > > now. > > > > > > BTW - The tests were run when we had 75 OSDs, which are all > > 7200RPM 2TB > > > HDs, across 9 OSD hosts. We have no SSD journals, instead we have > all > > > the disks setup as single disk RAID1 disk groups with WB cache with > > > BBU. All OSD hosts have 40Gb networking and the ESXi hosts have > 10G. > > > > > > Jake > > > > > > > > > On Mon, Jul 11, 2016 at 12:06 PM, Oliver Dzombic > > <[email protected] <javascript:;> > > > <mailto:[email protected] <javascript:;>>> wrote: > > > > > > Hi Mike, > > > > > > i was trying: > > > > > > https://ceph.com/dev-notes/adding-support-for-rbd-to-stgt/ > > > > > > ONE target, from different OSD servers directly, to multiple > > vmware esxi > > > servers. > > > > > > A config looked like: > > > > > > #cat iqn.ceph-cluster_netzlaboranten-storage.conf > > > > > > <target iqn.ceph-cluster:vmware-storage> > > > driver iscsi > > > bs-type rbd > > > backing-store rbd/vmware-storage > > > initiator-address 10.0.0.9 > > > initiator-address 10.0.0.10 > > > incominguser vmwaren-storage RPb18P0xAqkAw4M1 > > > </target> > > > > > > > > > We had 4 OSD servers. Everyone had this config running. > > > We had 2 vmware servers ( esxi ). > > > > > > So we had 4 paths to this vmware-storage RBD object. > > > > > > VMware, in the very end, had 8 paths ( 4 path's directly > > connected to > > > the specific vmware server ) + 4 paths this specific vmware > > servers saw > > > via the other vmware server ). > > > > > > There were very big problems with performance. I am talking > > about < 10 > > > MB/s. So the customer was not able to use it, so good old nfs > is > > > serving. > > > > > > At that time we used ceph hammer, and i think esxi 5.5 the > > customer was > > > using, or maybe esxi 6, was somewhere last year the testing. > > > > > > -------------------- > > > > > > We will make a new attempt now with ceph jewel and esxi 6 and > > this time > > > we will manage the vmware servers. > > > > > > As soon as we fixed this > > > > > > "ceph mon Segmentation fault after set crush_ruleset ceph > 10.2.2" > > > > > > what i already mailed here to the list is solved, we can start > the > > > testing. > > > > > > > > > -- > > > Mit freundlichen Gruessen / Best regards > > > > > > Oliver Dzombic > > > IP-Interactive > > > > > > mailto:[email protected] <javascript:;> <mailto: > [email protected] <javascript:;>> > > > > > > Anschrift: > > > > > > IP Interactive UG ( haftungsbeschraenkt ) > > > Zum Sonnenberg 1-3 > > > 63571 Gelnhausen > > > > > > HRB 93402 beim Amtsgericht Hanau > > > Geschäftsführung: Oliver Dzombic > > > > > > Steuer Nr.: 35 236 3622 1 <tel:35%20236%203622%201> > > > UST ID: DE274086107 > > > > > > > > > Am 11.07.2016 um 17:45 schrieb Mike Christie: > > > > On 07/08/2016 02:22 PM, Oliver Dzombic wrote: > > > >> Hi, > > > >> > > > >> does anyone have experience how to connect vmware with ceph > > smart ? > > > >> > > > >> iSCSI multipath does not really worked well. > > > > > > > > Are you trying to export rbd images from multiple iscsi > > targets at the > > > > same time or just one target? > > > > > > > > For the HA/multiple target setup, I am working on this for > > Red Hat. We > > > > plan to release it in RHEL 7.3/RHCS 2.1. SUSE ships something > > > already as > > > > someone mentioned. > > > > > > > > We just got a large chunk of code in the upstream kernel (it > > is in the > > > > block layer maintainer's tree for the next kernel) so it > > should be > > > > simple to add COMPARE_AND_WRITE support now. We should be > > posting krbd > > > > exclusive lock support in the next couple weeks. > > > > > > > > > > > >> NFS could be, but i think thats just too much layers in > between > > > to have > > > >> some useable performance. > > > >> > > > >> Systems like ScaleIO have developed a vmware addon to talk > > with it. > > > >> > > > >> Is there something similar out there for ceph ? > > > >> > > > >> What are you using ? > > > >> > > > >> Thank you ! > > > >> > > > > > > > _______________________________________________ > > > ceph-users mailing list > > > [email protected] <javascript:;> <mailto: > [email protected] <javascript:;>> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > _______________________________________________ > > ceph-users mailing list > > [email protected] <javascript:;> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > [email protected] <javascript:;> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
