Re: [openstack-dev] [cinder] should we use fsync when writing iscsi config file?
> -Original Message- > From: Eric Harney [mailto:ehar...@redhat.com] > Sent: Friday, September 25, 2015 2:56 PM > To: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [cinder] should we use fsync when writing iscsi > config file? > > On 09/25/2015 02:30 PM, Mitsuhiro Tanino wrote: > > On 09/22/2015 06:43 PM, Robert Collins wrote: > >> On 23 September 2015 at 09:52, Chris Friesen > >> <chris.frie...@windriver.com> wrote: > >>> Hi, > >>> > >>> I recently had an issue with one file out of a dozen or so in > >>> "/opt/cgcs/cinder/data/volumes/" being present but of size zero. > >>> I'm running stable/kilo if it makes a difference. > >>> > >>> Looking at the code in > >>> volume.targets.tgt.TgtAdm.create_iscsi_target(), I'm wondering if we > >>> should do a fsync() before the close(). The way it stands now, it > >>> seems like it might be possible to write the file, start making use > >>> of it, and then take a power outage before it actually gets written > >>> to persistent storage. When we come back up we could have an > >>> instance expecting to make use of it, but no target information in the on- > disk copy of the file. > > > > I think even if there is no target information in configuration file > > dir, c-vol started successfully and iSCSI targets were created automatically > and volumes were exported, right? > > > > There is an problem in this case that the iSCSI target was created > > without authentication because we can't get previous authentication from the > configuration file. > > > > I'm curious what kind of problem did you met? > > > >> If its being kept in sync with DB records, and won't self-heal from > >> this situation, then yes. e.g. if the overall workflow is something > >> like > > > > In my understanding, the provider_auth in database has user name and > > password > for iSCSI target. > > Therefore if we get authentication from DB, I think we can self-heal > > from this situation correctly after c-vol service is restarted. > > > > Is this not already done as-needed by ensure_export()? Yes. This logic is in the ensure_export but only lio target uses DB and other targets use file. > > The lio target obtains authentication from provider_auth in database, > > but tgtd, iet, cxt obtain authentication from file to recreate iSCSI target > when c-vol is restarted. > > If the file is missing, these volumes are exported without > > authentication and configuration file is recreated as I mentioned above. > > > > tgtd: Get target chap auth from file > > iet: Get target chap auth from file > > cxt: Get target chap auth from file > > lio: Get target chap auth from Database(in provider_auth) > > scst: Get target chap auth by using original command > > > > If we get authentication from DB for tgtd, iet and cxt same as lio, we > > can recreate iSCSI target with proper authentication when c-vol is > > restarted. > > I think this is a solution for this situation. > > > > This may be possible, but fixing the target config file to be written more > safely to work as currently intended is still a win. I think it is better to fix both of them, (1) Add a logic to write configuration file using fsync (2) Read authentication from database during ensure_export() same as lio target. Thanks, Mitsuhiro Tanino > > Any thought? > > > > Thanks, > > Mitsuhiro Tanino > > > >> -Original Message- > >> From: Chris Friesen [mailto:chris.frie...@windriver.com] > >> Sent: Friday, September 25, 2015 12:48 PM > >> To: openstack-dev@lists.openstack.org > >> Subject: Re: [openstack-dev] [cinder] should we use fsync when > >> writing iscsi config file? > >> > >> On 09/24/2015 04:21 PM, Chris Friesen wrote: > >>> On 09/24/2015 12:18 PM, Chris Friesen wrote: > >>> > >>>> > >>>> I think what happened is that we took the SIGTERM after the open() > >>>> call in create_iscsi_target(), but before writing anything to the file. > >>>> > >>>> f = open(volume_path, 'w+') > >>>> f.write(volume_conf) > >>>> f.close() > >>>> > >>>> The 'w+' causes the file to be immediately truncated on opening, > >>>> leading to an empty file. > >>>> > >>>> To work around this, I think we need to do the clas
Re: [openstack-dev] [cinder] should we use fsync when writing iscsi config file?
> -Original Message- > From: Chris Friesen [mailto:chris.frie...@windriver.com] > Sent: Friday, September 25, 2015 3:04 PM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [cinder] should we use fsync when writing iscsi > config file? > > On 09/25/2015 12:30 PM, Mitsuhiro Tanino wrote: > > On 09/22/2015 06:43 PM, Robert Collins wrote: > >> On 23 September 2015 at 09:52, Chris Friesen > >> <chris.frie...@windriver.com> wrote: > >>> Hi, > >>> > >>> I recently had an issue with one file out of a dozen or so in > >>> "/opt/cgcs/cinder/data/volumes/" being present but of size zero. > >>> I'm running stable/kilo if it makes a difference. > >>> > >>> Looking at the code in > >>> volume.targets.tgt.TgtAdm.create_iscsi_target(), I'm wondering if we > >>> should do a fsync() before the close(). The way it stands now, it > >>> seems like it might be possible to write the file, start making use > >>> of it, and then take a power outage before it actually gets written > >>> to persistent storage. When we come back up we could have an > >>> instance expecting to make use of it, but no target information in the on- > disk copy of the file. > > > > I think even if there is no target information in configuration file > > dir, c-vol started successfully and iSCSI targets were created automatically > and volumes were exported, right? > > > > There is an problem in this case that the iSCSI target was created > > without authentication because we can't get previous authentication from the > configuration file. > > > > I'm curious what kind of problem did you met? > > We had an issue in a private patch that was ported to Kilo without realizing > that the data type of chap_auth had changed. I understand. Thank you for your explanation. > > In my understanding, the provider_auth in database has user name and > > password > for iSCSI target. > > Therefore if we get authentication from DB, I think we can self-heal > > from this situation correctly after c-vol service is restarted. > > > > The lio target obtains authentication from provider_auth in database, > > but tgtd, iet, cxt obtain authentication from file to recreate iSCSI target > when c-vol is restarted. > > If the file is missing, these volumes are exported without > > authentication and configuration file is recreated as I mentioned above. > > > > tgtd: Get target chap auth from file > > iet: Get target chap auth from file > > cxt: Get target chap auth from file > > lio: Get target chap auth from Database(in provider_auth) > > scst: Get target chap auth by using original command > > > > If we get authentication from DB for tgtd, iet and cxt same as lio, we > > can recreate iSCSI target with proper authentication when c-vol is > > restarted. > > I think this is a solution for this situation. > > If we fixed the chap auth info then we could live with a zero-size file. > However, with the current code if we take a kernel panic or power outage it's > theoretically possible to end up with a corrupt file of nonzero size (due to > metadata hitting the persistant storage before the data). I'm not confident > that the current code would deal properly with that. > > That said, if we always regenerate every file from the DB on cinder-volume > startup (regardless of whether or not it existed, and without reading in the > existing file), then we'd be okay without the robustness improvements. This file is referred when the SCSI target service is restarted. Therefore, adding robustness for this file is also good approach. IMO. Thanks, Mitsuhiro Tanino > Chris > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.openstack.org_cgi- > 2Dbin_mailman_listinfo_openstack-2Ddev=BQICAg=DZ- > EF4pZfxGSU6MfABwx0g=klD1krzABGW034E9oBtY1xmIn3g7xZAIxV0XxaZpkJE=SZPjS9uXH42q > hmzSRbSZ8x39C9xi3aBDw-SQ7xa8cTM=XWJ91NIJglFkBSr762rSq9TdWeiRSdS5Pl0LzS1_1Z8= __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] should we use fsync when writing iscsi config file?
On 09/22/2015 06:43 PM, Robert Collins wrote: > On 23 September 2015 at 09:52, Chris Friesen > <chris.frie...@windriver.com> wrote: >> Hi, >> >> I recently had an issue with one file out of a dozen or so in >> "/opt/cgcs/cinder/data/volumes/" being present but of size zero. I'm >> running stable/kilo if it makes a difference. >> >> Looking at the code in >> volume.targets.tgt.TgtAdm.create_iscsi_target(), I'm wondering if we >> should do a fsync() before the close(). The way it stands now, it >> seems like it might be possible to write the file, start making use >> of it, and then take a power outage before it actually gets written >> to persistent storage. When we come back up we could have an >> instance expecting to make use of it, but no target information in the >> on-disk copy of the file. I think even if there is no target information in configuration file dir, c-vol started successfully and iSCSI targets were created automatically and volumes were exported, right? There is an problem in this case that the iSCSI target was created without authentication because we can't get previous authentication from the configuration file. I'm curious what kind of problem did you met? > If its being kept in sync with DB records, and won't self-heal from > this situation, then yes. e.g. if the overall workflow is something > like In my understanding, the provider_auth in database has user name and password for iSCSI target. Therefore if we get authentication from DB, I think we can self-heal from this situation correctly after c-vol service is restarted. The lio target obtains authentication from provider_auth in database, but tgtd, iet, cxt obtain authentication from file to recreate iSCSI target when c-vol is restarted. If the file is missing, these volumes are exported without authentication and configuration file is recreated as I mentioned above. tgtd: Get target chap auth from file iet: Get target chap auth from file cxt: Get target chap auth from file lio: Get target chap auth from Database(in provider_auth) scst: Get target chap auth by using original command If we get authentication from DB for tgtd, iet and cxt same as lio, we can recreate iSCSI target with proper authentication when c-vol is restarted. I think this is a solution for this situation. Any thought? Thanks, Mitsuhiro Tanino > -Original Message- > From: Chris Friesen [mailto:chris.frie...@windriver.com] > Sent: Friday, September 25, 2015 12:48 PM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [cinder] should we use fsync when writing iscsi > config file? > > On 09/24/2015 04:21 PM, Chris Friesen wrote: > > On 09/24/2015 12:18 PM, Chris Friesen wrote: > > > >> > >> I think what happened is that we took the SIGTERM after the open() > >> call in create_iscsi_target(), but before writing anything to the file. > >> > >> f = open(volume_path, 'w+') > >> f.write(volume_conf) > >> f.close() > >> > >> The 'w+' causes the file to be immediately truncated on opening, > >> leading to an empty file. > >> > >> To work around this, I think we need to do the classic "write to a > >> temporary file and then rename it to the desired filename" trick. > >> The atomicity of the rename ensures that either the old contents or the new > contents are present. > > > > I'm pretty sure that upstream code is still susceptible to zeroing out > > the file in the above scenario. However, it doesn't take an > > exception--that's due to a local change on our part that attempted to fix > > the > below issue. > > > > The stable/kilo code *does* have a problem in that when it regenerates > > the file it's missing the CHAP authentication line (beginning with > "incominguser"). > > I've proposed a change at https://urldefense.proofpoint.com/v2/url?u=https- > 3A__review.openstack.org_-23_c_227943_=BQICAg=DZ- > EF4pZfxGSU6MfABwx0g=klD1krzABGW034E9oBtY1xmIn3g7xZAIxV0XxaZpkJE=SVlOqKiqO04_ > NttKUIoDiaOR7cePB0SOA1bpjakqAss=q2_8XBAVH9lQ2mdT72nW7dN2EafIqJEpHGLBuf4K970= > > If anyone has suggestions on how to do this more robustly or more cleanly, > please let me know. > > Chris > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.openstack.org_cgi- > 2Dbin_mailman_listinfo_openstack-2Ddev=BQICAg=DZ- > EF4pZfxGSU6MfABwx0g=klD1krzABGW034E9oBtY1xmIn3g7xZAIxV0XxaZpkJE=SVlOqKiqO04_ > NttKUIoDiaOR7cePB0SOA1bpjakqAss=0DBbmeXSIK2c5QlBnwURY1iwNN1AXuqOLaUYnxjBl0w= __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] Spec for volume migration improvement
Hi Vincent, @Mitsuhiro Tanino, I believe you can submit another spec for the efficient migration as well. Yup, I posted a SPEC and got some positive feedbacks. https://review.openstack.org/#/c/186209/ Also I posted some comments for your spec. Waiting your next update for the spec. Thanks, Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS From: Sheng Bo Hou [mailto:sb...@cn.ibm.com] Sent: Monday, June 01, 2015 2:11 AM To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [cinder] Spec for volume migration improvement Hi all folks from Cinder, According to our agreement in the Vancouver summit, I submitted a cinder-spec to do the volume migration improvement. The spec is here for review: https://review.openstack.org/#/c/186327/https://urldefense.proofpoint.com/v2/url?u=https-3A__review.openstack.org_-23_c_186327_d=AwMGbwc=DZ-EF4pZfxGSU6MfABwx0gr=klD1krzABGW034E9oBtY1xmIn3g7xZAIxV0XxaZpkJEm=nA35Wqu6OFJXE3fxkslPyBZIx0zEtjDzPNWy0IXrXK4s=LSc2RNIkdesCRWR7tkKmIYqIsw1hpiKvMJ_YrhI6BZce=. Please feel free to give your comments. @Mitsuhiro Tanino, I believe you can submit another spec for the efficient migration as well. Thanks, folks. Best wishes, Vincent Hou (侯胜博) Staff Software Engineer, Open Standards and Open Source Team, Emerging Technology Institute, IBM China Software Development Lab Tel: 86-10-82450778 Fax: 86-10-82453660 Notes ID: Sheng Bo Hou/China/IBM@IBMCNE-mail: sb...@cn.ibm.commailto:sb...@cn.ibm.com Address:3F Ring, Building 28 Zhongguancun Software Park, 8 Dongbeiwang West Road, Haidian District, Beijing, P.R.C.100193 地址:北京市海淀区东北旺西路8号中关村软件园28号楼环宇大厦3层 邮编:100193 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] Migrating in-use volumes with swap_volume in Nova
Hi, The result turns out fine finally and the volume is successfully migrated via the swap_volume method in Nova. The swap_volume uses libvirt's live storage migration feature called blockcopy. If you use this way, you don’t need to detach a volume before volume migration. http://kashyapc.com/2014/07/06/live-disk-migration-with-libvirt-blockcopy/ Before applying this way to your system, it is recommended to test very well as many people said at the summit, because this scenario is not included tempest test case. Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS From: Sheng Bo Hou [mailto:sb...@cn.ibm.com] Sent: Friday, May 29, 2015 12:06 AM To: vivek.nandava...@vmturbo.com Cc: openstack-dev@lists.openstack.org Subject: [openstack-dev] [cinder] Migrating in-use volumes with swap_volume in Nova Hi Vivek, Per our discussion about the migration of an in-use volume, I have done the following tests: I took KVM as the hypervisor and LVM as the backend for cinder. * Configure two c-vol nodes. * Create the image in glance. * Create a volume from this image. * Boot a VM from this volume, so the volume becomes in-use and the VM is ready. * Migrate this in-use volume. The result turns out fine finally and the volume is successfully migrated via the swap_volume method in Nova. I just send you this mail as a confirmation that swap_volume works for your use case. Best wishes, Vincent Hou (侯胜博) Staff Software Engineer, Open Standards and Open Source Team, Emerging Technology Institute, IBM China Software Development Lab Tel: 86-10-82450778 Fax: 86-10-82453660 Notes ID: Sheng Bo Hou/China/IBM@IBMCNE-mail: sb...@cn.ibm.commailto:sb...@cn.ibm.com Address:3F Ring, Building 28 Zhongguancun Software Park, 8 Dongbeiwang West Road, Haidian District, Beijing, P.R.C.100193 地址:北京市海淀区东北旺西路8号中关村软件园28号楼环宇大厦3层 邮编:100193 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [stable] [Cinder] FFE for Clear migration_status from a destination volume if migration fails
Hi Jay, Thank you for your cooperation. This fix was merged successfully today. Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS -Original Message- From: Jay S. Bryant [mailto:jsbry...@electronicjungle.net] Sent: Tuesday, April 07, 2015 11:08 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [stable] [Cinder] FFE for Clear migration_status from a destination volume if migration fails Mitsuhiro, I had already put a +2 on this, so I am agreeable to an FFE. Mike or John, what are your thoughts? Jay On 04/06/2015 06:27 PM, Mitsuhiro Tanino wrote: Hello, I would like to get a FFE for patch https://review.openstack.org/#/c/161328/. This patch fixes the volume migration problem which is not executed proper cleanup steps if the volume migration is failed. This change only affects cleanup steps and does not change normal volume migration steps. Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [stable] [Cinder] FFE for Clear migration_status from a destination volume if migration fails
Hello, I would like to get a FFE for patch https://review.openstack.org/#/c/161328/. This patch fixes the volume migration problem which is not executed proper cleanup steps if the volume migration is failed. This change only affects cleanup steps and does not change normal volume migration steps. Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Cinder] Support LVM on a shared LU
Hello, Duncan, Mike, 11:13 (DuncanT) DuncanT mtanino, You can and should submit the code even if the spec isn't approved as long as it isn't looking contentious, but I will certainly take a look Base on the comment at cinder__unofficial_meeting at Wednesday, I posted both updated cinder-spec and the code. Could you review the spec and code? Spec: https://review.openstack.org/#/c/129352/ Code: https://review.openstack.org/#/c/92479/ The code is still work in progress, but most of functions are already implemented. Please check whether the code doesn't break existing cinder code. For your reference, here are whole links related to this proposal. Blueprints: * https://blueprints.launchpad.net/nova/+spec/lvm-driver-for-shared-storage * https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage Spec: * nova: https://review.openstack.org/#/c/127318/ * cinder: https://review.openstack.org/#/c/129352/ Gerrit Review: * nova: https://review.openstack.org/#/c/92443/ * cinder: https://review.openstack.org/#/c/92479/ Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Nominating Jay Pipes for nova-core
+1 Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS c/o Red Hat, 314 Littleton Road, Westford, MA 01886 -Original Message- From: Russell Bryant [mailto:rbry...@redhat.com] Sent: Thursday, July 31, 2014 7:58 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [Nova] Nominating Jay Pipes for nova-core On 07/30/2014 05:10 PM, Russell Bryant wrote: On 07/30/2014 05:02 PM, Michael Still wrote: Greetings, I would like to nominate Jay Pipes for the nova-core team. Jay has been involved with nova for a long time now. He's previously been a nova core, as well as a glance core (and PTL). He's been around so long that there are probably other types of core status I have missed. Please respond with +1s or any concerns. +1 Further, I'd like to propose that we treat all of existing +1 reviews as +2 (once he's officially added to the team). Does anyone have a problem with doing that? I think some folks would have done that anyway, but I wanted to clarify that it's OK. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova] Requesting spec freeze exception: LVM: Support a volume-group on shared storage spec
Hi Nova cores, I would like to request a spec freeze exception for my spec on LVM: Support a volume-group on shared storage. Please see the spec here: https://review.openstack.org/#/c/97602/4/specs/juno/lvm-driver-for-shared-storage.rst This is for the blueprint: https://blueprints.launchpad.net/nova/+spec/lvm-driver-for-shared-storage. This feature request is related to both nova and cinder pieces. Cinder piece is following BP. https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage Both nova-spec and blue print of cinder were once approved, however I had some discussion about cinder piece and the blueprint was changed to discussion status from approved. Therefore I'm preparing to explain benefits of my proposal at cinder community again. As for the patch of nova piece, I have already got +2 from core reviewer, so once I will get approved again for cinder piece, I think I am able to move forward both cinder piece and nova piece to get into Juno release. - Nova piece Blue print : https://blueprints.launchpad.net/nova/+spec/lvm-driver-for-shared-storage nova-spec: https://review.openstack.org/97602 nova patch : https://review.openstack.org/92443 - Cinder piece Blue print : https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage cinder patch : https://review.openstack.org/92479 Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS c/o Red Hat, 314 Littleton Road, Westford, MA 01886 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Cinder] Support LVM on a shared LU
Hi Deepak-san, Thank you for your comment. Please see following comments. 1) There is a lof of manual work needed here.. like every time the new host added.. admin needs to do FC zoning to ensure that LU is visible by the host. Right. Compared to LVMiSCSI driver, proposed driver are needed some manual admin works. Also the method you mentioend for refreshing (echo '---' ...) doesn't work reliably across all storage types does it ? The echo command is already used in rescan_hosts() at linuxfc.py before connecting new volume to a instance. As you mentioned, whether this command works properly or not depends on storage types. Therefore, admin needs to confirm the command working properly. 2) In Slide 1-1 .. how ( and who?) ensures that the compute nodes don't step on each other is using the LVs ? In other words.. how is it ensured that LV1 is not used by compute nodes 1 and 2 at the same time ? In my understanding, Nova can't assign single cinder volume(ex. VOL1) to multiple instances. After attaching the VOL1 to an instance, a status of the VOL1 is changed to in-use and user can’t attach the VOL1 to other instances. 3) In slide 1-2, you show that the LU1 is seen as /dev/sdx on all the nodes.. this is wrong.. it can be seen as anything (/dev/sdx on control node, sdn on compute 1, sdz on compute 2) so assumign sdx on all nodes is wrong. How does this different device names handled.. in short, how does compute node 2 knows that LU1 is actually sdn and not sdz (assuming you had 1 LUs provisioned) Right. Same device name may not be assigned on all nodes. At my proposed driver, admin needs to create PV and VG manually. Therefore, all nodes do not have to recognize the LU1 as /dev/sdx. 4) What abt multipath ? In most prod env.. the FC storage will be multipath'ed.. hence you will actually see sdx and sdy on each node and you actually need to use mpathN (which is multipathe'd to sdx anx sdy) device and NOT the sd? device to take adv of the customer multipath env. How does the nodes know which mpath? device to use and which mpath? device maps to which LU on the array ? As I mentioned above, admin creates PV and VG manually at my proposed driver. If a product environment uses multipath, admin can create a PV and VG on top of mpath device, using pvcreate /dev/mpath/mpathX. 5) Doesnt this new proposal also causes the compute nodes to be physcially connected (via FC) to the array, which means more wiring and need for FC HBA on compute nodes. With LVMiSCSI, we don't need FC HBA on compute nodes so you are actualluy adding cost of each FC HBA to the compute nodes and slowly turning commodity system to non-commodity ;-) (in a way) I think this is depends on a customer's or cloud provider's requirement.(In slide P9) If the requirement is low cost and none FC cloud environment, LVMiSCSI is appropriate driver. If better I/O performance is required, proposed driver or vendor cinder storage driver with FC are appropriate, because these drivers can directly issue I/O to volumes via FC. 6) Last but not the least... since you are using 1 BIG LU on the array to host multiple volumes, you cannot possibly take adv of the premium, efficient snapshot/clone/mirroring features of the array, since they are at LU level, not at the LV level. LV snapshots have limitations (as mentioned by you in other thread) and are always in-efficient compared to array snapshots. Why would someone want to use less efficient method when they invested on a expensive array ? Right. If user uses array volume directly, user can take adv of efficient snapshot/clone/mirroring features. As I wrote in a reply e-mail to Avishay-san, in OpenStack cloud environment, workloads of storages have been increasing and it is difficult to manage the workloads because every user have a permission to execute storage operations via cinder. In order to use expensive array more efficient, I think it is better to reduce hardware based storage workload by offloading the workload to software based volume operation on a case by case. If we have two drivers in regards to a storage, we can provide volume both way as the situation demands. Ex. As for Standard type storage, use proposed software based LVM cinder driver. As for High performance type storage, use hardware based cinder driver.(Ex. Higher charge than Standard volume) This is one of use-case of my proposed driver. Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS c/o Red Hat, 314 Littleton Road, Westford, MA 01886 From: Deepak Shetty [mailto:dpkshe...@gmail.com] Sent: Wednesday, May 28, 2014 3:11 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Cinder] Support LVM on a shared LU Mitsuhiro, Few questions that come to my mind based on your proposal 1) There is a lof of manual work needed here.. like every time the new host added.. admin needs to do FC zoning to ensure that LU
Re: [openstack-dev] [Cinder] Support LVM on a shared LU
Hi Avishay-san, Thank you for your review and comments for my proposal. I commented in-line. So the way I see it, the value here is a generic driver that can work with any storage. The downsides: A generic driver for any storage is an one of benefit. But main benefit of proposed driver is as follows. - Reduce hardware based storage workload by offloading the workload to software based volume operation. Conventionally, operations to an enterprise storage such as volume creation, deletion, snapshot, etc are only permitted system administrator and they handle these operations after carefully examining. In OpenStack cloud environment, every user have a permission to execute these storage operations via cinder. As a result, workloads of storages have been increasing and it is difficult to manage the workloads. If we have two drivers in regards to a storage, we can use both way as the situation demands. Ex. As for Standard type storage, use proposed software based LVM cinder driver. As for High performance type storage, use hardware based cinder driver. As a result, we can offload the workload of standard type storage from physical storage to cinder host. 1. The admin has to manually provision a very big volume and attach it to the Nova and Cinder hosts. Every time a host is rebooted, I thinks current FC-based cinder drivers using scsi scan to find created LU. # echo - - - /sys/class/scsi_host/host#/scan The admin can find additional LU using this, so host reboot are not required. or introduced, the admin must do manual work. This is one of the things OpenStack should be trying to avoid. This can't be automated without a driver, which is what you're trying to avoid. Yes. Some admin manual work is required and can’t be automated. I would like to know whether these operations are acceptable range to enjoy benefits from my proposed driver. 2. You lose on performance to volumes by adding another layer in the stack. I think this is case by case. When user use a cinder volume for DATA BASE, they prefer raw volume and proposed driver can’t provide raw cinder volume. In this case, I recommend High performance type storage. LVM is a default feature in many Linux distribution. Also LVM is used many enterprise systems and I think there is not critical performance loss. 3. You lose performance with snapshots - appliances will almost certainly have more efficient snapshots than LVM over network (consider that for every COW operation, you are reading synchronously over the network). (Basically, you turned your fully-capable storage appliance into a dumb JBOD) I agree that storage has efficient COW snapshot feature, so we can create new Boot Volume from glance quickly. In this case, I recommend High performance type storage. LVM can’t create nested snapshot with shared LVM now. Therefore, we can’t assign writable LVM snapshot to instances. Is this answer for your comment? In short, I think the cons outweigh the pros. Are there people deploying OpenStack who would deploy their storage like this? Please consider above main benefit. Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS c/o Red Hat, 314 Littleton Road, Westford, MA 01886 From: Avishay Traeger [mailto:avis...@stratoscale.com] Sent: Wednesday, May 21, 2014 4:36 AM To: OpenStack Development Mailing List (not for usage questions) Cc: Tomoki Sekiyama Subject: Re: [openstack-dev] [Cinder] Support LVM on a shared LU So the way I see it, the value here is a generic driver that can work with any storage. The downsides: 1. The admin has to manually provision a very big volume and attach it to the Nova and Cinder hosts. Every time a host is rebooted, or introduced, the admin must do manual work. This is one of the things OpenStack should be trying to avoid. This can't be automated without a driver, which is what you're trying to avoid. 2. You lose on performance to volumes by adding another layer in the stack. 3. You lose performance with snapshots - appliances will almost certainly have more efficient snapshots than LVM over network (consider that for every COW operation, you are reading synchronously over the network). (Basically, you turned your fully-capable storage appliance into a dumb JBOD) In short, I think the cons outweigh the pros. Are there people deploying OpenStack who would deploy their storage like this? Thanks, Avishay On Tue, May 20, 2014 at 6:31 PM, Mitsuhiro Tanino mitsuhiro.tan...@hds.commailto:mitsuhiro.tan...@hds.com wrote: Hello All, I’m proposing a feature of LVM driver to support LVM on a shared LU. The proposed LVM volume driver provides these benefits. - Reduce hardware based storage workload by offloading the workload to software based volume operation. - Provide quicker volume creation and snapshot creation without storage workloads. - Enable cinder to any kinds of shared storage volumes without specific cinder storage driver
[openstack-dev] [Cinder] Support LVM on a volume attached to multiple nodes
Hi John-san, Thank you for your review of my summit suggestion and I understand your indication. Could you see my following comments? - http://summit.openstack.org/cfp/details/166 There's a lot here, based on conversations in IRC and the excellent wiki you put together I'm still trying to get a clear picture here. Let's start by taking transport protocol out of the mix; I believe what you're proposing is a shared, multi-attach LV (over iscsi, fibre or whatever). That seems to fall under the multi-attach feature. - So both my BP and BP of multi-attach-volume need multi attach feature, but in my understanding, the target layer and goals are different . Let me explain the additional point. * multi-attach-volume: Implement the volume multi attach feature. Main target layers is Nova layer. (and a little cinder layer implement?). * My BP : Implement a generic LVM volume driver using LVM on a storage volume(over iscsi, fibre or whatever) attached to multi compute nodes. Target layers are both Nova and Cinder layer. The different point is the former case, the cinder volume is needed to create by other cinder storage driver with supported storage, and after that the feature can attach a volume to multiple instances.(and also hosts) On the other hands, my BP targets generic LVM driver and not depends on specific vendor storage. The driver just provide features to create/delete/snapshot volume, etc from a volume group on multi attached storage. The point is user needs to create a storage volume using own storage management tool, attach the created volume to multiple compute nodes, create VG on the volume and configure it to cinder.conf as a volume_group. So, multi-attach feature is not a target feature to implement. The driver just requires an environment of LVM on a storage attached to multi compute nodes. I think my proposed LVM driver is orthogonal to the multi-attach-volume and we can use multi-attach-volume and my BP in combination. Here is my additional understanding for both BPs. [A] My BP https://blueprints.launchpad.net/cinder/+spec/lvm-driver-for-shared-storage The goal of my BP is supporting a volume group on a volume which is attached to multiple compute nodes. So the proposed driver just provide a feature to create/delete/snapshot, etc from volume group same as existing LVMiSCSI driver. But the difference from LVMiSCSI driver, the prepared volume group on a storage volume is required to be attached to multiple compute nodes in order to recognize multiple compute nodes simultaneously instead of using iSCSI target. The multi attached storage and volume group on it is needed to prepare before cinder configuration by user without any cinder features because the target of my driver is generic LVM driver and my driver targets to support a storages environment which does not have cinder storage driver. [Preparation of volume_group] (a) An user create a storage volume using storage management tool.(not cinder feature) (b) Export a created volume to multiple compute nodes.(using FC host group or iSCSI target feature) (c) Create volume group on a exported storage volume. (d) Configure the volume group for the cinder volume_group. [B] BP of multi-attach-volume https://blueprints.launchpad.net/cinder/+spec/multi-attach-volume The goal of this BP is attaching a cinder volume to multiple instances. In my understanding, providing a cinder volume multi-attach feature is a goal of this BP. Steps to multi attach (1) Create a volume using cinder volume driver with supported storage. Driver and supported driver is required. (2) Export a created volume to multiple compute nodes via FC, iSCSI, etc. (3) Attach a created volume to multiple instances with RO/RW mode. (4) As a result, multiple instances can recognize a single cinder volume simultaneously. - I'm also trying to understand your statements about improved performance. Could you maybe add some more detail or explanation around this things or maybe grab me on IRC when convenient? - This explains a comparison between LVMiSCSI and my BP driver. Current LVMiSCSI driver uses a cinder node as a virtual storage using software iSCSI target. This is useful and not depends on storage arrangement, but we have to access a volume group via network even if we have FC environment. If we have multi attached FC volume, it is better to access the volume via FC instead of iSCSI target. As a result, we expect better I/O performance and latency to access multi attached FC volume to compute nodes. This is a point which I mentioned. Regards, Mitsuhiro Tanino mitsuhiro.tan...@hds.com HITACHI DATA SYSTEMS c/o Red Hat, 314 Littleton Road, Westford, MA 01886 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http