Re: [openstack-dev] [sahara] Upgrade of Hadoop components inside released version
Team - Please see in-line for my thoughts/opinions on the topic: From: Andrew Lazarev alaza...@mirantis.com Subject: [openstack-dev] [sahara] Upgrade of Hadoop components inside released version Date: June 24, 2014 at 5:20:27 PM EDT To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Reply-To: OpenStack Development Mailing List \(not for usage questions\) openstack-dev@lists.openstack.org Hi Team, I want to raise topic about upgrade of components in Hadoop version that is already supported by released Sahara plugin. The question is raised because of several change requests [1] and [2]. Topic was discussed in Atlanta ([3]), but we didn't come to the decision. Any future policy that is put in place must provide the ability for a plugin to move forward in terms of functionality. Each plugin, depending on its implementation is going to have limitations, sometimes with backwards compatibility. This is not a function of Sahara proper, but possibly of Hadoop and or the distribution in question that the plugin implements. Each vendor/plugin should be allowed to control what they do or do not support. With regards to the code submissions that are being delayed by lack of backwards compatibility policy ([1] [2]), it is my opinion that they should be allowed to move forward as there is no policy in place that is being challenged and/or violated. However, these code submission serve as a good vehicle for discussing said compatibility policy. All of us agreed that existing clusters must continue to work after OpenStack upgrade. So if user creates cluster by Icehouse Sahara and then upgrades OpenStack - everything should continue working as before. The most tricky operation is scaling and it dictates list of restrictions over new version of component: 1. plugin-version pair supported by the plugin must not change 2. if component upgrade requires DIB involved then plugin must work with both versions of image - old and new one 3. cluster with mixed nodes (created by old code and by new one) should still be operational Given that we should choose policy for components upgrade. Here are several options: 1. Prohibit components upgrade in released versions of plugin. Change plugin version even if hadoop version didn't change. This solves all listed problems but a little bit frustrating for user. They will need to recreate all clusters they have and migrate data like as it is hadoop upgrade. They should also consider Hadoop upgrade to do migration only once. Re-creating a cluster just because the version of a plugin (or Sahara) has changed is very unlikely to occur in the real world as this could easily involve 1,000’s of nodes and many petabytes of data. There must be a more compelling reason to recreate a cluster than plugin/sahara has changed. What’s more likely is that cluster that is provisioned which is rendered incompatible with a future version of a plugin will result in an administrator making use of the ‘native’ management capabilities provided by the Hadoop distribution; in the case of HDP, this would be Ambari. Clusters can be completely managed through Ambari, including migration, scaling etc. It’s only the VM resources that are not managed by Ambari, but this is a relatively simple proposition. 2. Disable some operations over cluster created by the previous version. If users don't have option to scale cluster there will be no problems with mixed nodes. For this option Sahara need to know if the cluster was created by this version or not. If for some reason a change is introduced in a plugin that renders it incompatible across either Hadoop OR OpenStack versions, it should still be possible to make such change in favor of moving the state of the art forward. Such incompatibility may be difficult (read expensive) or impossible to avoid. The requirement should be to specify the upgrade/migration support (through documentation) specifically with respect to scaling. 3. Require change author to perform all kind of tests and prove that mixed cluster works as good and not mixed. In such case we need some list of tests that are enough to cover all corner cases. My opinion is that testing and backwards compatibility is ultimately the responsibility of the plugin. As such, the plugin vendor should not be restricted in terms of what it needs/must do, but indicate through documentation what its capabilities are to set expectations with customers/users. Ideas are welcome. [1] https://review.openstack.org/#/c/98260/ [2] https://review.openstack.org/#/c/87723/ [3] https://etherpad.openstack.org/p/juno-summit-sahara-relmngmt-backward Thanks, Andrew. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- CONFIDENTIALITY NOTICE
Re: [openstack-dev] [savanna] why swift-internal:// ?
On Jan 24, 2014, at 7:50 AM, Matthew Farrellee m...@redhat.com wrote: andrew, what about having swift:// which defaults to the configured tenant and auth url for what we now call swift-internal, and we allow for user input to change tenant and auth url for what would be swift-external? I like this idea, then swift-internal/swift-external becomes unnecessary. In general, doing anything outside of the existing tenant is frowned upon, at least by existing customers that we’re engaged with. in fact, we may need to add the tenant selection in icehouse. it's a pretty big limitation to only allow a single tenant. best, matt On 01/23/2014 11:15 PM, Andrew Lazarev wrote: Matt, For swift-internal we are using the same keystone (and identity protocol version) as for savanna. Also savanna admin tenant is used. Thanks, Andrew. On Thu, Jan 23, 2014 at 6:17 PM, Matthew Farrellee m...@redhat.com mailto:m...@redhat.com wrote: what makes it internal vs external? swift-internal needs user pass swift-external needs user pass ?auth url? best, matt On 01/23/2014 08:43 PM, Andrew Lazarev wrote: Matt, I can easily imagine situation when job binaries are stored in external HDFS or external SWIFT (like data sources). Internal and external swifts are different since we need additional credentials. Thanks, Andrew. On Thu, Jan 23, 2014 at 5:30 PM, Matthew Farrellee m...@redhat.com mailto:m...@redhat.com mailto:m...@redhat.com mailto:m...@redhat.com wrote: trevor, job binaries are stored in swift or an internal savanna db, represented by swift-internal:// and savanna-db:// respectively. why swift-internal:// and not just swift://? fyi, i see mention of a potential future version of savanna w/ swift-external:// best, matt ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.__openstack.org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.__org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _ OpenStack-dev mailing list OpenStack-dev@lists.openstack.__org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Savanna] DiskBuilder / savanna-image-elements
Team - We’d like to move our disk creation mechanism over to using DiskBuilder so that users can build (and modify) their own VM images. We’d like to piggy back off of the existing mechanism that the vanilla plugin uses. It looks like I should be able to add image-elements to savanna-image-elements/elements directory for the HDP specific setup. Any concerns with this approach? From an organizational standpoint, it would make sense to separate elements from various plugins by creating a directory hierarchy under the elements directory; i.e. a directory for each plugin to keep them separated. Thanks, Erik -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Savanna] Savanna on Bare Metal and Base Requirements
Travis, Sounds like your environment should work fine. There are no special requirements beyond Grizzly/Havanna. Following the installation guides: https://savanna.readthedocs.org/en/latest/ is relatively straight forward. If you are using neutron networking, you will need to rely on public IPs until the we complete an enhancement for working around this limitation. Erik On Oct 25, 2013, at 12:46 AM, Tripp, Travis S travis.tr...@hp.com wrote: Hello Savanna team, I’ve just skimmed through the online documentation and I’m very interested in this project. We have a grizzly environment with all the latest patches as well as several Havana backports applied. We are are doing bare metal provisioning through Nova. It is limited to flat networking. Would Savanna work in this environment? What are the requirements? What are the minimum set of API calls that need to supported (for example, we can’t support snapshots)? Thank you, Travis ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Savanna] Savanna on Bare Metal and Base Requirements
On Oct 25, 2013, at 9:22 AM, Dmitry Mescheryakov dmescherya...@mirantis.com wrote: Hello Travis, We didn't researched Savanna on bare metal, though we considered it some time ago. I know little of bare metal provisioning, so I am rather unsure what problems you might experience. My main concern are images: does bare metal provisioning work with qcow2 images? Vanilla plugin (which installs Vanilla Apache Hadoop) requires a pre-built Linux images with Hadoop, so if qcow2 does not work for bare metal, you will need to somehow build images in required format. On the other hand HDP plugin (which installs Hortonworks Data Platform), does not require pre-built images, but works only on Red Hat OSes, as far as I know. HDP Plugin will support SUSE, Debian in the future, but for now HDP only provides pre-built CentOS images. Another concern: does bare metal support cloud-init? Savanna relies on it and reimplementing that functionality some other way might take some time. As for your concern on which API calls Savanna makes: it is a pretty small list of requests. Mainly authentication with keystone, basic operations with VMs via nova (create, list, terminate), basic operations with images (list, set/get attributes). Snapshots are not used. That is for basic functionality. Other than that, some features might require additional API calls. For instance Cinder support naturally requires calls for volume create/list/delete. Thanks, Dmitry 2013/10/25 Tripp, Travis S travis.tr...@hp.com Hello Savanna team, I’ve just skimmed through the online documentation and I’m very interested in this project. We have a grizzly environment with all the latest patches as well as several Havana backports applied. We are are doing bare metal provisioning through Nova. It is limited to flat networking. Would Savanna work in this environment? What are the requirements? What are the minimum set of API calls that need to supported (for example, we can’t support snapshots)? Thank you, Travis ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [savanna] Savanna usability
I want to start a dialog around adding some usability features to Savanna, now that I have had a chance to spend a fair amount of time provisioning, scaling and changing clusters. Here is a list of items that I believe are important to address; comments welcomed: 1.Changing an OpenStack flavor associated with a node-group template has a ripple effect on both node group and cluster templates. The reason for this is that OpenStack does not support modification of flavors; when a flavor is modified (RAM, CPU, Root Disk, etc.) the flavor is deleted and a new one is created resulting in a new flavor id. The implication is that both node-group referencing the flavor [id] and any cluster templates referencing the affected node-group become stale and unusable. A user then has to start from scratch, creating new node-groups and cluster templates for a simple flavor change. a.A possible solution to this is to internally associate flavor name with node-group and look up the flavor id based on flavor name when provisioning instances b.At a minimum it should be possible to change the flavor id associated with a node-group. See #2. 2.Cluster templates and node group templates are immutable. This is more of an issue at the node-group level as I often times want to make changes to a node-group and want to affect all cluster templates that make use of that node-group. I see this as being fairly commonplace. 3.Before provisioning a cluster, quotas should be checked to make sure that enough quota exists. I know this can be done transactionally (check quota, spawn cluster), but a basic check would go a long way. 4.Spawning a large cluster comes with some problems today, as Savanna will abort if a single VM fails. In deploying large clusters (100’s to x1000), which will be commonplace, having a single slave VM (i.e. data node) not spawn properly should not necessarily be a reason to abort the entire deployment, in particular in a scaling operation. Obviously the failing VM cannot be a master service. This applies both to the plugin as well as controller as they are both involved in the deployment. I could see a possible solution being the creation of an error/fault policy allowing a user to specify (perhaps optionally) a % or hard number for minimum # of nodes that need to come up without aborting deployment. a. This also applies to scaling the cluster by larger increments Just some thoughts based on my experience last week; comments welcomed. Best, Erik -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [savanna] Program name and Mission statement
On Sep 11, 2013, at 9:19 AM, Jon Maron jma...@hortonworks.com wrote: On Sep 10, 2013, at 9:42 PM, Mike Spreitzer mspre...@us.ibm.com wrote: Jon Maron jma...@hortonworks.com wrote on 09/10/2013 08:50:23 PM: From: Jon Maron jma...@hortonworks.com To: OpenStack Development Mailing List openstack-dev@lists.openstack.org, Cc: OpenStack Development Mailing List openstack-dev@lists.openstack.org Date: 09/10/2013 08:55 PM Subject: Re: [openstack-dev] [savanna] Program name and Mission statement Openstack Big Data Platform Let's see if you mean that. Does this project aim to cover big data things besides MapReduce? Can you give examples of other things that are in scope? Hive, Pig, data storage, oozie etc Adding a few items that are on the list, including YARN, Sqoop/2, HBase, and Hue. Other vendors will likely want to add additional services that pertain to their hadoop distro. i.e. SOLR, Impala etc. Thanks, Mike___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [savanna] Program name and Mission statement
On Sep 10, 2013, at 8:50 PM, Jon Maron jma...@hortonworks.com wrote: Openstack Big Data Platform On Sep 10, 2013, at 8:39 PM, David Scott david.sc...@cloudscaling.com wrote: I vote for 'Open Stack Data' On Tue, Sep 10, 2013 at 5:30 PM, Zhongyue Luo zhongyue@intel.com wrote: Why not OpenStack MapReduce? I think that pretty much says it all? On Wed, Sep 11, 2013 at 3:54 AM, Glen Campbell g...@glenc.io wrote: performant isn't a word. Or, if it is, it means having performance. I think you mean high-performance. On Tue, Sep 10, 2013 at 8:47 AM, Matthew Farrellee m...@redhat.com wrote: Rough cut - Program: OpenStack Data Processing Mission: To provide the OpenStack community with an open, cutting edge, performant and scalable data processing stack and associated management interfaces. Proposing a slightly different mission: To provide a simple, reliable and repeatable mechanism by which to deploy Hadoop and related Big Data projects, including management, monitoring and processing mechanisms driving further adoption of OpenStack. On 09/10/2013 09:26 AM, Sergey Lukjanov wrote: It sounds too broad IMO. Looks like we need to define Mission Statement first. Sincerely yours, Sergey Lukjanov Savanna Technical Lead Mirantis Inc. On Sep 10, 2013, at 17:09, Alexander Kuznetsov akuznet...@mirantis.com mailto:akuznet...@mirantis.com wrote: My suggestion OpenStack Data Processing. On Tue, Sep 10, 2013 at 4:15 PM, Sergey Lukjanov slukja...@mirantis.com mailto:slukja...@mirantis.com wrote: Hi folks, due to the Incubator Application we should prepare Program name and Mission statement for Savanna, so, I want to start mailing thread about it. Please, provide any ideas here. P.S. List of existing programs: https://wiki.openstack.org/wiki/Programs P.P.S. https://wiki.openstack.org/wiki/Governance/NewPrograms Sincerely yours, Sergey Lukjanov Savanna Technical Lead Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Glen Campbell http://glenc.io • @glenc ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Intel SSG/STOD/DCST/CIT 880 Zixing Road, Zizhu Science Park, Minhang District, 200241, Shanghai, China +862161166500 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.