Re: [openstack-dev] [cyborg] [nova] Cyborg quotas

2018-05-21 Thread Blair Bethwaite
(Please excuse the top-posting)

The other possibility is that the Cyborg managed devices are plumbed in via
IP in guest network space. Then "attach" isn't so much a Nova problem as a
Neutron one - probably similar to Manila.

Has the Cyborg team considered a RESTful-API proxy driver, i.e., something
that wraps a vendor-specific accelerator service and makes it friendly to a
multi-tenant OpenStack cloud? Quantum co-processors might be a compelling
example which fit this model.

Cheers,

On Sun., 20 May 2018, 23:28 Chris Friesen, <chris.frie...@windriver.com>
wrote:

> On 05/19/2018 05:58 PM, Blair Bethwaite wrote:
> > G'day Jay,
> >
> > On 20 May 2018 at 08:37, Jay Pipes <jaypi...@gmail.com> wrote:
> >> If it's not the VM or baremetal machine that is using the accelerator,
> what
> >> is?
> >
> > It will be a VM or BM, but I don't think accelerators should be tied
> > to the life of a single instance if that isn't technically necessary
> > (i.e., they are hot-pluggable devices). I can see plenty of scope for
> > use-cases where Cyborg is managing devices that are accessible to
> > compute infrastructure via network/fabric (e.g. rCUDA or dedicated
> > PCIe fabric). And even in the simple pci passthrough case (vfio or
> > mdev) it isn't hard to imagine use-cases for workloads that only need
> > an accelerator sometimes.
>
> Currently nova only supports attach/detach of volumes and network
> interfaces.
> Is Cyborg looking to implement new Compute API operations to support hot
> attach/detach of various types of accelerators?
>
> Chris
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cyborg] [nova] Cyborg quotas

2018-05-19 Thread Blair Bethwaite
G'day Jay,

On 20 May 2018 at 08:37, Jay Pipes  wrote:
> If it's not the VM or baremetal machine that is using the accelerator, what
> is?

It will be a VM or BM, but I don't think accelerators should be tied
to the life of a single instance if that isn't technically necessary
(i.e., they are hot-pluggable devices). I can see plenty of scope for
use-cases where Cyborg is managing devices that are accessible to
compute infrastructure via network/fabric (e.g. rCUDA or dedicated
PCIe fabric). And even in the simple pci passthrough case (vfio or
mdev) it isn't hard to imagine use-cases for workloads that only need
an accelerator sometimes.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cyborg] [nova] Cyborg quotas

2018-05-19 Thread Blair Bethwaite
Relatively Cyborg-naive question here...

I thought Cyborg was going to support a hot-plug model. So I certainly
hope it is not the expectation that accelerators will be encoded into
Nova flavors? That will severely limit its usefulness.

On 19 May 2018 at 23:30, Jay Pipes  wrote:
> On 05/18/2018 07:58 AM, Nadathur, Sundar wrote:
>>
>> Agreed. Not sure how other projects handle it, but here's the situation
>> for Cyborg. A request may get scheduled on a compute node with no
>> intervention by Cyborg. So, the earliest check that can be made today is in
>> the selected compute node. A simple approach can result in quota violations
>> as in this example.
>>
>> Say there are 5 devices in a cluster. A tenant has a quota of 4 and
>> is currently using 3. That leaves 2 unused devices, of which the
>> tenant is permitted to use only one. But he may submit two
>> concurrent requests, and they may land on two different compute
>> nodes. The Cyborg agent in each node will see the current tenant
>> usage as 3 and let the request go through, resulting in quota
>> violation.
>
>>
>>
>> To prevent this, we need some kind of atomic update , like SQLAlchemy's
>> with_lockmode():
>>
>> https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy#Pessimistic_Locking_-_SELECT_FOR_UPDATE
>> That seems to have issues, as documented in the link above. Also, since
>> every compute node does that, it would also serialize the bringup of all
>> instances with accelerators, across the cluster.
>
>>
>>
>> If there is a better solution, I'll be happy to hear it.
>
>
> The solution is to implement the following two specs:
>
> https://review.openstack.org/#/c/509042/
> https://review.openstack.org/#/c/569011/
>
> The problem of consuming more resources than a user/project has quota for is
> not a new problem. Users have been able to go over their quota in all of the
> services for as long as I can remember -- they can do this by essentially
> DDoS'ing the API with lots of concurrent single-instance build requests [1]
> all at once. The tenant then ends up in an over-quota situation and is
> essentially unable to do anything at all before deleting resources.
>
> The only operators that I can remember that complained about this issue were
> the public cloud operators -- and rightfully so since quota abuse in public
> clouds meant their reputation for fairness might be questioned. Most
> operators I know of solved this problem by addressing *rate-limiting*, which
> is not the same as quota limits. By rate-limiting requests to the APIs, the
> operators were able to alleviate the problem by addressing a symptom, which
> was that high rates of concurrent requests could lead to over-quota
> situations.
>
> Nobody is using Cyborg separately from Nova at the moment (or ever?). It's
> not as if a user will be consuming an accelerator outside of a Nova instance
> -- since it is the Nova instance that is the workload that uses the
> accelerator.
>
> That means that Cyborg resources should be treated as just another resource
> class whose usage should be checked in a single query to the /usages
> placement API endpoint before attempting to spawn the instance (again, via
> Nova) that ends up consuming those resources.
>
> The claiming of all resources that are consumed by a Nova instance (which
> would include any accelerator resources) is an atomic operation that
> prevents over-allocation of any provider involved in the claim transaction.
> [2]
>
> This atomic operation in Nova/Placement *significantly* cuts down on the
> chances of a user/project exceeding its quota because it reduces the amount
> of time to get an accurate read of the resource usage to a very small amount
> of time (from seconds/tens of seconds to milliseconds).
>
> So, to sum up, my recommendation is to get involved in the two Nova specs
> above and help to see them to completion in Rocky. Doing so will free Cyborg
> developers up to focus on integration with the virt driver layer via the
> os-acc library, implementing the update_provider_tree() interface, and
> coming up with some standard resource classes for describing accelerated
> resources.
>
> Best,
> -jay
>
> [1] I'm explicitly calling out multiple concurrent single build requests
> here, since a build request for multiple instances is actually not a cause
> of over-quota because the entire set of requested instances is considered as
> a single unit for usage calculation.
>
> [2] technically, NUMA topology resources and PCI devices do not currently
> participate in this single claim transaction. This is not ideal, and is
> something we are actively working on addressing. Keep in mind there are also
> no quota classes for PCI devices or NUMA topologies, though, so the
> over-quota problems don't exist for those resource classes.
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> 

Re: [openstack-dev] [nova] about rebuild instance booted from volume

2018-03-14 Thread Blair Bethwaite
Please do not default to deleting it, otherwise someone will eventually be
back here asking why an irate user has just lost data. The better scenario
is that the rebuild will fail (early - before impact to the running
instance) with a quota error.

Cheers,

On Thu., 15 Mar. 2018, 00:46 Matt Riedemann,  wrote:

> On 3/14/2018 3:42 AM, 李杰 wrote:
> >
> >  This is the spec about  rebuild a instance booted from
> > volume.In the spec,there is a
> >question about if we should delete the old root_volume.Anyone who
> > is interested in
> >booted from volume can help to review this. Any suggestion is
> > welcome.Thank you!
> >The link is here.
> >Re:the rebuild spec:https://review.openstack.org/#/c/532407/
>
> Copying the operators list and giving some more context.
>
> This spec is proposing to add support for rebuild with a new image for
> volume-backed servers, which today is just a 400 failure in the API
> since the compute doesn't support that scenario.
>
> With the proposed solution, the backing root volume would be deleted and
> a new volume would be created from the new image, similar to how boot
> from volume works.
>
> The question raised in the spec is whether or not nova should delete the
> root volume even if its delete_on_termination flag is set to False. The
> semantics get a bit weird here since that flag was not meant for this
> scenario, it's meant to be used when deleting the server to which the
> volume is attached. Rebuilding a server is not deleting it, but we would
> need to replace the root volume, so what do we do with the volume we're
> replacing?
>
> Do we say that delete_on_termination only applies to deleting a server
> and not rebuild and therefore nova can delete the root volume during a
> rebuild?
>
> If we don't delete the volume during rebuild, we could end up leaving a
> lot of volumes lying around that the user then has to clean up,
> otherwise they'll eventually go over quota.
>
> We need user (and operator) feedback on this issue and what they would
> expect to happen.
>
> --
>
> Thanks,
>
> Matt
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Mixed service version CI testing

2017-12-27 Thread Blair Bethwaite
+1!

It may also be worth testing a step where Nova & Neutron remain at N-1.

On 20 December 2017 at 04:58, Matt Riedemann  wrote:
> During discussion in the TC channel today [1], we got talking about how
> there is a perception that you must upgrade all of the services together for
> anything to work, at least the 'core' services like
> keystone/nova/cinder/neutron/glance - although maybe that's really just
> nova/cinder/neutron?
>
> Anyway, I posit that the services are not as tightly coupled as some people
> assume they are, at least not since kilo era when microversions started
> happening in nova.
>
> However, with the way we do CI testing, and release everything together, the
> perception is there that all things must go together to work.
>
> In our current upgrade job, we upgrade everything to N except the
> nova-compute service, that remains at N-1 to test rolling upgrades of your
> computes and to make sure guests are unaffected by the upgrade of the
> control plane.
>
> I asked if it would be valuable to our users (mostly ops for this right?) if
> we had an upgrade job where everything *except* nova were upgraded. If
> that's how the majority of people are doing upgrades anyway it seems we
> should make sure that works.
>
> I figure leaving nova at N-1 makes more sense because nova depends on the
> other services (keystone/glance/cinder/neutron) and is likely the harder /
> slower upgrade if you're going to do rolling upgrades of your compute nodes.
>
> This type of job would not run on nova changes on the master branch, since
> those changes would not be exercised in this type of environment. So we'd
> run this on master branch changes to
> keystone/cinder/glance/neutron/trove/designate/etc.
>
> Does that make sense? Would this be valuable at all? Or should the opposite
> be tested where we upgrade nova to N and leave all of the dependent services
> at N-1?
>
> Really looking for operator community feedback here.
>
> [1]
> http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%23openstack-tc.2017-12-19.log.html#t2017-12-19T15:14:15
>
> --
>
> Thanks,
>
> Matt
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Switching to longer development cycles

2017-12-13 Thread Blair Bethwaite
The former - we're running Cells so only have a single region currently
(except for Swift where we have multiple proxy endpoints around the
country, all backed by a global cluster, but they have to be different
regions to put them all in the service catalog). See
https://trello.com/b/9fkuT1eU/nectar-openstack-versions for the current
version snapshot.

On 14 Dec. 2017 18:00, "Clint Byrum"  wrote:

> Excerpts from Blair Bethwaite's message of 2017-12-14 17:44:53 +1100:
> > On 14 December 2017 at 17:36, Clint Byrum  wrote:
> > > The batch size for "upgrade the whole cloud" is too big. Let's help our
> > > users advance components one at a time, and then we won't have to worry
> > > so much about doing the whole integrated release dance so often.
> >
> > Is there any data about how operators approach this currently? Nectar
> > (and I presume other large and/or loosely coordinated OpenStack
> > clouds) has been running different projects across multiple versions
> > for quite a while, sometimes 3 or 4 different versions. Coordinating
> > upgrades in a federated cloud with distributed operations requires
> > that we do this, e.g., our current Nova Newton upgrade has probably
> > been in-train for a couple of months now.
> >
>
> That's interesting. Can you share what you mean by running 3 or 4
> different versions?
>
> Do you mean you mix versions in a single region, like, Pike keystone,
> Ocata Nova, and Newton Neutron? Or do you mean you might have a region
> running Pike, and another running Ocata, and another running Newton?
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Switching to longer development cycles

2017-12-13 Thread Blair Bethwaite
On 14 December 2017 at 17:36, Clint Byrum  wrote:
> The batch size for "upgrade the whole cloud" is too big. Let's help our
> users advance components one at a time, and then we won't have to worry
> so much about doing the whole integrated release dance so often.

Is there any data about how operators approach this currently? Nectar
(and I presume other large and/or loosely coordinated OpenStack
clouds) has been running different projects across multiple versions
for quite a while, sometimes 3 or 4 different versions. Coordinating
upgrades in a federated cloud with distributed operations requires
that we do this, e.g., our current Nova Newton upgrade has probably
been in-train for a couple of months now.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Upstream LTS Releases

2017-11-14 Thread Blair Bethwaite
Hi all - please note this conversation has been split variously across
-dev and -operators.

One small observation from the discussion so far is that it seems as
though there are two issues being discussed under the one banner:
1) maintain old releases for longer
2) do stable releases less frequently

It would be interesting to understand if the people who want longer
maintenance windows would be helped by #2.

On 14 November 2017 at 09:25, Doug Hellmann  wrote:
> Excerpts from Bogdan Dobrelya's message of 2017-11-14 17:08:31 +0100:
>> >> The concept, in general, is to create a new set of cores from these
>> >> groups, and use 3rd party CI to validate patches. There are lots of
>> >> details to be worked out yet, but our amazing UC (User Committee) will
>> >> be begin working out the details.
>> >
>> > What is the most worrying is the exact "take over" process. Does it mean 
>> > that
>> > the teams will give away the +2 power to a different team? Or will our 
>> > (small)
>> > stable teams still be responsible for landing changes? If so, will they 
>> > have to
>> > learn how to debug 3rd party CI jobs?
>> >
>> > Generally, I'm scared of both overloading the teams and losing the control 
>> > over
>> > quality at the same time :) Probably the final proposal will clarify it..
>>
>> The quality of backported fixes is expected to be a direct (and only?)
>> interest of those new teams of new cores, coming from users and
>> operators and vendors. The more parties to establish their 3rd party
>
> We have an unhealthy focus on "3rd party" jobs in this discussion. We
> should not assume that they are needed or will be present. They may be,
> but we shouldn't build policy around the assumption that they will. Why
> would we have third-party jobs on an old branch that we don't have on
> master, for instance?
>
>> checking jobs, the better proposed changes communicated, which directly
>> affects the quality in the end. I also suppose, contributors from ops
>> world will likely be only struggling to see things getting fixed, and
>> not new features adopted by legacy deployments they're used to maintain.
>> So in theory, this works and as a mainstream developer and maintainer,
>> you need no to fear of losing control over LTS code :)
>>
>> Another question is how to not block all on each over, and not push
>> contributors away when things are getting awry, jobs failing and merging
>> is blocked for a long time, or there is no consensus reached in a code
>> review. I propose the LTS policy to enforce CI jobs be non-voting, as a
>> first step on that way, and giving every LTS team member a core rights
>> maybe? Not sure if that works though.
>
> I'm not sure what change you're proposing for CI jobs and their voting
> status. Do you mean we should make the jobs non-voting as soon as the
> branch passes out of the stable support period?
>
> Regarding the review team, anyone on the review team for a branch
> that goes out of stable support will need to have +2 rights in that
> branch. Otherwise there's no point in saying that they're maintaining
> the branch.
>
> Doug
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] Upstream LTS Releases

2017-11-10 Thread Blair Bethwaite
I missed this session but the discussion strikes a chord as this is
something I've been saying on my user survey every 6 months.

On 11 November 2017 at 09:51, John Dickinson  wrote:
> What I heard from ops in the room is that they want (to start) one release a
> year who's branch isn't deleted after a year. What if that's exactly what we
> did? I propose that OpenStack only do one release a year instead of two. We
> still keep N-2 stable releases around. We still do backports to all open
> stable branches. We still do all the things we're doing now, we just do it
> once a year instead of twice.

+1

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Supporting SSH host certificates

2017-10-05 Thread Blair Bethwaite
A related bug that hasn't seen any love for some time:
https://bugs.launchpad.net/nova/+bug/1613199

On 6 October 2017 at 07:47, James Penick  wrote:

> Hey Pino,
>
> mriedem pointed me to the vendordata code [1] which shows some fields are
> passed (such as project ID) and that SSL is supported. So that's good.
>
> The docs on vendordata suck. But I think it'll do what you're looking for.
> Michael Still wrote up a helpful post titled "Nova vendordata deployment,
> an excessively detailed guide"[2] and he's written a vendordata service
> example[3] which even shows keystone integration.
>
> At Oath, we have a system that provides a unique x509 certificate for each
> host, including the ability to sign host SSH keys against an HSM. In our
> case what we do is have Nova call the service, which generates and returns
> a signed (and time limited) host bootstrap document, which is injected into
> the instance. When the instance boots it calls our identity service and
> provides its bootstrap document as a bearer certificate. The identity
> service trusts this one-time document to attest the instance, and will then
> provide an x509 certificate as well as sign the hosts SSH keys. After the
> initial bootstrap the host will rotate its keys frequently, by providing
> its last certificate in exchange for a new one. The service tracks all host
> document and certificate IDs which have been exchanged until their expiry,
> so that a cert cannot be re-used.
>
> This infrastructure relies on Athenz [4] as the AuthNG system for all
> principals (users, services, roles, domains, etc) as well as an internal
> signatory service which signs x509 certificates and SSH host keys using an
> HSM infrastructure.
>
> Instead, you could write a vendordata service which, when called, would
> generate an ssh host keypair, sign it, and return those files as encoded
> data, which can be expanded into files in the correct locations on first
> boot. I strongly suggest using not only using keystone auth, but that you
> ensure all calls from vendordata to the microservice are encrypted with TLS
> mutual auth.
>
> -James
>
>
> 1: https://github.com/openstack/nova/blob/master/nova/api/
> metadata/vendordata_dynamic.py#L77
> 2: https://www.stillhq.com/openstack/22.html
> 3: https://github.com/mikalstill/vendordata
> 4: https://athenz.io
>
>
> On Fri, Sep 29, 2017 at 5:17 PM, Fox, Kevin M  wrote:
>
>> https://review.openstack.org/#/c/93/
>> --
>> *From:* Giuseppe de Candia [giuseppe.decan...@gmail.com]
>> *Sent:* Friday, September 29, 2017 1:05 PM
>> *To:* OpenStack Development Mailing List (not for usage questions)
>> *Subject:* Re: [openstack-dev] Supporting SSH host certificates
>>
>> Ihar, thanks for pointing that out - I'll definitely take a close look.
>>
>> Jon, I'm not very familiar with Barbican, but I did assume the full
>> implementation would use Barbican to store private keys. However, in terms
>> of actually getting a private key (or SSH host cert) into a VM instance,
>> Barbican doesn't help. The instance needs permission to access secrets
>> stored in Barbican. The main question of my e-mail is: how do you inject a
>> credential in an automated but secure way? I'd love to hear ideas - in the
>> meantime I'll study Ihar's link.
>>
>> thanks,
>> Pino
>>
>>
>>
>> On Fri, Sep 29, 2017 at 2:49 PM, Ihar Hrachyshka 
>> wrote:
>>
>>> What you describe (at least the use case) seems to resemble
>>> https://review.openstack.org/#/c/456394/ This work never moved
>>> anywhere since the spec was posted though. You may want to revive the
>>> discussion in scope of the spec.
>>>
>>> Ihar
>>>
>>> On Fri, Sep 29, 2017 at 12:21 PM, Giuseppe de Candia
>>>  wrote:
>>> > Hi Folks,
>>> >
>>> >
>>> >
>>> > My intent in this e-mail is to solicit advice for how to inject SSH
>>> host
>>> > certificates into VM instances, with minimal or no burden on users.
>>> >
>>> >
>>> >
>>> > Background (skip if you're already familiar with SSH certificates):
>>> without
>>> > host certificates, when clients ssh to a host for the first time (or
>>> after
>>> > the host has been re-installed), they have to hope that there's no man
>>> in
>>> > the middle and that the public key being presented actually belongs to
>>> the
>>> > host they're trying to reach. The host's public key is stored in the
>>> > client's known_hosts file. SSH host certicates eliminate the
>>> possibility of
>>> > Man-in-the-Middle attack: a Certificate Authority public key is
>>> distributed
>>> > to clients (and written to their known_hosts file with a special
>>> syntax and
>>> > options); the host public key is signed by the CA, generating an SSH
>>> > certificate that contains the hostname and validity period (among other
>>> > things). When negotiating the ssh connection, the host presents its
>>> SSH host
>>> > certificate and the client verifies that it was signed 

Re: [openstack-dev] vGPUs support for Nova - Implementation

2017-10-02 Thread Blair Bethwaite
On 29 September 2017 at 22:26, Bob Ball  wrote:
> The concepts of PCI and SR-IOV are, of course, generic, but I think out of 
> principal we should avoid a hypervisor-specific integration for vGPU (indeed 
> Citrix has been clear from the beginning that the vGPU integration we are 
> proposing is intentionally hypervisor agnostic)

To be fair, what this proposal is doing is piggy-backing on Nova's
existing PCI functionality to expose Linux/KVM VFIO mdev, it just so
happens mdev was created for vGPU, but it was designed to extend to
other devices/things too.

> I also think there is value in exposing vGPU in a generic way, irrespective 
> of the underlying implementation (whether it is DEMU, mdev, SR-IOV or 
> whatever approach Hyper-V/VMWare use).

That is a big ask. To start with, all GPUs are not created equal, and
various vGPU functionality as designed by the GPU vendors is not
consistent, never mind the quirks added between different hypervisor
implementations. So I feel like trying to expose this in a generic
manner is, at least asking for problems, and more likely bound for
failure.

Nova already exposes plenty of hypervisor-specific functionality (or
functionality only implemented for one hypervisor), and that's fine.
Maybe there should be a something in OpenStack that would generically
manage vGPU-graphics and/or vGPU-compute etc, but I'm pretty sure it
would never be allowed into Nova :-).

Anyway, take all that with a grain of salt, because frankly I would
love to see this in sooner rather than later - even if it did have a
big "this might change in non-upgradeable ways" sticker on it.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Running large instances with CPU pinning and OOM

2017-09-27 Thread Blair Bethwaite
Hi Prema

On 28 September 2017 at 07:10, Premysl Kouril  wrote:
> Hi, I work with Jakub (the op of this thread) and here is my two
> cents: I think what is critical to realize is that KVM virtual
> machines can have substantial memory overhead of up to 25% of memory,
> allocated to KVM virtual machine itself. This overhead memory is not

I'm curious what sort of VM configuration causes such high overheads,
is this when using highly tuned virt devices with very large buffers?

> This KVM virtual machine overhead is what is causing the OOMs in our
> infrastructure and that's what we need to fix.

If you are pinning multiple guests per NUMA node in a multi-NUMA node
system then you might also have issues with uneven distribution of
system overheads across nodes, depending on how close to the sun you
are flying.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Running large instances with CPU pinning and OOM

2017-09-27 Thread Blair Bethwaite
On 27 September 2017 at 23:19, Jakub Jursa  wrote:
> 'hw:cpu_policy=dedicated' (while NOT setting 'hw:numa_nodes') results in
> libvirt pinning CPU in 'strict' memory mode
>
> (from libvirt xml for given instance)
> ...
>   
> 
> 
>   
> ...
>
> So yeah, the instance is not able to allocate memory from another NUMA node.

I can't recall what the docs say on this but I wouldn't be surprised
if that was a bug. Though I do think most users would want CPU & NUMA
pinning together (you haven't shared your use case but perhaps you do
too?).

> I'm not quite sure what do you mean by 'memory will be locked for the
> guest'. Also, aren't huge pages enabled in kernel by default?

I think that suggestion was probably referring to static hugepages,
which can be reserved (per NUMA node) at boot and then (assuming your
host is configured correctly) QEMU will be able to back guest RAM with
them.

You are probably thinking of THP (transparent huge pages) which are
now on by default in Linux but can be somewhat hit & miss if you have
a long running host where memory has become fragmented or the
pagecache is large - in our experience performance can be severely
degraded by just missing hugepage backing of a small fraction of guest
memory, and we have noticed behaviour from memory management where THP
allocations fail when pagecache is highly utilised despite none of it
being dirty (so should be able to be dropped immediately).

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance] Queens PTG: Thursday summary

2017-09-27 Thread Blair Bethwaite
On 27 September 2017 at 22:40, Belmiro Moreira
 wrote:
> In the past we used the tabs but latest Horizon versions use the visibility
> column/search instead.
> The issue is that we would like the old images to continue to be
> discoverable by everyone and have a image list that only shows the latest
> ones.

Yeah I think we hit that as well and have a patch for category
listing. It's not something I have worked on but Sam can fill the
gaps... or it could be that this is actually the last problem we have
left with upgrading to a current version of the dashboard and so are
effectively in the same boat.

> We are now using the “community” visibility to hide the old images from the
> default image list. But it’s not ideal.

Not ideal because you don't want them discoverable at all?

> I will move the old spec about image lifecycle to glance.
> https://review.openstack.org/#/c/327980/

Looks like a useful spec!

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Running large instances with CPU pinning and OOM

2017-09-27 Thread Blair Bethwaite
Also CC-ing os-ops as someone else may have encountered this before
and have further/better advice...

On 27 September 2017 at 18:40, Blair Bethwaite
<blair.bethwa...@gmail.com> wrote:
> On 27 September 2017 at 18:14, Stephen Finucane <sfinu...@redhat.com> wrote:
>> What you're probably looking for is the 'reserved_host_memory_mb' option. 
>> This
>> defaults to 512 (at least in the latest master) so if you up this to 4192 or
>> similar you should resolve the issue.
>
> I don't see how this would help given the problem description -
> reserved_host_memory_mb would only help avoid causing OOM when
> launching the last guest that would otherwise fit on a host based on
> Nova's simplified notion of memory capacity. It sounds like both CPU
> and NUMA pinning are in play here, otherwise the host would have no
> problem allocating RAM on a different NUMA node and OOM would be
> avoided.
>
> Jakub, your numbers sound reasonable to me, i.e., use 60 out of 64GB
> when only considering QEMU overhead - however I would expect that
> might  be a problem on NUMA node0 where there will be extra reserved
> memory regions for kernel and devices. In such a configuration where
> you are wanting to pin multiple guests into each of multiple NUMA
> nodes I think you may end up needing different flavor/instance-type
> configs (using less RAM) for node0 versus other NUMA nodes. Suggest
> freshly booting one of your hypervisors and then with no guests
> running take a look at e.g. /proc/buddyinfo/ and /proc/zoneinfo to see
> what memory is used/available and where.
>
> --
> Cheers,
> ~Blairo



-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Running large instances with CPU pinning and OOM

2017-09-27 Thread Blair Bethwaite
On 27 September 2017 at 18:14, Stephen Finucane  wrote:
> What you're probably looking for is the 'reserved_host_memory_mb' option. This
> defaults to 512 (at least in the latest master) so if you up this to 4192 or
> similar you should resolve the issue.

I don't see how this would help given the problem description -
reserved_host_memory_mb would only help avoid causing OOM when
launching the last guest that would otherwise fit on a host based on
Nova's simplified notion of memory capacity. It sounds like both CPU
and NUMA pinning are in play here, otherwise the host would have no
problem allocating RAM on a different NUMA node and OOM would be
avoided.

Jakub, your numbers sound reasonable to me, i.e., use 60 out of 64GB
when only considering QEMU overhead - however I would expect that
might  be a problem on NUMA node0 where there will be extra reserved
memory regions for kernel and devices. In such a configuration where
you are wanting to pin multiple guests into each of multiple NUMA
nodes I think you may end up needing different flavor/instance-type
configs (using less RAM) for node0 versus other NUMA nodes. Suggest
freshly booting one of your hypervisors and then with no guests
running take a look at e.g. /proc/buddyinfo/ and /proc/zoneinfo to see
what memory is used/available and where.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance] Queens PTG: Thursday summary

2017-09-26 Thread Blair Bethwaite
Hi Belmiro,

On 20 Sep. 2017 7:58 pm, "Belmiro Moreira" <
moreira.belmiro.email.li...@gmail.com> wrote:
> Discovering the latest image release is hard. So we added an image
property "recommended"
> that we update when a new image release is available. Also, we patched
horizon to show
> the "recommended" images first.

There is built in support in Horizon that allows displaying multiple image
category tabs where each takes contents from the list of images owned by a
specific project/tenant. In the Nectar research cloud this is what we rely
on to distinguish between "Public", "Project", "Nectar" (the base images we
maintain), and "Contributed" (images contributed by users who wish them to
be tested by us and effectively promoted as quality assured). When we
update a "Nectar" or "Contributed" image the old version stays public but
is moved into a project for deprecated images of that category, where
eventually we can clean it up.

> This helps our users to identify the latest image release but we continue
to show for
> each project the full list of public images + all personal user images.

Could you use the same model as us?

Cheers,
b1airo
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ptg] Simplification in OpenStack

2017-09-26 Thread Blair Bethwaite
I've been watching this thread and I think we've already seen an
excellent and uncontroversial suggestion towards simplifying initial
deployment of OS - that was to push towards encoding Constellations
into the deployment and/or config management projects.

On 26 September 2017 at 15:44, Adam Lawson  wrote:
> Hey Jay,
> I think a GUI with a default config is a good start. Much would need to
> happen to enable that of course but that's where my mind goes. Any talk
> about 'default' kind of infringes on what we've all strived to embrace; a
> cloud architecture without bakes in assumptions. A default-anything need not
> mean other options are not available - only that a default gets them
> started. I would never ever agree to a default that consists of
> KVM+Contrail+NetApp. Something neutral would be great- easier said than done
> of course.
>
> Samuel,
> Default configuration as I envision it != "Promoting a single solution". I
> really hope a working default install would allow new users to get started
> with OpeStack without promoting anything. OpenStack lacking a default
> install results in an unfriendly deployment exercise. I know for a fact the
> entire community at webhostingtalk.com ignores OS for the most part because
> of how hard it is to deploy. They use Fuel or other third-party solutions
> because we as a OS community continue to fail to acknowledge the importance
> of an easier of implementation. Imagine thousands of hosting providers
> deploying OpenStack because we made it easy. That is money in the bank IMHO.
> I totally get the thinking about avoiding the term default for the reasons
> you provided but giving users a starting point does not necessarily mean
> we're trying to get them to adopt that as their final design. Giving them a
> starting point must take precedence over not giving them any starting point.
>
> Jonathan,
> "I'm not going to adopt something new that requires a new parallel
> management tool to what I use." I would hope not! :) I don't mean having a
> tool means the tool is required. Only that a user-friendly deployment tool
> is available. Isn't that better than giving them nothing at all?
>
> //adam
>
>
> Adam Lawson
>
> Principal Architect
> Office: +1-916-794-5706
>
> On Mon, Sep 25, 2017 at 5:27 PM, Samuel Cassiba  wrote:
>>
>>
>> > On Sep 25, 2017, at 16:52, Clint Byrum  wrote:
>> >
>> > Excerpts from Jonathan D. Proulx's message of 2017-09-25 11:18:51 -0400:
>> >> On Sat, Sep 23, 2017 at 12:05:38AM -0700, Adam Lawson wrote:
>> >>
>> >> :Lastly, I do think GUI's make deployments easier and because of that,
>> >> I
>> >> :feel they're critical. There is more than one vendor whose built and
>> >> :distributes a free GUI to ease OpenStack deployment and management.
>> >> That's
>> >> :a good start but those are the opinions of a specific vendor - not he
>> >> OS
>> >> :community. I have always been a big believer in a default cloud
>> >> :configuration to ease the shock of having so many options for
>> >> everything. I
>> >> :have a feeling however our commercial community will struggle with
>> >> :accepting any method/project other than their own as being part a
>> >> default
>> >> :config. That will be a tough one to crack.
>> >>
>> >> Different people have differnt needs, so this is not meant to
>> >> contradict Adam.
>> >>
>> >> But :)
>> >>
>> >> Any unique deployment tool would be of no value to me as OpenStack (or
>> >> anyother infrastructure component) needs to fit into my environment.
>> >> I'm not going to adopt something new that requires a new parallel
>> >> management tool to what I use.
>> >>
>> >
>> > You already have that if you run OpenStack.
>> >
>> > The majority of development testing and gate testing happens via
>> > Devstack. A parallel management tool to what most people use to actually
>> > operate OpenStack.
>> >
>> >> I think focusing on the existing configuration management projects it
>> >> the way to go. Getting Ansible/Puppet/Chef/etc.. to support a well
>> >> know set of "constellations" in an opinionated would make deployment
>> >> easy (for most people who are using one of those already) and ,
>> >> ussuming the opionions are the same :) make consumption easier as
>> >> well.
>> >>
>> >> As an example when I started using OpenStack (Essex) we  had recently
>> >> switch to Ubuntu as our Linux platform and Pupept as our config
>> >> management. Ubuntu had a "one click MAAS install of OpenStack" which
>> >> was impossible as it made all sorts of assumptions about our
>> >> environment and wanted controll of most of them so it could provide a
>> >> full deployemnt solution.  Puppet had a good integrated example config
>> >> where I plugged in some local choices and and used existing deploy
>> >> methodologies.
>> >>
>> >> I fought with MAAS's "simple" install for a week.  When I gave up and
>> >> went with Puppet I had live users on a substantial (for the time)
>> >> cloud in less 

Re: [openstack-dev] [nova] Should PUT /os-services be idempotent?

2017-07-12 Thread Blair Bethwaite
Please don't make these 400s - it should not be a client error to be
unaware of the service status ahead of time.

On 12 July 2017 at 11:18, Matt Riedemann  wrote:
> I'm looking for some broader input on something being discussed in this
> change:
>
> https://review.openstack.org/#/c/464280/21/nova/api/openstack/compute/services.py
>
> This is collapsing the following APIs into a single API:
>
> Old:
>
> * PUT /os-services/enable
> * PUT /os-services/disable
> * PUT /os-services/disable-log-reason
> * PUT /os-services/force-down
>
> New:
>
> * PUT /os-services
>
> With the old APIs, if you tried to enable and already enabled service, it
> was not an error. The same is you tried to disable an already disabled
> service. It doesn't change anything, but it's not an error.
>
> The question is coming up in the new API if trying to enable an enabled
> service should be a 400, or trying to disable a disabled service. The way I
> wrote the new API, those are no 400 conditions. They don't do anything, like
> before, but they aren't errors.
>
> Looking at [1] it seems this should not be an error condition if you're
> trying to update the state of a resource and it's already at that state.
>
> I don't have a PhD in REST though so would like broader discussion on this.
>
> [1] http://www.restapitutorial.com/lessons/idempotency.html
>
> --
>
> Thanks,
>
> Matt
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][tc] Turning TC/UC workgroups into OpenStack SIGs

2017-07-05 Thread Blair Bethwaite
On 27 June 2017 at 23:47, Sean Dague  wrote:
> I still think I've missed, or not grasped, during this thread how a SIG
> functions differently than a WG, besides name. Both in theory and practice.

I think for the most part SIG is just a more fitting moniker for some
of these groups. E.g. I would say the scientific-wg is really a SIG,
typically with a few loose sub-WGs at any time, but the bulk of the
active membership is largely people doing similar jobs and
deploying/using OpenStack for similar things across science/academia -
so we basically have a lot of "shop" to talk about, and sometimes that
is only peripherally related to OpenStack.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [User-committee] [all][tc] Turning TC/UC workgroups into OpenStack SIGs

2017-07-05 Thread Blair Bethwaite
On 27 June 2017 at 23:47, Thierry Carrez  wrote:
> Setting up a common ML for common discussions (openstack-sigs) will
> really help, even if there will be some pain setting them up and getting
> the right readership to them :)

It's worth a try! I agree it will probably take a while, and I would
still expect cross-posting to be allowed without being jumped on for
some time until the list admins can see that membership has reached a
reasonable level - not sure how you'd define that though...

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Fwd: [scientific] IRC Meeting (Tues 2100 UTC): Science app catalogues, network security of research computing on OpenStack

2017-06-27 Thread Blair Bethwaite
Resend for openstack-dev with proper list perms...

-- Forwarded message --
From: Blair Bethwaite <blair.bethwa...@monash.edu>
Date: 27 June 2017 at 23:24
Subject: [scientific] IRC Meeting (Tues 2100 UTC): Science app catalogues,
network security of research computing on OpenStack
To: user-committee <user-commit...@lists.openstack.org>, "openstack-oper." <
openstack-operat...@lists.openstack.org>, "openstack-dev@lists.openstack.org"
<openstack-dev@lists.openstack.org>


Hi all,

Scientific-WG meeting in ~8 hours in #openstack-meeting. This week's agenda
is largely the same as last week, for alternate TZ.

Cheers,
Blair

-- Forwarded message --
From: Stig Telfer <stig.openst...@telfer.org>
Date: 21 June 2017 at 02:51
Subject: [User-committee] [scientific] IRC Meeting: Science app catalogues,
security of research computing on OpenStack - Wednesday 0900 UTC
To: user-committee <user-commit...@lists.openstack.org>, "openstack-oper." <
openstack-operat...@lists.openstack.org>


Greetings!

We have an IRC meeting on Wednesday at 0900 UTC in channel
#openstack-meeting.

This week we’d like to hear people’s thoughts and experiences on providing
scientific application catalogues to users - in particular with a view to
gathering best practice for a new chapter for the Scientific OpenStack book.

Similarly, we’d like to discuss what people are doing for security of
research computing instances on OpenStack.

The agenda is available here: https://wiki.openstack.o
rg/wiki/Scientific_working_group#IRC_Meeting_June_21st_2017
Details of the IRC meeting are here: http://eavesdrop.opensta
ck.org/#Scientific_Working_Group

Please come along with ideas, suggestions or requirements.  All are welcome.

Cheers,
Stig

___
User-committee mailing list
user-commit...@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee




-- 
Blair Bethwaite
Senior HPC Consultant

Monash eResearch Centre
Monash University
Room G26, 15 Innovation Walk, Clayton Campus
Clayton VIC 3800
Australia
Mobile: 0439-545-002
Office: +61 3-9903-2800 <+61%203%209903%202800>



-- 
Blair Bethwaite
Senior HPC Consultant

Monash eResearch Centre
Monash University
Room G26, 15 Innovation Walk, Clayton Campus
Clayton VIC 3800
Australia
Mobile: 0439-545-002
Office: +61 3-9903-2800
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [User-committee] [all][tc] Turning TC/UC workgroups into OpenStack SIGs

2017-06-27 Thread Blair Bethwaite
There is a not insignificant degree of irony in the fact that this
conversation has splintered so that anyone only reading openstack-operators
and/or user-committee is missing 90% of the picture Maybe I just need a
new ML management strategy.

I'd like to add a +1 to Sean's suggestion about WG/SIG/team/whatever tags
on reviews etc. This is something I've also suggested in the past:
http://lists.openstack.org/pipermail/user-committee/2016-October/001328.html.
My thinking at the time was that it would provide a tractable basis for
chairs to build standing discussion items around and help get more user &
ops eyes on blueprints/reviews/etc.

On 27 June 2017 at 10:25, Melvin Hillsman  wrote:

>
>
> On Wed, Jun 21, 2017 at 11:55 AM, Matt Riedemann 
> wrote:
>
>> On 6/21/2017 11:17 AM, Shamail Tahir wrote:
>>
>>>
>>>
>>> On Wed, Jun 21, 2017 at 12:02 PM, Thierry Carrez >> > wrote:
>>>
>>> Shamail Tahir wrote:
>>> > In the past, governance has helped (on the UC WG side) to reduce
>>> > overlaps/duplication in WGs chartered for similar objectives. I
>>> would
>>> > like to understand how we will handle this (if at all) with the
>>> new SIG
>>> > proposa?
>>>
>>> I tend to think that any overlap/duplication would get solved
>>> naturally,
>>> without having to force everyone through an application process that
>>> may
>>> discourage natural emergence of such groups. I feel like an
>>> application
>>> process would be premature optimization. We can always encourage
>>> groups
>>> to merge (or clean them up) after the fact. How much
>>> overlaps/duplicative groups did you end up having ?
>>>
>>>
>>> Fair point, it wasn't many. The reason I recalled this effort was
>>> because we had to go through the exercise after the fact and that made the
>>> volume of WGs to review much larger than had we asked the purpose whenever
>>> they were created. As long as we check back periodically and not let the
>>> work for validation/clean up pile up then this is probably a non-issue.
>>>
>>>
>>> > Also, do we have to replace WGs as a concept or could SIG
>>> > augment them? One suggestion I have would be to keep projects on
>>> the TC
>>> > side and WGs on the UC side and then allow for spin-up/spin-down
>>> of SIGs
>>> > as needed for accomplishing specific goals/tasks (picture of a
>>> diagram
>>> > I created at the Forum[1]).
>>>
>>> I feel like most groups should be inclusive of all community, so I'd
>>> rather see the SIGs being the default, and ops-specific or
>>> dev-specific
>>> groups the exception. To come back to my Public Cloud WG example, you
>>> need to have devs and ops in the same group in the first place before
>>> you would spin-up a "address scalability" SIG. Why not just have a
>>> Public Cloud SIG in the first place?
>>>
>>>
>>> +1, I interpreted originally that each use-case would be a SIG versus
>>> the SIG being able to be segment oriented (in which multiple use-cases
>>> could be pursued)
>>>
>>>
>>>  > [...]
>>> > Finally, how will this change impact the ATC/AUC status of the SIG
>>> > members for voting rights in the TC/UC elections?
>>>
>>> There are various options. Currently you give UC WG leads the AUC
>>> status. We could give any SIG lead both statuses. Or only give the
>>> AUC
>>> status to a subset of SIGs that the UC deems appropriate. It's
>>> really an
>>> implementation detail imho. (Also I would expect any SIG lead to
>>> already
>>> be both AUC and ATC somehow anyway, so that may be a non-issue).
>>>
>>>
>>> We can discuss this later because it really is an implementation detail.
>>> Thanks for the answers.
>>>
>>>
>>> --
>>> Thierry Carrez (ttx)
>>>
>>> 
>>> __
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>>> >> subscribe>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> 
>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Shamail Tahir
>>> t: @ShamailXD
>>> tz: Eastern Time
>>>
>>>
>>> 
>>> __
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: openstack-dev-requ...@lists.op
>>> enstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>> I think a key point you're going to want to convey and repeat ad nauseum
>> with this SIG idea is that each SIG is focused on a specific use case and
>> they can be spun up and spun down. Assuming that's what you want them to be.
>>
>> One 

Re: [openstack-dev] [Openstack-operators] [dev] [doc] Operations Guide future

2017-06-22 Thread Blair Bethwaite
Hi Alex,

On 2 June 2017 at 23:13, Alexandra Settle  wrote:
> O I like your thinking – I’m a pandoc fan, so, I’d be interested in
> moving this along using any tools to make it easier.

I can't realistically offer much time on this but I would be happy to
help (ad-hoc) review/catalog/clean-up issues with export.

> I think my only proviso (now I’m thinking about it more) is that we still
> have a link on docs.o.o, but it goes to the wiki page for the Ops Guide.

Agreed, need to maintain discoverability.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [dev] [doc] Operations Guide future

2017-06-01 Thread Blair Bethwaite
Hi Alex,

Likewise for option 3. If I recall correctly from the summit session
that was also the main preference in the room?

On 2 June 2017 at 11:15, George Mihaiescu  wrote:
> +1 for option 3
>
>
>
> On Jun 1, 2017, at 11:06, Alexandra Settle  wrote:
>
> Hi everyone,
>
>
>
> I haven’t had any feedback regarding moving the Operations Guide to the
> OpenStack wiki. I’m not taking silence as compliance. I would really like to
> hear people’s opinions on this matter.
>
>
>
> To recap:
>
>
>
> Option one: Kill the Operations Guide completely and move the Administration
> Guide to project repos.
> Option two: Combine the Operations and Administration Guides (and then this
> will be moved into the project-specific repos)
> Option three: Move Operations Guide to OpenStack wiki (for ease of
> operator-specific maintainability) and move the Administration Guide to
> project repos.
>
>
>
> Personally, I think that option 3 is more realistic. The idea for the last
> option is that operators are maintaining operator-specific documentation and
> updating it as they go along and we’re not losing anything by combining or
> deleting. I don’t want to lose what we have by going with option 1, and I
> think option 2 is just a workaround without fixing the problem – we are not
> getting contributions to the project.
>
>
>
> Thoughts?
>
>
>
> Alex
>
>
>
> From: Alexandra Settle 
> Date: Friday, May 19, 2017 at 1:38 PM
> To: Melvin Hillsman , OpenStack Operators
> 
> Subject: Re: [Openstack-operators] Fwd: [openstack-dev] [openstack-doc]
> [dev] What's up doc? Summit recap edition
>
>
>
> Hi everyone,
>
>
>
> Adding to this, I would like to draw your attention to the last dot point of
> my email:
>
>
>
> “One of the key takeaways from the summit was the session that I joint
> moderated with Melvin Hillsman regarding the Operations and Administration
> Guides. You can find the etherpad with notes here:
> https://etherpad.openstack.org/p/admin-ops-guides  The session was really
> helpful – we were able to discuss with the operators present the current
> situation of the documentation team, and how they could help us maintain the
> two guides, aimed at the same audience. The operator’s present at the
> session agreed that the Administration Guide was important, and could be
> maintained upstream. However, they voted and agreed that the best course of
> action for the Operations Guide was for it to be pulled down and put into a
> wiki that the operators could manage themselves. We will be looking at
> actioning this item as soon as possible.”
>
>
>
> I would like to go ahead with this, but I would appreciate feedback from
> operators who were not able to attend the summit. In the etherpad you will
> see the three options that the operators in the room recommended as being
> viable, and the voted option being moving the Operations Guide out of
> docs.openstack.org into a wiki. The aim of this was to empower the
> operations community to take more control of the updates in an environment
> they are more familiar with (and available to others).
>
>
>
> What does everyone think of the proposed options? Questions? Other thoughts?
>
>
>
> Alex
>
>
>
> From: Melvin Hillsman 
> Date: Friday, May 19, 2017 at 1:30 PM
> To: OpenStack Operators 
> Subject: [Openstack-operators] Fwd: [openstack-dev] [openstack-doc] [dev]
> What's up doc? Summit recap edition
>
>
>
>
>
> -- Forwarded message --
> From: Alexandra Settle 
> Date: Fri, May 19, 2017 at 6:12 AM
> Subject: [openstack-dev] [openstack-doc] [dev] What's up doc? Summit recap
> edition
> To: "openstack-d...@lists.openstack.org"
> 
> Cc: "OpenStack Development Mailing List (not for usage questions)"
> 
>
>
> Hi everyone,
>
>
> The OpenStack manuals project had a really productive week at the OpenStack
> summit in Boston. You can find a list of all the etherpads and attendees
> here: https://etherpad.openstack.org/p/docs-summit
>
>
>
> As we all know, we are rapidly losing key contributors and core reviewers.
> We are not alone, this is happening across the board. It is making things
> harder, but not impossible. Since our inception in 2010, we’ve been climbing
> higher and higher trying to achieve the best documentation we could, and
> uphold our high standards. This is something to be incredibly proud of.
> However, we now need to take a step back and realise that the amount of work
> we are attempting to maintain is now out of reach for the team size that we
> have. At the moment we have 13 cores, of which none are full time
> contributors or reviewers. This includes myself.
>
>
>
> That being said! I have spent the last week at the summit talking to some of
> our leaders, including Doug 

[openstack-dev] [scientific] Scientific-WG sessions in Boston

2017-05-09 Thread Blair Bethwaite
Hey all,

Hopefully you've all noticed this by now, the timing of the WG
sessions (lightening talks, meeting, BoF) has changed a little since
first published. I've just updated the etherpad to reflect that now:

https://etherpad.openstack.org/p/Scientific-WG-boston
Tues 11:15am - 11:55am - Scientific Working Group - Lightning Talks
Tues 3:40pm - 4:20pm - Scientific Working Group - Meeting
Weds 2:40pm - 3:20pm - Scientific OpenStack - BoF

Please jump on the etherpad and contribute to the agenda in order to
help us shape our focus for the next cycle.

Looking forward to seeing you today!

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [scientific] Lightning talks on Scientific OpenStack

2017-05-09 Thread Blair Bethwaite
Morning all -

Apologies for the shotgun email. But looks like we still have one or
two spots available for lightening talks if anyone has work they want
to share and/or discuss:

https://etherpad.openstack.org/p/Scientific-WG-Boston-Lightning

Best regards,

On 28 April 2017 at 06:19, George Mihaiescu  wrote:
> Thanks Stig,
>
> I added a presentation to the schedule.
>
>
> Cheers,
> George
>
>
>
> On Thu, Apr 27, 2017 at 3:49 PM, Stig Telfer 
> wrote:
>>
>> Hi George -
>>
>> Sorry for the slow response.  The consensus was for 8 minutes maximum.
>> That should be plenty for a lightning talk, and enables us to fit one more
>> in.
>>
>> Best wishes,
>> Stig
>>
>>
>> > On 27 Apr 2017, at 20:29, George Mihaiescu  wrote:
>> >
>> > Hi Stig, it will be 10 minutes sessions like in Barcelona?
>> >
>> > Thanks,
>> > George
>> >
>> >> On Apr 26, 2017, at 03:31, Stig Telfer 
>> >> wrote:
>> >>
>> >> Hi All -
>> >>
>> >> We have planned a session of lightning talks at the Boston summit to
>> >> discuss topics specific for OpenStack and research computing applications.
>> >> This was a great success at Barcelona and generated some stimulating
>> >> discussion.  We are also hoping for a small prize for the best talk of the
>> >> session!
>> >>
>> >> This is the event:
>> >>
>> >> https://www.openstack.org/summit/boston-2017/summit-schedule/events/18676
>> >>
>> >> If you’d like to propose a talk, please add a title and your name here:
>> >> https://etherpad.openstack.org/p/Scientific-WG-boston
>> >>
>> >> Everyone is welcome.
>> >>
>> >> Cheers,
>> >> Stig
>> >>
>> >>
>> >> ___
>> >> OpenStack-operators mailing list
>> >> openstack-operat...@lists.openstack.org
>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
> ___
> OpenStack-operators mailing list
> openstack-operat...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>



-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [cyborg] Fwd: GPU passthrough success and failure records

2017-05-06 Thread Blair Bethwaite
-- Forwarded message --
From: Blair Bethwaite <blair.bethwa...@gmail.com>
Date: 6 May 2017 at 17:55
Subject: GPU passthrough success and failure records
To: "openstack-oper." <openstack-operat...@lists.openstack.org>


Hi all,

I've been (very slowly) working on some docs detailing how to setup an
OpenStack Nova Libvirt+QEMU-KVM deployment to provide GPU-accelerated
instances. In Boston I hope to chat to some of the docs team and
figure out an appropriate upstream guide to fit that into. One of the
things I'd like to provide is a community record (better than ML
archives) of what works and doesn't. I've started a first attempt at
collating some basics here:
https://etherpad.openstack.org/p/GPU-passthrough-model-success-failure

I know there are at a least a few lurkers out there doing this too so
please share your own experience. Once there is a bit more data there
it probably makes sense to convert to a tabular format of some kind
(but wasn't immediately obvious to me how that should look given there
are several long list fields)

--
Cheers,
~Blairo


-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][blazar][scientific] advanced instance scheduling: reservations and preeemption - Forum session

2017-05-02 Thread Blair Bethwaite
On 2 May 2017 at 05:50, Jay Pipes  wrote:
> Masahito Muroi is currently marked as the moderator, but I will indeed be
> there and happy to assist Masahito in moderating, no problem.

The more the merrier :-).

There is a rather unfortunate clash here with the Scientific-WG BoF
session. Location permitting I will head to that first and then run to
catch-up on the end of this.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][blazar][scientific] advanced instance scheduling: reservations and preeemption - Forum session

2017-05-01 Thread Blair Bethwaite
Hi all,

Following up to the recent thread "[Openstack-operators] [scientific]
Resource reservation requirements (Blazar) - Forum session" and adding
openstack-dev.

This is now a confirmed forum session
(https://www.openstack.org/summit/boston-2017/summit-schedule/events/18781/advanced-instance-scheduling-reservations-and-preemption)
to cover any advanced scheduling use-cases people want to talk about,
but in particular focusing on reservations and preemption as they are
big priorities particularly for scientific deployers.

Etherpad draft is
https://etherpad.openstack.org/p/BOS-forum-advanced-instance-scheduling,
please attend and contribute! In particular I'd appreciate background
spec and review links added to the etherpad.

Jay, would you be able and interested to moderate this from the Nova side?

Cheers,

On 12 April 2017 at 05:22, Jay Pipes <jaypi...@gmail.com> wrote:
> On 04/11/2017 02:08 PM, Pierre Riteau wrote:
>>>
>>> On 4 Apr 2017, at 22:23, Jay Pipes <jaypi...@gmail.com
>>> <mailto:jaypi...@gmail.com>> wrote:
>>>
>>> On 04/04/2017 02:48 PM, Tim Bell wrote:
>>>>
>>>> Some combination of spot/OPIE
>>>
>>>
>>> What is OPIE?
>>
>>
>> Maybe I missed a message: I didn’t see any reply to Jay’s question about
>> OPIE.
>
>
> Thanks!
>
>> OPIE is the OpenStack Preemptible Instances
>> Extension: https://github.com/indigo-dc/opie
>> I am sure other on this list can provide more information.
>
>
> Got it.
>
>> I think running OPIE instances inside Blazar reservations would be
>> doable without many changes to the implementation.
>> We’ve talked about this idea several times, this forum session would be
>> an ideal place to draw up an implementation plan.
>
>
> I just looked through the OPIE source code. One thing I'm wondering is why
> the code for killing off pre-emptible instances is being done in the
> filter_scheduler module?
>
> Why not have a separate service that merely responds to the raising of a
> NoValidHost exception being raised from the scheduler with a call to go and
> terminate one or more instances that would have allowed the original request
> to land on a host?
>
> Right here is where OPIE goes and terminates pre-emptible instances:
>
> https://github.com/indigo-dc/opie/blob/master/opie/scheduler/filter_scheduler.py#L92-L100
>
> However, that code should actually be run when line 90 raises NoValidHost:
>
> https://github.com/indigo-dc/opie/blob/master/opie/scheduler/filter_scheduler.py#L90
>
> There would be no need at all for "detecting overcommit" here:
>
> https://github.com/indigo-dc/opie/blob/master/opie/scheduler/filter_scheduler.py#L96
>
> Simply detect a NoValidHost being returned to the conductor from the
> scheduler, examine if there are pre-emptible instances currently running
> that could be terminated and terminate them, and re-run the original call to
> select_destinations() (the scheduler call) just like a Retry operation
> normally does.
>
> There's be no need whatsoever to involve any changes to the scheduler at
> all.
>
>>>> and Blazar would seem doable as long as the resource provider
>>>> reserves capacity appropriately (i.e. spot resources>>blazar
>>>> committed along with no non-spot requests for the same aggregate).
>>>> Is this feasible?
>
>
> No. :)
>
> As mentioned in previous emails and on the etherpad here:
>
> https://etherpad.openstack.org/p/new-instance-reservation
>
> I am firmly against having the resource tracker or the placement API
> represent inventory or allocations with a temporal aspect to them (i.e.
> allocations in the future).
>
> A separate system (hopefully Blazar) is needed to manage the time-based
> associations to inventories of resources over a period in the future.
>
> Best,
> -jay
>
>>> I'm not sure how the above is different from the constraints I mention
>>> below about having separate sets of resource providers for preemptible
>>> instances than for non-preemptible instances?
>>>
>>> Best,
>>> -jay
>>>
>>>> Tim
>>>>
>>>> On 04.04.17, 19:21, "Jay Pipes" <jaypi...@gmail.com
>>>> <mailto:jaypi...@gmail.com>> wrote:
>>>>
>>>>On 04/03/2017 06:07 PM, Blair Bethwaite wrote:
>>>>> Hi Jay,
>>>>>
>>>>> On 4 April 2017 at 00:20, Jay Pipes <jaypi...@gmail.com
>>>> <mailto:jaypi...@gmail.com>> wrote:
>>>>>> However, implementing the above in any useful fash

Re: [openstack-dev] [scientific][nova][cyborg] Special Hardware Forum session

2017-05-01 Thread Blair Bethwaite
Thanks Rochelle. I encourage everyone to dump thoughts into the
etherpad (https://etherpad.openstack.org/p/BOS-forum-special-hardware
- feel free to garden it as you go!) so we can have some chance of
organising a coherent session. In particular it would be useful to
know what is going to be most useful for the Nova and Cyborg devs so
that we can give that priority before we start the show-and-tell /
knowledge-share that is often a large part of these sessions. I'd also
be very happy to have a co-moderator if any wants to volunteer.

On 26 April 2017 at 03:11, Rochelle Grober  wrote:
>
> I know that some cyborg folks and nova folks are planning to be there. Now
> we need to drive some ops folks.
>
>
> Sent from HUAWEI AnyOffice
> From:Blair Bethwaite
> To:openstack-dev@lists.openstack.org,openstack-oper.
> Date:2017-04-25 08:24:34
> Subject:[openstack-dev] [scientific][nova][cyborg] Special Hardware Forum
> session
>
> Hi all,
>
> A quick FYI that this Forum session exists:
> https://www.openstack.org/summit/boston-2017/summit-schedule/events/18803/special-hardware
> (https://etherpad.openstack.org/p/BOS-forum-special-hardware) is a
> thing this Forum.
>
> It would be great to see a good representation from both the Nova and
> Cyborg dev teams, and also ops ready to share their experience and
> use-cases.
>
> --
> Cheers,
> ~Blairo
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [nova][glance] Who needs multiple api_servers?

2017-05-01 Thread Blair Bethwaite
On 29 April 2017 at 01:46, Mike Dorman  wrote:
> I don’t disagree with you that the client side choose-a-server-at-random is 
> not a great load balancer.  (But isn’t this roughly the same thing that 
> oslo-messaging does when we give it a list of RMQ servers?)  For us it’s more 
> about the failure handling if one is down than it is about actually equally 
> distributing the load.

Maybe not great, but still better than making operators deploy (often
complex) full-featured external LBs when they really just want
*enough* redundancy. In many cases this seems to just create pets in
the control plane. I think it'd be useful if all OpenStack APIs and
their clients actively handled this poor-man's HA without having to
resort to haproxy etc, or e.g., assuming operators own the DNS.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [nova][glance] Who needs multiple api_servers?

2017-04-27 Thread Blair Bethwaite
We at Nectar are in the same boat as Mike. Our use-case is a little
bit more about geo-distributed operations though - our Cells are in
different States around the country, so the local glance-apis are
particularly important for caching popular images close to the
nova-computes. We consider these glance-apis as part of the underlying
cloud infra rather than user-facing, so I think we'd prefer not to see
them in the service-catalog returned to users either... is there going
to be a (standard) way to hide them?

On 28 April 2017 at 09:15, Mike Dorman  wrote:
> We make extensive use of the [glance]/api_servers list.  We configure that on 
> hypervisors to direct them to Glance servers which are more “local” 
> network-wise (in order to reduce network traffic across security 
> zones/firewalls/etc.)  This way nova-compute can fail over in case one of the 
> Glance servers in the list is down, without putting them behind a load 
> balancer.  We also don’t run https for these “internal” Glance calls, to save 
> the overhead when transferring images.
>
> End-user calls to Glance DO go through a real load balancer and then are 
> distributed out to the Glance servers on the backend.  From the end-user’s 
> perspective, I totally agree there should be one, and only one URL.
>
> However, we would be disappointed to see the change you’re suggesting 
> implemented.  We would lose the redundancy we get now by providing a list.  
> Or we would have to shunt all the calls through the user-facing endpoint, 
> which would generate a lot of extra traffic (in places where we don’t want 
> it) for image transfers.
>
> Thanks,
> Mike
>
>
>
> On 4/27/17, 4:02 PM, "Matt Riedemann"  wrote:
>
> On 4/27/2017 4:52 PM, Eric Fried wrote:
> > Y'all-
> >
> >   TL;DR: Does glance ever really need/use multiple endpoint URLs?
> >
> >   I'm working on bp use-service-catalog-for-endpoints[1], which intends
> > to deprecate disparate conf options in various groups, and centralize
> > acquisition of service endpoint URLs.  The idea is to introduce
> > nova.utils.get_service_url(group) -- note singular 'url'.
> >
> >   One affected conf option is [glance]api_servers[2], which currently
> > accepts a *list* of endpoint URLs.  The new API will only ever return 
> *one*.
> >
> >   Thus, as planned, this blueprint will have the side effect of
> > deprecating support for multiple glance endpoint URLs in Pike, and
> > removing said support in Queens.
> >
> >   Some have asserted that there should only ever be one endpoint URL for
> > a given service_type/interface combo[3].  I'm fine with that - it
> > simplifies things quite a bit for the bp impl - but wanted to make sure
> > there were no loudly-dissenting opinions before we get too far down this
> > path.
> >
> > [1]
> > 
> https://blueprints.launchpad.net/nova/+spec/use-service-catalog-for-endpoints
> > [2]
> > 
> https://github.com/openstack/nova/blob/7e7bdb198ed6412273e22dea72e37a6371fce8bd/nova/conf/glance.py#L27-L37
> > [3]
> > 
> http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2017-04-27.log.html#t2017-04-27T20:38:29
> >
> > Thanks,
> > Eric Fried (efried)
> > .
> >
> > 
> __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: 
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> +openstack-operators
>
> --
>
> Thanks,
>
> Matt
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> ___
> OpenStack-operators mailing list
> openstack-operat...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [scientific][nova][cyborg] Special Hardware Forum session

2017-04-25 Thread Blair Bethwaite
Hi all,

A quick FYI that this Forum session exists:
https://www.openstack.org/summit/boston-2017/summit-schedule/events/18803/special-hardware
(https://etherpad.openstack.org/p/BOS-forum-special-hardware) is a
thing this Forum.

It would be great to see a good representation from both the Nova and
Cyborg dev teams, and also ops ready to share their experience and
use-cases.

-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [MassivelyDistributed] bi-weekly meeting

2016-11-08 Thread Blair Bethwaite
Devil's advocate - what is "full enough"? Surely another channel is
essentially free and having flexibility in available timing is of utmost
importance?

On 8 Nov 2016 5:37 PM, "Tony Breeds"  wrote:

> On Mon, Nov 07, 2016 at 05:52:43PM +0100, lebre.adr...@free.fr wrote:
> > Dear all,
> >
> > We are still looking for an irc channel for our meeting:
> https://review.openstack.org/#/c/393899
> > There is no available channels for the slot we selected during our
> face-to-face meeting in Barcelona.
> >
> > If I'm correct, we have two possibilities:
> > - Determine another slot: the first available slot on Wednesday is at
> 17:00 UTC.
>
> Or you could look at 1400 on Wednesday
>
> > - Ask for the creation of a new IRC channel dedicated to our WG:
> something line #openstack-massively-distributed
>
> This is an option but you can't hold meetings in project specific channels
> as
> the meeting bot only works correctly in the #openstack-meeting rooms.
>
> We from time to time get asked to cerate a 5th meeting room.  Looking at
> [1] I
> don't feel like we're full enough to persue that.
>
> Yours Tony.
> [1] https://docs.google.com/spreadsheets/d/1lQHKCQa4wQmnWpTMB3DLltY81kICh
> ZumHLHzoMNF07c/edit?usp=sharing
>
> ___
> OpenStack-operators mailing list
> openstack-operat...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [glance] Proposal for a mid-cycle virtual sync on operator issues

2016-05-31 Thread Blair Bethwaite
Hi Nikhil,

2000UTC might catch a few kiwis, but it's 6am everywhere on the east
coast of Australia, and even earlier out west. 0800UTC, on the other
hand, would be more sociable.

On 26 May 2016 at 15:30, Nikhil Komawar  wrote:
> Thanks Sam. We purposefully chose that time to accommodate some of our
> community members from the Pacific. I'm assuming it's just your case
> that's not working out for that time? So, hopefully other Australian/NZ
> friends can join.
>
>
> On 5/26/16 12:59 AM, Sam Morrison wrote:
>> I’m hoping some people from the Large Deployment Team can come along. It’s 
>> not a good time for me in Australia but hoping someone else can join in.
>>
>> Sam
>>
>>
>>> On 26 May 2016, at 2:16 AM, Nikhil Komawar  wrote:
>>>
>>> Hello,
>>>
>>>
>>> Firstly, I would like to thank Fei Long for bringing up a few operator
>>> centric issues to the Glance team. After chatting with him on IRC, we
>>> realized that there may be more operators who would want to contribute
>>> to the discussions to help us take some informed decisions.
>>>
>>>
>>> So, I would like to call for a 2 hour sync for the Glance team along
>>> with interested operators on Thursday June 9th, 2016 at 2000UTC.
>>>
>>>
>>> If you are interested in participating please RSVP here [1], and
>>> participate in the poll for the tool you'd prefer. I've also added a
>>> section for Topics and provided a template to document the issues clearly.
>>>
>>>
>>> Please be mindful of everyone's time and if you are proposing issue(s)
>>> to be discussed, come prepared with well documented & referenced topic(s).
>>>
>>>
>>> If you've feedback that you are not sure if appropriate for the
>>> etherpad, you can reach me on irc (nick: nikhil).
>>>
>>>
>>> [1] https://etherpad.openstack.org/p/newton-glance-and-ops-midcycle-sync
>>>
>>> --
>>>
>>> Thanks,
>>> Nikhil Komawar
>>> Newton PTL for OpenStack Glance
>>>
>>>
>>> __
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> --
>
> Thanks,
> Nikhil
>
>
> ___
> OpenStack-operators mailing list
> openstack-operat...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



-- 
Cheers,
~Blairo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][glance] why do image properties control per-instance settings?

2014-11-25 Thread Blair Bethwaite
Hi all,

I've just been doing some user consultation and pondering a case for
use of the Qemu Guest Agent in order to get quiesced backups.

In doing so I found myself wondering why on earth I need to set an
image property in Glance (hw_qemu_guest_agent) to toggle such a thing
for any particular instance, it doesn't make any sense that what
should be an instance boot parameter (or possibly even runtime
dynamic) is controlled through the cloud's image registry. There is no
shortage of similar metadata properties, probably everything prefixed
hw_ for a start. It looks like this has even come up on reviews
before, e.g.
https://review.openstack.org/#/c/43513/
The last comment from DanielB is:
For setting per-instance, rather than doing something that only works
for passing kernel command line, it would be desirable to have a way
to pass in arbitrary key,value attribute pairs to the 'boot' API call,
because I can see this being useful for things beyond just the kernel
command line.

In some cases I imagine image properties could be useful to indicate
that the image has a certain *capability*, which could be used as a
step to verify it can support some requested feature (e.g., qemu-ga)
for any particular instance launch.

Is there similar work underway? Would it make sense to build such
functionality via the existing instance metadata API?

-- 
Cheers,
~Blairo

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Quota management and enforcement across projects

2014-11-19 Thread Blair Bethwaite
On 20 November 2014 05:25,  openstack-dev-requ...@lists.openstack.org wrote:
 --

 Message: 24
 Date: Wed, 19 Nov 2014 10:57:17 -0500
 From: Doug Hellmann d...@doughellmann.com
 To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] Quota management and enforcement across
 projects
 Message-ID: 13f4f7a1-d4ec-4d14-a163-d477a4fd9...@doughellmann.com
 Content-Type: text/plain; charset=windows-1252


 On Nov 19, 2014, at 9:51 AM, Sylvain Bauza sba...@redhat.com wrote:
 My bad. Let me rephrase it. I'm seeing this service as providing added value 
 for managing quotas by ensuring consistency across all projects. But as I 
 said, I'm also thinking that the quota enforcement has still to be done at 
 the customer project level.

 Oh, yes, that is true. I envision the API for the new service having a call 
 that means ?try to consume X units of a given quota? and that it would return 
 information about whether that can be done. The apps would have to define 
 what quotas they care about, and make the appropriate calls.

For actions initiated directly through core OpenStack service APIs
(Nova, Cinder, Neutron, etc - anything using Keystone policy),
shouldn't quota-enforcement be handled by Keystone? To me this is just
a subset of authz, and OpenStack already has a well established
service for such decisions.

It sounds like the idea here is to provide something generic that
could be used outside of OpenStack? I worry that might be premature
scope creep that detracts from the outcome.

-- 
Cheers,
~Blairo

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Ephemeral bases and block-migration

2013-10-06 Thread Blair Bethwaite
Hi there,

We've been investigating some guest filesystem issues recently and noticed
what looks like a slight inconsistency in base image handling in
block-migration. We're on Grizzly from the associated Ubuntu cloud archive
and using qcow on local storage.

What we've noticed is that after block-migration the instances secondary
disk has a generic backing file _base/ephemeral, as opposed to the
backing file it was created with, e.g., _base/ephemeral_30_default. These
backing files have different virtual sizes:
$ qemu-img info _base/ephemeral
image: _base/ephemeral
file format: raw
virtual size: 2.0G (2147483648 bytes)
disk size: 778M
$ qemu-img info _base/ephemeral_30_default
image: _base/ephemeral_30_default
file format: raw
virtual size: 30G (32212254720 bytes)
disk size: 614M

This seems like it could be problematic considering virtual disks of
different sizes end up pointed at this _base/ephemeral file, and I've no
idea how that file is created in the first place. Can anyone explain?

-- 
Cheers,
~Blairo
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev