Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2014-02-05 Thread Jaromir Coufal

On 2014/05/02 15:27, Tzu-Mainn Chen wrote:

Hi,

In parallel to Jarda's updated wireframes, and based on various discussions 
over the past
weeks, here are the updated Tuskar requirements for Icehouse:

https://wiki.openstack.org/wiki/TripleO/TuskarIcehouseRequirements

Any feedback is appreciated.  Thanks!

Tzu-Mainn Chen


+1 looks good to me!

-- Jarda

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2014-02-05 Thread Tzu-Mainn Chen
Hi,

In parallel to Jarda's updated wireframes, and based on various discussions 
over the past
weeks, here are the updated Tuskar requirements for Icehouse:

https://wiki.openstack.org/wiki/TripleO/TuskarIcehouseRequirements

Any feedback is appreciated.  Thanks!

Tzu-Mainn Chen

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-19 Thread Radomir Dopieralski
On 11/12/13 21:42, Robert Collins wrote:
> On 12 December 2013 01:17, Jaromir Coufal  wrote:
>> On 2013/10/12 23:09, Robert Collins wrote:

[snip]

>>> Thats speculation. We don't know if they will or will not because we
>>> haven't given them a working system to test.
>>
>> Some part of that is speculation, some part of that is feedback from people
>> who are doing deployments (of course its just very limited audience).
>> Anyway, it is not just pure theory.
> 
> Sure. Let be me more precise. There is a hypothesis that lack of
> direct control will be a significant adoption blocker for a primary
> group of users.

I'm sorry for butting in, but I think I can see where your disagreement
comes from and maybe explaining it will help resolving it.

It's not a hypothesis, but a well documented and researched fact, that
transparency has a huge impact on the ease of use of any information
artifact. In particular, the easier you can see what is actually
happening and how your actions affect the outcome, the faster you can
learn to use it and the more efficient you are in using it and resolving
any problems with it. It's no surprise that "closeness of mapping" and
"hidden dependencies" are two important congnitive dimensions that are
often measured when assesing the usability of an artifact. Humans simply
find it nice when they can tell what is happening, even if theoretically
they don't need that knowledge when everything works correctly.

This doesn't come from any direct requirements of Tuskar itself, and I
am sure that all the workarounds that Robert gave will work somehow in
every real-world problem that arises. But the whole will not necessarily
be easy or pleasant to learn and use. I am aware, that the requirment to
be able to see what is happening is a fundamental problem, because it
destroys one of the most important rules in system engineering --
separation of concerns. The parts in the upper layers should simply not
care how the parts in the lower layers do their jobs, as long as they
work properly.

I know that it is a kind of a tradition in Open Source software to
create software with the assumption, that it's enough for it to do its
job, and if every use case can be somehow done, directly or indirectly,
then it's good enough. We have a lot of working tools designed with this
principle in mind, such as CSS, autotools or our favorite git. They do
their job, and they do it well (except when they break horribly). But I
think we can put a little bit more effort into also ensuring that the
common use cases are not just doable, but also easy to implement and
maintain. And that means that we will sometimes have a requirement that
comes from how people think, and not from any particular technical need.
I know that it sounds like speculation, or theory, but I think we need
to tust in Jarda's experience with usability and his judgement about
what works better -- unless of course we are willing to learn all that
ourselves, which may take quite some time.

What is the point of having an expert, if we know better, after all?
-- 
Radomir Dopieralski


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-18 Thread Will Foster

On 13/12/13 09:41 -0500, Jay Dobies wrote:

* ability to 'preview' changes going to the scheduler


What does this give you? How detailed a preview do you need? What
information is critical there? Have you seen the proposed designs for
a heat template preview feature - would that be sufficient?


Will will probably have a better answer to this, but I feel like at 
very least this goes back to the psychology point raised earlier (I 
think in this thread, but if not, definitely one of the TripleO ones).


A weird parallel is whenever I do a new install of Fedora. I never 
accept their default disk partitioning without electing to 
review/modify it. Even if I didn't expect to change anything, I want 
to see what they are going to give me. And then I compulsively review 
the summary of what actual changes will be applied in the follow up 
screen that's displayed after I say I'm happy with the layout.


Perhaps that's more a commentary on my own OCD and cynicism that I 
feel dirty accepting the magic defaults blindly. I love the idea of 
anaconda doing the heavy lifting of figuring out sane defaults for 
home/root/swap and so on (similarly, I love the idea of Nova scheduler 
rationing out where instances are deployed), but I at least want to 
know I've seen it before it happens.


I fully admit to not knowing how common that sort of thing is. I 
suspect I'm in the majority of geeks and tame by sys admin standards, 
but I honestly don't know. So I acknowledge that my entire argument 
for the preview here is based on my own personality.




Jay,

I mirror your sentiments exactly here, the Fedora example is a good
one and is moreso the case when it comes to node allocation/details
and proposed changes in a deployment scenario.  Though 9/10 times the
defaults Nova scheduler will choose will be fine but there's a 'human'
need to review them, changing as necessary.

-will



pgpt6jWvlbElR.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-16 Thread Will Foster

On 13/12/13 19:06 +1300, Robert Collins wrote:

On 13 December 2013 06:24, Will Foster  wrote:


I just wanted to add a few thoughts:


Thank you!


For some comparative information here "from the field" I work
extensively on deployments of large OpenStack implementations,
most recently with a ~220node/9rack deployment (scaling up to 42racks / 1024
nodes soon).  My primary role is of a Devops/Sysadmin nature, and not a
specific development area so rapid provisioning/tooling/automation is an
area I almost exclusively work within (mostly using API-driven
using Foreman/Puppet).  The infrastructure our small team designs/builds
supports our development and business.

I am the target user base you'd probably want to cater to.


Absolutely!


I can tell you the philosophy and mechanics of Tuskar/OOO are great,
something I'd love to start using extensively but there are some needed
aspects in the areas of control that I feel should be added (though arguably
less for me and more for my ilk who are looking to expand their OpenStack
footprint).

* ability to 'preview' changes going to the scheduler


What does this give you? How detailed a preview do you need? What
information is critical there? Have you seen the proposed designs for
a heat template preview feature - would that be sufficient?


Thanks for the reply.  Preview-wise it'd be useful to see node
allocation prior to deployment - nothing too in-depth.
I have not seen the heat template preview features, are you referring
to the YAML templating[1] or something else[2]?  I'd like to learn
more.

[1] -
http://docs.openstack.org/developer/heat/template_guide/hot_guide.html
[2] - https://github.com/openstack/heat-templates




* ability to override/change some aspects within node assignment


What would this be used to do? How often do those situations turn up?
Whats the impact if you can't do that?


One scenario might be that autodiscovery does not pick up an available
node in your pool of resources, or detects incorrectly - you could
manually change things as you like it.  Another (more common)
scenario is that you don't have an isolated, flat network with which
to deploy and nodes are picked that you do not want included in the
provisioning - you could remove those from the set of resources prior
to launching overcloud creation.  The impact would be that the tooling
would seem inflexible to those lacking a thoughtfully prepared 
network/infrastructure, or more commonly in cases where the existing

network design is too inflexible the usefulness and quick/seamless
provisioning benefits would fall short.




* ability to view at least minimal logging from within Tuskar UI


Logging of what - the deployment engine? The heat event-log? Nova
undercloud logs? Logs from the deployed instances? If it's not there
in V1, but you can get, or already have credentials for the [instances
that hold the logs that you wanted] would that be a big adoption
blocker, or just a nuisance?



Logging of the deployment engine status during the bootstrapping
process initially, and some rudimentary node success/failure
indication.  It should be simplistic enough to not rival existing monitoring/log
systems but at least provide deployment logs as the overcloud is being
built and a general node/health 'check-in' that it's complete.

Afterwards as you mentioned the logs are available on the deployed
systems.  Think of it as providing some basic written navigational signs 
for people crossing a small bridge before they get to the highway,

there's continuity from start -> finish and a clear sense of what's
occurring.  From my perspective, absence of this type of verbosity may
impede adoption of new users (who are used to this type of
information with deployment tooling).




Here's the main reason - most new adopters of OpenStack/IaaS are going to be
running legacy/mixed hardware and while they might have an initiative to
explore and invest and even a decent budget most of them are not going to
have
completely identical hardware, isolated/flat networks and things set
aside in such a way that blind auto-discovery/deployment will just work all
the time.


Thats great information (and something I reasonably well expected, to
a degree). We have a hard dependency on no wildcard DHCP servers in
the environment (or we can't deploy). Autodiscovery is something we
don't have yet, but certainly debugging deployment failures is a very
important use case and one we need to improve both at the plumbing
layer and in the stories around it in the UI.


There will be a need to sometimes adjust, and those coming from a more
vertically-scaling infrastructure (most large orgs.) will not have
100% matching standards in place of vendor, machine spec and network design
which may make Tuscar/OOO seem inflexible and 'one-way'.  This may just be a
carry-over or fear of the old ways of deployment but nonetheless it
is present.


I'm not sure what you mean by matching standards here :). Ironic is
designed to support

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-13 Thread Matt Wagner
On Mon Dec  9 15:22:04 2013, Robert Collins wrote:
> On 9 December 2013 23:56, Jaromir Coufal  wrote:
>>
>> Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory
>> stats
>>
>> For registration it is just Management MAC address which is needed right? Or
>> does Ironic need also IP? I think that MAC address might be enough, we can
>> display IP in details of node later on.
>
> Ironic needs all the details I listed today. Management MAC is not
> currently used at all, but would be needed in future when we tackle
> IPMI IP managed by Neutron.

I think what happened here is that two separate things we need got
conflated.

We need the IP address of the management (IPMI) interface, for power
control, etc.

We also need the MAC of the host system (*not* its IPMI/management
interface) for PXE to serve it the appropriate content.


-- 
Matt Wagner
Software Engineer, Red Hat



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-13 Thread Jay Dobies

* ability to 'preview' changes going to the scheduler


What does this give you? How detailed a preview do you need? What
information is critical there? Have you seen the proposed designs for
a heat template preview feature - would that be sufficient?


Will will probably have a better answer to this, but I feel like at very 
least this goes back to the psychology point raised earlier (I think in 
this thread, but if not, definitely one of the TripleO ones).


A weird parallel is whenever I do a new install of Fedora. I never 
accept their default disk partitioning without electing to review/modify 
it. Even if I didn't expect to change anything, I want to see what they 
are going to give me. And then I compulsively review the summary of what 
actual changes will be applied in the follow up screen that's displayed 
after I say I'm happy with the layout.


Perhaps that's more a commentary on my own OCD and cynicism that I feel 
dirty accepting the magic defaults blindly. I love the idea of anaconda 
doing the heavy lifting of figuring out sane defaults for home/root/swap 
and so on (similarly, I love the idea of Nova scheduler rationing out 
where instances are deployed), but I at least want to know I've seen it 
before it happens.


I fully admit to not knowing how common that sort of thing is. I suspect 
I'm in the majority of geeks and tame by sys admin standards, but I 
honestly don't know. So I acknowledge that my entire argument for the 
preview here is based on my own personality.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Robert Collins
On 13 December 2013 06:24, Will Foster  wrote:

> I just wanted to add a few thoughts:

Thank you!

> For some comparative information here "from the field" I work
> extensively on deployments of large OpenStack implementations,
> most recently with a ~220node/9rack deployment (scaling up to 42racks / 1024
> nodes soon).  My primary role is of a Devops/Sysadmin nature, and not a
> specific development area so rapid provisioning/tooling/automation is an
> area I almost exclusively work within (mostly using API-driven
> using Foreman/Puppet).  The infrastructure our small team designs/builds
> supports our development and business.
>
> I am the target user base you'd probably want to cater to.

Absolutely!

> I can tell you the philosophy and mechanics of Tuskar/OOO are great,
> something I'd love to start using extensively but there are some needed
> aspects in the areas of control that I feel should be added (though arguably
> less for me and more for my ilk who are looking to expand their OpenStack
> footprint).
>
> * ability to 'preview' changes going to the scheduler

What does this give you? How detailed a preview do you need? What
information is critical there? Have you seen the proposed designs for
a heat template preview feature - would that be sufficient?

> * ability to override/change some aspects within node assignment

What would this be used to do? How often do those situations turn up?
Whats the impact if you can't do that?

> * ability to view at least minimal logging from within Tuskar UI

Logging of what - the deployment engine? The heat event-log? Nova
undercloud logs? Logs from the deployed instances? If it's not there
in V1, but you can get, or already have credentials for the [instances
that hold the logs that you wanted] would that be a big adoption
blocker, or just a nuisance?


> Here's the main reason - most new adopters of OpenStack/IaaS are going to be
> running legacy/mixed hardware and while they might have an initiative to
> explore and invest and even a decent budget most of them are not going to
> have
> completely identical hardware, isolated/flat networks and things set
> aside in such a way that blind auto-discovery/deployment will just work all
> the time.

Thats great information (and something I reasonably well expected, to
a degree). We have a hard dependency on no wildcard DHCP servers in
the environment (or we can't deploy). Autodiscovery is something we
don't have yet, but certainly debugging deployment failures is a very
important use case and one we need to improve both at the plumbing
layer and in the stories around it in the UI.

> There will be a need to sometimes adjust, and those coming from a more
> vertically-scaling infrastructure (most large orgs.) will not have
> 100% matching standards in place of vendor, machine spec and network design
> which may make Tuscar/OOO seem inflexible and 'one-way'.  This may just be a
> carry-over or fear of the old ways of deployment but nonetheless it
> is present.

I'm not sure what you mean by matching standards here :). Ironic is
designed to support extremely varied environments with arbitrary mixes
of IPMI/drac/ilo/what-have-you, and abstract that away for us. From a
network perspective I've been arguing the following:

 - we need routable access to the mgmt cards
 - if we don't have that (say there are 5 different mgmt domains with
no routing between them) then we install 5 deployment layers (5
underclouds) which could be as small as one machine each.
 - within the machines that are served by one routable region of mgmt
cards, we need no wildcard DHCP servers, for our DHCP server to serve
PXE to the machines (for the PXE driver in Ironic).
 - building a single region overcloud from multiple undercloud regions
will involve manually injecting well known endpoints (such as the
floating virtual IP for API endpoints) into some of the regions, but
it's in principle straightforward to do and use with the plumbing
layer today.

> In my case, we're lucky enough to have dedicated, near-identical
> equipment and a flexible network design we've architected prior that
> makes Tuskar/OOO a great fit.  Most people will not have this
> greenfield ability and will use what they have lying around initially
> as to not make a big investment until familiarity and trust of
> something new is permeated.
>
> That said, I've been working with Jaromir Coufal on some UI mockups of
> Tuskar with some of this 'advanced' functionality included and from
> my perspective it looks like something to consider pulling in sooner than
> later if you want to maximize the adoption of new users.

So, for Tuskar my short term goals are to support RH in shipping a
polished product while still architecting and building something
sustainable and suitable for integration into the OpenStack ecosystem.
(For instance, one of the requirements for integration is that we
don't [significantly] overlap other projects - and thats why I've been
pushing so hard on the do

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Robert Collins
On 13 December 2013 10:05, Jay Dobies  wrote:
>> Maybe this is a valid use case?

> You mention three specific nodes, but what you're describing is more likely
> three concepts:
> - Balanced Nodes
> - High Disk I/O Nodes
> - Low-End Appliance Nodes
>
> They may have one node in each, but I think your example of three nodes is
> potentially *too* simplified to be considered as proper sample size. I'd
> guess there are more than three in play commonly, in which case the concepts
> breakdown starts to be more appealing.
>
> I think the disk flavor in particular has quite a few use cases, especially
> until SSDs are ubiquitous. I'd want to flag those (in Jay terminology, "the
> disk hotness") as hosting the data-intensive portions, but where I had
> previously been viewing that as manual allocation, it sounds like the
> approach is to properly categorize them for what they are and teach Nova how
> to use them.
>
> Robert - Please correct me if I misread any of what your intention was, I
> don't want to drive people down the wrong path if I'm misinterpretting
> anything.

You nailed it, no butchering involved at all!

-Rob


-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Robert Collins
On 13 December 2013 06:13, Keith Basil  wrote:
> On Dec 11, 2013, at 3:42 PM, Robert Collins wrote:

>>> My question is - can't we help them now? To enable users to use our app even
>>> when we don't have enough smartness to help them 'auto' way?
>>
>> I understand the question: but I can't answer it until we have *an*
>> example that is both real and not deliverable today. At the moment the
>> only one we know of is HA, and thats certainly an important feature on
>> the nova scheduled side, so doing manual control to deliver a future
>> automatic feature doesn't make a lot of sense to me. Crawl, walk, run.
>
> Maybe this is a valid use case?
>
> Cloud operator has several core service nodes of differing configuration
> types.
>
> [node1]  <-- balanced mix of disk/cpu/ram for general core services
> [node2]  <-- lots of disks for Ceilometer data storage
> [node3]  <-- low-end "appliance like" box for a specialized/custom core 
> service
>  (SIEM box for example)
>
> All nodes[1,2,3] are in the same deployment grouping ("core services)".  As 
> such,
> this is a heterogenous deployment grouping.  Heterogeneity in this case 
> defined by
> differing roles and hardware configurations.
>
> This is a real use case.
>
> How do we handle this?

Ok, so node1 gets flavor A, node2 gets flavor B, node3 gets flavor C.

We have three disk images, one with general core services on it
(imageA), one with ceilometer backend storage (imageB), one with SIEM
on it (imageC).
And we have three service groups, one that binds imageA to {flavors:
[FlavorA], count:1}, one that binds imageB to {flavors:[FlavorB],
count:1}, one that binds imageC to {flavors:[FlavorC], count:1}

Thats doable by the plumbing today, without any bypass of the Nova scheduler.

FlavorB might be the same as the flavor for gluster boxes for
instance, in which case you'll get commonality - if one fails, we can
schedule onto another.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Robert Collins
On 13 December 2013 05:35, Keith Basil  wrote:
> On Dec 10, 2013, at 5:09 PM, Robert Collins wrote:






 unallocated | aqvailable | undeployed
>>>
>>> +1 unallocated
>>
>> I think available is most accurate, but undeployed works too. I really
>> don't like unallocated, sorry!
>
> Would "available" introduce/denote that the service is deployed
> and operational?

It could lead to that confusion. Jaromir suggested free in the other
thread, I think that that would work well and avoid the confusion with
'working service' that available has.


>> Brainstorming: role is something like 'KVM compute', but we may have
>> two differing only in configuration sets of that role. In a very
>> technical sense it's actually:
>> image + configuration -> scaling group in Heat.
>> So perhaps:
>> Role + Service group ?
>> e.g. GPU KVM Hypervisor would be a service group, using the KVM
>> Compute role aka disk image.
>>
>> Or perhaps we should actually surface image all the way up:
>>
>> Image + Service group ?
>> image = what things we build into the image
>> service group = what runtime configuration we're giving, including how
>> many machines we want in the group
>>
> How about just leaving it as Resource Class?  The things you've
> brainstormed about are in line with the original thinking around
> the resource class concept.
>
> role (assumes role specific image) +
> service/resource grouping +
> hardware that can provide that service/resource

So, Resource Class as originally communicated really is quite
different to me: though obviously there is some overlap. I can drill
into that if you want ... however the implications of the words and
how folk can map from them back to the plumbing is what really
concerns me, so thats what I'll focus on here.

Specifically: Resource Class was focused on the resources being
offered into the overcloud, but the image + (service config/service
group/group config) idea applies to all things we deploy equally -
it's relevant to management instances, control plane instances, as
well as Nova and Cinder. So the Resource part of it doesn't really
fit. Using 'Class' is just jargon - I would expect it to be pretty
impenetrable to non-programmers.

Ideally I think we want something that:
 - has a fairly obvious mapping back to Nova/Heat terminology (e.g. if
the concepts are the same, lets call them the same)
 - doesn't overlap other terms unless they are compatible.

For instance Heat has a concept 'resourcegroup' where resource means
'the object that heat has created and is managing' and the group
refers to scaling to some N of them. This is what we will eventually
back a particular image + config onto - that becomes one resourcegroup
in heat; using resource class to refer to that when the resource
referred to is the delivered service, not 'Instance's (the Nova
baremetal instances we create through the resourcegroup) is going to
cause significant confusion at minimum :)

-Rob


-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Jay Dobies



On 12/12/2013 04:25 PM, Keith Basil wrote:

On Dec 12, 2013, at 4:05 PM, Jay Dobies wrote:


Maybe this is a valid use case?

Cloud operator has several core service nodes of differing configuration
types.

[node1]  <-- balanced mix of disk/cpu/ram for general core services
[node2]  <-- lots of disks for Ceilometer data storage
[node3]  <-- low-end "appliance like" box for a specialized/custom core service
 (SIEM box for example)

All nodes[1,2,3] are in the same deployment grouping ("core services)".  As 
such,
this is a heterogenous deployment grouping.  Heterogeneity in this case defined 
by
differing roles and hardware configurations.

This is a real use case.

How do we handle this?


This is the sort of thing I had been concerned with, but I think this is just a 
variation on Robert's GPU example. Rather than butcher it by paraphrasing, I'll 
just include the relevant part:


"The basic stuff we're talking about so far is just about saying each
role can run on some set of undercloud flavors. If that new bit of kit
has the same coarse metadata as other kit, Nova can't tell it apart.
So the way to solve the problem is:
- a) teach Ironic about the specialness of the node (e.g. a tag 'GPU')
- b) teach Nova that there is a flavor that maps to the presence of
that specialness, and
   c) teach Nova that other flavors may not map to that specialness

then in Tuskar whatever Nova configuration is needed to use that GPU
is a special role ('GPU compute' for instance) and only that role
would be given that flavor to use. That special config is probably
being in a host aggregate, with an overcloud flavor that specifies
that aggregate, which means at the TripleO level we need to put the
aggregate in the config metadata for that role, and the admin does a
one-time setup in the Nova Horizon UI to configure their GPU compute
flavor."



Yes, the core services example is a variation on the above.  The idea
of _undercloud_ flavor assignment (flavor to role mapping) escaped me
when I read that earlier.

It appears to be very elegant and provides another attribute for Tuskar's
notion of resource classes.  So +1 here.



You mention three specific nodes, but what you're describing is more likely 
three concepts:
- Balanced Nodes
- High Disk I/O Nodes
- Low-End Appliance Nodes

They may have one node in each, but I think your example of three nodes is 
potentially *too* simplified to be considered as proper sample size. I'd guess 
there are more than three in play commonly, in which case the concepts 
breakdown starts to be more appealing.


Correct - definitely more than three, I just wanted to illustrate the use case.


I not sure I explained what I was getting at properly. I wasn't implying 
you thought it was limited to just three. I do the same thing, simplify 
down for discussion purposes (I've done so in my head about this very 
topic).


But I think this may be a rare case where simplifying actually masks the 
concept rather than exposes it. Manual feels a bit more desirable in 
small sample groups but when looking at larger sets of nodes, the flavor 
concept feels less odd than it does when defining a flavor for a single 
machine.


That's all. :) Maybe that was clear already, but I wanted to make sure I 
didn't come off as attacking your example. It certainly wasn't my 
intention. The balanced v. disk machine thing is the sort of thing I'd 
been thinking for a while but hadn't found a good way to make concrete.



I think the disk flavor in particular has quite a few use cases, especially until SSDs 
are ubiquitous. I'd want to flag those (in Jay terminology, "the disk hotness") 
as hosting the data-intensive portions, but where I had previously been viewing that as 
manual allocation, it sounds like the approach is to properly categorize them for what 
they are and teach Nova how to use them.

Robert - Please correct me if I misread any of what your intention was, I don't 
want to drive people down the wrong path if I'm misinterpretting anything.


-k


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Keith Basil
On Dec 12, 2013, at 4:05 PM, Jay Dobies wrote:

>> Maybe this is a valid use case?
>> 
>> Cloud operator has several core service nodes of differing configuration
>> types.
>> 
>> [node1]  <-- balanced mix of disk/cpu/ram for general core services
>> [node2]  <-- lots of disks for Ceilometer data storage
>> [node3]  <-- low-end "appliance like" box for a specialized/custom core 
>> service
>>   (SIEM box for example)
>> 
>> All nodes[1,2,3] are in the same deployment grouping ("core services)".  As 
>> such,
>> this is a heterogenous deployment grouping.  Heterogeneity in this case 
>> defined by
>> differing roles and hardware configurations.
>> 
>> This is a real use case.
>> 
>> How do we handle this?
> 
> This is the sort of thing I had been concerned with, but I think this is just 
> a variation on Robert's GPU example. Rather than butcher it by paraphrasing, 
> I'll just include the relevant part:
> 
> 
> "The basic stuff we're talking about so far is just about saying each
> role can run on some set of undercloud flavors. If that new bit of kit
> has the same coarse metadata as other kit, Nova can't tell it apart.
> So the way to solve the problem is:
> - a) teach Ironic about the specialness of the node (e.g. a tag 'GPU')
> - b) teach Nova that there is a flavor that maps to the presence of
> that specialness, and
>   c) teach Nova that other flavors may not map to that specialness
> 
> then in Tuskar whatever Nova configuration is needed to use that GPU
> is a special role ('GPU compute' for instance) and only that role
> would be given that flavor to use. That special config is probably
> being in a host aggregate, with an overcloud flavor that specifies
> that aggregate, which means at the TripleO level we need to put the
> aggregate in the config metadata for that role, and the admin does a
> one-time setup in the Nova Horizon UI to configure their GPU compute
> flavor."
> 

Yes, the core services example is a variation on the above.  The idea
of _undercloud_ flavor assignment (flavor to role mapping) escaped me
when I read that earlier.

It appears to be very elegant and provides another attribute for Tuskar's
notion of resource classes.  So +1 here.


> You mention three specific nodes, but what you're describing is more likely 
> three concepts:
> - Balanced Nodes
> - High Disk I/O Nodes
> - Low-End Appliance Nodes
> 
> They may have one node in each, but I think your example of three nodes is 
> potentially *too* simplified to be considered as proper sample size. I'd 
> guess there are more than three in play commonly, in which case the concepts 
> breakdown starts to be more appealing.

Correct - definitely more than three, I just wanted to illustrate the use case.

> I think the disk flavor in particular has quite a few use cases, especially 
> until SSDs are ubiquitous. I'd want to flag those (in Jay terminology, "the 
> disk hotness") as hosting the data-intensive portions, but where I had 
> previously been viewing that as manual allocation, it sounds like the 
> approach is to properly categorize them for what they are and teach Nova how 
> to use them.
> 
> Robert - Please correct me if I misread any of what your intention was, I 
> don't want to drive people down the wrong path if I'm misinterpretting 
> anything.

-k


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Jay Dobies

Maybe this is a valid use case?

Cloud operator has several core service nodes of differing configuration
types.

[node1]  <-- balanced mix of disk/cpu/ram for general core services
[node2]  <-- lots of disks for Ceilometer data storage
[node3]  <-- low-end "appliance like" box for a specialized/custom core service
 (SIEM box for example)

All nodes[1,2,3] are in the same deployment grouping ("core services)".  As 
such,
this is a heterogenous deployment grouping.  Heterogeneity in this case defined 
by
differing roles and hardware configurations.

This is a real use case.

How do we handle this?


This is the sort of thing I had been concerned with, but I think this is 
just a variation on Robert's GPU example. Rather than butcher it by 
paraphrasing, I'll just include the relevant part:



"The basic stuff we're talking about so far is just about saying each
role can run on some set of undercloud flavors. If that new bit of kit
has the same coarse metadata as other kit, Nova can't tell it apart.
So the way to solve the problem is:
 - a) teach Ironic about the specialness of the node (e.g. a tag 'GPU')
 - b) teach Nova that there is a flavor that maps to the presence of
that specialness, and
   c) teach Nova that other flavors may not map to that specialness

then in Tuskar whatever Nova configuration is needed to use that GPU
is a special role ('GPU compute' for instance) and only that role
would be given that flavor to use. That special config is probably
being in a host aggregate, with an overcloud flavor that specifies
that aggregate, which means at the TripleO level we need to put the
aggregate in the config metadata for that role, and the admin does a
one-time setup in the Nova Horizon UI to configure their GPU compute
flavor."


You mention three specific nodes, but what you're describing is more 
likely three concepts:

- Balanced Nodes
- High Disk I/O Nodes
- Low-End Appliance Nodes

They may have one node in each, but I think your example of three nodes 
is potentially *too* simplified to be considered as proper sample size. 
I'd guess there are more than three in play commonly, in which case the 
concepts breakdown starts to be more appealing.


I think the disk flavor in particular has quite a few use cases, 
especially until SSDs are ubiquitous. I'd want to flag those (in Jay 
terminology, "the disk hotness") as hosting the data-intensive portions, 
but where I had previously been viewing that as manual allocation, it 
sounds like the approach is to properly categorize them for what they 
are and teach Nova how to use them.


Robert - Please correct me if I misread any of what your intention was, 
I don't want to drive people down the wrong path if I'm misinterpretting 
anything.






-k


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Will Foster

On 12/12/13 09:42 +1300, Robert Collins wrote:

On 12 December 2013 01:17, Jaromir Coufal  wrote:

On 2013/10/12 23:09, Robert Collins wrote:



The 'easiest' way is to support bigger companies with huge deployments,
tailored infrastructure, everything connected properly.

But there are tons of companies/users who are running on old
heterogeneous
hardware. Very likely even more than the number of companies having
already
mentioned large deployments. And giving them only the way of 'setting up
rules' in order to get the service on the node - this type of user is not
gonna use our deployment system.



Thats speculation. We don't know if they will or will not because we
haven't given them a working system to test.


Some part of that is speculation, some part of that is feedback from people
who are doing deployments (of course its just very limited audience).
Anyway, it is not just pure theory.


Sure. Let be me more precise. There is a hypothesis that lack of
direct control will be a significant adoption blocker for a primary
group of users.

I think it's safe to say that some users in the group 'sysadmins
having to deploy an OpenStack cloud' will find it a bridge too far and
not use a system without direct control. Call this group A.

I think it's also safe to say that some users will not care in the
slightest, because their deployment is too small for them to be
particularly worried (e.g. about occasional downtime (but they would
worry a lot about data loss)). Call this group B.

I suspect we don't need to consider group C - folk who won't use a
system if it *has* manual control, but thats only a suspicion. It may
be that the side effect of adding direct control is to reduce
usability below the threshold some folk need...

To assess 'significant adoption blocker' we basically need to find the
% of users who will care sufficiently that they don't use TripleO.

How can we do that? We can do questionnaires, and get such folk to
come talk with use, but that suffers from selection bias - group B can
use the system with or without direct manual control, so have little
motivation to argue vigorously in any particular direction. Group A
however have to argue because they won't use the system at all without
that feature, and they may want to use the system for other reasons,
so that because a crucial aspect for them.

A much better way IMO is to test it - to get a bunch of volunteers and
see who responds positively to a demo *without* direct manual control.

To do that we need a demoable thing, which might just be mockups that
show a set of workflows (and include things like Jay's
shiny-new-hardware use case in the demo).

I rather suspect we're building that anyway as part of doing UX work,
so maybe what we do is put a tweet or blog post up asking for
sysadmins who a) have not yet deployed openstack, b) want to, and c)
are willing to spend 20-30 minutes with us, walk them through a demo
showing no manual control, and record what questions they ask, and
whether they would like to have that product to us, and if not, then
(a) what use cases they can't address with the mockups and (b) what
other reasons they have for not using it.

This is a bunch of work though!

So, do we need to do that work?

*If* we can layer manual control on later, then we could defer this
testing until we are at the point where we can say 'the nova scheduled
version is ready, now lets decide if we add the manual control'.

OTOH, if we *cannot* layer manual control on later - if it has
tentacles through too much of the code base, then we need to decide
earlier, because it will be significantly harder to add later and that
may be too late of a ship date for vendors shipping on top of TripleO.

So with that as a prelude, my technical sense is that we can layer
manual scheduling on later: we provide an advanced screen, show the
list of N instances we're going to ask for and allow each instance to
be directly customised with a node id selected from either the current
node it's running on or an available node. It's significant work both
UI and plumbing, but it's not going to be made harder by the other
work we're doing AFAICT.

-> My proposal is that we shelve this discussion until we have the
nova/heat scheduled version in 'and now we polish' mode, and then pick
it back up and assess user needs.

An alternative argument is to say that group A is a majority of the
userbase and that doing an automatic version is entirely unnecessary.
Thats also possible, but I'm extremely skeptical, given the huge cost
of staff time, and the complete lack of interest my sysadmin friends
(and my former sysadmin self) have in doing automatable things by
hand.


I just wanted to add a few thoughts:

For some comparative information here "from the field" I work
extensively on deployments of large OpenStack implementations,
most recently with a ~220node/9rack deployment (scaling up to 
42racks / 1024 nodes soon).  My primary role is of a Devops/Sysadmin 
nature, and no

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Keith Basil
On Dec 11, 2013, at 3:42 PM, Robert Collins wrote:

> On 12 December 2013 01:17, Jaromir Coufal  wrote:
>> On 2013/10/12 23:09, Robert Collins wrote:
> 
 The 'easiest' way is to support bigger companies with huge deployments,
 tailored infrastructure, everything connected properly.
 
 But there are tons of companies/users who are running on old
 heterogeneous
 hardware. Very likely even more than the number of companies having
 already
 mentioned large deployments. And giving them only the way of 'setting up
 rules' in order to get the service on the node - this type of user is not
 gonna use our deployment system.
>>> 
>>> 
>>> Thats speculation. We don't know if they will or will not because we
>>> haven't given them a working system to test.
>> 
>> Some part of that is speculation, some part of that is feedback from people
>> who are doing deployments (of course its just very limited audience).
>> Anyway, it is not just pure theory.
> 
> Sure. Let be me more precise. There is a hypothesis that lack of
> direct control will be a significant adoption blocker for a primary
> group of users.
> 
> I think it's safe to say that some users in the group 'sysadmins
> having to deploy an OpenStack cloud' will find it a bridge too far and
> not use a system without direct control. Call this group A.
> 
> I think it's also safe to say that some users will not care in the
> slightest, because their deployment is too small for them to be
> particularly worried (e.g. about occasional downtime (but they would
> worry a lot about data loss)). Call this group B.
> 
> I suspect we don't need to consider group C - folk who won't use a
> system if it *has* manual control, but thats only a suspicion. It may
> be that the side effect of adding direct control is to reduce
> usability below the threshold some folk need...
> 
> To assess 'significant adoption blocker' we basically need to find the
> % of users who will care sufficiently that they don't use TripleO.
> 
> How can we do that? We can do questionnaires, and get such folk to
> come talk with use, but that suffers from selection bias - group B can
> use the system with or without direct manual control, so have little
> motivation to argue vigorously in any particular direction. Group A
> however have to argue because they won't use the system at all without
> that feature, and they may want to use the system for other reasons,
> so that because a crucial aspect for them.
> 
> A much better way IMO is to test it - to get a bunch of volunteers and
> see who responds positively to a demo *without* direct manual control.
> 
> To do that we need a demoable thing, which might just be mockups that
> show a set of workflows (and include things like Jay's
> shiny-new-hardware use case in the demo).
> 
> I rather suspect we're building that anyway as part of doing UX work,
> so maybe what we do is put a tweet or blog post up asking for
> sysadmins who a) have not yet deployed openstack, b) want to, and c)
> are willing to spend 20-30 minutes with us, walk them through a demo
> showing no manual control, and record what questions they ask, and
> whether they would like to have that product to us, and if not, then
> (a) what use cases they can't address with the mockups and (b) what
> other reasons they have for not using it.
> 
> This is a bunch of work though!
> 
> So, do we need to do that work?
> 
> *If* we can layer manual control on later, then we could defer this
> testing until we are at the point where we can say 'the nova scheduled
> version is ready, now lets decide if we add the manual control'.
> 
> OTOH, if we *cannot* layer manual control on later - if it has
> tentacles through too much of the code base, then we need to decide
> earlier, because it will be significantly harder to add later and that
> may be too late of a ship date for vendors shipping on top of TripleO.
> 
> So with that as a prelude, my technical sense is that we can layer
> manual scheduling on later: we provide an advanced screen, show the
> list of N instances we're going to ask for and allow each instance to
> be directly customised with a node id selected from either the current
> node it's running on or an available node. It's significant work both
> UI and plumbing, but it's not going to be made harder by the other
> work we're doing AFAICT.
> 
> -> My proposal is that we shelve this discussion until we have the
> nova/heat scheduled version in 'and now we polish' mode, and then pick
> it back up and assess user needs.
> 
> An alternative argument is to say that group A is a majority of the
> userbase and that doing an automatic version is entirely unnecessary.
> Thats also possible, but I'm extremely skeptical, given the huge cost
> of staff time, and the complete lack of interest my sysadmin friends
> (and my former sysadmin self) have in doing automatable things by
> hand.
> 
>>> Lets break the concern into two halves:
>>> A) Users who could have t

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-12 Thread Keith Basil
On Dec 10, 2013, at 5:09 PM, Robert Collins wrote:

> On 11 December 2013 05:42, Jaromir Coufal  wrote:
>> On 2013/09/12 23:38, Tzu-Mainn Chen wrote:
>>> The disagreement comes from whether we need manual node assignment or not.
>>> I would argue that we
>>> need to step back and take a look at the real use case: heterogeneous
>>> nodes.  If there are literally
>>> no characteristics that differentiate nodes A and B, then why do we care
>>> which gets used for what?  Why
>>> do we need to manually assign one?
>> 
>> 
>> Ideally, we don't. But with this approach we would take out the possibility
>> to change something or decide something from the user.
> 
> So, I think this is where the confusion is. Using the nova scheduler
> doesn't prevent change or control. It just ensures the change and
> control happen in the right place: the Nova scheduler has had years of
> work, of features and facilities being added to support HPC, HA and
> other such use cases. It should have everything we need [1], without
> going down to manual placement. For clarity: manual placement is when
> any of the user, Tuskar, or Heat query Ironic, select a node, and then
> use a scheduler hint to bypass the scheduler.
> 
>> The 'easiest' way is to support bigger companies with huge deployments,
>> tailored infrastructure, everything connected properly.
>> 
>> But there are tons of companies/users who are running on old heterogeneous
>> hardware. Very likely even more than the number of companies having already
>> mentioned large deployments. And giving them only the way of 'setting up
>> rules' in order to get the service on the node - this type of user is not
>> gonna use our deployment system.
> 
> Thats speculation. We don't know if they will or will not because we
> haven't given them a working system to test.
> 
> Lets break the concern into two halves:
> A) Users who could have their needs met, but won't use TripleO because
> meeting their needs in this way is too hard/complex/painful.
> 
> B) Users who have a need we cannot meet with the current approach.
> 
> For category B users, their needs might be specific HA things - like
> the oft discussed failure domains angle, where we need to split up HA
> clusters across power bars, aircon, switches etc. Clearly long term we
> want to support them, and the undercloud Nova scheduler is entirely
> capable of being informed about this, and we can evolve to a holistic
> statement over time. Lets get a concrete list of the cases we can
> think of today that won't be well supported initially, and we can
> figure out where to do the work to support them properly.
> 
> For category A users, I think that we should get concrete examples,
> and evolve our design (architecture and UX) to make meeting those
> needs pleasant.
> 
> What we shouldn't do is plan complex work without concrete examples
> that people actually need. Jay's example of some shiny new compute
> servers with special parts that need to be carved out was a great one
> - we can put that in category A, and figure out if it's easy enough,
> or obvious enough - and think about whether we document it or make it
> a guided workflow or $whatever.
> 
>> Somebody might argue - why do we care? If user doesn't like TripleO
>> paradigm, he shouldn't use the UI and should use another tool. But the UI is
>> not only about TripleO. Yes, it is underlying concept, but we are working on
>> future *official* OpenStack deployment tool. We should care to enable people
>> to deploy OpenStack - large/small scale, homo/heterogeneous hardware,
>> typical or a bit more specific use-cases.
> 
> The difficulty I'm having is that the discussion seems to assume that
> 'heterogeneous implies manual', but I don't agree that that
> implication is necessary!
> 
>> As an underlying paradigm of how to install cloud - awesome idea, awesome
>> concept, it works. But user doesn't care about how it is being deployed for
>> him. He cares about getting what he wants/needs. And we shouldn't go that
>> far that we violently force him to treat his infrastructure as cloud. I
>> believe that possibility to change/control - if needed - is very important
>> and we should care.
> 
> I propose that we make concrete use cases: 'Fred cannot use TripleO
> without manual assignment because XYZ'. Then we can assess how
> important XYZ is to our early adopters and go from there.
> 
>> And what is key for us is to *enable* users - not to prevent them from using
>> our deployment tool, because it doesn't work for their requirements.
> 
> Totally agreed :)
> 
>>> If we can agree on that, then I think it would be sufficient to say that
>>> we want a mechanism to allow
>>> UI users to deal with heterogeneous nodes, and that mechanism must use
>>> nova-scheduler.  In my mind,
>>> that's what resource classes and node profiles are intended for.
>> 
>> 
>> Not arguing on this point. Though that mechanism should support also cases,
>> where user specifies a role for a node / removes node fro

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-11 Thread Robert Collins
On 12 December 2013 01:17, Jaromir Coufal  wrote:
> On 2013/10/12 23:09, Robert Collins wrote:

>>> The 'easiest' way is to support bigger companies with huge deployments,
>>> tailored infrastructure, everything connected properly.
>>>
>>> But there are tons of companies/users who are running on old
>>> heterogeneous
>>> hardware. Very likely even more than the number of companies having
>>> already
>>> mentioned large deployments. And giving them only the way of 'setting up
>>> rules' in order to get the service on the node - this type of user is not
>>> gonna use our deployment system.
>>
>>
>> Thats speculation. We don't know if they will or will not because we
>> haven't given them a working system to test.
>
> Some part of that is speculation, some part of that is feedback from people
> who are doing deployments (of course its just very limited audience).
> Anyway, it is not just pure theory.

Sure. Let be me more precise. There is a hypothesis that lack of
direct control will be a significant adoption blocker for a primary
group of users.

I think it's safe to say that some users in the group 'sysadmins
having to deploy an OpenStack cloud' will find it a bridge too far and
not use a system without direct control. Call this group A.

I think it's also safe to say that some users will not care in the
slightest, because their deployment is too small for them to be
particularly worried (e.g. about occasional downtime (but they would
worry a lot about data loss)). Call this group B.

I suspect we don't need to consider group C - folk who won't use a
system if it *has* manual control, but thats only a suspicion. It may
be that the side effect of adding direct control is to reduce
usability below the threshold some folk need...

To assess 'significant adoption blocker' we basically need to find the
% of users who will care sufficiently that they don't use TripleO.

How can we do that? We can do questionnaires, and get such folk to
come talk with use, but that suffers from selection bias - group B can
use the system with or without direct manual control, so have little
motivation to argue vigorously in any particular direction. Group A
however have to argue because they won't use the system at all without
that feature, and they may want to use the system for other reasons,
so that because a crucial aspect for them.

A much better way IMO is to test it - to get a bunch of volunteers and
see who responds positively to a demo *without* direct manual control.

To do that we need a demoable thing, which might just be mockups that
show a set of workflows (and include things like Jay's
shiny-new-hardware use case in the demo).

I rather suspect we're building that anyway as part of doing UX work,
so maybe what we do is put a tweet or blog post up asking for
sysadmins who a) have not yet deployed openstack, b) want to, and c)
are willing to spend 20-30 minutes with us, walk them through a demo
showing no manual control, and record what questions they ask, and
whether they would like to have that product to us, and if not, then
(a) what use cases they can't address with the mockups and (b) what
other reasons they have for not using it.

This is a bunch of work though!

So, do we need to do that work?

*If* we can layer manual control on later, then we could defer this
testing until we are at the point where we can say 'the nova scheduled
version is ready, now lets decide if we add the manual control'.

OTOH, if we *cannot* layer manual control on later - if it has
tentacles through too much of the code base, then we need to decide
earlier, because it will be significantly harder to add later and that
may be too late of a ship date for vendors shipping on top of TripleO.

So with that as a prelude, my technical sense is that we can layer
manual scheduling on later: we provide an advanced screen, show the
list of N instances we're going to ask for and allow each instance to
be directly customised with a node id selected from either the current
node it's running on or an available node. It's significant work both
UI and plumbing, but it's not going to be made harder by the other
work we're doing AFAICT.

-> My proposal is that we shelve this discussion until we have the
nova/heat scheduled version in 'and now we polish' mode, and then pick
it back up and assess user needs.

An alternative argument is to say that group A is a majority of the
userbase and that doing an automatic version is entirely unnecessary.
Thats also possible, but I'm extremely skeptical, given the huge cost
of staff time, and the complete lack of interest my sysadmin friends
(and my former sysadmin self) have in doing automatable things by
hand.

>> Lets break the concern into two halves:
>> A) Users who could have their needs met, but won't use TripleO because
>> meeting their needs in this way is too hard/complex/painful.
>>
>> B) Users who have a need we cannot meet with the current approach.
>>
>> For category B users, their needs might be speci

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-11 Thread Tzu-Mainn Chen
> On 2013/10/12 19:39, Tzu-Mainn Chen wrote:
> >>
> >> Ideally, we don't. But with this approach we would take out the
> >> possibility to change something or decide something from the user.
> >>
> >> The 'easiest' way is to support bigger companies with huge deployments,
> >> tailored infrastructure, everything connected properly.
> >>
> >> But there are tons of companies/users who are running on old
> >> heterogeneous hardware. Very likely even more than the number of
> >> companies having already mentioned large deployments. And giving them
> >> only the way of 'setting up rules' in order to get the service on the
> >> node - this type of user is not gonna use our deployment system.
> >>
> >> Somebody might argue - why do we care? If user doesn't like TripleO
> >> paradigm, he shouldn't use the UI and should use another tool. But the
> >> UI is not only about TripleO. Yes, it is underlying concept, but we are
> >> working on future *official* OpenStack deployment tool. We should care
> >> to enable people to deploy OpenStack - large/small scale,
> >> homo/heterogeneous hardware, typical or a bit more specific use-cases.
> >
> > I think this is a very important clarification, and I'm glad you made it.
> > It sounds
> > like manual assignment is actually a sub-requirement, and the feature
> > you're arguing
> > for is: supporting non-TripleO deployments.
>
> Mostly but not only. The other argument is - keeping control on stuff I
> am doing. Note that undercloud user is different from overcloud user.

Sure, but again, that argument seems to me to be a non-TripleO approach.  I'm
not saying that it's not a possible use case, I'm saying that you're advocating
for a deployment strategy that fundamentally diverges from the TripleO
philosophy - and as such, that strategy will likely require a separate UI, 
underlying
architecture, etc, and should not be planned for in the Icehouse timeframe.

> > That might be a worthy goal, but I think it's a distraction for the
> > Icehouse timeframe.
> > Each new deployment strategy requires not only a new UI, but different
> > deployment
> > architectures that could have very little common with each other.
> > Designing them all
> > to work in the same space is a recipe for disaster, a convoluted gnarl of
> > code that
> > doesn't do any one thing particularly well.  To use an analogy: there's a
> > reason why
> > no one makes a flying boat car.
> >
> > I'm going to strongly advocate that for Icehouse, we focus exclusively on
> > large scale
> > TripleO deployments, working to make that UI and architecture as sturdy as
> > we can.  Future
> > deployment strategies should be discussed in the future, and if they're not
> > TripleO based,
> > they should be discussed with the proper OpenStack group.
> One concern here is - it is quite likely that we get people excited
> about this approach - it will be a new boom - 'wow', there is automagic
> doing everything for me. But then the question would be reality - how
> many from that excited users will actually use TripleO for their real
> deployments (I mean in the early stages)? Would it be only couple of
> them (because of covered use cases, concerns of maturity, lack of
> control scarcity)? Can we assure them that if anything goes wrong, they
> have control over it?
> -- Jarda
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-11 Thread Jaromir Coufal



On 2013/10/12 19:39, Tzu-Mainn Chen wrote:


Ideally, we don't. But with this approach we would take out the
possibility to change something or decide something from the user.

The 'easiest' way is to support bigger companies with huge deployments,
tailored infrastructure, everything connected properly.

But there are tons of companies/users who are running on old
heterogeneous hardware. Very likely even more than the number of
companies having already mentioned large deployments. And giving them
only the way of 'setting up rules' in order to get the service on the
node - this type of user is not gonna use our deployment system.

Somebody might argue - why do we care? If user doesn't like TripleO
paradigm, he shouldn't use the UI and should use another tool. But the
UI is not only about TripleO. Yes, it is underlying concept, but we are
working on future *official* OpenStack deployment tool. We should care
to enable people to deploy OpenStack - large/small scale,
homo/heterogeneous hardware, typical or a bit more specific use-cases.


I think this is a very important clarification, and I'm glad you made it.  It 
sounds
like manual assignment is actually a sub-requirement, and the feature you're 
arguing
for is: supporting non-TripleO deployments.
Mostly but not only. The other argument is - keeping control on stuff I 
am doing. Note that undercloud user is different from overcloud user.



That might be a worthy goal, but I think it's a distraction for the Icehouse 
timeframe.
Each new deployment strategy requires not only a new UI, but different 
deployment
architectures that could have very little common with each other.  Designing 
them all
to work in the same space is a recipe for disaster, a convoluted gnarl of code 
that
doesn't do any one thing particularly well.  To use an analogy: there's a 
reason why
no one makes a flying boat car.

I'm going to strongly advocate that for Icehouse, we focus exclusively on large 
scale
TripleO deployments, working to make that UI and architecture as sturdy as we 
can.  Future
deployment strategies should be discussed in the future, and if they're not 
TripleO based,
they should be discussed with the proper OpenStack group.
One concern here is - it is quite likely that we get people excited 
about this approach - it will be a new boom - 'wow', there is automagic 
doing everything for me. But then the question would be reality - how 
many from that excited users will actually use TripleO for their real 
deployments (I mean in the early stages)? Would it be only couple of 
them (because of covered use cases, concerns of maturity, lack of 
control scarcity)? Can we assure them that if anything goes wrong, they 
have control over it?


-- Jarda

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-11 Thread Jaromir Coufal

On 2013/10/12 23:09, Robert Collins wrote:

On 11 December 2013 05:42, Jaromir Coufal  wrote:

On 2013/09/12 23:38, Tzu-Mainn Chen wrote:

The disagreement comes from whether we need manual node assignment or not.
I would argue that we
need to step back and take a look at the real use case: heterogeneous
nodes.  If there are literally
no characteristics that differentiate nodes A and B, then why do we care
which gets used for what?  Why
do we need to manually assign one?



Ideally, we don't. But with this approach we would take out the possibility
to change something or decide something from the user.


So, I think this is where the confusion is. Using the nova scheduler
doesn't prevent change or control. It just ensures the change and
control happen in the right place: the Nova scheduler has had years of
work, of features and facilities being added to support HPC, HA and
other such use cases. It should have everything we need [1], without
going down to manual placement. For clarity: manual placement is when
any of the user, Tuskar, or Heat query Ironic, select a node, and then
use a scheduler hint to bypass the scheduler.

This is very well written. I am all for things going to right places.


The 'easiest' way is to support bigger companies with huge deployments,
tailored infrastructure, everything connected properly.

But there are tons of companies/users who are running on old heterogeneous
hardware. Very likely even more than the number of companies having already
mentioned large deployments. And giving them only the way of 'setting up
rules' in order to get the service on the node - this type of user is not
gonna use our deployment system.


Thats speculation. We don't know if they will or will not because we
haven't given them a working system to test.
Some part of that is speculation, some part of that is feedback from 
people who are doing deployments (of course its just very limited 
audience). Anyway, it is not just pure theory.



Lets break the concern into two halves:
A) Users who could have their needs met, but won't use TripleO because
meeting their needs in this way is too hard/complex/painful.

B) Users who have a need we cannot meet with the current approach.

For category B users, their needs might be specific HA things - like
the oft discussed failure domains angle, where we need to split up HA
clusters across power bars, aircon, switches etc. Clearly long term we
want to support them, and the undercloud Nova scheduler is entirely
capable of being informed about this, and we can evolve to a holistic
statement over time. Lets get a concrete list of the cases we can
think of today that won't be well supported initially, and we can
figure out where to do the work to support them properly.
My question is - can't we help them now? To enable users to use our app 
even when we don't have enough smartness to help them 'auto' way?



For category A users, I think that we should get concrete examples,
and evolve our design (architecture and UX) to make meeting those
needs pleasant.
+1... I tried to pull some operators into this discussion thread, will 
try to get more.



What we shouldn't do is plan complex work without concrete examples
that people actually need. Jay's example of some shiny new compute
servers with special parts that need to be carved out was a great one
- we can put that in category A, and figure out if it's easy enough,
or obvious enough - and think about whether we document it or make it
a guided workflow or $whatever.


Somebody might argue - why do we care? If user doesn't like TripleO
paradigm, he shouldn't use the UI and should use another tool. But the UI is
not only about TripleO. Yes, it is underlying concept, but we are working on
future *official* OpenStack deployment tool. We should care to enable people
to deploy OpenStack - large/small scale, homo/heterogeneous hardware,
typical or a bit more specific use-cases.


The difficulty I'm having is that the discussion seems to assume that
'heterogeneous implies manual', but I don't agree that that
implication is necessary!
No, I don't agree with this either. Heterogeneous hardware can be very 
well managed automatically as well as homogeneous (classes, node profiles).



As an underlying paradigm of how to install cloud - awesome idea, awesome
concept, it works. But user doesn't care about how it is being deployed for
him. He cares about getting what he wants/needs. And we shouldn't go that
far that we violently force him to treat his infrastructure as cloud. I
believe that possibility to change/control - if needed - is very important
and we should care.


I propose that we make concrete use cases: 'Fred cannot use TripleO
without manual assignment because XYZ'. Then we can assess how
important XYZ is to our early adopters and go from there.
+1, yes. I will try to bug more relevant people, who could contribute at 
this area.



And what is key for us is to *enable* users - not to prevent them from using
our deployme

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-10 Thread Robert Collins
On 11 December 2013 05:42, Jaromir Coufal  wrote:
> On 2013/09/12 23:38, Tzu-Mainn Chen wrote:
>> The disagreement comes from whether we need manual node assignment or not.
>> I would argue that we
>> need to step back and take a look at the real use case: heterogeneous
>> nodes.  If there are literally
>> no characteristics that differentiate nodes A and B, then why do we care
>> which gets used for what?  Why
>> do we need to manually assign one?
>
>
> Ideally, we don't. But with this approach we would take out the possibility
> to change something or decide something from the user.

So, I think this is where the confusion is. Using the nova scheduler
doesn't prevent change or control. It just ensures the change and
control happen in the right place: the Nova scheduler has had years of
work, of features and facilities being added to support HPC, HA and
other such use cases. It should have everything we need [1], without
going down to manual placement. For clarity: manual placement is when
any of the user, Tuskar, or Heat query Ironic, select a node, and then
use a scheduler hint to bypass the scheduler.

> The 'easiest' way is to support bigger companies with huge deployments,
> tailored infrastructure, everything connected properly.
>
> But there are tons of companies/users who are running on old heterogeneous
> hardware. Very likely even more than the number of companies having already
> mentioned large deployments. And giving them only the way of 'setting up
> rules' in order to get the service on the node - this type of user is not
> gonna use our deployment system.

Thats speculation. We don't know if they will or will not because we
haven't given them a working system to test.

Lets break the concern into two halves:
A) Users who could have their needs met, but won't use TripleO because
meeting their needs in this way is too hard/complex/painful.

B) Users who have a need we cannot meet with the current approach.

For category B users, their needs might be specific HA things - like
the oft discussed failure domains angle, where we need to split up HA
clusters across power bars, aircon, switches etc. Clearly long term we
want to support them, and the undercloud Nova scheduler is entirely
capable of being informed about this, and we can evolve to a holistic
statement over time. Lets get a concrete list of the cases we can
think of today that won't be well supported initially, and we can
figure out where to do the work to support them properly.

For category A users, I think that we should get concrete examples,
and evolve our design (architecture and UX) to make meeting those
needs pleasant.

What we shouldn't do is plan complex work without concrete examples
that people actually need. Jay's example of some shiny new compute
servers with special parts that need to be carved out was a great one
- we can put that in category A, and figure out if it's easy enough,
or obvious enough - and think about whether we document it or make it
a guided workflow or $whatever.

> Somebody might argue - why do we care? If user doesn't like TripleO
> paradigm, he shouldn't use the UI and should use another tool. But the UI is
> not only about TripleO. Yes, it is underlying concept, but we are working on
> future *official* OpenStack deployment tool. We should care to enable people
> to deploy OpenStack - large/small scale, homo/heterogeneous hardware,
> typical or a bit more specific use-cases.

The difficulty I'm having is that the discussion seems to assume that
'heterogeneous implies manual', but I don't agree that that
implication is necessary!

> As an underlying paradigm of how to install cloud - awesome idea, awesome
> concept, it works. But user doesn't care about how it is being deployed for
> him. He cares about getting what he wants/needs. And we shouldn't go that
> far that we violently force him to treat his infrastructure as cloud. I
> believe that possibility to change/control - if needed - is very important
> and we should care.

I propose that we make concrete use cases: 'Fred cannot use TripleO
without manual assignment because XYZ'. Then we can assess how
important XYZ is to our early adopters and go from there.

> And what is key for us is to *enable* users - not to prevent them from using
> our deployment tool, because it doesn't work for their requirements.

Totally agreed :)

>> If we can agree on that, then I think it would be sufficient to say that
>> we want a mechanism to allow
>> UI users to deal with heterogeneous nodes, and that mechanism must use
>> nova-scheduler.  In my mind,
>> that's what resource classes and node profiles are intended for.
>
>
> Not arguing on this point. Though that mechanism should support also cases,
> where user specifies a role for a node / removes node from a role. The rest
> of nodes which I don't care about should be handled by nova-scheduler.

Why! What is a use case for removing a role from a node while leaving
that node in service? Lets be specific, alway

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-10 Thread Jay Dobies

Thanks for the explanation!

I'm going to claim that the thread revolves around two main areas of 
disagreement.  Then I'm going
to propose a way through:

a) Manual Node Assignment

I think that everyone is agreed that automated node assignment through 
nova-scheduler is by
far the most ideal case; there's no disagreement there.

The disagreement comes from whether we need manual node assignment or not.  I 
would argue that we
need to step back and take a look at the real use case: heterogeneous nodes.  
If there are literally
no characteristics that differentiate nodes A and B, then why do we care which 
gets used for what?  Why
do we need to manually assign one?


This is a better way of verbalizing my concerns. I suspect there are 
going to be quite a few heterogeneous environments built from legacy 
pieces in the near term and fewer built from the ground up with all new 
matching hotness.


On the other side of it, instead of handling legacy hardware I was 
worried about the new hotness (not sure why I keep using that term) 
specialized for a purpose. This is exactly what Robert described in his 
GPU example. I think his explanation of how to use the scheduler to 
accommodate that makes a lot of sense, so I'm much less behind the idea 
of a strict manual assignment than I previously was.



If we can agree on that, then I think it would be sufficient to say that we 
want a mechanism to allow
UI users to deal with heterogeneous nodes, and that mechanism must use 
nova-scheduler.  In my mind,
that's what resource classes and node profiles are intended for.

One possible objection might be: nova scheduler doesn't have the appropriate 
filter that we need to
separate out two nodes.  In that case, I would say that needs to be taken up 
with nova developers.


b) Terminology

It feels a bit like some of the disagreement come from people using different 
words for the same thing.
For example, the wireframes already details a UI where Robert's roles come 
first, but I think that message
was confused because I mentioned "node types" in the requirements.

So could we come to some agreement on what the most exact terminology would be? 
 I've listed some examples below,
but I'm sure there are more.

node type | role
management node | ?
resource node | ?
unallocated | available | undeployed
create a node distribution | size the deployment
resource classes | ?
node profiles | ?

Mainn

- Original Message -

On 10 December 2013 09:55, Tzu-Mainn Chen  wrote:

* created as part of undercloud install process



By that note I meant, that Nodes are not resources, Resource instances
run on Nodes. Nodes are the generic pool of hardware we can deploy
things onto.


I don't think "resource nodes" is intended to imply that nodes are
resources; rather, it's supposed to
indicate that it's a node where a resource instance runs.  It's supposed to
separate it from "management node"
and "unallocated node".


So the question is are we looking at /nodes/ that have a /current
role/, or are we looking at /roles/ that have some /current nodes/.

My contention is that the role is the interesting thing, and the nodes
is the incidental thing. That is, as a sysadmin, my hierarchy of
concerns is something like:
  A: are all services running
  B: are any of them in a degraded state where I need to take prompt
action to prevent a service outage [might mean many things: - software
update/disk space criticals/a machine failed and we need to scale the
cluster back up/too much load]
  C: are there any planned changes I need to make [new software deploy,
feature request from user, replacing a faulty machine]
  D: are there long term issues sneaking up on me [capacity planning,
machine obsolescence]

If we take /nodes/ as the interesting thing, and what they are doing
right now as the incidental thing, it's much harder to map that onto
the sysadmin concerns. If we start with /roles/ then can answer:
  A: by showing the list of roles and the summary stats (how many
machines, service status aggregate), role level alerts (e.g. nova-api
is not responding)
  B: by showing the list of roles and more detailed stats (overall
load, response times of services, tickets against services
  and a list of in trouble instances in each role - instances with
alerts against them - low disk, overload, failed service,
early-detection alerts from hardware
  C: probably out of our remit for now in the general case, but we need
to enable some things here like replacing faulty machines
  D: by looking at trend graphs for roles (not machines), but also by
looking at the hardware in aggregate - breakdown by age of machines,
summary data for tickets filed against instances that were deployed to
a particular machine

C: and D: are (F) category work, but for all but the very last thing,
it seems clear how to approach this from a roles perspective.

I've tried to approach this using /nodes/ as the starting point, and
after two terrible drafts I've deleted th

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-10 Thread Tzu-Mainn Chen
Thanks for the reply!  Comments in-line:

> > The disagreement comes from whether we need manual node assignment or not.
> > I would argue that we
> > need to step back and take a look at the real use case: heterogeneous
> > nodes.  If there are literally
> > no characteristics that differentiate nodes A and B, then why do we care
> > which gets used for what?  Why
> > do we need to manually assign one?
> 
> Ideally, we don't. But with this approach we would take out the
> possibility to change something or decide something from the user.
> 
> The 'easiest' way is to support bigger companies with huge deployments,
> tailored infrastructure, everything connected properly.
> 
> But there are tons of companies/users who are running on old
> heterogeneous hardware. Very likely even more than the number of
> companies having already mentioned large deployments. And giving them
> only the way of 'setting up rules' in order to get the service on the
> node - this type of user is not gonna use our deployment system.
> 
> Somebody might argue - why do we care? If user doesn't like TripleO
> paradigm, he shouldn't use the UI and should use another tool. But the
> UI is not only about TripleO. Yes, it is underlying concept, but we are
> working on future *official* OpenStack deployment tool. We should care
> to enable people to deploy OpenStack - large/small scale,
> homo/heterogeneous hardware, typical or a bit more specific use-cases.

I think this is a very important clarification, and I'm glad you made it.  It 
sounds
like manual assignment is actually a sub-requirement, and the feature you're 
arguing
for is: supporting non-TripleO deployments.

That might be a worthy goal, but I think it's a distraction for the Icehouse 
timeframe.
Each new deployment strategy requires not only a new UI, but different 
deployment
architectures that could have very little common with each other.  Designing 
them all
to work in the same space is a recipe for disaster, a convoluted gnarl of code 
that
doesn't do any one thing particularly well.  To use an analogy: there's a 
reason why
no one makes a flying boat car.

I'm going to strongly advocate that for Icehouse, we focus exclusively on large 
scale
TripleO deployments, working to make that UI and architecture as sturdy as we 
can.  Future
deployment strategies should be discussed in the future, and if they're not 
TripleO based,
they should be discussed with the proper OpenStack group.


> As an underlying paradigm of how to install cloud - awesome idea,
> awesome concept, it works. But user doesn't care about how it is being
> deployed for him. He cares about getting what he wants/needs. And we
> shouldn't go that far that we violently force him to treat his
> infrastructure as cloud. I believe that possibility to change/control -
> if needed - is very important and we should care.
> 
> And what is key for us is to *enable* users - not to prevent them from
> using our deployment tool, because it doesn't work for their requirements.
> 
> 
> > If we can agree on that, then I think it would be sufficient to say that we
> > want a mechanism to allow
> > UI users to deal with heterogeneous nodes, and that mechanism must use
> > nova-scheduler.  In my mind,
> > that's what resource classes and node profiles are intended for.
> 
> Not arguing on this point. Though that mechanism should support also
> cases, where user specifies a role for a node / removes node from a
> role. The rest of nodes which I don't care about should be handled by
> nova-scheduler.
> 
> > One possible objection might be: nova scheduler doesn't have the
> > appropriate filter that we need to
> > separate out two nodes.  In that case, I would say that needs to be taken
> > up with nova developers.
> 
> Give it to Nova guys to fix it... What if that user's need would be
> undercloud specific requirement?  Why should Nova guys care? What should
> our unhappy user do until then? Use other tool? Will he be willing to
> get back to use our tool once it is ready?
> 
> I can also see other use-cases. It can be distribution based on power
> sockets, networking connections, etc. We can't think about all the ways
> which our user will need.

In this case - it would be our job to make the Nova guys care and to work with 
them to develop
the feature.  Creating parallel services with the same fundamental purpose - I 
think that
runs counter to what OpenStack is designed for.

> 
> > b) Terminology
> >
> > It feels a bit like some of the disagreement come from people using
> > different words for the same thing.
> > For example, the wireframes already details a UI where Robert's roles come
> > first, but I think that message
> > was confused because I mentioned "node types" in the requirements.
> >
> > So could we come to some agreement on what the most exact terminology would
> > be?  I've listed some examples below,
> > but I'm sure there are more.
> >
> > node type | role
> +1 role
> 
> > management node | ?
> > resource

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-10 Thread Jaromir Coufal

On 2013/09/12 23:38, Tzu-Mainn Chen wrote:

Thanks for the explanation!

I'm going to claim that the thread revolves around two main areas of 
disagreement.  Then I'm going
to propose a way through:

a) Manual Node Assignment

I think that everyone is agreed that automated node assignment through 
nova-scheduler is by
far the most ideal case; there's no disagreement there.


+1


The disagreement comes from whether we need manual node assignment or not.  I 
would argue that we
need to step back and take a look at the real use case: heterogeneous nodes.  
If there are literally
no characteristics that differentiate nodes A and B, then why do we care which 
gets used for what?  Why
do we need to manually assign one?


Ideally, we don't. But with this approach we would take out the 
possibility to change something or decide something from the user.


The 'easiest' way is to support bigger companies with huge deployments, 
tailored infrastructure, everything connected properly.


But there are tons of companies/users who are running on old 
heterogeneous hardware. Very likely even more than the number of 
companies having already mentioned large deployments. And giving them 
only the way of 'setting up rules' in order to get the service on the 
node - this type of user is not gonna use our deployment system.


Somebody might argue - why do we care? If user doesn't like TripleO 
paradigm, he shouldn't use the UI and should use another tool. But the 
UI is not only about TripleO. Yes, it is underlying concept, but we are 
working on future *official* OpenStack deployment tool. We should care 
to enable people to deploy OpenStack - large/small scale, 
homo/heterogeneous hardware, typical or a bit more specific use-cases.


As an underlying paradigm of how to install cloud - awesome idea, 
awesome concept, it works. But user doesn't care about how it is being 
deployed for him. He cares about getting what he wants/needs. And we 
shouldn't go that far that we violently force him to treat his 
infrastructure as cloud. I believe that possibility to change/control - 
if needed - is very important and we should care.


And what is key for us is to *enable* users - not to prevent them from 
using our deployment tool, because it doesn't work for their requirements.




If we can agree on that, then I think it would be sufficient to say that we 
want a mechanism to allow
UI users to deal with heterogeneous nodes, and that mechanism must use 
nova-scheduler.  In my mind,
that's what resource classes and node profiles are intended for.


Not arguing on this point. Though that mechanism should support also 
cases, where user specifies a role for a node / removes node from a 
role. The rest of nodes which I don't care about should be handled by 
nova-scheduler.



One possible objection might be: nova scheduler doesn't have the appropriate 
filter that we need to
separate out two nodes.  In that case, I would say that needs to be taken up 
with nova developers.


Give it to Nova guys to fix it... What if that user's need would be 
undercloud specific requirement?  Why should Nova guys care? What should 
our unhappy user do until then? Use other tool? Will he be willing to 
get back to use our tool once it is ready?


I can also see other use-cases. It can be distribution based on power 
sockets, networking connections, etc. We can't think about all the ways 
which our user will need.




b) Terminology

It feels a bit like some of the disagreement come from people using different 
words for the same thing.
For example, the wireframes already details a UI where Robert's roles come 
first, but I think that message
was confused because I mentioned "node types" in the requirements.

So could we come to some agreement on what the most exact terminology would be? 
 I've listed some examples below,
but I'm sure there are more.

node type | role

+1 role


management node | ?
resource node | ?
unallocated | aqvailable | undeployed

+1 unallocated


ceate a node distribution | size the deployment

* Distribute nodes


resource classes | ?

Service classes?


node profiles | ?




So when we talk about 'unallocated Nodes', the implication is that
users 'allocate Nodes', but they don't: they size roles, and after
doing all that there may be some Nodes that are - yes - unallocated,
or have nothing scheduled to them. So... I'm not debating that we
should have a list of free hardware - we totally should - I'm debating
how we frame it. 'Available Nodes' or 'Undeployed machines' or
whatever.
The allocation can happen automatically, so from my point of view I 
don't see big problem with 'allocate' term.



I just want to get away from talking about something
([manual] allocation) that we don't offer.

We don't at the moment but we should :)

-- Jarda

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-10 Thread Jaromir Coufal



On 2013/09/12 21:22, Robert Collins wrote:

Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory
stats

For registration it is just Management MAC address which is needed right? Or
does Ironic need also IP? I think that MAC address might be enough, we can
display IP in details of node later on.


Ironic needs all the details I listed today. Management MAC is not
currently used at all, but would be needed in future when we tackle
IPMI IP managed by Neutron.

OK, I will reflect that in wireframes for UI.



  >   * Auto-discovery during undercloud install process (M)

* Monitoring
* assignment, availability, status
* capacity, historical statistics (M)

Why is this under 'nodes'? I challenge the idea that it should be
there. We will need to surface some stuff about nodes, but the
underlying idea is to take a cloud approach here - so we're monitoring
services, that happen to be on nodes. There is room to monitor nodes,
as an undercloud feature set, but lets be very very specific about
what is sitting at what layer.

We need both - we need to track services but also state of nodes (CPU, RAM,
Network bandwidth, etc). So in node detail you should be able to track both.


Those are instance characteristics, not node characteristics. An
instance is software running on a Node, and the amount of CPU/RAM/NIC
utilisation is specific to that software while it's on that Node, not
to future or past instances running on that Node.
I think this is minor detail. Node has certain CPU/RAM/NIC capacity and 
instance is consuming it. Either way it is important for us to display 
this utilization in the UI as well as service statistics.



 * Resource nodes

 ^ nodes is again confusing layers - nodes are
what things are deployed to, but they aren't the entry point

Can you, please be a bit more specific here? I don't understand this note.


By the way, can you get your email client to insert > before the text
you are replying to rather than HTML | marks? Hard to tell what I
wrote and what you did :).

Oh right, sure, sorry. Should be fixed ;)


By that note I meant, that Nodes are not resources, Resource instances
run on Nodes. Nodes are the generic pool of hardware we can deploy
things onto.
Well right, this is the terminology. From my point of view, resources 
for overcloud are the instances which are running on Nodes. Once we 
deploy the nodes with appropriate software they become Resource Nodes 
(from unallocated pool). If this terminology is confusing already then 
we should fix it. Any suggestions for improvements?



 * Unallocated nodes

This implies an 'allocation' step, that we don't have - how about
'Idle nodes' or something.

It can be auto-allocation. I don't see problem with 'unallocated' term.


Ok, it's not a biggy. I do think it will frame things poorly and lead
to an expectation about how TripleO works that doesn't match how it
does, but we can change it later if I'm right, and if I'm wrong, well
it won't be the first time :).
I think we will figure it out in the other thread (where we talk about 
allocation). Anyway - I am interested in how differently would you 
formulate Unallocated / Resource / Management Nodes? Maybe your is better :)


-- Jarda

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-10 Thread Jaromir Coufal


On 2013/09/12 17:15, Tzu-Mainn Chen wrote:


- As an infrastructure administrator, Anna wants to be able to 
unallocate a node from a deployment.

Why? Whats her motivation. One plausible one for me is 'a machine
needs to be serviced so Anna wants to remove it from the deployment to
avoid causing user visible downtime.'  So lets say that: Anna needs to
be able to take machines out of service so they can be maintained or
disposed of.

Node being serviced is a different user story for me.

I believe we are still 'fighting' here with two approaches and I
believe we need both. We can't only provide a way 'give us
resources we will do a magic'. Yes this is preferred way -
especially for large deployments, but we also need a fallback so
that user can say - no, this node doesn't belong to the class, I
don't want it there - unassign. Or I need to have this node there
- assign.

Just for clarification - the wireframes don't cover individual nodes 
being manually assigned, do they?  I thought the concession to manual 
control was entirely through resource classes and node profiles, which 
are still parameters to be passed through to the nova-scheduler 
filter.  To me, that's very different from manual assignment.


Mainn
It's all doable and wireframes are prepared for the manual assignment as 
well, Mainn. I just was not designing details for now, since we are 
going to focus on auto-distribution first. But I will cover this use 
case in later iterations of wireframes.


Cheers
-- Jarda
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Tzu-Mainn Chen
Thanks for the explanation!

I'm going to claim that the thread revolves around two main areas of 
disagreement.  Then I'm going
to propose a way through:

a) Manual Node Assignment

I think that everyone is agreed that automated node assignment through 
nova-scheduler is by
far the most ideal case; there's no disagreement there.

The disagreement comes from whether we need manual node assignment or not.  I 
would argue that we
need to step back and take a look at the real use case: heterogeneous nodes.  
If there are literally
no characteristics that differentiate nodes A and B, then why do we care which 
gets used for what?  Why
do we need to manually assign one?

If we can agree on that, then I think it would be sufficient to say that we 
want a mechanism to allow
UI users to deal with heterogeneous nodes, and that mechanism must use 
nova-scheduler.  In my mind,
that's what resource classes and node profiles are intended for.

One possible objection might be: nova scheduler doesn't have the appropriate 
filter that we need to
separate out two nodes.  In that case, I would say that needs to be taken up 
with nova developers.


b) Terminology

It feels a bit like some of the disagreement come from people using different 
words for the same thing.
For example, the wireframes already details a UI where Robert's roles come 
first, but I think that message
was confused because I mentioned "node types" in the requirements.

So could we come to some agreement on what the most exact terminology would be? 
 I've listed some examples below,
but I'm sure there are more.

node type | role
management node | ?
resource node | ?
unallocated | available | undeployed
create a node distribution | size the deployment
resource classes | ?
node profiles | ?

Mainn

- Original Message -
> On 10 December 2013 09:55, Tzu-Mainn Chen  wrote:
> >> >* created as part of undercloud install process
> 
> >> By that note I meant, that Nodes are not resources, Resource instances
> >> run on Nodes. Nodes are the generic pool of hardware we can deploy
> >> things onto.
> >
> > I don't think "resource nodes" is intended to imply that nodes are
> > resources; rather, it's supposed to
> > indicate that it's a node where a resource instance runs.  It's supposed to
> > separate it from "management node"
> > and "unallocated node".
> 
> So the question is are we looking at /nodes/ that have a /current
> role/, or are we looking at /roles/ that have some /current nodes/.
> 
> My contention is that the role is the interesting thing, and the nodes
> is the incidental thing. That is, as a sysadmin, my hierarchy of
> concerns is something like:
>  A: are all services running
>  B: are any of them in a degraded state where I need to take prompt
> action to prevent a service outage [might mean many things: - software
> update/disk space criticals/a machine failed and we need to scale the
> cluster back up/too much load]
>  C: are there any planned changes I need to make [new software deploy,
> feature request from user, replacing a faulty machine]
>  D: are there long term issues sneaking up on me [capacity planning,
> machine obsolescence]
> 
> If we take /nodes/ as the interesting thing, and what they are doing
> right now as the incidental thing, it's much harder to map that onto
> the sysadmin concerns. If we start with /roles/ then can answer:
>  A: by showing the list of roles and the summary stats (how many
> machines, service status aggregate), role level alerts (e.g. nova-api
> is not responding)
>  B: by showing the list of roles and more detailed stats (overall
> load, response times of services, tickets against services
>  and a list of in trouble instances in each role - instances with
> alerts against them - low disk, overload, failed service,
> early-detection alerts from hardware
>  C: probably out of our remit for now in the general case, but we need
> to enable some things here like replacing faulty machines
>  D: by looking at trend graphs for roles (not machines), but also by
> looking at the hardware in aggregate - breakdown by age of machines,
> summary data for tickets filed against instances that were deployed to
> a particular machine
> 
> C: and D: are (F) category work, but for all but the very last thing,
> it seems clear how to approach this from a roles perspective.
> 
> I've tried to approach this using /nodes/ as the starting point, and
> after two terrible drafts I've deleted the section. I'd love it if
> someone could show me how it would work:)
> 
> >> > * Unallocated nodes
> >> >
> >> > This implies an 'allocation' step, that we don't have - how about
> >> > 'Idle nodes' or something.
> >> >
> >> > It can be auto-allocation. I don't see problem with 'unallocated' term.
> >>
> >> Ok, it's not a biggy. I do think it will frame things poorly and lead
> >> to an expectation about how TripleO works that doesn't match how it
> >> does, but we can change it later if I'm right, and if I'm wrong, wel

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Robert Collins
On 10 December 2013 10:57, Jay Dobies  wrote:

>>
>> So we have:
>>   - node - a physical general purpose machine capable of running in
>> many roles. Some nodes may have hardware layout that is particularly
>> useful for a given role.
>>   - role - a specific workload we want to map onto one or more nodes.
>> Examples include 'undercloud control plane', 'overcloud control
>> plane', 'overcloud storage', 'overcloud compute' etc.
>>   - instance - A role deployed on a node - this is where work actually
>> happens.
>>   - scheduling - the process of deciding which role is deployed on which
>> node.
>
>
> This glossary is really handy to make sure we're all speaking the same
> language.
>
>
>> The way TripleO works is that we defined a Heat template that lays out
>> policy: 5 instances of 'overcloud control plane please', '20
>> hypervisors' etc. Heat passes that to Nova, which pulls the image for
>> the role out of Glance, picks a node, and deploys the image to the
>> node.
>>
>> Note in particular the order: Heat -> Nova -> Scheduler -> Node chosen.
>>
>> The user action is not 'allocate a Node to 'overcloud control plane',
>> it is 'size the control plane through heat'.
>>
>> So when we talk about 'unallocated Nodes', the implication is that
>> users 'allocate Nodes', but they don't: they size roles, and after
>> doing all that there may be some Nodes that are - yes - unallocated,
>
>
> I'm not sure if I should ask this here or to your point above, but what
> about multi-role nodes? Is there any piece in here that says "The policy
> wants 5 instances but I can fit two of them on this existing underutilized
> node and three of them on unallocated nodes" or since it's all at the image
> level you get just what's in the image and that's the finest-level of
> granularity?

The way we handle that today is to create a composite role that says
'overcloud-compute+cinder storage', for instance - because image is
the level of granularity. If/when we get automatic container
subdivision - see the other really interesting long-term thread - we
could subdivide, but I'd still do that using image as the level of
granularity, it's just that we'd have the host image + the container
images.

>> or have nothing scheduled to them. So... I'm not debating that we
>> should have a list of free hardware - we totally should - I'm debating
>> how we frame it. 'Available Nodes' or 'Undeployed machines' or
>> whatever. I just want to get away from talking about something
>> ([manual] allocation) that we don't offer.
>
>
> My only concern here is that we're not talking about cloud users, we're
> talking about admins adminning (we'll pretend it's a word, come with me) a
> cloud. To a cloud user, "give me some power so I can do some stuff" is a
> safe use case if I trust the cloud I'm running on. I trust that the cloud
> provider has taken the proper steps to ensure that my CPU isn't in New York
> and my storage in Tokyo.

Sure :)

> To the admin setting up an overcloud, they are the ones providing that trust
> to eventual cloud users. That's where I feel like more visibility and
> control are going to be desired/appreciated.
>
> I admit what I just said isn't at all concrete. Might even be flat out
> wrong. I was never an admin, I've just worked on sys management software
> long enough to have the opinion that their levels of OCD are legendary. I
> can't shake this feeling that someone is going to slap some fancy new
> jacked-up piece of hardware onto the network and have a specific purpose
> they are going to want to use it for. But maybe that's antiquated thinking
> on my part.

I think concrete use cases are the only way we'll get light at the end
of the tunnel.

So lets say someone puts a new bit of fancy kit onto their network and
wants it for e.g. GPU VM instances only. Thats a reasonable desire.

The basic stuff we're talking about so far is just about saying each
role can run on some set of undercloud flavors. If that new bit of kit
has the same coarse metadata as other kit, Nova can't tell it apart.
So the way to solve the problem is:
 - a) teach Ironic about the specialness of the node (e.g. a tag 'GPU')
 - b) teach Nova that there is a flavor that maps to the presence of
that specialness, and
   c) teach Nova that other flavors may not map to that specialness

then in Tuskar whatever Nova configuration is needed to use that GPU
is a special role ('GPU compute' for instance) and only that role
would be given that flavor to use. That special config is probably
being in a host aggregate, with an overcloud flavor that specifies
that aggregate, which means at the TripleO level we need to put the
aggregate in the config metadata for that role, and the admin does a
one-time setup in the Nova Horizon UI to configure their GPU compute
flavor.

This isn't 'manual allocation' to me - it's surfacing the capabilities
from the bottom ('has GPU') and the constraints from the top ('needs
GPU') and letting Nova and Heat sort it out.

-Rob

--

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Jay Dobies




So the question is are we looking at /nodes/ that have a /current
role/, or are we looking at /roles/ that have some /current nodes/.

My contention is that the role is the interesting thing, and the nodes
is the incidental thing. That is, as a sysadmin, my hierarchy of
concerns is something like:
  A: are all services running
  B: are any of them in a degraded state where I need to take prompt
action to prevent a service outage [might mean many things: - software
update/disk space criticals/a machine failed and we need to scale the
cluster back up/too much load]
  C: are there any planned changes I need to make [new software deploy,
feature request from user, replacing a faulty machine]
  D: are there long term issues sneaking up on me [capacity planning,
machine obsolescence]

If we take /nodes/ as the interesting thing, and what they are doing
right now as the incidental thing, it's much harder to map that onto
the sysadmin concerns. If we start with /roles/ then can answer:
  A: by showing the list of roles and the summary stats (how many
machines, service status aggregate), role level alerts (e.g. nova-api
is not responding)
  B: by showing the list of roles and more detailed stats (overall
load, response times of services, tickets against services
  and a list of in trouble instances in each role - instances with
alerts against them - low disk, overload, failed service,
early-detection alerts from hardware
  C: probably out of our remit for now in the general case, but we need
to enable some things here like replacing faulty machines
  D: by looking at trend graphs for roles (not machines), but also by
looking at the hardware in aggregate - breakdown by age of machines,
summary data for tickets filed against instances that were deployed to
a particular machine

C: and D: are (F) category work, but for all but the very last thing,
it seems clear how to approach this from a roles perspective.

I've tried to approach this using /nodes/ as the starting point, and
after two terrible drafts I've deleted the section. I'd love it if
someone could show me how it would work:)


 * Unallocated nodes

This implies an 'allocation' step, that we don't have - how about
'Idle nodes' or something.

It can be auto-allocation. I don't see problem with 'unallocated' term.


Ok, it's not a biggy. I do think it will frame things poorly and lead
to an expectation about how TripleO works that doesn't match how it
does, but we can change it later if I'm right, and if I'm wrong, well
it won't be the first time :).



I'm interested in what the distinction you're making here is.  I'd rather get 
things
defined correctly the first time, and it's very possible that I'm missing a 
fundamental
definition here.


So we have:
  - node - a physical general purpose machine capable of running in
many roles. Some nodes may have hardware layout that is particularly
useful for a given role.
  - role - a specific workload we want to map onto one or more nodes.
Examples include 'undercloud control plane', 'overcloud control
plane', 'overcloud storage', 'overcloud compute' etc.
  - instance - A role deployed on a node - this is where work actually happens.
  - scheduling - the process of deciding which role is deployed on which node.


This glossary is really handy to make sure we're all speaking the same 
language.



The way TripleO works is that we defined a Heat template that lays out
policy: 5 instances of 'overcloud control plane please', '20
hypervisors' etc. Heat passes that to Nova, which pulls the image for
the role out of Glance, picks a node, and deploys the image to the
node.

Note in particular the order: Heat -> Nova -> Scheduler -> Node chosen.

The user action is not 'allocate a Node to 'overcloud control plane',
it is 'size the control plane through heat'.

So when we talk about 'unallocated Nodes', the implication is that
users 'allocate Nodes', but they don't: they size roles, and after
doing all that there may be some Nodes that are - yes - unallocated,


I'm not sure if I should ask this here or to your point above, but what 
about multi-role nodes? Is there any piece in here that says "The policy 
wants 5 instances but I can fit two of them on this existing 
underutilized node and three of them on unallocated nodes" or since it's 
all at the image level you get just what's in the image and that's the 
finest-level of granularity?



or have nothing scheduled to them. So... I'm not debating that we
should have a list of free hardware - we totally should - I'm debating
how we frame it. 'Available Nodes' or 'Undeployed machines' or
whatever. I just want to get away from talking about something
([manual] allocation) that we don't offer.


My only concern here is that we're not talking about cloud users, we're 
talking about admins adminning (we'll pretend it's a word, come with me) 
a cloud. To a cloud user, "give me some power so I can do some stuff" is 
a safe use case if I trust the cloud I'm running on. I t

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Robert Collins
On 10 December 2013 09:55, Tzu-Mainn Chen  wrote:
>> >* created as part of undercloud install process

>> By that note I meant, that Nodes are not resources, Resource instances
>> run on Nodes. Nodes are the generic pool of hardware we can deploy
>> things onto.
>
> I don't think "resource nodes" is intended to imply that nodes are resources; 
> rather, it's supposed to
> indicate that it's a node where a resource instance runs.  It's supposed to 
> separate it from "management node"
> and "unallocated node".

So the question is are we looking at /nodes/ that have a /current
role/, or are we looking at /roles/ that have some /current nodes/.

My contention is that the role is the interesting thing, and the nodes
is the incidental thing. That is, as a sysadmin, my hierarchy of
concerns is something like:
 A: are all services running
 B: are any of them in a degraded state where I need to take prompt
action to prevent a service outage [might mean many things: - software
update/disk space criticals/a machine failed and we need to scale the
cluster back up/too much load]
 C: are there any planned changes I need to make [new software deploy,
feature request from user, replacing a faulty machine]
 D: are there long term issues sneaking up on me [capacity planning,
machine obsolescence]

If we take /nodes/ as the interesting thing, and what they are doing
right now as the incidental thing, it's much harder to map that onto
the sysadmin concerns. If we start with /roles/ then can answer:
 A: by showing the list of roles and the summary stats (how many
machines, service status aggregate), role level alerts (e.g. nova-api
is not responding)
 B: by showing the list of roles and more detailed stats (overall
load, response times of services, tickets against services
 and a list of in trouble instances in each role - instances with
alerts against them - low disk, overload, failed service,
early-detection alerts from hardware
 C: probably out of our remit for now in the general case, but we need
to enable some things here like replacing faulty machines
 D: by looking at trend graphs for roles (not machines), but also by
looking at the hardware in aggregate - breakdown by age of machines,
summary data for tickets filed against instances that were deployed to
a particular machine

C: and D: are (F) category work, but for all but the very last thing,
it seems clear how to approach this from a roles perspective.

I've tried to approach this using /nodes/ as the starting point, and
after two terrible drafts I've deleted the section. I'd love it if
someone could show me how it would work:)

>> > * Unallocated nodes
>> >
>> > This implies an 'allocation' step, that we don't have - how about
>> > 'Idle nodes' or something.
>> >
>> > It can be auto-allocation. I don't see problem with 'unallocated' term.
>>
>> Ok, it's not a biggy. I do think it will frame things poorly and lead
>> to an expectation about how TripleO works that doesn't match how it
>> does, but we can change it later if I'm right, and if I'm wrong, well
>> it won't be the first time :).
>>
>
> I'm interested in what the distinction you're making here is.  I'd rather get 
> things
> defined correctly the first time, and it's very possible that I'm missing a 
> fundamental
> definition here.

So we have:
 - node - a physical general purpose machine capable of running in
many roles. Some nodes may have hardware layout that is particularly
useful for a given role.
 - role - a specific workload we want to map onto one or more nodes.
Examples include 'undercloud control plane', 'overcloud control
plane', 'overcloud storage', 'overcloud compute' etc.
 - instance - A role deployed on a node - this is where work actually happens.
 - scheduling - the process of deciding which role is deployed on which node.

The way TripleO works is that we defined a Heat template that lays out
policy: 5 instances of 'overcloud control plane please', '20
hypervisors' etc. Heat passes that to Nova, which pulls the image for
the role out of Glance, picks a node, and deploys the image to the
node.

Note in particular the order: Heat -> Nova -> Scheduler -> Node chosen.

The user action is not 'allocate a Node to 'overcloud control plane',
it is 'size the control plane through heat'.

So when we talk about 'unallocated Nodes', the implication is that
users 'allocate Nodes', but they don't: they size roles, and after
doing all that there may be some Nodes that are - yes - unallocated,
or have nothing scheduled to them. So... I'm not debating that we
should have a list of free hardware - we totally should - I'm debating
how we frame it. 'Available Nodes' or 'Undeployed machines' or
whatever. I just want to get away from talking about something
([manual] allocation) that we don't offer.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.o

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Tzu-Mainn Chen
> >* created as part of undercloud install process
> >* can create additional management nodes (F)
> > * Resource nodes
> >
> > ^ nodes is again confusing layers - nodes are
> > what things are deployed to, but they aren't the entry point
> >
> > Can you, please be a bit more specific here? I don't understand this note.
> 
> By the way, can you get your email client to insert > before the text
> you are replying to rather than HTML | marks? Hard to tell what I
> wrote and what you did :).
> 
> By that note I meant, that Nodes are not resources, Resource instances
> run on Nodes. Nodes are the generic pool of hardware we can deploy
> things onto.

I don't think "resource nodes" is intended to imply that nodes are resources; 
rather, it's supposed to
indicate that it's a node where a resource instance runs.  It's supposed to 
separate it from "management node"
and "unallocated node".

> > * Unallocated nodes
> >
> > This implies an 'allocation' step, that we don't have - how about
> > 'Idle nodes' or something.
> >
> > It can be auto-allocation. I don't see problem with 'unallocated' term.
> 
> Ok, it's not a biggy. I do think it will frame things poorly and lead
> to an expectation about how TripleO works that doesn't match how it
> does, but we can change it later if I'm right, and if I'm wrong, well
> it won't be the first time :).
> 

I'm interested in what the distinction you're making here is.  I'd rather get 
things
defined correctly the first time, and it's very possible that I'm missing a 
fundamental
definition here.


Mainn

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Robert Collins
On 9 December 2013 23:56, Jaromir Coufal  wrote:
>
> On 2013/07/12 01:59, Robert Collins wrote:
>
>* Creation
>   * Manual registration
>  * hardware specs from Ironic based on mac address (M)
>
> Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory
> stats
>
> For registration it is just Management MAC address which is needed right? Or
> does Ironic need also IP? I think that MAC address might be enough, we can
> display IP in details of node later on.

Ironic needs all the details I listed today. Management MAC is not
currently used at all, but would be needed in future when we tackle
IPMI IP managed by Neutron.

 >   * Auto-discovery during undercloud install process (M)
>* Monitoring
>* assignment, availability, status
>* capacity, historical statistics (M)
>
> Why is this under 'nodes'? I challenge the idea that it should be
> there. We will need to surface some stuff about nodes, but the
> underlying idea is to take a cloud approach here - so we're monitoring
> services, that happen to be on nodes. There is room to monitor nodes,
> as an undercloud feature set, but lets be very very specific about
> what is sitting at what layer.
>
> We need both - we need to track services but also state of nodes (CPU, RAM,
> Network bandwidth, etc). So in node detail you should be able to track both.

Those are instance characteristics, not node characteristics. An
instance is software running on a Node, and the amount of CPU/RAM/NIC
utilisation is specific to that software while it's on that Node, not
to future or past instances running on that Node.

>* created as part of undercloud install process
>* can create additional management nodes (F)
> * Resource nodes
>
> ^ nodes is again confusing layers - nodes are
> what things are deployed to, but they aren't the entry point
>
> Can you, please be a bit more specific here? I don't understand this note.

By the way, can you get your email client to insert > before the text
you are replying to rather than HTML | marks? Hard to tell what I
wrote and what you did :).

By that note I meant, that Nodes are not resources, Resource instances
run on Nodes. Nodes are the generic pool of hardware we can deploy
things onto.

> * searchable by status, name, cpu, memory, and all attributes from
> ironic
> * can be allocated as one of four node types
>
> Not by users though. We need to stop thinking of this as 'what we do
> to nodes' - Nova/Ironic operate on nodes, we operate on Heat
> templates.
>
> Discussed in other threads, but I still believe (and I am not alone) that we
> need to allow 'force nodes'.

I'll respond in the other thread :).

> * Unallocated nodes
>
> This implies an 'allocation' step, that we don't have - how about
> 'Idle nodes' or something.
>
> It can be auto-allocation. I don't see problem with 'unallocated' term.

Ok, it's not a biggy. I do think it will frame things poorly and lead
to an expectation about how TripleO works that doesn't match how it
does, but we can change it later if I'm right, and if I'm wrong, well
it won't be the first time :).

-Rob


-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Tzu-Mainn Chen
> > - As an infrastructure administrator, Anna wants to be able to unallocate a
> > node from a deployment.
> 
> > Why? Whats her motivation. One plausible one for me is 'a machine
> 
> > needs to be serviced so Anna wants to remove it from the deployment to
> 
> > avoid causing user visible downtime.'  So lets say that: Anna needs to
> 
> > be able to take machines out of service so they can be maintained or
> 
> > disposed of.
> 

> Node being serviced is a different user story for me.

> I believe we are still 'fighting' here with two approaches and I believe we
> need both. We can't only provide a way 'give us resources we will do a
> magic'. Yes this is preferred way - especially for large deployments, but we
> also need a fallback so that user can say - no, this node doesn't belong to
> the class, I don't want it there - unassign. Or I need to have this node
> there - assign.
Just for clarification - the wireframes don't cover individual nodes being 
manually assigned, do they? I thought the concession to manual control was 
entirely through resource classes and node profiles, which are still parameters 
to be passed through to the nova-scheduler filter. To me, that's very different 
from manual assignment. 

Mainn 
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread mar...@redhat.com
On 09/12/13 18:01, Jay Dobies wrote:
>> I believe we are still 'fighting' here with two approaches and I believe
>> we need both. We can't only provide a way 'give us resources we will do
>> a magic'. Yes this is preferred way - especially for large deployments,
>> but we also need a fallback so that user can say - no, this node doesn't
>> belong to the class, I don't want it there - unassign. Or I need to have
>> this node there - assign.
> 
> +1 to this. I think there are still a significant amount of admins out
> there that are really opposed to magic and want that fine-grained
> control. Even if they don't use it that frequently, in my experience
> they want to know it's there in the event they need it (and will often
> dream up a case that they'll need it).

+1 to the responses to the 'automagic' vs 'manual' discussion. The
latter is in fact only really possible in small deployments. But that's
not to say it is not a valid use case. Perhaps we need to split it
altogether into two use cases.

At least we should have a level of agreement here and register
blueprints for both: for Icehouse the auto selection of which services
go onto which nodes (i.e. allocation of services to nodes is entirely
transparent). For post Icehouse allow manual allocation of services to
nodes. This last bit may also coincide with any work being done in
Ironic/Nova scheduler which will make this allocation prettier than the
current force_nodes situation.


> 
> I'm absolutely for pushing the magic approach as the preferred use. And
> in large deployments that's where people are going to see the biggest
> gain. The fine-grained approach can even be pushed off as a future
> feature. But I wouldn't be surprised to see people asking for it and I'd
> like to at least be able to say it's been talked about.
> 
 - As an infrastructure administrator, Anna wants to be able to view
 the history of nodes that have been in a deployment.
>>> Why? This is super generic and could mean anything.
>> I believe this has something to do with 'archived nodes'. But correct me
>> if I am wrong.
>>
>> -- Jarda
>>
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Jay Dobies



On 12/06/2013 09:39 PM, Tzu-Mainn Chen wrote:

Thanks for the comments and questions!  I fully expect that this list of 
requirements
will need to be fleshed out, refined, and heavily modified, so the more the 
merrier.

Comments inline:



*** Requirements are assumed to be targeted for Icehouse, unless marked
otherwise:
(M) - Maybe Icehouse, dependency on other in-development features
(F) - Future requirement, after Icehouse

* NODES


Note that everything in this section should be Ironic API calls.


* Creation
   * Manual registration
  * hardware specs from Ironic based on mac address (M)


Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory
stats


  * IP auto populated from Neutron (F)


Do you mean IPMI IP ? I'd say IPMI address managed by Neutron here.


   * Auto-discovery during undercloud install process (M)
* Monitoring
* assignment, availability, status
* capacity, historical statistics (M)


Why is this under 'nodes'? I challenge the idea that it should be
there. We will need to surface some stuff about nodes, but the
underlying idea is to take a cloud approach here - so we're monitoring
services, that happen to be on nodes. There is room to monitor nodes,
as an undercloud feature set, but lets be very very specific about
what is sitting at what layer.


That's a fair point.  At the same time, the UI does want to monitor both
services and the nodes that the services are running on, correct?  I would
think that a user would want this.

Would it be better to explicitly split this up into two separate requirements?


That was my understanding as well, that Tuskar would not only care about 
the services of the undercloud but the health of the actual hardware on 
which it's running. As I write that I think you're correct, two separate 
requirements feels much more explicit in how that's different from 
elsewhere in OpenStack.



* Management node (where triple-o is installed)


This should be plural :) - TripleO isn't a single service to be
installed - We've got Tuskar, Ironic, Nova, Glance, Keystone, Neutron,
etc.


I misspoke here - this should be "where the undercloud is installed".  My
current understanding is that our initial release will only support the 
undercloud
being installed onto a single node, but my understanding could very well be 
flawed.


* created as part of undercloud install process
* can create additional management nodes (F)
 * Resource nodes


 ^ nodes is again confusing layers - nodes are
what things are deployed to, but they aren't the entry point


 * searchable by status, name, cpu, memory, and all attributes from
 ironic
 * can be allocated as one of four node types


Not by users though. We need to stop thinking of this as 'what we do
to nodes' - Nova/Ironic operate on nodes, we operate on Heat
templates.


Right, I didn't mean to imply that users would be doing this allocation.  But 
once Nova
does this allocation, the UI does want to be aware of how the allocation is 
done, right?
That's what this requirement meant.


 * compute
 * controller
 * object storage
 * block storage
 * Resource class - allows for further categorization of a node type
 * each node type specifies a single default resource class
 * allow multiple resource classes per node type (M)


Whats a node type?


Compute/controller/object storage/block storage.  Is another term besides "node 
type"
more accurate?




 * optional node profile for a resource class (M)
 * acts as filter for nodes that can be allocated to that
 class (M)


I'm not clear on this - you can list the nodes that have had a
particular thing deployed on them; we probably can get a good answer
to being able to see what nodes a particular flavor can deploy to, but
we don't want to be second guessing the scheduler..


Correct; the goal here is to provide a way through the UI to send additional 
filtering
requirements that will eventually be passed into the scheduler, allowing the 
scheduler
to apply additional filters.


 * nodes can be viewed by node types
 * additional group by status, hardware specification


*Instances* - e.g. hypervisors, storage, block storage etc.


 * controller node type


Again, need to get away from node type here.


* each controller node will run all openstack services
   * allow each node to run specified service (F)
* breakdown by workload (percentage of cpu used per node) (M)
 * Unallocated nodes


This implies an 'allocation' step, that we don't have - how about
'Idle nodes' or something.


Is it imprecise to say that nodes are allocated by the scheduler?  Would 
something like
'active/idle' be better?


 * Archived nodes (F)
 * Will be se

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Jay Dobies

I believe we are still 'fighting' here with two approaches and I believe
we need both. We can't only provide a way 'give us resources we will do
a magic'. Yes this is preferred way - especially for large deployments,
but we also need a fallback so that user can say - no, this node doesn't
belong to the class, I don't want it there - unassign. Or I need to have
this node there - assign.


+1 to this. I think there are still a significant amount of admins out 
there that are really opposed to magic and want that fine-grained 
control. Even if they don't use it that frequently, in my experience 
they want to know it's there in the event they need it (and will often 
dream up a case that they'll need it).


I'm absolutely for pushing the magic approach as the preferred use. And 
in large deployments that's where people are going to see the biggest 
gain. The fine-grained approach can even be pushed off as a future 
feature. But I wouldn't be surprised to see people asking for it and I'd 
like to at least be able to say it's been talked about.



- As an infrastructure administrator, Anna wants to be able to view the history 
of nodes that have been in a deployment.

Why? This is super generic and could mean anything.

I believe this has something to do with 'archived nodes'. But correct me
if I am wrong.

-- Jarda


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread mar...@redhat.com
On 07/12/13 04:42, Tzu-Mainn Chen wrote:
>> On 7 December 2013 08:15, Jay Dobies  wrote:
>>> Disclaimer: I'm very new to the project, so apologies if some of my
>>> questions have been already answered or flat out don't make sense.
>>
>>
>> NP :)
>>
>>
  * optional node profile for a resource class (M)
  * acts as filter for nodes that can be allocated to that
 class (M)
>>>
>>>
>>> To my understanding, once this is in Icehouse, we'll have to support
>>> upgrades. If this filtering is pushed off, could we get into a situation
>>> where an allocation created in Icehouse would no longer be valid in
>>> Icehouse+1 once these filters are in place? If so, we might want to make it
>>> more of a priority to get them in place earlier and not eat the headache of
>>> addressing these sorts of integrity issues later.
>>
>> We need to be wary of over-implementing now; a lot of the long term
>> picture is moving Tuskar prototype features into proper homes like
>> Heat and Nova; so the more we implement now the more we have to move.
>>
  * Unallocated nodes
>>>
>>>
>>> Is there more still being flushed out here? Things like:
>>>  * Listing unallocated nodes
>>>  * Unallocating a previously allocated node (does this make it a vanilla
>>> resource or does it retain the resource type? is this the only way to
>>> change
>>> a node's resource type?)
>>
>> Nodes don't have resource types. Nodes are machines Ironic knows
>> about, and thats all they are.
> 
> Once nodes are assigned by nova scheduler, would it be accurate to say that 
> they
> have an implicit resource type?  Or am I missing the point entirely?
> 
>>>  * Unregistering nodes from Tuskar's inventory (I put this under
>>>  unallocated
>>> under the assumption that the workflow will be an explicit unallocate
>>> before
>>> unregister; I'm not sure if this is the same as "archive" below).
>>
>> Tuskar shouldn't have an inventory of nodes.
> 
> Would it be correct to say that Ironic has an inventory of nodes, and that we 
> may
> want to remove a node from Ironic's inventory?

right, in which case (needs to be clarified): Tuskar doesn't store info
about nodes BUT Tuskar (??) the Tuskar UI (??) uses a client to fetch
info directly from Ironic on demand (from the UI).  ??

> 
> Mainn
> 
>> -Rob
>>
>>
>> --
>> Robert Collins 
>> Distinguished Technologist
>> HP Converged Cloud
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Liz Blanchard

On Dec 6, 2013, at 8:20 PM, Robert Collins  wrote:

> On 7 December 2013 09:31, Liz Blanchard  wrote:
>> This list is great, thanks very much for taking the time to write this up! I 
>> think a big part of the User Experience design is to take a step back and 
>> understand the requirements from an end user's point of view…what would they 
>> want to accomplish by using this UI? This might influence the design in 
>> certain ways, so I've taken a cut at a set of user stories for the Icehouse 
>> timeframe based on these requirements that I hope will be useful during 
>> discussions.
>> 
>> Based on the OpenStack Personas[1], I think that Anna would be the main 
>> consumer of the TripleO UI, but please let me know if you think otherwise.
>> 
>> - As an infrastructure administrator, Anna needs to deploy or update a set 
>> of resources that will run OpenStack (This isn't a very specific use case, 
>> but more of the larger end goal of Anna coming into the UI.)
>> - As an infrastructure administrator, Anna expects that the management node 
>> for the deployment services is already up and running and the status of this 
>> node is shown in the UI.
>> - As an infrastructure administrator, Anna wants to be able to quickly see 
>> the set of unallocated nodes that she could use for her deployment of 
>> OpenStack. Ideally, she would not have to manually tell the system about 
>> these nodes. If she needs to manually register nodes for whatever reason, 
>> Anna would only want to have to define the essential data needed to register 
>> these nodes.
> 
> I want to challenge this one. There are two concerns conflated. A)
> seeing available resources for scaling up her cloud. B) minimising
> effort to enroll additional resources. B) is a no-brainer. For A)
> though, as phrased, we're talking about seeing a set of individual
> items: but actually, wouldn't aggregated capacity being more useful,
> with optional drill down - '400 cores, 2TB RAM, 1PB of disk'

Good point. I will update this to read that the user wants to see the available 
capacity and have the option to drill in further. [1]

> 
>> - As an infrastructure administrator, Anna needs to assign a role to each of 
>> the necessary nodes in her OpenStack deployment. The nodes could be either 
>> controller, compute, networking, or storage resources depending on the needs 
>> of this deployment.
> 
> Definitely not: she needs to deliver a running cloud. Manually saying
> 'machine X is a compute node' is confusing an implementation with a
> need. She needs to know that her cloud will have enough capacity to
> meet her users needs; she needs to know that it will be resilient
> against a wide set of failures (and this might be a dial with
> different clouds having different uptime guarantees); she may need to
> ensure that some specific hardware configuration is used for storage,
> as a performance optimisation. None of those needs imply assigning
> roles to machines.
> 
>> - As an infrastructure administrator, Anna wants to review the distribution 
>> of the nodes that she has assigned before kicking off the "Deploy" task.
> 
> If by distribution you mean the top level stats (15 control nodes, 200
> hypervisors, etc) - then I agree. If you mean 'node X will be a
> hypervisor' - I thoroughly disagree. What does that do for her?

We are in agreement, I'd expect the former. I've updated the use case to be 
more specific. [1] 

> 
>> - As an infrastructure administrator, Anna wants to monitor the deployment 
>> process of all of the nodes that she has assigned.
> 
> I don't think she wants to do that. I think she wants to be told if
> there is a problem that needs her intervention to solve - e.g. bad
> IPMI details for a node, or a node not responding when asked to boot
> via PXE.
> 
>> - As an infrastructure administrator, Anna needs to be able to troubleshoot 
>> any errors that may occur during the deployment of nodes process.
> 
> Definitely.
> 
>> - As an infrastructure administrator, Anna wants to monitor the availability 
>> and status of each node in her deployment.
> 
> Yes, with the caveat that I think instance is the key thing here for
> now; there is a lifecycle aspect where being able to say 'machine X is
> having persistent network issues' is very important, as a long term
> thing we should totally aim at that.
> 
>> - As an infrastructure administrator, Anna wants to be able to unallocate a 
>> node from a deployment.
> 
> Why? Whats her motivation. One plausible one for me is 'a machine
> needs to be serviced so Anna wants to remove it from the deployment to
> avoid causing user visible downtime.'  So lets say that: Anna needs to
> be able to take machines out of service so they can be maintained or
> disposed of.
> 
>> - As an infrastructure administrator, Anna wants to be able to view the 
>> history of nodes that have been in a deployment.
> 
> Why? This is super generic and could mean anything.
> 
>> - As an infrastructure administrator, Anna ne

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Liz Blanchard

On Dec 9, 2013, at 8:58 AM, James Slagle  wrote:

> On Fri, Dec 6, 2013 at 4:55 PM, Matt Wagner  wrote:
>>> - As an infrastructure administrator, Anna expects that the
>>> management node for the deployment services is already up and running
>>> and the status of this node is shown in the UI.
>> 
>> The 'management node' here is the undercloud node that Anna is
>> interacting with, as I understand it. (Someone correct me if I'm wrong.)
>> So it's not a bad idea to show its status, but I guess the mere fact
>> that she's using it will indicate that it's operational.
> 
> That's how I read it as well, which assumes that you're using the
> undercloud to manage itself.
> 
> FWIW, based on the OpenStack personas I think that Anna would be the
> one doing the undercloud setup.  So, maybe this use case should be:
> 
> - As an infrastructure administrator, Anna wants to install the
> undercloud so she can use the UI.
> 
> That piece is going to be a pretty big part of the entire deployment
> process, so I think having a use case for it makes sense.

+1. I've added this as the very first use case.

> 
> Nice work on the use cases Liz, thanks for pulling them together.

Thanks to all for the great discussion on these use cases. The 
questions/comments that they've generated is exactly what I was hoping for. I 
will continue to make updates and refine these[1] based on discussions. Of 
course, feel free to add to/change these yourself as well.

Liz

[1] https://wiki.openstack.org/wiki/TripleO/Tuskar/IcehouseUserStories

> 
> -- 
> -- James Slagle
> --
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread mar...@redhat.com
On 06/12/13 04:31, Tzu-Mainn Chen wrote:
> Hey all,
> 
> I've attempted to spin out the requirements behind Jarda's excellent 
> wireframes 
> (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html).
> Hopefully this can add some perspective on both the wireframes and the needed 
> changes to the tuskar-api.
> 
> All comments are welcome!
> 
> Thanks,
> Tzu-Mainn Chen
> 
> 
> 
> *** Requirements are assumed to be targeted for Icehouse, unless marked 
> otherwise:
>(M) - Maybe Icehouse, dependency on other in-development features
>(F) - Future requirement, after Icehouse
> 
> * NODES
>* Creation
>   * Manual registration
>  * hardware specs from Ironic based on mac address (M)
>  * IP auto populated from Neutron (F)
>   * Auto-discovery during undercloud install process (M)
>* Monitoring
>* assignment, availability, status
>* capacity, historical statistics (M)
>* Management node (where triple-o is installed)
>* created as part of undercloud install process
>* can create additional management nodes (F)
> * Resource nodes
> * searchable by status, name, cpu, memory, and all attributes from 
> ironic
> * can be allocated as one of four node types
> * compute
> * controller
> * object storage
> * block storage
> * Resource class - allows for further categorization of a node type
> * each node type specifies a single default resource class
> * allow multiple resource classes per node type (M)
> * optional node profile for a resource class (M)
> * acts as filter for nodes that can be allocated to that 
> class (M)
> * nodes can be viewed by node types
> * additional group by status, hardware specification
> * controller node type
>* each controller node will run all openstack services
>   * allow each node to run specified service (F)
>* breakdown by workload (percentage of cpu used per node) (M)
> * Unallocated nodes
> * Archived nodes (F)
> * Will be separate openstack service (F)
> 
> * DEPLOYMENT
>* multiple deployments allowed (F)
>  * initially just one
>* deployment specifies a node distribution across node types
>   * node distribution can be updated after creation
>* deployment configuration, used for initial creation only
>   * defaulted, with no option to change
>  * allow modification (F)
>* review distribution map (F)
>* notification when a deployment is ready to go or whenever something 
> changes
> 
> * DEPLOYMENT ACTION
>* Heat template generated on the fly
>   * hardcoded images
>  * allow image selection (F)
>   * pre-created template fragments for each node type
>   * node type distribution affects generated template

sorry am a bit late to the discussion - fyi:

 there are two sides to these previous points 1) temp solution using
merge.py from tuskar and the tripleo-heat-templates repo. (Icehouse,
imo) and 2) doing it 'properly' with the merge functionality pushed into
heat. (F, imo).

For 1) various bits are in play: fyi/if interested:

 /#/c/56947/ (Make merge.py invokable), /#/c/58823/ (Make merge.py
installable) and /#/c/52045/ (WIP : sketch of what using merge.py looks
like for tuskar) this last one needs updating and thought. Also
/#/c/58229/ and /#/c/57210/ which need some more thought,



>* nova scheduler allocates nodes
>   * filters based on resource class and node profile information (M)
>* Deployment action can create or update
>* status indicator to determine overall state of deployment
>   * status indicator for nodes as well
>   * status includes 'time left' (F)
> 
> * NETWORKS (F)
> * IMAGES (F)
> * LOGS (F)
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Liz Blanchard

On Dec 9, 2013, at 4:57 AM, Jaromir Coufal  wrote:

> 
> On 2013/07/12 02:20, Robert Collins wrote:
>>> - As an infrastructure administrator, Anna needs to assign a role to each 
>>> of the necessary nodes in her OpenStack deployment. The nodes could be 
>>> either controller, compute, networking, or storage resources depending on 
>>> the needs of this deployment.
>> Definitely not: she needs to deliver a running cloud. Manually saying
>> 'machine X is a compute node' is confusing an implementation with a
>> need. She needs to know that her cloud will have enough capacity to
>> meet her users needs; she needs to know that it will be resilient
>> against a wide set of failures (and this might be a dial with
>> different clouds having different uptime guarantees); she may need to
>> ensure that some specific hardware configuration is used for storage,
>> as a performance optimisation. None of those needs imply assigning
>> roles to machines.
> Yes, in ideal world and large deployments. But there might be cases when Anna 
> will need to say - deploy storage to this specific node. Not arguing that we 
> want to have policy based approach, but we need to cover also manual control 
> (forcing node to take some role).

Perhaps the use case is that Anna would want to define the different capacities 
that her cloud deployment will need? You both a right though, we don't want to 
force the user to manually select which nodes will run which services, but we 
should allow it for cases in which it's needed. I've updated the use case as an 
attempt to clear this up. [1]

> 
>>> - As an infrastructure administrator, Anna wants to monitor the deployment 
>>> process of all of the nodes that she has assigned.
>> I don't think she wants to do that. I think she wants to be told if
>> there is a problem that needs her intervention to solve - e.g. bad
>> IPMI details for a node, or a node not responding when asked to boot
>> via PXE.
> I think by this user story Liz wanted to capture that Anna wants to see if 
> the deployment process is still being in progress or if it has 
> finished/failed, etc. Which I agree with. I don't think that she will sit and 
> watch what is happening.

Yes, definitely. I've updated this use case to reflect reality in that Anna 
would not sit there and actively monitor, but rather she would want to 
ultimately make sure that there weren't any errors during the deployment 
process. [1]

>  
>> - As an infrastructure administrator, Anna wants to be able to unallocate a 
>> node from a deployment.
>> Why? Whats her motivation. One plausible one for me is 'a machine
>> needs to be serviced so Anna wants to remove it from the deployment to
>> avoid causing user visible downtime.'  So lets say that: Anna needs to
>> be able to take machines out of service so they can be maintained or
>> disposed of.
> Node being serviced is a different user story for me.
> 
> I believe we are still 'fighting' here with two approaches and I believe we 
> need both. We can't only provide a way 'give us resources we will do a 
> magic'. Yes this is preferred way - especially for large deployments, but we 
> also need a fallback so that user can say - no, this node doesn't belong to 
> the class, I don't want it there - unassign. Or I need to have this node 
> there - assign.

This is a great question, Robert. I think the reason you bring up for Anna 
wanting to remove a node is actually more of a "Disable node" action. This way 
she could potentially bring it back up after the maintenance is done. I will 
add some more details to this use case to try to clarify. [1]

> 
>>> - As an infrastructure administrator, Anna wants to be able to view the 
>>> history of nodes that have been in a deployment.
>> Why? This is super generic and could mean anything.
> I believe this has something to do with 'archived nodes'. But correct me if I 
> am wrong.

I was assuming it would be incase the user wants to go back to view the history 
of a certain node. Potentially the user could bring an archived node back 
online? Although maybe at this point it would just be rediscovered?

Thanks,
Liz

[1] https://wiki.openstack.org/wiki/TripleO/Tuskar/IcehouseUserStories

> 
> -- Jarda
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Liz Blanchard

On Dec 9, 2013, at 4:29 AM, Jaromir Coufal  wrote:

> 
> On 2013/06/12 22:55, Matt Wagner wrote:
>>> - As an infrastructure administrator, Anna wants to review the
>>> distribution of the nodes that she has assigned before kicking off
>>> the "Deploy" task.
>> What does she expect to see here on the review screen that she didn't
>> see on the previous screens, if anything? Is this just a summation, or
>> is she expecting to see things like which node will get which role? (I'd
>> argue for the former; I don't know that we can predict the latter.)
> At the beginning, just summation. Later (when we have nova-scheduler 
> reservation) we can get the real distribution of which node is taking which 
> role.

Yes, the idea is that Anna wants to see some representation of what the 
distribution of nodes would be (how many would be assigned to each profile) 
before kicking off the "deploy" action.

> 
>>> - As an infrastructure administrator, Anna wants to monitor the
>>> deployment process of all of the nodes that she has assigned.
>> I think there's an implied "...through the UI" here, versus tailing log
>> files to watch state. Does she just expect to see states like "Pending",
>> "Deploying", or "Finished", versus, say, having the full logs shown in
>> the UI? (I'd vote 'yes'.)
> For simplified view - yes, only change of states and progress bar. However 
> log should be available.

I'd vote 'yes' as well. These are definitely design decisions we should be 
making based on what we know of our end user. Although some use cases like 
troubleshooting might point towards using logs, this one definitely seems like 
a UI addition. I'll update the use case to be more specific. [1]

> 
>>> - As an infrastructure administrator, Anna needs to be able to
>>> troubleshoot any errors that may occur during the deployment of nodes
>>> process.
>> I'm not sure that the "...through the UI" implication I mentioned above
>> extends here. (IMHO) I assume that if things fail, Anna might be okay
>> with us showing a message that $foo failed on $bar, and she should try
>> looking in /var/log/$baz for full details. Does that seem fair? (At
>> least early on.)
> As said above, for simplified views, it is ok to say $foo failed on $bar, but 
> she should be able to track the problem - logs section in the UI.

Yes, this is meant to be through the UI. I've updated the use case. [1]

> 
>>> - As an infrastructure administrator, Anna wants to be able to view
>>> the history of nodes that have been in a deployment.
>> Why does she want to view history of past nodes?
>> 
>> Note that I'm not arguing against this; it's just not abundantly clear
>> to me what she'll be using this information for. Does she want a history
>> to check off an "Audit log" checkbox, or will she be looking to extract
>> certain data from this history?
> Short answer is Graphs - history of utilization of the class etc.

I've updated this one to be more specific about the reasons why historic nodes 
is important to Anna. [1]

Thanks for all of the feedback,
Liz

[1] https://wiki.openstack.org/wiki/TripleO/Tuskar/IcehouseUserStories

> 
> -- Jarda
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread James Slagle
On Fri, Dec 6, 2013 at 4:55 PM, Matt Wagner  wrote:
>> - As an infrastructure administrator, Anna expects that the
>> management node for the deployment services is already up and running
>> and the status of this node is shown in the UI.
>
> The 'management node' here is the undercloud node that Anna is
> interacting with, as I understand it. (Someone correct me if I'm wrong.)
> So it's not a bad idea to show its status, but I guess the mere fact
> that she's using it will indicate that it's operational.

That's how I read it as well, which assumes that you're using the
undercloud to manage itself.

FWIW, based on the OpenStack personas I think that Anna would be the
one doing the undercloud setup.  So, maybe this use case should be:

- As an infrastructure administrator, Anna wants to install the
undercloud so she can use the UI.

That piece is going to be a pretty big part of the entire deployment
process, so I think having a use case for it makes sense.

Nice work on the use cases Liz, thanks for pulling them together.

-- 
-- James Slagle
--

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread James Slagle
Mainn,

Thanks for pulling this together.

> * NODES
>* Management node (where triple-o is installed)
>* created as part of undercloud install process

I think getting the undercloud installed/deployed should be a
requirement for Icehouse.  I'm not sure if you meant that or were
assuming that it would already be done :).  I'd like to see a simpler
process than building the seed vm, starting it, deploying undercloud,
etc.  But, that's something we can work to define if others agree as
well.

>* can create additional management nodes (F)

By this, do you mean using the undercloud to scale itself?  e.g.,
using nova on the undercloud to launch an additional undercloud
compute node, etc.  I like that concept, and don't see any reason why
that wouldn't be technically possible.

> * DEPLOYMENT ACTION
>* Heat template generated on the fly
>   * hardcoded images
>  * allow image selection (F)

So, I think this may be what Robert was getting at, but I think this
one should be M or possibly even committed to Icehouse.  I think it's
very likely we're going to need to update which image is used to do
the deployment, e.g., if you build a new image to pick up a security
update.

IIRC, the image is just referenced by name in the template.  So,
maybe the process is just:

* build the new image
* rename/delete the old image
* upload the new image with the required name (overcloud-compute,
overcloud-control)

However, having a nicer image selection process would be nice.


-- 
-- James Slagle
--

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Imre Farkas

On 12/09/2013 11:56 AM, Jaromir Coufal wrote:

On 2013/07/12 01:59, Robert Collins wrote:

* Monitoring
* assignment, availability, status
* capacity, historical statistics (M)

Why is this under 'nodes'? I challenge the idea that it should be
there. We will need to surface some stuff about nodes, but the
underlying idea is to take a cloud approach here - so we're monitoring
services, that happen to be on nodes. There is room to monitor nodes,
as an undercloud feature set, but lets be very very specific about
what is sitting at what layer.

We need both - we need to track services but also state of nodes (CPU,
RAM, Network bandwidth, etc). So in node detail you should be able to
track both.


I agree. Monitoring services and monitoring nodes are both valid 
features for Tuskar. I think splitting it into two separate requirements 
as Mainn suggested would make a lot of sense.



 * searchable by status, name, cpu, memory, and all attributes from 
ironic
 * can be allocated as one of four node types

Not by users though. We need to stop thinking of this as 'what we do
to nodes' - Nova/Ironic operate on nodes, we operate on Heat
templates.

Discussed in other threads, but I still believe (and I am not alone)
that we need to allow 'force nodes'.


Yeah, having both approaches would be nice to have. Instead of using the 
existing 'force nodes' implementation, wouldn't it be better/cleaner to 
implement support for it in Nova and Heat?


Imre

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Jaromir Coufal


On 2013/07/12 01:59, Robert Collins wrote:


* Creation
   * Manual registration
  * hardware specs from Ironic based on mac address (M)

Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory stats
For registration it is just Management MAC address which is needed 
right? Or does Ironic need also IP? I think that MAC address might be 
enough, we can display IP in details of node later on.



  * IP auto populated from Neutron (F)

Do you mean IPMI IP ? I'd say IPMI address managed by Neutron here.

+1


   * Auto-discovery during undercloud install process (M)
* Monitoring
* assignment, availability, status
* capacity, historical statistics (M)

Why is this under 'nodes'? I challenge the idea that it should be
there. We will need to surface some stuff about nodes, but the
underlying idea is to take a cloud approach here - so we're monitoring
services, that happen to be on nodes. There is room to monitor nodes,
as an undercloud feature set, but lets be very very specific about
what is sitting at what layer.
We need both - we need to track services but also state of nodes (CPU, 
RAM, Network bandwidth, etc). So in node detail you should be able to 
track both.



* Management node (where triple-o is installed)

This should be plural :) - TripleO isn't a single service to be
installed - We've got Tuskar, Ironic, Nova, Glance, Keystone, Neutron,
etc.


* created as part of undercloud install process
* can create additional management nodes (F)
 * Resource nodes

 ^ nodes is again confusing layers - nodes are
what things are deployed to, but they aren't the entry point

Can you, please be a bit more specific here? I don't understand this note.




 * searchable by status, name, cpu, memory, and all attributes from 
ironic
 * can be allocated as one of four node types

Not by users though. We need to stop thinking of this as 'what we do
to nodes' - Nova/Ironic operate on nodes, we operate on Heat
templates.
Discussed in other threads, but I still believe (and I am not alone) 
that we need to allow 'force nodes'.



 * Unallocated nodes
This implies an 'allocation' step, that we don't have - how about
'Idle nodes' or something.

It can be auto-allocation. I don't see problem with 'unallocated' term.


   * defaulted, with no option to change
  * allow modification (F)
* review distribution map (F)
* notification when a deployment is ready to go or whenever something 
changes

Is this an (M) ?
Might be M but with higher priority. I see it in the middle. But if we 
have to decide, it can be M.

-- Jarda
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Jaromir Coufal


On 2013/07/12 02:20, Robert Collins wrote:

- As an infrastructure administrator, Anna needs to assign a role to each of 
the necessary nodes in her OpenStack deployment. The nodes could be either 
controller, compute, networking, or storage resources depending on the needs of 
this deployment.

Definitely not: she needs to deliver a running cloud. Manually saying
'machine X is a compute node' is confusing an implementation with a
need. She needs to know that her cloud will have enough capacity to
meet her users needs; she needs to know that it will be resilient
against a wide set of failures (and this might be a dial with
different clouds having different uptime guarantees); she may need to
ensure that some specific hardware configuration is used for storage,
as a performance optimisation. None of those needs imply assigning
roles to machines.
Yes, in ideal world and large deployments. But there might be cases when 
Anna will need to say - deploy storage to this specific node. Not 
arguing that we want to have policy based approach, but we need to cover 
also manual control (forcing node to take some role).



- As an infrastructure administrator, Anna wants to monitor the deployment 
process of all of the nodes that she has assigned.

I don't think she wants to do that. I think she wants to be told if
there is a problem that needs her intervention to solve - e.g. bad
IPMI details for a node, or a node not responding when asked to boot
via PXE.
I think by this user story Liz wanted to capture that Anna wants to see 
if the deployment process is still being in progress or if it has 
finished/failed, etc. Which I agree with. I don't think that she will 
sit and watch what is happening.



- As an infrastructure administrator, Anna wants to be able to unallocate a 
node from a deployment.
Why? Whats her motivation. One plausible one for me is 'a machine
needs to be serviced so Anna wants to remove it from the deployment to
avoid causing user visible downtime.'  So lets say that: Anna needs to
be able to take machines out of service so they can be maintained or
disposed of.

Node being serviced is a different user story for me.

I believe we are still 'fighting' here with two approaches and I believe 
we need both. We can't only provide a way 'give us resources we will do 
a magic'. Yes this is preferred way - especially for large deployments, 
but we also need a fallback so that user can say - no, this node doesn't 
belong to the class, I don't want it there - unassign. Or I need to have 
this node there - assign.



- As an infrastructure administrator, Anna wants to be able to view the history 
of nodes that have been in a deployment.

Why? This is super generic and could mean anything.
I believe this has something to do with 'archived nodes'. But correct me 
if I am wrong.


-- Jarda
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Jaromir Coufal


On 2013/06/12 22:55, Matt Wagner wrote:

- As an infrastructure administrator, Anna wants to review the
distribution of the nodes that she has assigned before kicking off
the "Deploy" task.

What does she expect to see here on the review screen that she didn't
see on the previous screens, if anything? Is this just a summation, or
is she expecting to see things like which node will get which role? (I'd
argue for the former; I don't know that we can predict the latter.)
At the beginning, just summation. Later (when we have nova-scheduler 
reservation) we can get the real distribution of which node is taking 
which role.



- As an infrastructure administrator, Anna wants to monitor the
deployment process of all of the nodes that she has assigned.

I think there's an implied "...through the UI" here, versus tailing log
files to watch state. Does she just expect to see states like "Pending",
"Deploying", or "Finished", versus, say, having the full logs shown in
the UI? (I'd vote 'yes'.)
For simplified view - yes, only change of states and progress bar. 
However log should be available.



- As an infrastructure administrator, Anna needs to be able to
troubleshoot any errors that may occur during the deployment of nodes
process.

I'm not sure that the "...through the UI" implication I mentioned above
extends here. (IMHO) I assume that if things fail, Anna might be okay
with us showing a message that $foo failed on $bar, and she should try
looking in /var/log/$baz for full details. Does that seem fair? (At
least early on.)
As said above, for simplified views, it is ok to say $foo failed on 
$bar, but she should be able to track the problem - logs section in the UI.



- As an infrastructure administrator, Anna wants to be able to view
the history of nodes that have been in a deployment.

Why does she want to view history of past nodes?

Note that I'm not arguing against this; it's just not abundantly clear
to me what she'll be using this information for. Does she want a history
to check off an "Audit log" checkbox, or will she be looking to extract
certain data from this history?

Short answer is Graphs - history of utilization of the class etc.

-- Jarda
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-09 Thread Jaromir Coufal


On 2013/06/12 21:26, Tzu-Mainn Chen wrote:

* can be allocated as one of four node types

It's pretty clear by the current verbiage but I'm going to ask anyway:
"one and only one"?

Yep, that's right!

Confirming. One and only one.


My gut reaction is that we want to bite this off sooner rather than
later. This will have data model and API implications that, even if we
don't commit to it for Icehouse, should still be in our minds during it,
so it might make sense to make it a first class thing to just nail down now.

That is entirely correct, which is one reason it's on the list of requirements. 
 The
forthcoming API design will have to account for it.  Not recreating the entire 
data
model between releases is a key goal :)
Well yeah, that's why we should try to think in a longer-term and 
wireframes are covering also a bit more than might land in Icehouse. So 
that we are aware of future direction and we don't have to completely 
rebuild underlying models later on.



  * optional node profile for a resource class (M)
  * acts as filter for nodes that can be allocated to that
  class (M)

To my understanding, once this is in Icehouse, we'll have to support
upgrades. If this filtering is pushed off, could we get into a situation
where an allocation created in Icehouse would no longer be valid in
Icehouse+1 once these filters are in place? If so, we might want to make
it more of a priority to get them in place earlier and not eat the
headache of addressing these sorts of integrity issues later.
Hm, can you be a bit more specific about how the allocation created in I 
might no longer be valid in I+1?



That's true.  The problem is that to my understanding, the filters we'd
need in nova-scheduler are not yet fully in place.
I think at the moment there are 'extra params' which we might use to 
some level. But yes, AFAIK there is missing part for filtered scheduling 
in nova.


I also think that this is an issue that we'll need to address no matter what.
Even once filters exist, if a user applies a filter *after* nodes are allocated,
we'll need to do something clever if the already-allocated nodes don't meet the
filter criteria.
Well here is a thing. Once nodes are allocated, you can get warning, 
that those nodes in the resource class are not fulfilling the criteria 
(if they were changed) but that's all. It will be up to user's decision 
if he wants to keep them in or unallocate them. The profiles are 
important when a decision 'which node can get in' is being made.



  * nodes can be viewed by node types
  * additional group by status, hardware specification
  * controller node type
 * each controller node will run all openstack services
* allow each node to run specified service (F)
 * breakdown by workload (percentage of cpu used per node) (M)
  * Unallocated nodes

Is there more still being flushed out here? Things like:
   * Listing unallocated nodes
   * Unallocating a previously allocated node (does this make it a
vanilla resource or does it retain the resource type? is this the only
way to change a node's resource type?)
If we use policy based approach then yes this is correct. First 
unallocate a node and then increase number of resources in other class.


But I believe that we need keep control over your infrastructure and not 
to relay only on policies. So I hope we can get into something like 
'reallocate'/'allocate manually' which will force a node to be part of 
specific class.



   * Unregistering nodes from Tuskar's inventory (I put this under
unallocated under the assumption that the workflow will be an explicit
unallocate before unregister; I'm not sure if this is the same as
"archive" below).

Ah, you're entirely right.  I'll add these to the list.


  * Archived nodes (F)

Can you elaborate a bit more on what this is?

To be honest, I'm a bit fuzzy about this myself; Jarda mentioned that there was
an OpenStack service in the process of being planned that would handle this
requirement.  Jarda, can you detail a bit?
So the thing is based on historical data. At the moment, there is no 
service which would keep this type of data (might be new project?). 
Since Tuskar will not be only deploying but also monitoring your 
deployment, it is important to have historical data available. If user 
removes some nodes from infrastructure, he would lose all the data and 
we would not be able to generate graphs.That's why archived nodes = 
nodes which were registered in past but are no longer available.


-- Jarda
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Tzu-Mainn Chen
> On 7 December 2013 08:15, Jay Dobies  wrote:
> > Disclaimer: I'm very new to the project, so apologies if some of my
> > questions have been already answered or flat out don't make sense.
> 
> 
> NP :)
> 
> 
> >>  * optional node profile for a resource class (M)
> >>  * acts as filter for nodes that can be allocated to that
> >> class (M)
> >
> >
> > To my understanding, once this is in Icehouse, we'll have to support
> > upgrades. If this filtering is pushed off, could we get into a situation
> > where an allocation created in Icehouse would no longer be valid in
> > Icehouse+1 once these filters are in place? If so, we might want to make it
> > more of a priority to get them in place earlier and not eat the headache of
> > addressing these sorts of integrity issues later.
> 
> We need to be wary of over-implementing now; a lot of the long term
> picture is moving Tuskar prototype features into proper homes like
> Heat and Nova; so the more we implement now the more we have to move.
> 
> >>  * Unallocated nodes
> >
> >
> > Is there more still being flushed out here? Things like:
> >  * Listing unallocated nodes
> >  * Unallocating a previously allocated node (does this make it a vanilla
> > resource or does it retain the resource type? is this the only way to
> > change
> > a node's resource type?)
> 
> Nodes don't have resource types. Nodes are machines Ironic knows
> about, and thats all they are.

Once nodes are assigned by nova scheduler, would it be accurate to say that they
have an implicit resource type?  Or am I missing the point entirely?

> >  * Unregistering nodes from Tuskar's inventory (I put this under
> >  unallocated
> > under the assumption that the workflow will be an explicit unallocate
> > before
> > unregister; I'm not sure if this is the same as "archive" below).
> 
> Tuskar shouldn't have an inventory of nodes.

Would it be correct to say that Ironic has an inventory of nodes, and that we 
may
want to remove a node from Ironic's inventory?

Mainn

> -Rob
> 
> 
> --
> Robert Collins 
> Distinguished Technologist
> HP Converged Cloud
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Tzu-Mainn Chen
Thanks for the comments and questions!  I fully expect that this list of 
requirements
will need to be fleshed out, refined, and heavily modified, so the more the 
merrier.

Comments inline:

> >
> > *** Requirements are assumed to be targeted for Icehouse, unless marked
> > otherwise:
> >(M) - Maybe Icehouse, dependency on other in-development features
> >(F) - Future requirement, after Icehouse
> >
> > * NODES
> 
> Note that everything in this section should be Ironic API calls.
> 
> >* Creation
> >   * Manual registration
> >  * hardware specs from Ironic based on mac address (M)
> 
> Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory
> stats
> 
> >  * IP auto populated from Neutron (F)
> 
> Do you mean IPMI IP ? I'd say IPMI address managed by Neutron here.
> 
> >   * Auto-discovery during undercloud install process (M)
> >* Monitoring
> >* assignment, availability, status
> >* capacity, historical statistics (M)
> 
> Why is this under 'nodes'? I challenge the idea that it should be
> there. We will need to surface some stuff about nodes, but the
> underlying idea is to take a cloud approach here - so we're monitoring
> services, that happen to be on nodes. There is room to monitor nodes,
> as an undercloud feature set, but lets be very very specific about
> what is sitting at what layer.

That's a fair point.  At the same time, the UI does want to monitor both
services and the nodes that the services are running on, correct?  I would
think that a user would want this.

Would it be better to explicitly split this up into two separate requirements?

> >* Management node (where triple-o is installed)
> 
> This should be plural :) - TripleO isn't a single service to be
> installed - We've got Tuskar, Ironic, Nova, Glance, Keystone, Neutron,
> etc.

I misspoke here - this should be "where the undercloud is installed".  My
current understanding is that our initial release will only support the 
undercloud
being installed onto a single node, but my understanding could very well be 
flawed.

> >* created as part of undercloud install process
> >* can create additional management nodes (F)
> > * Resource nodes
> 
> ^ nodes is again confusing layers - nodes are
> what things are deployed to, but they aren't the entry point
> 
> > * searchable by status, name, cpu, memory, and all attributes from
> > ironic
> > * can be allocated as one of four node types
> 
> Not by users though. We need to stop thinking of this as 'what we do
> to nodes' - Nova/Ironic operate on nodes, we operate on Heat
> templates.

Right, I didn't mean to imply that users would be doing this allocation.  But 
once Nova
does this allocation, the UI does want to be aware of how the allocation is 
done, right?
That's what this requirement meant.

> > * compute
> > * controller
> > * object storage
> > * block storage
> > * Resource class - allows for further categorization of a node type
> > * each node type specifies a single default resource class
> > * allow multiple resource classes per node type (M)
> 
> Whats a node type?

Compute/controller/object storage/block storage.  Is another term besides "node 
type"
more accurate?

> 
> > * optional node profile for a resource class (M)
> > * acts as filter for nodes that can be allocated to that
> > class (M)
> 
> I'm not clear on this - you can list the nodes that have had a
> particular thing deployed on them; we probably can get a good answer
> to being able to see what nodes a particular flavor can deploy to, but
> we don't want to be second guessing the scheduler..

Correct; the goal here is to provide a way through the UI to send additional 
filtering
requirements that will eventually be passed into the scheduler, allowing the 
scheduler
to apply additional filters.

> > * nodes can be viewed by node types
> > * additional group by status, hardware specification
> 
> *Instances* - e.g. hypervisors, storage, block storage etc.
> 
> > * controller node type
> 
> Again, need to get away from node type here.
> 
> >* each controller node will run all openstack services
> >   * allow each node to run specified service (F)
> >* breakdown by workload (percentage of cpu used per node) (M)
> > * Unallocated nodes
> 
> This implies an 'allocation' step, that we don't have - how about
> 'Idle nodes' or something.

Is it imprecise to say that nodes are allocated by the scheduler?  Would 
something like
'active/idle' be better?

> > * Archived nodes (F)
> > * Will be separate openstack service (F)
> >
> > * DEPLOYMENT
> >* multiple deployments allowed (F)
> >  * initially just one
> >* deployment specifies a node distribution across no

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Robert Collins
On 7 December 2013 10:55, Matt Wagner  wrote:

> The 'management node' here is the undercloud node that Anna is
> interacting with, as I understand it. (Someone correct me if I'm wrong.)
> So it's not a bad idea to show its status, but I guess the mere fact
> that she's using it will indicate that it's operational.

There are potentially many such nodes, and Anna will be interacting
with some of them; I don't think we can make too many assumptions
about what the UI working implies.

>> - As an infrastructure administrator, Anna needs to be able to
>> troubleshoot any errors that may occur during the deployment of nodes
>> process.
>
> I'm not sure that the "...through the UI" implication I mentioned above
> extends here. (IMHO) I assume that if things fail, Anna might be okay
> with us showing a message that $foo failed on $bar, and she should try
> looking in /var/log/$baz for full details. Does that seem fair? (At
> least early on.)

I don't think we necessarily need to do anything here other than make
sure the system is a) well documented and b) Anna has all the normal
sysadmin access to the infrastructure. Her needs can be met by us
getting out of the way gracefully; at least in the short term.


-Rob

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Robert Collins
On 7 December 2013 09:31, Liz Blanchard  wrote:
> This list is great, thanks very much for taking the time to write this up! I 
> think a big part of the User Experience design is to take a step back and 
> understand the requirements from an end user's point of view…what would they 
> want to accomplish by using this UI? This might influence the design in 
> certain ways, so I've taken a cut at a set of user stories for the Icehouse 
> timeframe based on these requirements that I hope will be useful during 
> discussions.
>
> Based on the OpenStack Personas[1], I think that Anna would be the main 
> consumer of the TripleO UI, but please let me know if you think otherwise.
>
> - As an infrastructure administrator, Anna needs to deploy or update a set of 
> resources that will run OpenStack (This isn't a very specific use case, but 
> more of the larger end goal of Anna coming into the UI.)
> - As an infrastructure administrator, Anna expects that the management node 
> for the deployment services is already up and running and the status of this 
> node is shown in the UI.
> - As an infrastructure administrator, Anna wants to be able to quickly see 
> the set of unallocated nodes that she could use for her deployment of 
> OpenStack. Ideally, she would not have to manually tell the system about 
> these nodes. If she needs to manually register nodes for whatever reason, 
> Anna would only want to have to define the essential data needed to register 
> these nodes.

I want to challenge this one. There are two concerns conflated. A)
seeing available resources for scaling up her cloud. B) minimising
effort to enroll additional resources. B) is a no-brainer. For A)
though, as phrased, we're talking about seeing a set of individual
items: but actually, wouldn't aggregated capacity being more useful,
with optional drill down - '400 cores, 2TB RAM, 1PB of disk'

> - As an infrastructure administrator, Anna needs to assign a role to each of 
> the necessary nodes in her OpenStack deployment. The nodes could be either 
> controller, compute, networking, or storage resources depending on the needs 
> of this deployment.

Definitely not: she needs to deliver a running cloud. Manually saying
'machine X is a compute node' is confusing an implementation with a
need. She needs to know that her cloud will have enough capacity to
meet her users needs; she needs to know that it will be resilient
against a wide set of failures (and this might be a dial with
different clouds having different uptime guarantees); she may need to
ensure that some specific hardware configuration is used for storage,
as a performance optimisation. None of those needs imply assigning
roles to machines.

> - As an infrastructure administrator, Anna wants to review the distribution 
> of the nodes that she has assigned before kicking off the "Deploy" task.

If by distribution you mean the top level stats (15 control nodes, 200
hypervisors, etc) - then I agree. If you mean 'node X will be a
hypervisor' - I thoroughly disagree. What does that do for her?

> - As an infrastructure administrator, Anna wants to monitor the deployment 
> process of all of the nodes that she has assigned.

I don't think she wants to do that. I think she wants to be told if
there is a problem that needs her intervention to solve - e.g. bad
IPMI details for a node, or a node not responding when asked to boot
via PXE.

> - As an infrastructure administrator, Anna needs to be able to troubleshoot 
> any errors that may occur during the deployment of nodes process.

Definitely.

> - As an infrastructure administrator, Anna wants to monitor the availability 
> and status of each node in her deployment.

Yes, with the caveat that I think instance is the key thing here for
now; there is a lifecycle aspect where being able to say 'machine X is
having persistent network issues' is very important, as a long term
thing we should totally aim at that.

> - As an infrastructure administrator, Anna wants to be able to unallocate a 
> node from a deployment.

Why? Whats her motivation. One plausible one for me is 'a machine
needs to be serviced so Anna wants to remove it from the deployment to
avoid causing user visible downtime.'  So lets say that: Anna needs to
be able to take machines out of service so they can be maintained or
disposed of.

> - As an infrastructure administrator, Anna wants to be able to view the 
> history of nodes that have been in a deployment.

Why? This is super generic and could mean anything.

> - As an infrastructure administrator, Anna needs to be notified of any 
> important changes to nodes that are in the OpenStack deployment. She does not 
> want to be spammed with non-important notifications.

What sort of changes do you mean here?



Thanks for putting this together, I love Personas as a way to make
designs concrete and connected to user needs.

-Rob


-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

_

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Robert Collins
On 7 December 2013 09:26, Tzu-Mainn Chen  wrote:

>> >  * Archived nodes (F)
>>
>> Can you elaborate a bit more on what this is?
>
> To be honest, I'm a bit fuzzy about this myself; Jarda mentioned that there 
> was
> an OpenStack service in the process of being planned that would handle this
> requirement.  Jarda, can you detail a bit?

Ironic is a hypervisor service, roughly like libvirt+kvm for virtual
machines : so it doesn't keep a deep history of whats been deployed
where and other similar things : it's not a CMDB. Historical reporting
is something to push data for into ceilometer, for instance. Nova has
some support for historical data, and it's possible that that minimal
approach might fit in Ironic, but I'm super skeptical - Ironic would
have nothing to *do* for historical data, so why keep it there at all?

-Rob


--
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Robert Collins
On 7 December 2013 08:15, Jay Dobies  wrote:
> Disclaimer: I'm very new to the project, so apologies if some of my
> questions have been already answered or flat out don't make sense.


NP :)


>>  * optional node profile for a resource class (M)
>>  * acts as filter for nodes that can be allocated to that
>> class (M)
>
>
> To my understanding, once this is in Icehouse, we'll have to support
> upgrades. If this filtering is pushed off, could we get into a situation
> where an allocation created in Icehouse would no longer be valid in
> Icehouse+1 once these filters are in place? If so, we might want to make it
> more of a priority to get them in place earlier and not eat the headache of
> addressing these sorts of integrity issues later.

We need to be wary of over-implementing now; a lot of the long term
picture is moving Tuskar prototype features into proper homes like
Heat and Nova; so the more we implement now the more we have to move.

>>  * Unallocated nodes
>
>
> Is there more still being flushed out here? Things like:
>  * Listing unallocated nodes
>  * Unallocating a previously allocated node (does this make it a vanilla
> resource or does it retain the resource type? is this the only way to change
> a node's resource type?)

Nodes don't have resource types. Nodes are machines Ironic knows
about, and thats all they are.

>  * Unregistering nodes from Tuskar's inventory (I put this under unallocated
> under the assumption that the workflow will be an explicit unallocate before
> unregister; I'm not sure if this is the same as "archive" below).

Tuskar shouldn't have an inventory of nodes.

-Rob


-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Robert Collins
Thanks for doing this!

On 6 December 2013 15:31, Tzu-Mainn Chen  wrote:
> Hey all,
>
> I've attempted to spin out the requirements behind Jarda's excellent 
> wireframes 
> (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html).
> Hopefully this can add some perspective on both the wireframes and the needed 
> changes to the tuskar-api.
>
> All comments are welcome!
>
> Thanks,
> Tzu-Mainn Chen
>
> 
>
> *** Requirements are assumed to be targeted for Icehouse, unless marked 
> otherwise:
>(M) - Maybe Icehouse, dependency on other in-development features
>(F) - Future requirement, after Icehouse
>
> * NODES

Note that everything in this section should be Ironic API calls.

>* Creation
>   * Manual registration
>  * hardware specs from Ironic based on mac address (M)

Ironic today will want IPMI address + MAC for each NIC + disk/cpu/memory stats

>  * IP auto populated from Neutron (F)

Do you mean IPMI IP ? I'd say IPMI address managed by Neutron here.

>   * Auto-discovery during undercloud install process (M)
>* Monitoring
>* assignment, availability, status
>* capacity, historical statistics (M)

Why is this under 'nodes'? I challenge the idea that it should be
there. We will need to surface some stuff about nodes, but the
underlying idea is to take a cloud approach here - so we're monitoring
services, that happen to be on nodes. There is room to monitor nodes,
as an undercloud feature set, but lets be very very specific about
what is sitting at what layer.

>* Management node (where triple-o is installed)

This should be plural :) - TripleO isn't a single service to be
installed - We've got Tuskar, Ironic, Nova, Glance, Keystone, Neutron,
etc.

>* created as part of undercloud install process
>* can create additional management nodes (F)
> * Resource nodes

^ nodes is again confusing layers - nodes are
what things are deployed to, but they aren't the entry point

> * searchable by status, name, cpu, memory, and all attributes from 
> ironic
> * can be allocated as one of four node types

Not by users though. We need to stop thinking of this as 'what we do
to nodes' - Nova/Ironic operate on nodes, we operate on Heat
templates.

> * compute
> * controller
> * object storage
> * block storage
> * Resource class - allows for further categorization of a node type
> * each node type specifies a single default resource class
> * allow multiple resource classes per node type (M)

Whats a node type?

> * optional node profile for a resource class (M)
> * acts as filter for nodes that can be allocated to that 
> class (M)

I'm not clear on this - you can list the nodes that have had a
particular thing deployed on them; we probably can get a good answer
to being able to see what nodes a particular flavor can deploy to, but
we don't want to be second guessing the scheduler..

> * nodes can be viewed by node types
> * additional group by status, hardware specification

*Instances* - e.g. hypervisors, storage, block storage etc.

> * controller node type

Again, need to get away from node type here.

>* each controller node will run all openstack services
>   * allow each node to run specified service (F)
>* breakdown by workload (percentage of cpu used per node) (M)
> * Unallocated nodes

This implies an 'allocation' step, that we don't have - how about
'Idle nodes' or something.

> * Archived nodes (F)
> * Will be separate openstack service (F)
>
> * DEPLOYMENT
>* multiple deployments allowed (F)
>  * initially just one
>* deployment specifies a node distribution across node types

I can't parse this. Deployments specify how many instances to deploy
in what roles (e.g. 2 control, 2 storage, 4 block storage, 20
hypervisors), some minor metadata about the instances (such as 'kvm'
for the hypervisor, and what undercloud flavors to deploy on).

>   * node distribution can be updated after creation
>* deployment configuration, used for initial creation only

Can you enlarge on what you mean here?

>   * defaulted, with no option to change
>  * allow modification (F)
>* review distribution map (F)
>* notification when a deployment is ready to go or whenever something 
> changes

Is this an (M) ?

> * DEPLOYMENT ACTION
>* Heat template generated on the fly
>   * hardcoded images
>  * allow image selection (F)

We'll be spinning images up as part of the deployment, I presume - so
this is really M, isn't it? or do you mean 'allow supplying images
rather than building just in time' ? Or --- I dunno, but lets get some
clarity here.

>   * pre-created template fragments for each node type
>   * nod

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Tzu-Mainn Chen
The relevant wiki page is here:

https://wiki.openstack.org/wiki/TripleO/Tuskar#Icehouse_Planning


- Original Message -
> That looks really good, thanks for putting that together!
> 
> I'm going to put together a wiki page that consolidates the various Tuskar
> planning documents - requirements, user stories, wireframes, etc - so it's
> easier to see the whole planning picture.
> 
> Mainn
> 
> - Original Message -
> > 
> > On Dec 5, 2013, at 9:31 PM, Tzu-Mainn Chen  wrote:
> > 
> > > Hey all,
> > > 
> > > I've attempted to spin out the requirements behind Jarda's excellent
> > > wireframes
> > > (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html).
> > > Hopefully this can add some perspective on both the wireframes and the
> > > needed changes to the tuskar-api.
> > 
> > This list is great, thanks very much for taking the time to write this up!
> > I
> > think a big part of the User Experience design is to take a step back and
> > understand the requirements from an end user's point of view…what would
> > they
> > want to accomplish by using this UI? This might influence the design in
> > certain ways, so I've taken a cut at a set of user stories for the Icehouse
> > timeframe based on these requirements that I hope will be useful during
> > discussions.
> > 
> > Based on the OpenStack Personas[1], I think that Anna would be the main
> > consumer of the TripleO UI, but please let me know if you think otherwise.
> > 
> > - As an infrastructure administrator, Anna needs to deploy or update a set
> > of
> > resources that will run OpenStack (This isn't a very specific use case, but
> > more of the larger end goal of Anna coming into the UI.)
> > - As an infrastructure administrator, Anna expects that the management node
> > for the deployment services is already up and running and the status of
> > this
> > node is shown in the UI.
> > - As an infrastructure administrator, Anna wants to be able to quickly see
> > the set of unallocated nodes that she could use for her deployment of
> > OpenStack. Ideally, she would not have to manually tell the system about
> > these nodes. If she needs to manually register nodes for whatever reason,
> > Anna would only want to have to define the essential data needed to
> > register
> > these nodes.
> > - As an infrastructure administrator, Anna needs to assign a role to each
> > of
> > the necessary nodes in her OpenStack deployment. The nodes could be either
> > controller, compute, networking, or storage resources depending on the
> > needs
> > of this deployment.
> > - As an infrastructure administrator, Anna wants to review the distribution
> > of the nodes that she has assigned before kicking off the "Deploy" task.
> > - As an infrastructure administrator, Anna wants to monitor the deployment
> > process of all of the nodes that she has assigned.
> > - As an infrastructure administrator, Anna needs to be able to troubleshoot
> > any errors that may occur during the deployment of nodes process.
> > - As an infrastructure administrator, Anna wants to monitor the
> > availability
> > and status of each node in her deployment.
> > - As an infrastructure administrator, Anna wants to be able to unallocate a
> > node from a deployment.
> > - As an infrastructure administrator, Anna wants to be able to view the
> > history of nodes that have been in a deployment.
> > - As an infrastructure administrator, Anna needs to be notified of any
> > important changes to nodes that are in the OpenStack deployment. She does
> > not want to be spammed with non-important notifications.
> > 
> > Please feel free to comment, change, or add to this list.
> > 
> > [1]https://docs.google.com/document/d/16rkiXWxxgzGT47_Wc6hzIPzO2-s2JWAPEKD0gP2mt7E/edit?pli=1#
> > 
> > Thanks,
> > Liz
> > 
> > > 
> > > All comments are welcome!
> > > 
> > > Thanks,
> > > Tzu-Mainn Chen
> > > 
> > > 
> > > 
> > > *** Requirements are assumed to be targeted for Icehouse, unless marked
> > > otherwise:
> > >   (M) - Maybe Icehouse, dependency on other in-development features
> > >   (F) - Future requirement, after Icehouse
> > > 
> > > * NODES
> > >   * Creation
> > >  * Manual registration
> > > * hardware specs from Ironic based on mac address (M)
> > > * IP auto populated from Neutron (F)
> > >  * Auto-discovery during undercloud install process (M)
> > >   * Monitoring
> > >   * assignment, availability, status
> > >   * capacity, historical statistics (M)
> > >   * Management node (where triple-o is installed)
> > >   * created as part of undercloud install process
> > >   * can create additional management nodes (F)
> > >* Resource nodes
> > >* searchable by status, name, cpu, memory, and all attributes from
> > >ironic
> > >* can be allocated as one of four node types
> > >* compute
> > >* controller
> > >* object storage
> > > 

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Matt Wagner
Thanks, Liz! Seeing things this way is really helpful.

(I actually feel like wireframes -> requirements -> user stories is
exactly the opposite of how this normally goes, but hitting all of the
steps either way makes things much clearer.)

I've raised some questions below. I think many of them aren't aimed at
you per se, but are more general things that seeing the user stories has
helped me realize we could clarify.


On Fri Dec  6 15:31:36 2013, Liz Blanchard wrote:

> - As an infrastructure administrator, Anna expects that the
> management node for the deployment services is already up and running
> and the status of this node is shown in the UI.

The 'management node' here is the undercloud node that Anna is
interacting with, as I understand it. (Someone correct me if I'm wrong.)
So it's not a bad idea to show its status, but I guess the mere fact
that she's using it will indicate that it's operational.


> - As an infrastructure administrator, Anna wants to review the
> distribution of the nodes that she has assigned before kicking off
> the "Deploy" task.

What does she expect to see here on the review screen that she didn't
see on the previous screens, if anything? Is this just a summation, or
is she expecting to see things like which node will get which role? (I'd
argue for the former; I don't know that we can predict the latter.)


> - As an infrastructure administrator, Anna wants to monitor the
> deployment process of all of the nodes that she has assigned.

I think there's an implied "...through the UI" here, versus tailing log
files to watch state. Does she just expect to see states like "Pending",
"Deploying", or "Finished", versus, say, having the full logs shown in
the UI? (I'd vote 'yes'.)


> - As an infrastructure administrator, Anna needs to be able to
> troubleshoot any errors that may occur during the deployment of nodes
> process.

I'm not sure that the "...through the UI" implication I mentioned above
extends here. (IMHO) I assume that if things fail, Anna might be okay
with us showing a message that $foo failed on $bar, and she should try
looking in /var/log/$baz for full details. Does that seem fair? (At
least early on.)


> - As an infrastructure administrator, Anna wants to be able to view
> the history of nodes that have been in a deployment.

Why does she want to view history of past nodes?

Note that I'm not arguing against this; it's just not abundantly clear
to me what she'll be using this information for. Does she want a history
to check off an "Audit log" checkbox, or will she be looking to extract
certain data from this history?

Thanks again for creating these user stories, Liz!

-- 
Matt Wagner
Software Engineer, Red Hat



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Tzu-Mainn Chen
That looks really good, thanks for putting that together!

I'm going to put together a wiki page that consolidates the various Tuskar
planning documents - requirements, user stories, wireframes, etc - so it's
easier to see the whole planning picture.

Mainn

- Original Message -
> 
> On Dec 5, 2013, at 9:31 PM, Tzu-Mainn Chen  wrote:
> 
> > Hey all,
> > 
> > I've attempted to spin out the requirements behind Jarda's excellent
> > wireframes
> > (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html).
> > Hopefully this can add some perspective on both the wireframes and the
> > needed changes to the tuskar-api.
> 
> This list is great, thanks very much for taking the time to write this up! I
> think a big part of the User Experience design is to take a step back and
> understand the requirements from an end user's point of view…what would they
> want to accomplish by using this UI? This might influence the design in
> certain ways, so I've taken a cut at a set of user stories for the Icehouse
> timeframe based on these requirements that I hope will be useful during
> discussions.
> 
> Based on the OpenStack Personas[1], I think that Anna would be the main
> consumer of the TripleO UI, but please let me know if you think otherwise.
> 
> - As an infrastructure administrator, Anna needs to deploy or update a set of
> resources that will run OpenStack (This isn't a very specific use case, but
> more of the larger end goal of Anna coming into the UI.)
> - As an infrastructure administrator, Anna expects that the management node
> for the deployment services is already up and running and the status of this
> node is shown in the UI.
> - As an infrastructure administrator, Anna wants to be able to quickly see
> the set of unallocated nodes that she could use for her deployment of
> OpenStack. Ideally, she would not have to manually tell the system about
> these nodes. If she needs to manually register nodes for whatever reason,
> Anna would only want to have to define the essential data needed to register
> these nodes.
> - As an infrastructure administrator, Anna needs to assign a role to each of
> the necessary nodes in her OpenStack deployment. The nodes could be either
> controller, compute, networking, or storage resources depending on the needs
> of this deployment.
> - As an infrastructure administrator, Anna wants to review the distribution
> of the nodes that she has assigned before kicking off the "Deploy" task.
> - As an infrastructure administrator, Anna wants to monitor the deployment
> process of all of the nodes that she has assigned.
> - As an infrastructure administrator, Anna needs to be able to troubleshoot
> any errors that may occur during the deployment of nodes process.
> - As an infrastructure administrator, Anna wants to monitor the availability
> and status of each node in her deployment.
> - As an infrastructure administrator, Anna wants to be able to unallocate a
> node from a deployment.
> - As an infrastructure administrator, Anna wants to be able to view the
> history of nodes that have been in a deployment.
> - As an infrastructure administrator, Anna needs to be notified of any
> important changes to nodes that are in the OpenStack deployment. She does
> not want to be spammed with non-important notifications.
> 
> Please feel free to comment, change, or add to this list.
> 
> [1]https://docs.google.com/document/d/16rkiXWxxgzGT47_Wc6hzIPzO2-s2JWAPEKD0gP2mt7E/edit?pli=1#
> 
> Thanks,
> Liz
> 
> > 
> > All comments are welcome!
> > 
> > Thanks,
> > Tzu-Mainn Chen
> > 
> > 
> > 
> > *** Requirements are assumed to be targeted for Icehouse, unless marked
> > otherwise:
> >   (M) - Maybe Icehouse, dependency on other in-development features
> >   (F) - Future requirement, after Icehouse
> > 
> > * NODES
> >   * Creation
> >  * Manual registration
> > * hardware specs from Ironic based on mac address (M)
> > * IP auto populated from Neutron (F)
> >  * Auto-discovery during undercloud install process (M)
> >   * Monitoring
> >   * assignment, availability, status
> >   * capacity, historical statistics (M)
> >   * Management node (where triple-o is installed)
> >   * created as part of undercloud install process
> >   * can create additional management nodes (F)
> >* Resource nodes
> >* searchable by status, name, cpu, memory, and all attributes from
> >ironic
> >* can be allocated as one of four node types
> >* compute
> >* controller
> >* object storage
> >* block storage
> >* Resource class - allows for further categorization of a node type
> >* each node type specifies a single default resource class
> >* allow multiple resource classes per node type (M)
> >* optional node profile for a resource class (M)
> >* acts as filter for nodes that can b

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Liz Blanchard

On Dec 5, 2013, at 9:31 PM, Tzu-Mainn Chen  wrote:

> Hey all,
> 
> I've attempted to spin out the requirements behind Jarda's excellent 
> wireframes 
> (http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html).
> Hopefully this can add some perspective on both the wireframes and the needed 
> changes to the tuskar-api.

This list is great, thanks very much for taking the time to write this up! I 
think a big part of the User Experience design is to take a step back and 
understand the requirements from an end user's point of view…what would they 
want to accomplish by using this UI? This might influence the design in certain 
ways, so I've taken a cut at a set of user stories for the Icehouse timeframe 
based on these requirements that I hope will be useful during discussions.

Based on the OpenStack Personas[1], I think that Anna would be the main 
consumer of the TripleO UI, but please let me know if you think otherwise.

- As an infrastructure administrator, Anna needs to deploy or update a set of 
resources that will run OpenStack (This isn't a very specific use case, but 
more of the larger end goal of Anna coming into the UI.)
- As an infrastructure administrator, Anna expects that the management node for 
the deployment services is already up and running and the status of this node 
is shown in the UI.
- As an infrastructure administrator, Anna wants to be able to quickly see the 
set of unallocated nodes that she could use for her deployment of OpenStack. 
Ideally, she would not have to manually tell the system about these nodes. If 
she needs to manually register nodes for whatever reason, Anna would only want 
to have to define the essential data needed to register these nodes.
- As an infrastructure administrator, Anna needs to assign a role to each of 
the necessary nodes in her OpenStack deployment. The nodes could be either 
controller, compute, networking, or storage resources depending on the needs of 
this deployment.
- As an infrastructure administrator, Anna wants to review the distribution of 
the nodes that she has assigned before kicking off the "Deploy" task.
- As an infrastructure administrator, Anna wants to monitor the deployment 
process of all of the nodes that she has assigned.
- As an infrastructure administrator, Anna needs to be able to troubleshoot any 
errors that may occur during the deployment of nodes process.
- As an infrastructure administrator, Anna wants to monitor the availability 
and status of each node in her deployment.
- As an infrastructure administrator, Anna wants to be able to unallocate a 
node from a deployment.
- As an infrastructure administrator, Anna wants to be able to view the history 
of nodes that have been in a deployment.
- As an infrastructure administrator, Anna needs to be notified of any 
important changes to nodes that are in the OpenStack deployment. She does not 
want to be spammed with non-important notifications.

Please feel free to comment, change, or add to this list.

[1]https://docs.google.com/document/d/16rkiXWxxgzGT47_Wc6hzIPzO2-s2JWAPEKD0gP2mt7E/edit?pli=1#

Thanks,
Liz

> 
> All comments are welcome!
> 
> Thanks,
> Tzu-Mainn Chen
> 
> 
> 
> *** Requirements are assumed to be targeted for Icehouse, unless marked 
> otherwise:
>   (M) - Maybe Icehouse, dependency on other in-development features
>   (F) - Future requirement, after Icehouse
> 
> * NODES
>   * Creation
>  * Manual registration
> * hardware specs from Ironic based on mac address (M)
> * IP auto populated from Neutron (F)
>  * Auto-discovery during undercloud install process (M)
>   * Monitoring
>   * assignment, availability, status
>   * capacity, historical statistics (M)
>   * Management node (where triple-o is installed)
>   * created as part of undercloud install process
>   * can create additional management nodes (F)
>* Resource nodes
>* searchable by status, name, cpu, memory, and all attributes from 
> ironic
>* can be allocated as one of four node types
>* compute
>* controller
>* object storage
>* block storage
>* Resource class - allows for further categorization of a node type
>* each node type specifies a single default resource class
>* allow multiple resource classes per node type (M)
>* optional node profile for a resource class (M)
>* acts as filter for nodes that can be allocated to that class 
> (M)
>* nodes can be viewed by node types
>* additional group by status, hardware specification
>* controller node type
>   * each controller node will run all openstack services
>  * allow each node to run specified service (F)
>   * breakdown by workload (percentage of cpu used per node) (M)
>* Unallocated nodes
>* Archived nodes (F)
>* Will be separate o

Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Tzu-Mainn Chen
Thanks for the comments!  Responses inline:

> Disclaimer: I'm very new to the project, so apologies if some of my
> questions have been already answered or flat out don't make sense.
> 
> As I proofread, some of my comments may drift a bit past basic
> requirements, so feel free to tell me to take certain questions out of
> this thread into specific discussion threads if I'm getting too detailed.
> 
> > 
> >
> > *** Requirements are assumed to be targeted for Icehouse, unless marked
> > otherwise:
> > (M) - Maybe Icehouse, dependency on other in-development features
> > (F) - Future requirement, after Icehouse
> >
> > * NODES
> > * Creation
> >* Manual registration
> >   * hardware specs from Ironic based on mac address (M)
> >   * IP auto populated from Neutron (F)
> >* Auto-discovery during undercloud install process (M)
> > * Monitoring
> > * assignment, availability, status
> > * capacity, historical statistics (M)
> > * Management node (where triple-o is installed)
> > * created as part of undercloud install process
> > * can create additional management nodes (F)
> >  * Resource nodes
> >  * searchable by status, name, cpu, memory, and all attributes from
> >  ironic
> >  * can be allocated as one of four node types
> 
> It's pretty clear by the current verbiage but I'm going to ask anyway:
> "one and only one"?

Yep, that's right!

> >  * compute
> >  * controller
> >  * object storage
> >  * block storage
> >  * Resource class - allows for further categorization of a node
> >  type
> >  * each node type specifies a single default resource class
> >  * allow multiple resource classes per node type (M)
> 
> My gut reaction is that we want to bite this off sooner rather than
> later. This will have data model and API implications that, even if we
> don't commit to it for Icehouse, should still be in our minds during it,
> so it might make sense to make it a first class thing to just nail down now.

That is entirely correct, which is one reason it's on the list of requirements. 
 The
forthcoming API design will have to account for it.  Not recreating the entire 
data
model between releases is a key goal :)


> >  * optional node profile for a resource class (M)
> >  * acts as filter for nodes that can be allocated to that
> >  class (M)
> 
> To my understanding, once this is in Icehouse, we'll have to support
> upgrades. If this filtering is pushed off, could we get into a situation
> where an allocation created in Icehouse would no longer be valid in
> Icehouse+1 once these filters are in place? If so, we might want to make
> it more of a priority to get them in place earlier and not eat the
> headache of addressing these sorts of integrity issues later.

That's true.  The problem is that to my understanding, the filters we'd
need in nova-scheduler are not yet fully in place.

I also think that this is an issue that we'll need to address no matter what.
Even once filters exist, if a user applies a filter *after* nodes are allocated,
we'll need to do something clever if the already-allocated nodes don't meet the
filter criteria.

> >  * nodes can be viewed by node types
> >  * additional group by status, hardware specification
> >  * controller node type
> > * each controller node will run all openstack services
> >* allow each node to run specified service (F)
> > * breakdown by workload (percentage of cpu used per node) (M)
> >  * Unallocated nodes
> 
> Is there more still being flushed out here? Things like:
>   * Listing unallocated nodes
>   * Unallocating a previously allocated node (does this make it a
> vanilla resource or does it retain the resource type? is this the only
> way to change a node's resource type?)
>   * Unregistering nodes from Tuskar's inventory (I put this under
> unallocated under the assumption that the workflow will be an explicit
> unallocate before unregister; I'm not sure if this is the same as
> "archive" below).

Ah, you're entirely right.  I'll add these to the list.

> >  * Archived nodes (F)
> 
> Can you elaborate a bit more on what this is?

To be honest, I'm a bit fuzzy about this myself; Jarda mentioned that there was
an OpenStack service in the process of being planned that would handle this
requirement.  Jarda, can you detail a bit?

Thanks again for the comments!


Mainn

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-06 Thread Jay Dobies
Disclaimer: I'm very new to the project, so apologies if some of my 
questions have been already answered or flat out don't make sense.


As I proofread, some of my comments may drift a bit past basic 
requirements, so feel free to tell me to take certain questions out of 
this thread into specific discussion threads if I'm getting too detailed.





*** Requirements are assumed to be targeted for Icehouse, unless marked 
otherwise:
(M) - Maybe Icehouse, dependency on other in-development features
(F) - Future requirement, after Icehouse

* NODES
* Creation
   * Manual registration
  * hardware specs from Ironic based on mac address (M)
  * IP auto populated from Neutron (F)
   * Auto-discovery during undercloud install process (M)
* Monitoring
* assignment, availability, status
* capacity, historical statistics (M)
* Management node (where triple-o is installed)
* created as part of undercloud install process
* can create additional management nodes (F)
 * Resource nodes
 * searchable by status, name, cpu, memory, and all attributes from 
ironic
 * can be allocated as one of four node types


It's pretty clear by the current verbiage but I'm going to ask anyway: 
"one and only one"?



 * compute
 * controller
 * object storage
 * block storage
 * Resource class - allows for further categorization of a node type
 * each node type specifies a single default resource class
 * allow multiple resource classes per node type (M)


My gut reaction is that we want to bite this off sooner rather than 
later. This will have data model and API implications that, even if we 
don't commit to it for Icehouse, should still be in our minds during it, 
so it might make sense to make it a first class thing to just nail down now.



 * optional node profile for a resource class (M)
 * acts as filter for nodes that can be allocated to that class 
(M)


To my understanding, once this is in Icehouse, we'll have to support 
upgrades. If this filtering is pushed off, could we get into a situation 
where an allocation created in Icehouse would no longer be valid in 
Icehouse+1 once these filters are in place? If so, we might want to make 
it more of a priority to get them in place earlier and not eat the 
headache of addressing these sorts of integrity issues later.



 * nodes can be viewed by node types
 * additional group by status, hardware specification
 * controller node type
* each controller node will run all openstack services
   * allow each node to run specified service (F)
* breakdown by workload (percentage of cpu used per node) (M)
 * Unallocated nodes


Is there more still being flushed out here? Things like:
 * Listing unallocated nodes
 * Unallocating a previously allocated node (does this make it a 
vanilla resource or does it retain the resource type? is this the only 
way to change a node's resource type?)
 * Unregistering nodes from Tuskar's inventory (I put this under 
unallocated under the assumption that the workflow will be an explicit 
unallocate before unregister; I'm not sure if this is the same as 
"archive" below).



 * Archived nodes (F)


Can you elaborate a bit more on what this is?


 * Will be separate openstack service (F)

* DEPLOYMENT
* multiple deployments allowed (F)
  * initially just one
* deployment specifies a node distribution across node types
   * node distribution can be updated after creation
* deployment configuration, used for initial creation only
   * defaulted, with no option to change
  * allow modification (F)
* review distribution map (F)
* notification when a deployment is ready to go or whenever something 
changes

* DEPLOYMENT ACTION
* Heat template generated on the fly
   * hardcoded images
  * allow image selection (F)
   * pre-created template fragments for each node type
   * node type distribution affects generated template
* nova scheduler allocates nodes
   * filters based on resource class and node profile information (M)
* Deployment action can create or update
* status indicator to determine overall state of deployment
   * status indicator for nodes as well
   * status includes 'time left' (F)

* NETWORKS (F)
* IMAGES (F)
* LOGS (F)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][Tuskar] Icehouse Requirements

2013-12-05 Thread Tzu-Mainn Chen
Hey all,

I've attempted to spin out the requirements behind Jarda's excellent wireframes 
(http://lists.openstack.org/pipermail/openstack-dev/2013-December/020944.html).
Hopefully this can add some perspective on both the wireframes and the needed 
changes to the tuskar-api.

All comments are welcome!

Thanks,
Tzu-Mainn Chen



*** Requirements are assumed to be targeted for Icehouse, unless marked 
otherwise:
   (M) - Maybe Icehouse, dependency on other in-development features
   (F) - Future requirement, after Icehouse

* NODES
   * Creation
  * Manual registration
 * hardware specs from Ironic based on mac address (M)
 * IP auto populated from Neutron (F)
  * Auto-discovery during undercloud install process (M)
   * Monitoring
   * assignment, availability, status
   * capacity, historical statistics (M)
   * Management node (where triple-o is installed)
   * created as part of undercloud install process
   * can create additional management nodes (F)
* Resource nodes
* searchable by status, name, cpu, memory, and all attributes from 
ironic
* can be allocated as one of four node types
* compute
* controller
* object storage
* block storage
* Resource class - allows for further categorization of a node type
* each node type specifies a single default resource class
* allow multiple resource classes per node type (M)
* optional node profile for a resource class (M)
* acts as filter for nodes that can be allocated to that class 
(M)
* nodes can be viewed by node types
* additional group by status, hardware specification
* controller node type
   * each controller node will run all openstack services
  * allow each node to run specified service (F)
   * breakdown by workload (percentage of cpu used per node) (M)
* Unallocated nodes
* Archived nodes (F)
* Will be separate openstack service (F)

* DEPLOYMENT
   * multiple deployments allowed (F)
 * initially just one
   * deployment specifies a node distribution across node types
  * node distribution can be updated after creation
   * deployment configuration, used for initial creation only
  * defaulted, with no option to change
 * allow modification (F)
   * review distribution map (F)
   * notification when a deployment is ready to go or whenever something changes

* DEPLOYMENT ACTION
   * Heat template generated on the fly
  * hardcoded images
 * allow image selection (F)
  * pre-created template fragments for each node type
  * node type distribution affects generated template
   * nova scheduler allocates nodes
  * filters based on resource class and node profile information (M)
   * Deployment action can create or update
   * status indicator to determine overall state of deployment
  * status indicator for nodes as well
  * status includes 'time left' (F)

* NETWORKS (F)
* IMAGES (F)
* LOGS (F)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev