Re: [openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-28 Thread Oleg Gelbukh
Vladimir,

Sorry for delayed response. Let's continue inline.

On Thu, Oct 22, 2015 at 2:16 PM, Vladimir Kuklin 
wrote:

>
>
>>
>>  Each task can use some code to transform this output to the
>> representation that is actually needed for this particular task. Whenever a
>> task transforms this data it can access API and do version negotiation, for
>> example. Each time this transformation is performed this task can return
>> the data to some storage that will save this data for sake of control and
>> troubleshooting, such as, for example, user can always see which changes
>> are going to be applied and decide what to do next.
>>
>> Also, this means that the process of data calculation itself is very
>> 'lazy' or 'delayed', i. e. the data itself is calculated right at the
>> beginning of deployment transaction, so that it is not locked to some
>> particular details of deployment engine data processing and not prone to
>> issues like 'oh, I cannot get VIP because it has not been allocated yet by
>> Nailgun/oh, I cannot set it because it has already been set by Nailgun and
>> there is no way to alter it'.
>>
>
> >> To me, the two paragraphs above a contradictory. If the data
> calculations are lazy, I don't really see how one can introspect into
> changes that will be applied by a component at any given run. You just >>
> don't have this information, and you need to calculate it anyways to see
> which settings will be passed to a component. Might be I got your point
> wrong here. Please correct me if this is the case.
>
> Oleg, I actually meant that we do it in the following stages:
>
> 1) Change stuff in any amount of business logic engines you want,
> configuration databases, wikipedia, 4chan, etc.
>

Every business logic provider is a 'component' with it's own 'view' in the
proposed notation. Component has full control over 'authoritative' settings
in its view. In case of components that serve as sources of configuration,
it is likely that most all of settings in their views will be
'authoritative'.


> 2) Schedule a transaction of deployment
> 3) Make 'transformers/serializers' for each of the task collect all the
> data and store them before execution is started
>

The idea of configuration provisioning system is to strictly define all
sources of changes to settings and automate recalculation. Recalculation
should happen every time any setting is changed, and must 'touch' all
connected components. Otherwise we'll stuck with situations similar to
described in this bug [1].

[1] https://bugs.launchpad.net/fuel/+bug/1450100

4) Allow user to compare differences and decide whether he actually wants
> to apply this change
> 5) Commit the deployment - run particular tasks with particular set of
> settings which are staged and frozen (otherwise it will be impossible to
>  debug this stuff)
>

Versioning and persistent storage of views should be useful to solve this.


> 6) If there is lack of data for some task, e.g. you need some entitties to
> be created during the deployment so that other task will use their output
> or side-effects to calculate things - this task should not be executed
> within this transaction. This means that the whole deployment should be
> splitted into 2 transactions. I can mention an old story here - when we
> were running puppet we needed to create some stuff for neutron knowing ID
> of the network that had been created by another resource 5 seconds earlier.
> But we could not do this because puppet 'freezes' the input provided with
> "facts" before this transaction runs. This is exactly the same usecase.
>

Ideally, every task should be a separate component with its own view, and
should update their authoritative values upon execution.


>
> So these 6 items actually mean:
>
> 1) Clear separation between layers of the system and their functional
> boundaries
> 2) Minimum of cross-dependencies between component data - e.g. deployment
> tasks should not ever produce anything that is then stored in the storage.
>

This contradicts requirement to control and track the state and inputs of
components, isn't it? If deployment task does produce some parameters that
are used by another component, it should be stored and versioned just as
any other change to the state of configuration.


> Instead, you should have an API that provides you with data which is the
> result of deployment run. E.g. if you need to create a user in LDAP and you
> need this user's ID for some reason, your deployment task should create
> this user and, instead of returning this output to the storage, you just
> run another transaction and the task that requires this ID fetches it from
> LDAP.
>

It seems to me that here you're mixing responsibilities of orchestrator and
settings store. Orchestrator tells a component when to start, and it could
ask poll the settings store to determine if certain task has to be started.
Alternatively, the component itself might be notified by settings store
about 

Re: [openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-22 Thread Vladimir Kuklin
>
>  Each task can use some code to transform this output to the
> representation that is actually needed for this particular task. Whenever a
> task transforms this data it can access API and do version negotiation, for
> example. Each time this transformation is performed this task can return
> the data to some storage that will save this data for sake of control and
> troubleshooting, such as, for example, user can always see which changes
> are going to be applied and decide what to do next.
>
> Also, this means that the process of data calculation itself is very
> 'lazy' or 'delayed', i. e. the data itself is calculated right at the
> beginning of deployment transaction, so that it is not locked to some
> particular details of deployment engine data processing and not prone to
> issues like 'oh, I cannot get VIP because it has not been allocated yet by
> Nailgun/oh, I cannot set it because it has already been set by Nailgun and
> there is no way to alter it'.
>

>> To me, the two paragraphs above a contradictory. If the data
calculations are lazy, I don't really see how one can introspect into
changes that will be applied by a component at any given run. You just >>
don't have this information, and you need to calculate it anyways to see
which settings will be passed to a component. Might be I got your point
wrong here. Please correct me if this is the case.

Oleg, I actually meant that we do it in the following stages:

1) Change stuff in any amount of business logic engines you want,
configuration databases, wikipedia, 4chan, etc.
2) Schedule a transaction of deployment
3) Make 'transformers/serializers' for each of the task collect all the
data and store them before execution is started
4) Allow user to compare differences and decide whether he actually wants
to apply this change
5) Commit the deployment - run particular tasks with particular set of
settings which are staged and frozen (otherwise it will be impossible to
 debug this stuff)
6) If there is lack of data for some task, e.g. you need some entitties to
be created during the deployment so that other task will use their output
or side-effects to calculate things - this task should not be executed
within this transaction. This means that the whole deployment should be
splitted into 2 transactions. I can mention an old story here - when we
were running puppet we needed to create some stuff for neutron knowing ID
of the network that had been created by another resource 5 seconds earlier.
But we could not do this because puppet 'freezes' the input provided with
"facts" before this transaction runs. This is exactly the same usecase.

So these 6 items actually mean:

1) Clear separation between layers of the system and their functional
boundaries
2) Minimum of cross-dependencies between component data - e.g. deployment
tasks should not ever produce anything that is then stored in the storage.
Instead, you should have an API that provides you with data which is the
result of deployment run. E.g. if you need to create a user in LDAP and you
need this user's ID for some reason, your deployment task should create
this user and, instead of returning this output to the storage, you just
run another transaction and the task that requires this ID fetches it from
LDAP.


On Thu, Oct 22, 2015 at 1:25 PM, Dmitriy Shulyak 
wrote:

>
> Hi Oleg,
>
> I want to mention that we are using similar approach for deployment
> engine, the difference is that we are working not with components, but with
> deployment objects (it could be resources or tasks).
> Right now all the data should be provided by user, but we are going to add
> concept of managed resource, so that resource will be able to request data
> from 3rd party service before execution, or by notification, if it is
> supported.
> I think this is similar to what Vladimir describes.
>
> As for the components - i see how it can be useful, for example
> provisioning service will require data from networking service, but i think
> nailgun can act as router for such cases.
> This way we will keep components simple and purely functional, and nailgun
> will perform a role of a client which knows how to build interaction
> between components.
>
> So, as a summary i think this is 2 different problems.
>
>
>
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Yours Faithfully,
Vladimir Kuklin,
Fuel Library Tech Lead,
Mirantis, Inc.
+7 (495) 640-49-04
+7 (926) 702-39-68
Skype kuklinvv
35bk3, Vorontsovskaya Str.
Moscow, Russia,
www.mirantis.com 
www.mirantis.ru
vkuk...@mirantis.com
__
OpenStack Development Mailing List (not for usage questions)

Re: [openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-22 Thread Oleg Gelbukh
Hello,

We discussed this proposal in our team and came up with the following
vision of a configuration provisioning system:

- Installer is a system of multiple components: hardware inventory,
user interfaces, provisioning modules, deployment modules, checker modules,
volume manager, network manager, plugins, etc.
- Every component has its own data representation (we call them 'views'
as they provide an introspection in the configuration of the system), which
should include all the settings data the component should have access to to
perform its functions.
- Every component has 2 types of data in its view/representation:
authoritative data (which the component can modify) and external data
(which essentially is links to elements of another component's
view/representation).
- There is no 'universal' or 'general' representation of data which
serves a source of truth for all other views: every component is a source
of truth for its authoritative data.
- Views are defined as templates in some declarative language (YaML,
JSON, XML, %whatever%), think of jsonschema here. Authoritative settings of
the component have only type, external settings must also contain a link to
external view (might be just piece of code with properly referenced
elements of external view as parameters).
- View template shall be rendered in the data store during
'registration' of the component in the system, i.e. data structure shall be
created to represent the format of the data with necessary links.
- Views can be saved to the data store and modified by component that
'owns' the view's template, or via system's API. Changes to authoritative
settings in the view shall be propagated to all views that contain external
links to those settings.
- Both view template and views defined by it have versions. Template
version if defined by the version of it's owner component. View version
increases with every change made to it and can be used by the orchestrator
and component to determine if the async update of view was made by external
links.

We will continue to flesh it out as a specification in Fuel specs
repository. I will greatly appreciate any feedback on this vision,
including comments, objections, concerns and questions.

--
Best regards,
Oleg Gelbukh

On Tue, Oct 20, 2015 at 2:13 PM, Vladimir Kuklin 
wrote:

> Folks
>
> Can we please stop using etherpad and move to some more usable thing as
> Google Docs? Etherpad seems too unusable for such discussion especially
> with this coloured formatting.
>
> Mike
>
> I currently see no need in following marketing trend for noSQL here - we
> need to store a set of structured data. This store should be the one that
> can be easily consumed directly or with some API wrapper. That is all. We
> will need to carefully evaluate each storage engine and decide which to
> pick. I personally insist on the engine that provides 100% consistency
> which is in fact opposite to what most of noSQL and distributed
> architectures provide. Nobody cares if you lose 1 billion of messages in a
> social network (even these messages authors) - this is almost all the time
> garbage with porn and cat pictures. Things will get worse if you destroy
> something in production serving accounting in your cloud due to the fact
> that nodes are
>
> I agree with option #2 - we actually should have task abstraction layer
> with drivers for execution, but I would go with baby steps for supporting
> other deployment tools - currently I do not see any benefit in using
> Ansible for tasks that Fuel is solving. The same is almost true for
> containers, but this is a different story.
>
> Eugene, Mike
>
> I agree with you that we need to think about where to execute these
> serializers. I think that we could do it the following way - serializer can
> be executed wherever it can actually work and it should possibly put data
> into centralized storage for the means of logging, control and accounting.
> I am not sure that this is the limitation case all the users will agree
> with, but we need to think of it.
>
> Regarding this 'last task throwing an exception issue' - we can handle
> this properly by simply rerunning the task that failed only due to
> serialization problem. Or even better - reorder its execution for later
> steps and try it again in a while if there are other tasks to be executed.
>
> But Mike's approach of data preparation prior to deployment/workflow
> transaction execution seems more viable. I think, we should follow the
> following one: "If you do not know the data before the transaction run,
> this data should be calculated after this transaction ends and this data
> should be used for another workflow in a different transaction".
>
>
> On Tue, Oct 20, 2015 at 1:20 PM, Evgeniy L  wrote:
>
>> Hi,
>>
>> I have a comment regarding to when/where run translators.
>> I think data processing (fetching + validation + translation) should be
>> done
>> 

Re: [openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-22 Thread Vladimir Kuklin
Oleg

Thank you for your feedback. IMO, the schema you are providing is very
complex and would surely benefit from some examples.

If I understand correctly your proposal, you are trying to do the things
that we actually want to get rid of - tight coupling and schema control of
data that is being used by components. There should be no cross-reference
between components that do actual deployment. Instead, there should be a
clear separation between layers of our deployment system.

All the data that is provided to deployment (or
provisioning/power_management/etc.) tasks should be accessible through API
of the top-level components such as
Network/Partitioning/IPAddressAllocation/ Manager or any other type of
external configuration database such as ENC of external puppet
master/LDAP/.

 Each task can use some code to transform this output to the representation
that is actually needed for this particular task. Whenever a task
transforms this data it can access API and do version negotiation, for
example. Each time this transformation is performed this task can return
the data to some storage that will save this data for sake of control and
troubleshooting, such as, for example, user can always see which changes
are going to be applied and decide what to do next.

Also, this means that the process of data calculation itself is very 'lazy'
or 'delayed', i. e. the data itself is calculated right at the beginning of
deployment transaction, so that it is not locked to some particular details
of deployment engine data processing and not prone to issues like 'oh, I
cannot get VIP because it has not been allocated yet by Nailgun/oh, I
cannot set it because it has already been set by Nailgun and there is no
way to alter it'.

On Thu, Oct 22, 2015 at 12:16 PM, Oleg Gelbukh 
wrote:

> Hello,
>
> We discussed this proposal in our team and came up with the following
> vision of a configuration provisioning system:
>
> - Installer is a system of multiple components: hardware inventory,
> user interfaces, provisioning modules, deployment modules, checker modules,
> volume manager, network manager, plugins, etc.
> - Every component has its own data representation (we call them
> 'views' as they provide an introspection in the configuration of the
> system), which should include all the settings data the component should
> have access to to perform its functions.
> - Every component has 2 types of data in its view/representation:
> authoritative data (which the component can modify) and external data
> (which essentially is links to elements of another component's
> view/representation).
> - There is no 'universal' or 'general' representation of data which
> serves a source of truth for all other views: every component is a source
> of truth for its authoritative data.
> - Views are defined as templates in some declarative language (YaML,
> JSON, XML, %whatever%), think of jsonschema here. Authoritative settings of
> the component have only type, external settings must also contain a link to
> external view (might be just piece of code with properly referenced
> elements of external view as parameters).
> - View template shall be rendered in the data store during
> 'registration' of the component in the system, i.e. data structure shall be
> created to represent the format of the data with necessary links.
> - Views can be saved to the data store and modified by component that
> 'owns' the view's template, or via system's API. Changes to authoritative
> settings in the view shall be propagated to all views that contain external
> links to those settings.
> - Both view template and views defined by it have versions. Template
> version if defined by the version of it's owner component. View version
> increases with every change made to it and can be used by the orchestrator
> and component to determine if the async update of view was made by external
> links.
>
> We will continue to flesh it out as a specification in Fuel specs
> repository. I will greatly appreciate any feedback on this vision,
> including comments, objections, concerns and questions.
>
> --
> Best regards,
> Oleg Gelbukh
>
> On Tue, Oct 20, 2015 at 2:13 PM, Vladimir Kuklin 
> wrote:
>
>> Folks
>>
>> Can we please stop using etherpad and move to some more usable thing as
>> Google Docs? Etherpad seems too unusable for such discussion especially
>> with this coloured formatting.
>>
>> Mike
>>
>> I currently see no need in following marketing trend for noSQL here - we
>> need to store a set of structured data. This store should be the one that
>> can be easily consumed directly or with some API wrapper. That is all. We
>> will need to carefully evaluate each storage engine and decide which to
>> pick. I personally insist on the engine that provides 100% consistency
>> which is in fact opposite to what most of noSQL and distributed
>> architectures provide. Nobody cares if you lose 1 

Re: [openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-22 Thread Dmitriy Shulyak
Hi Oleg,

I want to mention that we are using similar approach for deployment engine,
the difference is that we are working not with components, but with
deployment objects (it could be resources or tasks).
Right now all the data should be provided by user, but we are going to add
concept of managed resource, so that resource will be able to request data
from 3rd party service before execution, or by notification, if it is
supported.
I think this is similar to what Vladimir describes.

As for the components - i see how it can be useful, for example
provisioning service will require data from networking service, but i think
nailgun can act as router for such cases.
This way we will keep components simple and purely functional, and nailgun
will perform a role of a client which knows how to build interaction
between components.

So, as a summary i think this is 2 different problems.
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-22 Thread Oleg Gelbukh
Hi Vladimir,

Thanks for prompt reply. Please, see my comments inline.

On Thu, Oct 22, 2015 at 12:44 PM, Vladimir Kuklin 
wrote:

> Oleg
>
> Thank you for your feedback. IMO, the schema you are providing is very
> complex and would surely benefit from some examples.
>

I'm going to submit spec for review that will incorporate examples and
diagrams for sure. I expect to come up with it in couple of days, most
likely by Monday.

>
> If I understand correctly your proposal, you are trying to do the things
> that we actually want to get rid of - tight coupling and schema control of
> data that is being used by components.
>

Your understanding is mostly correct. However, important thing here is that
we propose API that will allow to adjust schema of the particular view at
any time or register a new schema (for new/added component), etc, (almost)
without writing Python code.


> There should be no cross-reference between components that do actual
> deployment. Instead, there should be a clear separation between layers of
> our deployment system.
>

Such a separation will not be enforced by the system we propose. However,
if we indeed have some 'hierarchy' of components, it will be naturally
reflected in the way the links are specified in templates. For example, if
our primary source of configuration settings is UI/API, then it will be
authoritative for configurable parameters, like backends selection, IP
address ranges, etc. However, settings that are discovered from actual
nodes shall be provided by corresponding components, like 'nailgun-agent'.

Deployment modules most likely won't be authoritative for any settings, as
far as I can tell at the moment. They could, however, provide feedback-like
parameters, for instance, those that can be calculated only in runtime.


>
> All the data that is provided to deployment (or
> provisioning/power_management/etc.) tasks should be accessible through API
> of the top-level components such as
> Network/Partitioning/IPAddressAllocation/ Manager or any other type of
> external configuration database such as ENC of external puppet
> master/LDAP/.
>

This very proposal is about creating such an API (whether service-like or
library-like) that other components and even end users can leverage to
access and manage configuration parameters. We probably should start with
library API, and decide whether we need service of this kind later.


>
>  Each task can use some code to transform this output to the
> representation that is actually needed for this particular task. Whenever a
> task transforms this data it can access API and do version negotiation, for
> example. Each time this transformation is performed this task can return
> the data to some storage that will save this data for sake of control and
> troubleshooting, such as, for example, user can always see which changes
> are going to be applied and decide what to do next.
>
> Also, this means that the process of data calculation itself is very
> 'lazy' or 'delayed', i. e. the data itself is calculated right at the
> beginning of deployment transaction, so that it is not locked to some
> particular details of deployment engine data processing and not prone to
> issues like 'oh, I cannot get VIP because it has not been allocated yet by
> Nailgun/oh, I cannot set it because it has already been set by Nailgun and
> there is no way to alter it'.
>

To me, the two paragraphs above a contradictory. If the data calculations
are lazy, I don't really see how one can introspect into changes that will
be applied by a component at any given run. You just don't have this
information, and you need to calculate it anyways to see which settings
will be passed to a component. Might be I got your point wrong here. Please
correct me if this is the case.

Thanks again, looking forward to hear from you.

--
Best regards,
Oleg Gelbukh


>
> On Thu, Oct 22, 2015 at 12:16 PM, Oleg Gelbukh 
> wrote:
>
>> Hello,
>>
>> We discussed this proposal in our team and came up with the following
>> vision of a configuration provisioning system:
>>
>> - Installer is a system of multiple components: hardware inventory,
>> user interfaces, provisioning modules, deployment modules, checker modules,
>> volume manager, network manager, plugins, etc.
>> - Every component has its own data representation (we call them
>> 'views' as they provide an introspection in the configuration of the
>> system), which should include all the settings data the component should
>> have access to to perform its functions.
>> - Every component has 2 types of data in its view/representation:
>> authoritative data (which the component can modify) and external data
>> (which essentially is links to elements of another component's
>> view/representation).
>> - There is no 'universal' or 'general' representation of data which
>> serves a source of truth for all other views: every component is a source
>> of truth for its 

Re: [openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-20 Thread Evgeniy L
Hi,

I have a comment regarding to when/where run translators.
I think data processing (fetching + validation + translation) should be done
as a separate stage, this way when we start deployment, we are sure
that we have everything we need to perform deployment, but I understand
that there might be exceptions and sometimes we will have to get the data
on the fly.
If we are going to put translated data into some distributed store I'm not
sure
if distributed approach for fetching and updating the data won't cause the
problems
with race conditions, in centralised approach probability to get such
problems can
be reduced.

Thanks,


On Tue, Oct 20, 2015 at 5:14 AM, Mike Scherbakov 
wrote:

> Thanks Vladimir.
> This is very important work, I'd say. I'd split it into two parts:
>
>1. Have some store where you'd dump data from serializers. Hiera
>should be able to read easily directly from this store.
>2. Refactor serializers so to get rid of single python method which
>creates data for multiple OpenStack components, and allow deployment
>engineers to easily modify code of particular piece
>
> For #1, it is important to think broadly. We want this store to be used by
> other tools which users may have (Puppet Master, Ansible, etc.) as a source
> data, and so that Fuel & other tools can coexist on the same env if really
> needed (even though I'd ideally try to avoid it).
> We need to have an abstraction layer there, so that we can have drivers
> for key-value store and for such things as Zookeeper, for instance, in the
> future. I think we need to address #1 in the first order, before going to
> #2 (if we can't do it in parallel).
>
> For #2, I think we need to consider flexibility. What if we use Ansible,
> or container for some of our services? So we need to think where we can put
> these per-component / per-task serializers, so those can be used for both
> Puppet module & something different.
>
> Also, it's interesting problem from the execution point of view. Do we run
> serialization on Fuel Master side or on slave nodes, where we install
> OpenStack? I see some issues with running it on OpenStack nodes, even
> though I like an idea of load distribution, etc. For instance, if you run
> almost all graph, and then the last task in the graph runs corresponding
> serializer - and there is a Python exception for whatever reason (user
> input leads to bug in calculation). You could get it right a way, if you
> tried to calculate it before overall deployment - but now you've been
> waiting deployment to be almost done to catch it.
>
> Thank you,
>
> On Fri, Oct 16, 2015 at 9:22 AM Vladimir Kuklin 
> wrote:
>
>> Hey, Fuelers
>>
>> TL;DR This email is about how to make
>>
>> * Intro
>> I want to bring up one of the important topics on how to make Fuel more
>> flexible. Some of you know that we have been discussing means of doing this
>> internally and now it is time to share these thoughts with all of you.
>>
>> As you could know per Evgeniy Li's message [0] we are looking forward
>> splitting Fuel (specifically it's Fuel-Web) part into set of microservices
>> each one serving their own purpose like networking configuration,
>> partitioning, etc.
>>
>>
>> And while we are working on this it seems that we need to get rid of
>> so-called Nailgun serializers that are put too close to business logic
>> engine, that have a lot of duplicating attributes; you are not able to
>> easily modify or extend them; you are not able to change their behaviour
>> even when Fuel Library is capable of doing so - everything is hardcoded in
>> Nailgun code without clear separation between business logic and actual
>> deployment workflow data generation and orchestration.
>>
>> Let me give you an example:
>>
>> * Case A. Replace Linux bridges with OVS bridges by default
>>
>> We all know that we removed OVS as much as possible from our reference
>> architecture due to its buginess. Imagine a situation when someone
>> magically fixed OVS and wants to use it as a provider for generic bonds and
>> bridge. It actually means that he needs to set default provider in
>> network_scheme for l23network puppet module to 'ovs' instead of 'lnx'.
>> Imagine, he has put this magical OVS into a package and created a plugin.
>> The problem here will be that he needs to override what network serializer
>> is sending to the nodes.
>>
>> But the problem here is that he cannot do it without editing Nailgun code
>> or override this serializer in any way.
>>
>> * Case B. Make Swift Partitions Known to Fuel Library
>>
>> Imagine, you altered the way you partition your disk in Nailgun. You
>> created a special role for swift disks which should occupy the whole disk.
>> In this case you should be able to get this info from api and feed it to
>> swift deployment task. But it is not so easy - this stuff is still
>> hardcoded in deployment serializers like {mp} field of nodes array of
>> hashes.
>>
>> 

Re: [openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-20 Thread Vladimir Kuklin
Folks

Can we please stop using etherpad and move to some more usable thing as
Google Docs? Etherpad seems too unusable for such discussion especially
with this coloured formatting.

Mike

I currently see no need in following marketing trend for noSQL here - we
need to store a set of structured data. This store should be the one that
can be easily consumed directly or with some API wrapper. That is all. We
will need to carefully evaluate each storage engine and decide which to
pick. I personally insist on the engine that provides 100% consistency
which is in fact opposite to what most of noSQL and distributed
architectures provide. Nobody cares if you lose 1 billion of messages in a
social network (even these messages authors) - this is almost all the time
garbage with porn and cat pictures. Things will get worse if you destroy
something in production serving accounting in your cloud due to the fact
that nodes are

I agree with option #2 - we actually should have task abstraction layer
with drivers for execution, but I would go with baby steps for supporting
other deployment tools - currently I do not see any benefit in using
Ansible for tasks that Fuel is solving. The same is almost true for
containers, but this is a different story.

Eugene, Mike

I agree with you that we need to think about where to execute these
serializers. I think that we could do it the following way - serializer can
be executed wherever it can actually work and it should possibly put data
into centralized storage for the means of logging, control and accounting.
I am not sure that this is the limitation case all the users will agree
with, but we need to think of it.

Regarding this 'last task throwing an exception issue' - we can handle this
properly by simply rerunning the task that failed only due to serialization
problem. Or even better - reorder its execution for later steps and try it
again in a while if there are other tasks to be executed.

But Mike's approach of data preparation prior to deployment/workflow
transaction execution seems more viable. I think, we should follow the
following one: "If you do not know the data before the transaction run,
this data should be calculated after this transaction ends and this data
should be used for another workflow in a different transaction".


On Tue, Oct 20, 2015 at 1:20 PM, Evgeniy L  wrote:

> Hi,
>
> I have a comment regarding to when/where run translators.
> I think data processing (fetching + validation + translation) should be
> done
> as a separate stage, this way when we start deployment, we are sure
> that we have everything we need to perform deployment, but I understand
> that there might be exceptions and sometimes we will have to get the data
> on the fly.
> If we are going to put translated data into some distributed store I'm not
> sure
> if distributed approach for fetching and updating the data won't cause the
> problems
> with race conditions, in centralised approach probability to get such
> problems can
> be reduced.
>
> Thanks,
>
>
> On Tue, Oct 20, 2015 at 5:14 AM, Mike Scherbakov  > wrote:
>
>> Thanks Vladimir.
>> This is very important work, I'd say. I'd split it into two parts:
>>
>>1. Have some store where you'd dump data from serializers. Hiera
>>should be able to read easily directly from this store.
>>2. Refactor serializers so to get rid of single python method which
>>creates data for multiple OpenStack components, and allow deployment
>>engineers to easily modify code of particular piece
>>
>> For #1, it is important to think broadly. We want this store to be used
>> by other tools which users may have (Puppet Master, Ansible, etc.) as a
>> source data, and so that Fuel & other tools can coexist on the same env if
>> really needed (even though I'd ideally try to avoid it).
>> We need to have an abstraction layer there, so that we can have drivers
>> for key-value store and for such things as Zookeeper, for instance, in the
>> future. I think we need to address #1 in the first order, before going to
>> #2 (if we can't do it in parallel).
>>
>> For #2, I think we need to consider flexibility. What if we use Ansible,
>> or container for some of our services? So we need to think where we can put
>> these per-component / per-task serializers, so those can be used for both
>> Puppet module & something different.
>>
>> Also, it's interesting problem from the execution point of view. Do we
>> run serialization on Fuel Master side or on slave nodes, where we install
>> OpenStack? I see some issues with running it on OpenStack nodes, even
>> though I like an idea of load distribution, etc. For instance, if you run
>> almost all graph, and then the last task in the graph runs corresponding
>> serializer - and there is a Python exception for whatever reason (user
>> input leads to bug in calculation). You could get it right a way, if you
>> tried to calculate it before overall deployment - but 

Re: [openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-19 Thread Mike Scherbakov
Thanks Vladimir.
This is very important work, I'd say. I'd split it into two parts:

   1. Have some store where you'd dump data from serializers. Hiera should
   be able to read easily directly from this store.
   2. Refactor serializers so to get rid of single python method which
   creates data for multiple OpenStack components, and allow deployment
   engineers to easily modify code of particular piece

For #1, it is important to think broadly. We want this store to be used by
other tools which users may have (Puppet Master, Ansible, etc.) as a source
data, and so that Fuel & other tools can coexist on the same env if really
needed (even though I'd ideally try to avoid it).
We need to have an abstraction layer there, so that we can have drivers for
key-value store and for such things as Zookeeper, for instance, in the
future. I think we need to address #1 in the first order, before going to
#2 (if we can't do it in parallel).

For #2, I think we need to consider flexibility. What if we use Ansible, or
container for some of our services? So we need to think where we can put
these per-component / per-task serializers, so those can be used for both
Puppet module & something different.

Also, it's interesting problem from the execution point of view. Do we run
serialization on Fuel Master side or on slave nodes, where we install
OpenStack? I see some issues with running it on OpenStack nodes, even
though I like an idea of load distribution, etc. For instance, if you run
almost all graph, and then the last task in the graph runs corresponding
serializer - and there is a Python exception for whatever reason (user
input leads to bug in calculation). You could get it right a way, if you
tried to calculate it before overall deployment - but now you've been
waiting deployment to be almost done to catch it.

Thank you,

On Fri, Oct 16, 2015 at 9:22 AM Vladimir Kuklin 
wrote:

> Hey, Fuelers
>
> TL;DR This email is about how to make
>
> * Intro
> I want to bring up one of the important topics on how to make Fuel more
> flexible. Some of you know that we have been discussing means of doing this
> internally and now it is time to share these thoughts with all of you.
>
> As you could know per Evgeniy Li's message [0] we are looking forward
> splitting Fuel (specifically it's Fuel-Web) part into set of microservices
> each one serving their own purpose like networking configuration,
> partitioning, etc.
>
>
> And while we are working on this it seems that we need to get rid of
> so-called Nailgun serializers that are put too close to business logic
> engine, that have a lot of duplicating attributes; you are not able to
> easily modify or extend them; you are not able to change their behaviour
> even when Fuel Library is capable of doing so - everything is hardcoded in
> Nailgun code without clear separation between business logic and actual
> deployment workflow data generation and orchestration.
>
> Let me give you an example:
>
> * Case A. Replace Linux bridges with OVS bridges by default
>
> We all know that we removed OVS as much as possible from our reference
> architecture due to its buginess. Imagine a situation when someone
> magically fixed OVS and wants to use it as a provider for generic bonds and
> bridge. It actually means that he needs to set default provider in
> network_scheme for l23network puppet module to 'ovs' instead of 'lnx'.
> Imagine, he has put this magical OVS into a package and created a plugin.
> The problem here will be that he needs to override what network serializer
> is sending to the nodes.
>
> But the problem here is that he cannot do it without editing Nailgun code
> or override this serializer in any way.
>
> * Case B. Make Swift Partitions Known to Fuel Library
>
> Imagine, you altered the way you partition your disk in Nailgun. You
> created a special role for swift disks which should occupy the whole disk.
> In this case you should be able to get this info from api and feed it to
> swift deployment task. But it is not so easy - this stuff is still
> hardcoded in deployment serializers like {mp} field of nodes array of
> hashes.
>
> * Proposed solution
>
> In order to tackle this I propose to extract these so called serializers
> (see links [1] and [2]) and put them closer to library. You can see that
> half of the code is actually duplicated for deployment and provsioning
> serializers and there is actually no inheritance of common code betwen
> them. If you want to introduce new attribute and put it into astute.yaml,
> you will need to rewrite Nailgun code. This is not very
> deployment/sysop/sysadmin engineer-friendly. Essentially, the proposal is
> to introduce a library of such `serializers` (I would like to call them
> translators actually) which could leverage inheritance, polymorphism and
> incapsulation pretty much in OOP mode but with ability for deployment
> engineers to apply versioning to serializers and allow each particular task
> to work 

[openstack-dev] [Fuel][Fuel-Modularization] Proposal on Decoupling Serializers from Nailgun

2015-10-16 Thread Vladimir Kuklin
Hey, Fuelers

TL;DR This email is about how to make

* Intro
I want to bring up one of the important topics on how to make Fuel more
flexible. Some of you know that we have been discussing means of doing this
internally and now it is time to share these thoughts with all of you.

As you could know per Evgeniy Li's message [0] we are looking forward
splitting Fuel (specifically it's Fuel-Web) part into set of microservices
each one serving their own purpose like networking configuration,
partitioning, etc.


And while we are working on this it seems that we need to get rid of
so-called Nailgun serializers that are put too close to business logic
engine, that have a lot of duplicating attributes; you are not able to
easily modify or extend them; you are not able to change their behaviour
even when Fuel Library is capable of doing so - everything is hardcoded in
Nailgun code without clear separation between business logic and actual
deployment workflow data generation and orchestration.

Let me give you an example:

* Case A. Replace Linux bridges with OVS bridges by default

We all know that we removed OVS as much as possible from our reference
architecture due to its buginess. Imagine a situation when someone
magically fixed OVS and wants to use it as a provider for generic bonds and
bridge. It actually means that he needs to set default provider in
network_scheme for l23network puppet module to 'ovs' instead of 'lnx'.
Imagine, he has put this magical OVS into a package and created a plugin.
The problem here will be that he needs to override what network serializer
is sending to the nodes.

But the problem here is that he cannot do it without editing Nailgun code
or override this serializer in any way.

* Case B. Make Swift Partitions Known to Fuel Library

Imagine, you altered the way you partition your disk in Nailgun. You
created a special role for swift disks which should occupy the whole disk.
In this case you should be able to get this info from api and feed it to
swift deployment task. But it is not so easy - this stuff is still
hardcoded in deployment serializers like {mp} field of nodes array of
hashes.

* Proposed solution

In order to tackle this I propose to extract these so called serializers
(see links [1] and [2]) and put them closer to library. You can see that
half of the code is actually duplicated for deployment and provsioning
serializers and there is actually no inheritance of common code betwen
them. If you want to introduce new attribute and put it into astute.yaml,
you will need to rewrite Nailgun code. This is not very
deployment/sysop/sysadmin engineer-friendly. Essentially, the proposal is
to introduce a library of such `serializers` (I would like to call them
translators actually) which could leverage inheritance, polymorphism and
incapsulation pretty much in OOP mode but with ability for deployment
engineers to apply versioning to serializers and allow each particular task
to work with different sources of data with different versions of API.

What this actually means: each task has a step called 'translation' which
fetches attributes from any arbitrary set of sources and converts them into
the format that is consumable by the deployment stage of this task. From
our current architectural point of view it will look like generation of a
set of yaml files that will be merged by hiera so that each puppet task can
leverage the power of hiera.

This actually means that in scope of our modularization initiative each
module should have an API which will be accessed by those tasks in runtime
right before the tasks are executed. This also means that if a user changes
some of the values in the databases of those modules, rerun of such task
will lead to a different result of 'translation' and trigger some actions
like 'keystone_config ~> Service[keystone]' in puppet.

There is a tough discussion (etherpad here:[4]) on:

1) how to handle versioning/revert capabilities
2) where to store output produced by those 'translators'
3) which type of the storage to use

Please, feel free to provide your feedback on this approach and tell me
where this approach is going to be wrong.

[0] http://permalink.gmane.org/gmane.comp.cloud.openstack.devel/66563
[1]
https://github.com/stackforge/fuel-web/blob/master/nailgun/nailgun/orchestrator/deployment_serializers.py
[2]
https://github.com/stackforge/fuel-web/blob/master/nailgun/nailgun/orchestrator/provisioning_serializers.py
[3] https://github.com/xenolog/l23network
[4] https://etherpad.openstack.org/p/data-processor-per-component

-- 
Yours Faithfully,
Vladimir Kuklin,
Fuel Library Tech Lead,
Mirantis, Inc.
+7 (495) 640-49-04
+7 (926) 702-39-68
Skype kuklinvv
35bk3, Vorontsovskaya Str.
Moscow, Russia,
www.mirantis.com 
www.mirantis.ru
vkuk...@mirantis.com
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: