Re: [openstack-dev] [oslo][oslo.db] MySQL Cluster support

Octave J. Orgeron Thu, 02 Feb 2017 11:31:59 -0800

Hi Doug,

Comments below..


Thanks,
Octave

On 2/2/2017 11:27 AM, Doug Hellmann wrote:

Excerpts from Octave J. Orgeron's message of 2017-02-02 09:40:23 -0700:

Hi Doug,

One could try to detect the default engine. However, in MySQL Cluster,
you can support multiple storage engines. Only NDB is fully clustered
and replicated, so if you accidentally set a table to be InnoDB it won't
be replicated . So it makes more sense for the operator to be explicit
on which engine they want to use.

I think this change is probably a bigger scale item than I understood
it to be when you originally contacted me off-list for advice about
how to get started. I hope I haven't steered you too far wrong, but
at least the conversation is started.

As someone (Mike?) pointed out on the review, the option by itself
doesn't do much of anything, now. Before we add it, I think we'll
want to see some more detail about how it's going used. It may be
easier to have that broader conversation here on email than on the
patch currently up for review.

Understood, it's a complicated topic since it involves gritty details inSQL Alchemy and Alembic that are masked from end-users and operatorsalike. Figuring out how to make this work did take some time on my part.


It sounds like part of the plan is to use the configuration setting
to control how the migration scripts create tables. How will that
work? Does each migration need custom logic, or can we build helpers
into oslo.db somehow? Or will the option be passed to the database
to change its behavior transparently?

These are good questions. For each service, when the db sync or dbmanage operation is done it will call into SQL Alchemy or Alembicdepending on the methods used by the given service. For example, mostuse SQL Alchemy, but there are services like Ironic and Neutron that useAlembic. It is within these scripts under the <service>/db/* hierarchythat the logic exist today to configure the database schema for anygiven service. Both approaches will look at the schema version in thedatabase to determine where to start the create, upgrade, heal, etc.operations. What my patches do is that in the scripts where a tableneeds to be modified, there will be custom IF/THEN logic to check thecfg.CONF.database.mysql_storage_engine setting to make the requiredmodifications. There are also use cases where the api.py or model(s).pyunder the <service>/db/ hierarchy needs to look at this setting as wellfor API and CLI operations where mysql_engine is auto-inserted into DBoperations. In those use cases, I replace the hard coded "InnoDB" withthe mysql_storage_engine variable.

It would be interesting if we could develop some helpers to automatethis, but it would probably have to be at the SQL Alchemy or Alembiclevels. Unfortunately, throughout all of the OpenStack services today weare hard coding things like mysql_engine, using InnoDB specific features(savepoints, nested operations, etc.), and not following the strict SQLorders for modifying table elements (foreign keys, constraints, andindexes). That actually makes it difficult to support other MySQLdialects or other databases out of the box. SQL Alchemy can be used tofix some of these things if the SQL statements are all generic and wefollow strict SQL rules. But to change that would be a monumentaleffort. That is why I took this approach of just adding custom logic.There is a president for this already for Postgres and DB2 support insome of the OpenStack services using custom logic to deal with similardifferences.

As to why we should place the configuration setting into oslo.db? Hereare a couple of logical reasons:


 * The configuration block for database settings for each service comes
   from the oslo.db namespace today under cfg.CONF.database.*. Placing
   it here makes the location consistent across all of the services.
 * Within the SQL Alchemy and Alembic scripts, this is one of the few
   common namespaces that are available without bringing in a larger
   number of modules across the services today.
 * Many of the SQL Alchemy and Alembic scripts only import the minimal
   set of python modules. If we imported others, we would also have to
   initialize those name spaces which means a lot more code :(
 * Reduces the amount of overhead required to make these changes.


Keep in mind that we do not encourage code outside of libraries to
rely on configuration settings defined within libraries, because
that limits our ability to change the names and locations of the
configuration variables.  If migration scripts need to access the
configuration setting we will need to add some sort of public API
to oslo.db to query the value. The function can simply return the
configured value.

Configuration parameters within any given service will make use of alarge namespace that pulls in things from oslo and the .conf files for agiven service. So even when an API, CLI, or DB related call is made,these namespaces are key for things to work. In the case of the SQLAlchemy and Alembic scripts, they also make use of this namespace withoslo, oslo.db, etc. to figure out how to connect to the database andother database settings. I don't think we need a public API for thesekinds of calls as the community already makes use of the libraries tobuild the namespace. My oslo.db setting and patches for each servicejust make use of the cfg.CONF.database namespace to determine thecorrect behavior to execute.


What other behaviors are likely to be changed by the new option?
Will application runtime behavior need to know about the storage
engine?

The changes will be transparent to the application runtime behavior. TheAPIs and CLI tools call into the <service>/db/api.py as the entry pointfor database calls. Behind this you usually have a models.py that isaware of the database schema to understand the layout of things. So theunderlining structure is abstracted away from the run-time. These entrypoints sometimes do require minor modifications to handle any hard codedissues or intercept functions like savepoints and nested operations.Again I use the cfg.CONF.database namespace to check for the appropriatebehavior and implement IF/THEN logic to do the right thing.


Some of my design objectives for all of these patches are:

 * Zero impact on OpenStack functionality and usability (API, CLI, user
   experience)
 * No loss in database structure. Consistent foreign keys, constraints,
   indexes, etc.
 * Minimal impact on column size and/or types to fit within NDB table
   row limits. Many columns are over-sized today.
 * Validate functionality of APIs, service processes, and CLI. Tempest
   is our friend :)
 * Zero impact for users not using MySQL Cluster (NDB).


Doug

Thanks,
Octave

On 2/2/2017 6:46 AM, Doug Hellmann wrote:

Excerpts from Octave J. Orgeron's message of 2017-02-01 20:33:38 -0700:

Hi Folks,

I'm working on adding support for MySQL Cluster to the core OpenStack
services. This will enable the community to benefit from an
active/active, auto-sharding, and scale-out MySQL database. My approach
is to have a single configuration setting in each core OpenStack service
in the oslo.db configuration section called mysql_storage_engine that
will enable the logic in the SQL Alchemy or Alembic upgrade scripts to
handle the differences between InnoDB and NDB storage engines
respectively. When enabled, this logic will make the required table
schema changes around:

    * Row character length limits 65k -> 14k
    * Proper SQL ordering of foreign key, constraints, and index operations
    * Interception of savepoint and nested operations

By default this functionality will not be enabled and will have no
impact on the default InnoDB functionality. These changes have been
tested on Kilo and Mitaka in previous releases of our OpenStack
distributions with Tempest. I'm working on updating these patches for
upstream consumption. We are also working on a 3rd party CI for
regression testing against MySQL Cluster for the community.

The first change set is for oslo.db and can be reviewed at:

https://review.openstack.org/427970

Thanks,
Octave

Is it possible to detect the storage engine at runtime, instead of
having the operator configure it?

Doug

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--

Oracle <http://www.oracle.com/>
Octave J. Orgeron | Sr. Principal Architect and Software Engineer
Oracle Linux OpenStack
Mobile: +1-720-616-1550 <tel:+17206161550>
500 Eldorado Blvd. | Broomfield, CO 80021

Certified Oracle Enterprise Architect: Systems Infrastructure<http://www.oracle.com/us/solutions/enterprise-architecture/index.html>Green Oracle <http://www.oracle.com/commitment> Oracle is committed todeveloping practices and products that help protect the environment

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo][oslo.db] MySQL Cluster support

Reply via email to