MySQL Cluster 7.6.4-dmr has been released

Balasubramanian Kandasamy Thu, 01 Feb 2018 10:33:02 -0800


Dear MySQL Users,


MySQL Cluster is the distributed, shared-nothing variant of MySQL.
This storage engine provides:

  - In-Memory storage - Real-time performance (with optional
    checkpointing to disk)
  - Transparent Auto-Sharding - Read & write scalability
  - Active-Active/Multi-Master geographic replication

  - 99.999% High Availability with no single point of failure
    and on-line maintenance
  - NoSQL and SQL APIs (including C++, Java, http, Memcached
    and JavaScript/Node.js)

MySQL Cluster 7.6.4-dmr, has been released and can be downloaded from

http://www.mysql.com/downloads/cluster/

where you will also find Quick Start guides to help you get your
first MySQL Cluster database up and running.

The release notes are available from

http://dev.mysql.com/doc/relnotes/mysql-cluster/7.6/en/index.html

MySQL Cluster enables users to meet the database challenges of next
generation web, cloud, and communications services with uncompromising
scalability, uptime and agility.

More details can be found at

http://www.mysql.com/products/cluster/

Enjoy !

Changes in MySQL NDB Cluster 7.6.4 (5.7.20-ndb-7.6.4) (2018-01-31,
Development Milestone 4)

   MySQL NDB Cluster 7.6.4 is a new release of NDB 7.6, based on
   MySQL Server 5.7 and including features in version 7.6 of the
   NDB storage engine, as well as fixing recently discovered
   bugs in previous NDB Cluster releases.

   Obtaining NDB Cluster 7.6.  NDB Cluster 7.6 source code and
   binaries can be obtained from
http://dev.mysql.com/downloads/cluster/.

   For an overview of changes made in NDB Cluster 7.6, see What
   is New in NDB Cluster 7.6
(http://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-what-is-new-7-6.html).

   This release also incorporates all bug fixes and changes made
   in previous NDB Cluster releases, as well as all bug fixes
   and feature changes which were added in mainline MySQL 5.7
   through MySQL 5.7.20 (see Changes in MySQL 5.7.20
   (2017-10-16, General Availability)
(http://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-20.html)).

     * Functionality Added or Changed

     * Bugs Fixed

   Functionality Added or Changed

     * Incompatible Change; NDB Disk Data: Due to changes in
       disk file formats, it is necessary to perform an
       --initial restart of each data node when upgrading to or
       downgrading from this release.

     * Important Change; NDB Disk Data: NDB Cluster has improved
       node restart times and overall performance with larger
       data sets by implementing partial local checkpoints.
       Prior to this release, an LCP always made a copy of the
       entire database.
       NDB now supports LCPs that write individual records, so
       it is no longer strictly necessary for an LCP to write
       the entire database. Since, at recovery, it remains
       necessary to restore the database fully, the strategy is
       to save one fourth of all records at each LCP, as well as
       to write the records that have changed since the last
       LCP.
       Two data node configuration parameters relating to this
       change are introduced in this release: EnablePartialLcp
       (default true, or enabled) enables partial LCPs. When
       partial LCPs are enabled, RecoveryWork controls the
       percentage of space given over to LCPs; it increases with
       the amount of work which must be performed on LCPs during
       restarts as opposed to that performed during normal
       operations. Raising this value causes LCPs during normal
       operations to require writing fewer records and so
       decreases the usual workload. Raising this value also
       means that restarts can take longer.
       Important
       Upgrading disk data tables to NDB 7.6.4 or downgrading
       them from this release requires an initial restart of
       each data node. An initial node restart still requires a
       complete LCP; a partial LCP is not used for this purpose.
       This release also deprecates the data node configuration
       parameters BackupDataBufferSize, BackupWriteSize, and
       BackupMaxWriteSize; these are now subject to removal in a
       future NDB Cluster release.

     * Important Change: Added the ndb_perror utility for
       obtaining information about NDB Cluster error codes. This
       tool replaces perror --ndb; the --ndb option for perror
       is now deprecated and raises a warning when used; the
       option is subject to removal in a future NDB release.
       See ndb_perror --- Obtain NDB error message information
(http://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-programs-ndb-perror.html),
       for more information. (Bug
       #81703, Bug #81704, Bug #23523869, Bug #23523926)
       References: See also: Bug #26966826, Bug #88086.

     * NDB Client Programs: NDB Cluster Auto-Installer node
       configuration parameters as supported in the UI and
       accompanying documentation were in some cases hard coded
       to an arbitrary value, or were missing altogether.
       Configuration parameters, their default values, and the
       documentation have been better aligned with those found
       in release versions of the NDB Cluster software.
       One necessary addition to this task was implementing the
       mechanism which the Auto-Installer now provides for
       setting parameters that take discrete values. For
       example, the value of the data node parameter Arbitration
       must now be one of Default, Disabled, or WaitExternal.
       The Auto-Installer also now gets and uses the amount of
       disk space available to NDB on each host for deriving
       reasonable default values for configuration parameters
       which depend on this value.
       See The NDB Cluster Auto-Installer
(http://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-install-auto.html),
       for more information.

     * NDB Client Programs: Secure connection support in the
       MySQL NDB Cluster Auto-Installer has been updated or
       improved in this release as follows:

          + Added a mechanism for setting SSH membership on a
            per-host basis.

          + Updated the Paramiko Python module to the most
            recent available version (2.6.1).

          + Provided a place in the GUI for encrypted private
            key passwords, and discontinued use of hardcoded
            passwords.
       Related enhancements implemented in the current release
       include the following:

          + Discontinued use of cookies as a persistent store
            for NDB Cluster configuration information; these
            were not secure and came with a hard upper limit on
            storage. Now the Auto-Installer uses an encrypted
            file for this purpose.

          + In order to secure data transfer between the web
            browser front end and the back end web server, the
            default communications protocol has been switched
            from HTTP to HTTPS.
       See The NDB Cluster Auto-Installer
(http://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-install-auto.html),
       for more information.

     * It is now possible to specify a set of cores to be used
       for I/O threads performing offline multithreaded builds
       of ordered indexes, as opposed to normal I/O duties such
       as file I/O， compression， or decompression. "Offline" in
       this context refers to building of ordered indexes
       performed when the parent table is not being written to;
       such building takes place when an NDB cluster performs a
       node or system restart, or as part of restoring a cluster
       from backup using ndb_restore --rebuild-indexes.
       In addition, the default behaviour for offline index
       build work is modified to use all cores available to
       ndbmtd, rather limiting itself to the core reserved for
       the I/O thread. Doing so can improve restart and restore
       times and performance, availability, and the user
       experience.
       This enhancement is implemented as follows:

         1. The default value for BuildIndexThreads is changed
            from 0 to 128. This means that offline ordered index
            builds are now multithreaded by default.

         2. The default value for TwoPassInitialNodeRestartCopy
            is changed from false to true. This means that an
            initial node restart first copies all data from a
            "live" node to one that is starting---without
            creating any indexes---builds ordered indexes
            offline, and then again synchronizes its data with
            the live node, that is, synchronizing twice and
            building indexes offline between the two
            synchonizations. This causes an initial node restart
            to behave more like the normal restart of a node,
            and reduces the time required for building indexes.

         3. A new thread type (idxbld) is defined for the
            ThreadConfig configuration parameter, to allow
            locking of offline index build threads to specific
            CPUs.
       In addition, NDB now distinguishes the thread types that
       are accessible to "ThreadConfig" by the following two
       criteria:

         1. Whether the thread is an execution thread. Threads
            of types main, ldm, recv, rep, tc, and send are
            execution threads; thread types io, watchdog, and
            idxbld are not.

         2. Whether the allocation of the thread to a given task
            is permanent or temporary. Currently all thread
            types except idxbld are permanent.
       For additonal information, see the descriptions of the
       parameters in the Manual. (Bug #25835748, Bug #26928111)

     * Added the ODirectSyncFlag configuration parameter for
       data nodes. When enabled, the data node treats all
       completed filesystem writes to the redo log as though
       they had been performed using fsync.
       Note
       This parameter has no effect if at least one of the
       following conditions is true:

          + ODirect is not enabled.

          + InitFragmentLogFiles is set to SPARSE.
       (Bug #25428560)

     * Added the ndbinfo.error_messages table, which provides
       information about NDB Cluster errors, including error
       codes, status types, brief descriptions, and
       classifications. This makes it possible to obtain error
       information using SQL in the mysql client (or other MySQL
       client program), like this:
mysql> SELECT * FROM ndbinfo.error_messages WHERE error_code='321';
+------------+----------------------+-----------------+---------------
-------+
| error_code | error_description    | error_status    | error_classifi
cation |
+------------+----------------------+-----------------+---------------
-------+
|        321 | Invalid nodegroup id | Permanent error | Application er
ror    |
+------------+----------------------+-----------------+---------------
-------+
1 row in set (0.00 sec)

       The query just shown provides equivalent information to
       that obtained by issuing ndb_perror 321 or (now
       deprecated) perror --ndb 321 on the command line. (Bug
       #86295, Bug #26048272)

     * When executing a scan as a pushed join, all instances of
       DBSPJ were involved in the execution of a single query;
       some of these received multiple requests from the same
       query. This situation is improved by enabling a single
       SPJ request to handle a set of root fragments to be
       scanned, such that only a single SPJ request is sent to
       each DBSPJ instance on each node and batch sizes are
       allocated per fragment, the multi-fragment scan can
       obtain a larger total batch size, allowing for some
       scheduling optimizations to be done within DBSPJ, which
       can scan a single fragment at a time (giving it the total
       batch size allocation), scan all fragments in parallel
       using smaller sub-batches, or some combination of the
       two.
       Since the effect of this change is generally to require
       fewer SPJ requests and instances, performance of
       pushed-down joins should be improved in many cases.

     * As part of work ongoing to optimize bulk DDL performance
       by ndbmtd, it is now possible to obtain performance
       improvements by increasing the batch size for the bulk
       data parts of DDL operations which process all of the
       data in a fragment or set of fragments using a scan.
       Batch sizes are now made configurable for unique index
       builds, foreign key builds, and online reorganization, by
       setting the respective data node configuration parameters
       listed here:

          + MaxFKBuildBatchSize: Maximum scan batch size used
            for building foreign keys.

          + MaxReorgBuildBatchSize: Maximum scan batch size used
            for reorganization of table partitions.

          + MaxUIBuildBatchSize: Maximum scan batch size used
            for building unique keys.
       For each of the parameters just listed, the default value
       is 64, the minimum is 16, and the maximum is 512.
       Increasing the appropriate batch size or sizes can help
       amortize inter-thread and inter-node latencies and make
       use of more parallel resources (local and remote) to help
       scale DDL performance.

     * Formerly, the data node LGMAN kernel block processed undo
       log records serially; now this is done in parallel. The
       rep thread, which hands off undo records to local data
       handler (LDM) threads, waited for an LDM to finish
       applying a record before fetching the next one; now the
       rep thread no longer waits, but proceeds immediately to
       the next record and LDM.
       There are no user-visible changes in functionality
       directly associated with this work; this performance
       enhancement is part of the work being done in NDB 7.6 to
       improve undo long handling for partial local checkpoints.

     * When applying an undo log the table ID and fragment ID
       are obtained from the page ID. This was done by reading
       the page from PGMAN using an extra PGMAN worker thread,
       but when applying the undo log it was necessary to read
       the page again.
       This became very inefficient when using O_DIRECT (see
       ODirect) since the page was not cached in the OS kernel.
       Mapping from page ID to table ID and fragment ID is now
       done using information the extent header contains about
       the table IDs and fragment IDs of the pages used in a
       given extent. Since the extent pages are always present
       in the page cache, no extra disk reads are required to
       perform the mapping, and the information can be read
       using existing TSMAN data structures.

     * Added the NODELOG DEBUG command in the ndb_mgm client to
       provide runtime control over data node debug logging.
       NODE DEBUG ON causes a data node to write extra debugging
       information to its node log, the same as if the node had
       been started with --verbose. NODELOG DEBUG OFF disables
       the extra logging.

     * Added the LocationDomainId configuration parameter for
       management, data, and API nodes. When using NDB Cluster
       in a cloud environment, you can set this parameter to
       assign a node to a given availability domain or
       availability zone. This can improve performance in the
       following ways:

          + If requested data is not found on the same node,
            reads can be directed to another node in the same
            availability domain.

          + Communication between nodes in different
            availability domains are guaranteed to use NDB
            transporters' WAN support without any further manual
            intervention.

          + The transporter's group number can be based on which
            availability domain is used, such that also SQL and
            other API nodes communicate with local data nodes in
            the same availability domain whenever possible.

          + The arbitrator can be selected from an availability
            domain in which no data nodes are present, or, if no
            such availability domain can be found, from a third
            availability domain.
       This parameter takes an integer value between 0 and 16,
       with 0 being the default; using 0 is the same as leaving
       LocationDomainId unset.

   Bugs Fixed

     * Important Change: The --passwd option for ndb_top is now
       deprecated, and thus subject to removal in a future
       release of NDB Cluster. (Bug #88236, Bug #20733646)
       References: See also: Bug #86615, Bug #26236320.

     * Replication: With GTIDs generated for incident log
       events, MySQL error code 1590 (ER_SLAVE_INCIDENT) could
       not be skipped using the --slave-skip-errors=1590 startup
       option on a replication slave. (Bug #26266758)

     * NDB Disk Data: An ALTER TABLE that switched the table
       storage format between MEMORY and DISK was always
       performed in place for all columns. This is not correct
       in the case of a column whose storage format is inherited
       from the table; the column's storage type is not changed.
       For example, this statement creates a table t1 whose
       column c2 uses in-memory storage since the table does so
       implicitly:
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT) ENGINE NDB;

       The ALTER TABLE statement shown here is expected to cause
       c2 to be stored on disk, but failed to do so:
ALTER TABLE t1 STORAGE DISK TABLESPACE ts1;

       Similarly, an on-disk column that inherited its storage
       format from the table to which it belonged did not have
       the format changed by ALTER TABLE ... STORAGE MEMORY.
       These two cases are now performed as a copying alter, and
       the storage format of the affected column is now changed.
       (Bug #26764270)

     * NDB Replication: On an SQL node not being used for a
       replication channel with sql_log_bin=0 it was possible
       after creating and populating an NDB table for a table
       map event to be written to the binary log for the created
       table with no corresponding row events. This led to
       problems when this log was later used by a slave cluster
       replicating from the mysqld where this table was created.
       Fixed this by adding support for maintaining a cumulative
       any_value bitmap for global checkpoint event operations
       that represents bits set consistently for all rows of a
       specific table in a given epoch, and by adding a check to
       determine whether all operations (rows) for a specific
       table are all marked as NOLOGGING, to prevent the
       addition of this table to the Table_map held by the
       binlog injector.
       As part of this fix, the NDB API adds a new
       getNextEventOpInEpoch3() method which provides
       information about any AnyValue received by making it
       possible to retrieve the cumulative any_value bitmap.
       (Bug #26333981)

     * ndbinfo Information Database: Counts of committed rows
       and committed operations per fragment used by some tables
       in ndbinfo were taken from the DBACC block, but due to
       the fact that commit signals can arrive out of order,
       transient counter values could be negative. This could
       happen if, for example, a transaction contained several
       interleaved insert and delete operations on the same row;
       in such cases, commit signals for delete operations could
       arrive before those for the corresponding insert
       operations, leading to a failure in DBACC.
       This issue is fixed by using the counts of committed rows
       which are kept in DBTUP, which do not have this problem.
       (Bug #88087, Bug #26968613)

     * Errors in parsing NDB_TABLE modifiers could cause memory
       leaks. (Bug #26724559)

     * Added DUMP code 7027 to facilitate testing of issues
       relating to local checkpoints. For more information, see
       DUMP 7027
(http://dev.mysql.com/doc/ndb-internals/en/ndb-internals-dump-command-7027.html).
       (Bug #26661468)

     * A previous fix intended to improve logging of node
       failure handling in the transaction coordinator included
       logging of transactions that could occur in normal
       operation, which made the resulting logs needlessly
       verbose. Such normal transactions are no longer written
       to the log in such cases. (Bug #26568782)
       References: This issue is a regression of: Bug #26364729.

     * Due to a configuration file error, CPU locking capability
       was not available on builds for Linux platforms. (Bug
       #26378589)

     * Some DUMP codes used for the LGMAN kernel block were
       incorrectly assigned numbers in the range used for codes
       belonging to DBTUX. These have now been assigned symbolic
       constants and numbers in the proper range (10001, 10002,
       and 10003). (Bug #26365433)

     * Node failure handling in the DBTC kernel block consists
       of a number of tasks which execute concurrently, and all
       of which must complete before TC node failure handling is
       complete. This fix extends logging coverage to record
       when each task completes, and which tasks remain,
       includes the following improvements:

          + Handling interactions between GCP and node failure
            handling interactions, in which TC takeover causes
            GCP participant stall at the master TC to allow it
            to extend the current GCI with any transactions that
            were taken over; the stall can begin and end in
            different GCP protocol states. Logging coverage is
            extended to cover all scenarios. Debug logging is
            now more consistent and understandable to users.

          + Logging done by the QMGR block as it monitors
            duration of node failure handling duration is done
            more frequently. A warning log is now generated
            every 30 seconds (instead of 1 minute), and this now
            includes DBDIH block debug information (formerly
            this was written separately, and less often).

          + To reduce space used, DBTC instance number: is
            shortened to DBTC number:.

          + A new error code is added to assist testing.
       (Bug #26364729)

     * During a restart, DBLQH loads redo log part metadata for
       each redo log part it manages, from one or more redo log
       files. Since each file has a limited capacity for
       metadata, the number of files which must be consulted
       depends on the size of the redo log part. These files are
       opened, read, and closed sequentially, but the closing of
       one file occurs concurrently with the opening of the
       next.
       In cases where closing of the file was slow, it was
       possible for more than 4 files per redo log part to be
       open concurrently; since these files were opened using
       the OM_WRITE_BUFFER option, more than 4 chunks of write
       buffer were allocated per part in such cases. The write
       buffer pool is not unlimited; if all redo log parts were
       in a similar state, the pool was exhausted, causing the
       data node to shut down.
       This issue is resolved by avoiding the use of
       OM_WRITE_BUFFER during metadata reload, so that any
       transient opening of more than 4 redo log files per log
       file part no longer leads to failure of the data node.
       (Bug #25965370)

     * A join entirely within the materialized part of a
       semi-join was not pushed even if it could have been. In
       addition, EXPLAIN provided no information about why the
       join was not pushed. (Bug #88224, Bug #27022925)
       References: See also: Bug #27067538.

     * When the duplicate weedout algorithm was used for
       evaluating a semi-join, the result had missing rows. (Bug
       #88117, Bug #26984919)
       References: See also: Bug #87992, Bug #26926666.

     * A table used in a loose scan could be used as a child in
       a pushed join query, leading to possibly incorrect
       results. (Bug #87992, Bug #26926666)

     * When representing a materialized semi-join in the query
       plan, the MySQL Optimizer inserted extra QEP_TAB and
       JOIN_TAB objects to represent access to the materialized
       subquery result. The join pushdown analyzer did not
       properly set up its internal data structures for these,
       leaving them uninitialized instead. This meant that later
       usage of any item objects referencing the materialized
       semi-join accessed an initialized tableno column when
       accessing a 64-bit tableno bitmask, possibly referring to
       a point beyond its end, leading to an unplanned shutdown
       of the SQL node. (Bug #87971, Bug #26919289)

     * In some cases, a SCAN_FRAGCONF signal was received after
       a SCAN_FRAGREQ with a close flag had already been sent,
       clearing the timer. When this occurred, the next
       SCAN_FRAGREF to arrive caused time tracking to fail. Now
       in such cases, a check for a cleared timer is performed
       prior to processing the SCAN_FRAGREF message. (Bug
       #87942, Bug #26908347)

     * While deleting an element in Dbacc, or moving it during
       hash table expansion or reduction, the method used
       (getLastAndRemove()) could return a reference to a
       removed element on a released page, which could later be
       referenced from the functions calling it. This was due to
       a change brought about by the implementation of dynamic
       index memory in NDB 7.6.2; previously, the page had
       always belonged to a single Dbacc instance, so accessing
       it was safe. This was no longer the case following the
       change; a page released in Dbacc could be placed directly
       into the global page pool where any other thread could
       then allocate it.
       Now we make sure that newly released pages in Dbacc are
       kept within the current Dbacc instance and not given over
       directly to the global page pool. In addition, the
       reference to a released page has been removed; the
       affected internal method now returns the last element by
       value, rather than by reference. (Bug #87932, Bug
       #26906640)
       References: See also: Bug #87987, Bug #26925595.

     * The DBTC kernel block could receive a TCRELEASEREQ signal
       in a state for which it was unprepared. Now it such cases
       it responds with a TCRELEASECONF message, and
       subsequently behaves just as if the API connection had
       failed. (Bug #87838, Bug #26847666)
       References: See also: Bug #20981491.

     * When a data node was configured for locking threads to
       CPUs, it failed during startup with Failed to lock tid.
       This was is a side effect of a fix for a previous issue,
       which disabled CPU locking based on the version of the
       available glibc. The specific glibc issue being guarded
       against is encountered only in response to an internal
       NDB API call (Ndb_UnlockCPU()) not used by data nodes
       (and which can be accessed only through internal API
       calls). The current fix enables CPU locking for data
       nodes and disables it only for the relevant API calls
       when an affected glibc version is used. (Bug #87683, Bug
       #26758939)
       References: This issue is a regression of: Bug #86892,
       Bug #26378589.

     * ndb_top failed to build on platforms where the ncurses
       library did not define stdscr. Now these platforms
       require the tinfo library to be included. (Bug #87185,
       Bug #26524441)

     * On completion of a local checkpoint, every node sends a
       LCP_COMPLETE_REP signal to every other node in the
       cluster; a node does not consider the LCP complete until
       it has been notified that all other nodes have sent this
       signal. Due to a minor flaw in the LCP protocol, if this
       message was delayed from another node other than the
       master, it was possible to start the next LCP before one
       or more nodes had completed the one ongoing; this caused
       problems with LCP_COMPLETE_REP signals from previous LCPs
       becoming mixed up with such signals from the current LCP,
       which in turn led to node failures.
       To fix this problem, we now ensure that the previous LCP
       is complete before responding to any TCGETOPSIZEREQ
       signal initiating a new LCP. (Bug #87184, Bug #26524096)

     * NDB Cluster did not compile successfully when the build
       used WITH_UNIT_TESTS=OFF. (Bug #86881, Bug #26375985)

     * Recent improvements in local checkpoint handling that use
       OM_CREATE to open files did not work correctly on Windows
       platforms, where the system tried to create a new file
       and failed if it already existed. (Bug #86776, Bug
       #26321303)

     * A potential hundredfold signal fan-out when sending a
       START_FRAG_REQ signal could lead to a node failure due to
       a job buffer full error in start phase 5 while trying to
       perform a local checkpoint during a restart. (Bug #86675,
       Bug #26263397)
       References: See also: Bug #26288247, Bug #26279522.

     * Compilation of NDB Cluster failed when using
       -DWITHOUT_SERVER=1 to build only the client libraries.
       (Bug #85524, Bug #25741111)

     * The NDBFS block's OM_SYNC flag is intended to make sure
       that all FSWRITEREQ signals used for a given file are
       synchronized, but was ignored by platforms that do not
       support O_SYNC, meaning that this feature did not behave
       properly on those platforms. Now the synchronization flag
       is used on those platforms that do not support O_SYNC.
       (Bug #76975, Bug #21049554)

On Behalf of Oracle/MySQL Release Engineering Team
Balasubramanian Kandasamy



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

MySQL Cluster 7.6.4-dmr has been released

Reply via email to