This is an automated email from the ASF dual-hosted git repository.
dmagda pushed a commit to branch IGNITE-7595
in repository https://gitbox.apache.org/repos/asf/ignite.git
The following commit(s) were added to refs/heads/IGNITE-7595 by this push:
new ea7d8f2 Restructured and update the snapshotting docs.
ea7d8f2 is described below
commit ea7d8f217b4b8d1932d2acc173b6cad683d3e335
Author: Denis Magda <[email protected]>
AuthorDate: Thu Sep 17 12:52:41 2020 -0700
Restructured and update the snapshotting docs.
---
docs/_data/toc.yaml | 4 +-
docs/_docs/code-snippets/xml/snapshots.xml | 2 +-
docs/_docs/persistence/snapshot.adoc | 198 -----------------------------
docs/_docs/persistence/snapshots.adoc | 188 +++++++++++++++++++++++++++
4 files changed, 192 insertions(+), 200 deletions(-)
diff --git a/docs/_data/toc.yaml b/docs/_data/toc.yaml
index 9aad776..836032d 100644
--- a/docs/_data/toc.yaml
+++ b/docs/_data/toc.yaml
@@ -25,7 +25,7 @@
url: /installation/installing-using-zip
- title: Installing Using Docker
url: /installation/installing-using-docker
- - title: Installing DEB or RPM package
+ - title: Installing DEB or RPM package
url: /installation/deb-rpm
- title: Kubernetes
items:
@@ -103,6 +103,8 @@
url: /persistence/swap
- title: Implementing Custom Cache Store
url: /persistence/custom-cache-store
+ - title: Cluster Snapshots
+ url: /persistence/snapshots
- title: Disk Compression
url: /persistence/disk-compression
- title: Tuning Persistence
diff --git a/docs/_docs/code-snippets/xml/snapshots.xml
b/docs/_docs/code-snippets/xml/snapshots.xml
index beeea0c..b41dedd 100644
--- a/docs/_docs/code-snippets/xml/snapshots.xml
+++ b/docs/_docs/code-snippets/xml/snapshots.xml
@@ -4,7 +4,7 @@
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<!--
Sets a path to the root directory where snapshot files will be
persisted.
- By default, the relative `snapshots` directory is used in the
IGNITE_HOME/db
+ By default, the `snapshots` directory is placed under the
`IGNITE_HOME/db`.
-->
<property name="snapshotPath" value="/snapshots"/>
diff --git a/docs/_docs/persistence/snapshot.adoc
b/docs/_docs/persistence/snapshot.adoc
deleted file mode 100644
index 49fc0f1..0000000
--- a/docs/_docs/persistence/snapshot.adoc
+++ /dev/null
@@ -1,198 +0,0 @@
-= Snapshots and recovery
-
-== Overview
-
-Apache Ignite 2.9 comes with an ability to create fully consistent
cluster-wide snapshots for deployments with
-the link:persistence/native-persistence[Ignite Native Persistence]. At
runtime, you can create multiple snapshots of all
-data stored in your cluster. The snapshot is a consistent copy of all cache
data files (except
-structures used for crash recovery) for each node in a cluster. Since data of
caches are stored
-on disk in files for each node (cache group partition files, configuration
files, binary metadata) in a cluster,
-the snapshot will contain a copy of the same files with keeping Ignite cluster
node data directory structure and node consistent IDs.
-
-=== Snapshot Consistency
-
-All snapshots you've created are fully consistent in terms of concurrent
cluster-wide operations and all ongoing changes of
-system files on the local node. Primary and backup cache group partitions will
also be fully consistent in created
-snapshots.
-
-The cluster-wide snapshot consistency is achieved by triggering the
-link:https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood[Partition-Map-Exchange]
-procedure. Doing this the cluster will eventually get a point in time when all
previously started transactions are
-finished on primary and backups, and new ones are hold until a new snapshot
operation is initiated.
-
-The local system files (e.g. cache group partition files, binary metadata
files, configuration files) consistency is achieved
-by copying them to the destination snapshot directory with tracking all
concurrent ongoing changes. Tracking concurrent
-changes during copying of cache group partition files might require additional
space in the Ignite work directory.
-
-=== Snapshot Structure
-
-The created snapshot has the same
-link:https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood#IgnitePersistentStoreunderthehood-FoldersStructure[Directory
Structure]
-as the Ignite native persistence does with keeping nodes `consistentId` in the
snapshot directory. The `wal` and `checkpoint`
-directories will be excluded from the snapshot since recovery procedures are
not necessary for cache group data files.
-
-The created snapshot contains:
-
-- cache group partition files;
-- cache configuration files related to cache groups;
-- binary metadata files;
-- marshaller data files;
-
-.Example of snapshot directory structure
-[source,shell]
-----
-work
-└── snapshots
- └── backup23012020
- ├── binary_meta
- │ ├── snapshot_IgniteClusterSnapshotSelfTest0
- │ ├── snapshot_IgniteClusterSnapshotSelfTest1
- │ └── snapshot_IgniteClusterSnapshotSelfTest2
- ├── db
- │ ├── snapshot_IgniteClusterSnapshotSelfTest0
- │ │ └── cache-txCache
- │ │ ├── cache_data.dat
- │ │ ├── part-3.bin
- │ │ ├── part-4.bin
- │ │ └── part-6.bin
- │ ├── snapshot_IgniteClusterSnapshotSelfTest1
- │ │ └── cache-txCache
- │ │ ├── cache_data.dat
- │ │ ├── part-1.bin
- │ │ ├── part-5.bin
- │ │ └── part-7.bin
- │ └── snapshot_IgniteClusterSnapshotSelfTest2
- │ └── cache-txCache
- │ ├── cache_data.dat
- │ ├── part-0.bin
- │ └── part-2.bin
- └── marshaller
-----
-
-=== Limitations
-
-The snapshot procedure has some limitations that you should be aware of before
using it within your production environment:
-
-* taking a snapshot of a particular cache or cache group is not supported;
-* in-memory caches will not be included into a snapshot;
-* encrypted caches are not supported and will be ignored;
-* only one snapshot operation at a time can be initiated;
-* snapshot procedure will be interrupted if any node leaves the cluster;
-* snapshot may be restored only at the same baseline topology with the same
node consistent IDs;
-* the automatic snapshot restore is not available yet, you must copy files
manually;
-
-== Configuration
-
-A configured Ignite native persistence directory requires additional disk
space (up to the current size of this directory).
-This space will be used for storing intermediate snapshot results and cleaned
up after the snapshot operation is completed.
-
-The destination snapshot directory can be configured via `IgniteConfiguration`.
-
-[tabs]
---
-tab:XML[]
-
-[source, xml]
-----
-include::code-snippets/xml/snapshots.xml[tags=ignite-config;!discovery,
indent=0]
-
-----
-
-tab:Java[]
-
-[source, java]
-----
-include::{javaCodeDir}/Snapshots.java[tags=config, indent=0]
-
-----
-
-tab:C#/.NET[]
-tab:C++[]
---
-
-== Snapshot creation
-
-Ignite provides the ability to start a snapshot operation from the following
interfaces:
-
-- link:#command-line[Command line]
-- link:#jmx[JMX]
-- link:#java-api[Java API]
-
-=== Command line
-
-Ignite ships `control.(sh|bat)` scripts, located in the `$IGNITE_HOME/bin`
directory, that act like a tool to
-start or cancel snapshot operation from the command line. The following
commands can be used with `control.(sh|bat)`:
-
-[source,shell]
-----
-#Create cluster snapshot:
-control.(sh|bat) --snapshot create snapshot_name
-
-#Cancel running snapshot:
-control.(sh|bat) --snapshot cancel snapshot_name
-
-#Kill running snapshot:
-control.(sh|bat) --kill SNAPSHOT snapshot_name
-----
-
-=== JMX
-
-You can start snapshot operation or cancel currently running one via the
`SnapshotMXBean` interface:
-
-[cols="1,1",opts="header"]
-|===
-|Method | Description
-|createSnapshot(String snpName) | Create cluster-wide snapshot.
-|cancelSnapshot(String snpName) | Cancel started cluster-wide snapshot on the
node initiator.
-|===
-
-=== Java API
-
-The snapshot operation can be started using Java API:
-
-[tabs]
---
-tab:Java[]
-
-[source, java]
-----
-include::{javaCodeDir}/Snapshots.java[tags=create, indent=0]
-----
---
-
-== Restoring from a snapshot (manual)
-
-[NOTE]
-Removing data from the $IGNITE_HOME/work directory performed at your own risk.
-
-Currently, the data restore procedure is fully manual. Since the snapshot is a
consistent copy of all data files for each node,
-in order to recover from the particular snapshot, you need to stop the cluster
and start every cluster node swapping its
-current data from the `{IGNITE_WORK_DIR}/db` directory with the data from the
snapshot
-(see link:#snapshot-structure[Snapshot Structure] for details). Also, you
should take into account if the `storagePath`
-has been overridden by your DataStorageConfiguration object
-(see
link:persistence/native-persistence#configuring-persistent-storage-directory[Configuring
Persistent Storage Directory]).
-
-To restore a cluster from a snapshot, the steps below should be performed.
-
-- Stop the cluster you are intending to restore;
-- Remove files related to cluster-recovery procedure: `wal` and `cp`
directories;
-- Locate the snapshot you've created by name;
-
-After the cluster has been stopped for each node id in the cluster you should
do the following:
-
-- Remove all the data from `$IGNITE_HOME/work/{node_id}` where `{node_id}` is
a consistent id of a node you work with;
-- Copy all snapshot files from the snapshot related to `{node_id}` to the
$IGNITE_HOME/work/{node_id} directory;
-
-*Restore on different baseline*
-
-Some use cases at production deployments may require taking a snapshot at the
m-node cluster and applying it to the
-n-node cluster. Here are some options:
-
-[cols="1,1",opts="header"]
-|===
-|Condition | Description
-|m == n | The *recommended* case. Use the same baseline and configuration.
-|m < n | Start the m-node cluster from the snapshot and add additional nodes
to the started baseline. The process
-will require the cluster to be rebalanced and SQL indexes to be rebuilt. These
operations may take a long time.
-|m > n | Currently, it is not supported.
-|===
diff --git a/docs/_docs/persistence/snapshots.adoc
b/docs/_docs/persistence/snapshots.adoc
new file mode 100644
index 0000000..5f3292d
--- /dev/null
+++ b/docs/_docs/persistence/snapshots.adoc
@@ -0,0 +1,188 @@
+= Cluster Snapshots
+
+== Overview
+
+Ignite provides an ability to create full cluster snapshots for deployments
using
+link:persistence/native-persistence[Ignite Persistence]. An Ignite snapshot
includes a consistent cluster-wide copy of
+all data records persisted on disk and some other files needed for a recovery
procedure.
+
+The snapshot structure is similar to the layout of the
+link:persistence/native-persistence#configuring-persistent-storage-directory[Ignite
Persistence storage directory],
+with several exceptions. Let's take this snapshot as an example to review the
structure:
+[source,shell]
+----
+work
+└── snapshots
+ └── backup23012020
+ ├── binary_meta
+ │ ├── node1
+ │ ├── node2
+ │ └── node3
+ ├── db
+ │ ├── node1
+ │ │ └── my-sample-cache
+ │ │ ├── cache_data.dat
+ │ │ ├── part-3.bin
+ │ │ ├── part-4.bin
+ │ │ └── part-6.bin
+ │ ├── node2
+ │ │ └── my-sample-cache
+ │ │ ├── cache_data.dat
+ │ │ ├── part-1.bin
+ │ │ ├── part-5.bin
+ │ │ └── part-7.bin
+ │ └── node3
+ │ └── my-sample-cache
+ │ ├── cache_data.dat
+ │ ├── part-0.bin
+ │ └── part-2.bin
+ └── marshaller
+----
+* The snapshot is located under the `work\snapshots` directory and named as
`backup23012020` where `work` is Ignite's work
+directory.
+* The snapshot is created for a 3-node cluster with all the nodes running on
the same machine. In this example,
+the nodes are named as `node1`, `node2`, and `node3`, while in practice, the
names are equal to nodes'
+link:https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood#IgnitePersistentStoreunderthehood-SubfoldersGeneration[consistent
IDs].
+* The snapshot keeps a copy of the `my-sample-cache` cache.
+* The `db` folder keeps a copy of data records in `part-N.bin` and
`cache_data.dat` files. Write-ahead and checkpointing
+are not added into the snapshot as long as those are not required for the
current recovery procedure.
+* The `binary_meta` and `marshaller` directories store metadata and
marshaller-specific information needed for the recovery.
+
+[NOTE]
+====
+[discrete]
+=== Usually Snapshot is Spread Across the Cluster
+
+The previous example shows the snapshot created for the cluster running on the
same physical machine. Thus, the whole
+snapshot is located in a single place. While in practice, all the nodes will
be running on different machines having the
+snapshot data spread across the cluster. Each node keeps a segment of the
snapshot with the data belonging to this particular node.
+The link:persistence/snapshots#restoring-from-snapshot[restore procedure]
explains how to tether together all the segments during recovery.
+====
+
+== Configuring Snapshot Directory
+
+By default, a segment of the snapshot is stored in the work directory of a
respective Ignite node and uses the same storage
+media where Ignite Persistence keeps data, index, WAL, and other files. Since
the snapshot can consume as much space as
+already taken by the persistence files and can affect your applications'
performance by sharing the disk I/O with the
+Ignite Persistence routines, it's suggested to store the snapshot and
persistence files on different media.
+
+You can avoid this interference between Ignite Native persistence and
snapshotting
+by either changing
link:persistence/native-persistence#configuring-persistent-storage-directory[storage
directories of the persistence files]
+or overriding the default snapshots' location as shown below:
+[tabs]
+--
+tab:XML[]
+[source, xml]
+----
+include::code-snippets/xml/snapshots.xml[tags=ignite-config;!discovery,
indent=0]
+----
+tab:Java[]
+[source, java]
+----
+include::{javaCodeDir}/Snapshots.java[tags=config, indent=0]
+----
+--
+
+== Creating Snapshot
+
+Ignite provides several APIs for the snapshot creation. Let's review all the
options.
+
+=== Using Control Script
+
+Ignite ships the link:control-script[control script] that supports
snapshots-related commands listed below:
+
+[source,shell]
+----
+#Create a cluster snapshot:
+control.(sh|bat) --snapshot create snapshot_name
+
+#Cancel a running snapshot:
+control.(sh|bat) --snapshot cancel snapshot_name
+
+#Kill a running snapshot:
+control.(sh|bat) --kill SNAPSHOT snapshot_name
+----
+
+=== Using JMX
+
+Use the `SnapshotMXBean` interface to perform the snapshot-specific procedures
via JMX:
+
+[cols="1,1",opts="header"]
+|===
+|Method | Description
+|createSnapshot(String snpName) | Create a snapshot.
+|cancelSnapshot(String snpName) | Cancel a snapshot on the node initiated its
creation.
+|===
+
+=== Using Java API
+
+Also, it's possible to create a snapshot programmatically in Java:
+
+[tabs]
+--
+tab:Java[]
+
+[source, java]
+----
+include::{javaCodeDir}/Snapshots.java[tags=create, indent=0]
+----
+--
+
+== Restoring From Snapshot
+
+Currently, the data restore procedure has to be performed manually. In a
nutshell, you need to stop the cluster,
+replace persistence data and other files with the data from the snapshot, and
restart the nodes. The detailed procedure
+looks as follows:
+
+. Stop the cluster you intend to restore
+. Do the following on each node:
+ - Remove all the files and directories under the
`$IGNITE_HOME/work/{node_id}` directory. Clean the
+link:link:persistence/native-persistence#configuring-persistent-storage-directory[`db/{node_id}`]
directory separately if
+it's not located under the Ignite `work` dir.
+ - Copy the files of belonging to a node with the `{node_id}` from the
snapshot into the `$IGNITE_HOME/work/{node_id}` directory.
+If the `db/{node_id}` directory is not located under the Ignite `work` dir
then you need to copy data files there.
+. Restart the cluster
+
+*Restore On Cluster of Different Topology*
+
+Sometimes you might want to create a snapshot of an N-node cluster and use it
to restore on an M-node cluster. The table
+below explains what options are supported:
+
+[cols="1,1",opts="header"]
+|===
+|Condition | Description
+|N == M | The *recommended* case. Create and use the snapshot on clusters of a
similar topology.
+|N < M | Start the first N nodes of the M-node cluster and apply the snapshot.
Add the rest of the M-cluster nodes to
+the topology and wait while the data gets rebalanced and indexes are rebuilt.
+|N > M | Unsupported.
+|===
+
+== Consistency Guarantees
+
+All snapshots are fully consistent in terms of concurrent cluster-wide
operations as well as ongoing changes with Ignite
+Persistence and other files on nodes.
+
+The cluster-wide snapshot consistency is achieved by triggering the
link:https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood[Partition-Map-Exchange]
+procedure. By doing that, the cluster will eventually get to the point in time
when all previously started transactions are completed, and new
+ones are paused. Once this happens, the cluster initiates the snapshot
creation procedure.
+
+The consistency between the primary Ignite Persistence files and their
snapshot copies is achieved by copying the primary
+files to the destination snapshot directory with tracking all concurrent
ongoing changes. The tracking of the changes
+might require extra space on your storage media.
+
+
+== Current Limitations
+
+The snapshot procedure has some limitations that you should be aware of before
using the feature in your production environment:
+
+* Snapshotting of specific caches/tables is unsupported. You always create a
full cluster snapshot.
+* Incremental snapshots are not supported.
+* Caches/tables that are not persisted in Ignite Persistence are not added
into the snapshot.
+* Encrypted caches are not included in the snapshot.
+* You can have only one snapshotting operation running at a time.
+* The snapshot procedure is interrupted if the cluster topology changes (new
node join the cluster or some leave it).
+* Snapshot may be restored only at the same cluster topology with the same
node IDs;
+* The automatic restore procedure is not available yet. You have to restore it
manually.
+
+If any of these limitations prevent you from using Apache Ignite, then select
alternate snapshotting implementations for
+Ignite provided by enterprise vendors.