Author: frm
Date: Tue Dec 13 09:47:32 2016
New Revision: 1773935

URL: http://svn.apache.org/viewvc?rev=1773935&view=rev
Log:
OAK-5274 - Document CLI tools for oak-segment-tar

Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md

Modified: 
jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md?rev=1773935&r1=1773934&r2=1773935&view=diff
==============================================================================
--- 
jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md 
(original)
+++ 
jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/segment/overview.md 
Tue Dec 13 09:47:32 2016
@@ -41,6 +41,14 @@ Oak Segment Tar is an implementation of
             * [What happened during cleanup?](#what-happened-during-cleanup)
         * [Monitoring via JMX](#monitoring-via-jmx)
             * 
[SegmentRevisionGarbageCollection](#SegmentRevisionGarbageCollection)
+* [Tools](#tools)
+    * [Backup](#backup)
+    * [Restore](#restore)
+    * [Check](#check)
+    * [Compact](#compact)
+    * [Debug](#debug)
+    * [Diff](#diff)
+    * [History](#history)
 * [Design](#design)
 
 ## <a name="garbage-collection"/> Garbage Collection
@@ -463,6 +471,164 @@ If garbage collection is not running, th
 Start garbage collection.
 If garbage collection is already running, this operation has no effect.
 
+## <a name="tools"/> Tools
+
+Oak Segment Tar exposes a number of command line tools that can be used to 
perform different tasks on the repository.
+
+The tools are exposed as sub-commands of [Oak 
Run](https://github.com/apache/jackrabbit-oak/tree/trunk/oak-run).
+The following sections assume that you have built this module or that you have 
a compiled version of it.
+
+### <a name="backup"/> Backup
+
+```
+java -jar oak-run.jar backup ORIGINAL BACKUP 
+```
+
+The `backup` tool performs a backup of a Segment Store `ORIGINAL` and saves it 
to the folder `BACKUP`. 
+`ORIGINAL` must be the path to an existing, valid Segment Store.
+`BACKUP` must be a valid path to a folder on the file system.
+If `BACKUP` doesn't exist, it will be created.
+If `BACKUP` exists, it must be a path to an existing, valid Segment Store.
+
+The tool assumes that the `ORIGINAL` Segment Store doesn't use an external 
Blob Store.
+If this is the case, it's necessary to set the `oak.backup.UseFakeBlobStore` 
system property to `true` on the command line as shown below.
+
+```
+java -Doak.backup.UseFakeBlobStore=true -jar oak-run.jar backup ...
+```
+
+When a backup is performed, if `BACKUP` points to an existing Segment Store, 
only the content that is different from `ORIGINAL` is copied.
+This is similar to an incremental backup performed at the level of the content.
+When an incremental backup is performed, the tool will automatically try to 
cleanup eventual garbage from the `BACKUP` Segment Store.
+
+### <a name="restore"/> Restore
+
+```
+java -jar oak-run.jar restore ORIGINAL BACKUP
+```
+
+The `restore` tool restores the state of the `ORIGINAL` Node Store from a 
previous backup `BACKUP`. 
+This tool is the counterpart of `backup`.
+
+### <a name="check"/> Check
+
+```
+java -jar oak-run.jar check --path PATH [--journal JOURNAL] [--deep [SECS]] 
[--bin [LENGTH]]
+```
+
+The `check` tool inspects an existing Segment Store at `PATH` for eventual 
inconsistencies. 
+The algorithm implemented by this tool traverses every revision in the 
journal, from the most recent to the oldest.
+For every revision, the actual nodes and properties are traversed, verifying 
that every piece of data is reachable and undamaged.
+  
+If the `--journal` option is specified, the tool will use the journal file at 
`JOURNAL` instead of picking up the one contained in `PATH`. 
+`JOURNAL` must be a path to a valid journal file for the Segment Store. 
+
+If the `--deep` option is specified, the tool will perform a deep scan of the 
content tree, traversing every node.
+The optional argument `SECS` is the number of seconds between progress 
information messages.
+If not specified, `SECS` defaults to positive infinity, effectively disabling 
progress information messages.
+If `SECS` is specified to be `0`, every progress information message is 
printed.
+
+If the `--bin` option is specified, the tool will scan the content of binary 
properties, up to the specified length `LENGTH`.
+The default value for `LENGTH` is `0`, effectively disabling the traversal of 
binary properties.
+If `LENGTH` is set to a value greater than `0`, only the initial `LENGTH` 
bytes of binary properties are traversed.
+If `LENGTH` is set to `-1`, binary properties are fully traversed.
+The `--bin` property has no effect on binary properties stored in an external 
Blob Store.
+
+### <a name="compact"/> Compact
+
+```
+java -jar oak-run.jar compact PATH
+```
+
+The `compact` command performs offline compaction on the Segment Store at 
`PATH`. 
+`PATH` must be a valid path to an existing Segment Store. 
+
+### <a name="debug"/> Debug
+
+```
+java -jar oak-run.jar debug PATH
+java -jar oak-run.jar debug PATH ITEMS...
+```
+
+The `debug` command prints diagnostic information about a Segment Store or 
individual Segment Store items.
+
+`PATH` is mandatory and must be a valid path to an existing Segment Store. 
+If only the path is specified - as in the first example above - only general 
debugging information about the Segment Store are printed.
+ 
+`ITEMS` is a sequence of one or more TAR file name, segment ID, node record ID 
or range of node record ID.
+If one or more items are specified - as in the second example above - general 
debugging information about the segment store are not printed.
+Instead, detailed information about the specified items are shown.
+
+A TAR file is specified by its name.
+Every string in `ITEMS` ending in`.tar` is assumed to be a name of a TAR file.
+
+A segment ID is specified by its UUID representation, e.g. 
`333dc24d-438f-4cca-8b21-3ebf67c05856`.
+
+A node record ID is specified by a concatenation of a UUID and a record 
number, e.g. `333dc24d-438f-4cca-8b21-3ebf67c05856:12345`.
+The record ID must point to a valid node record.
+A node record ID can be optionally followed by path, like 
`333dc24d-438f-4cca-8b21-3ebf67c05856:12345/path/to/child`. 
+When a node record ID is provided, the tool will print information about the 
node record pointed by it.
+If a path is specified, the tool will additionally print information about 
every child node identified by that path.
+
+A node record ID range is specified by a pair of record IDs separated by a 
hyphen (`-`), e.g. 
`333dc24d-438f-4cca-8b21-3ebf67c05856:12345-46116fda-7a72-4dbc-af88-a09322a7753a:67890`.
+Both record IDs must point to valid node records.
+The pair of record IDs can be followed by a path, like 
`333dc24d-438f-4cca-8b21-3ebf67c05856:12345-46116fda-7a72-4dbc-af88-a09322a7753a:67890/path/to/child`.
+When a node record ID range is specified, the tool will perform a diff between 
the two nodes pointed by the record IDs, optionally following the provided path.
+The result of the diff will be printed in JSOP format.
+
+### <a name="diff"/> Diff
+
+```
+java -jar oak-run.jar tarmkdiff [--output OUTPUT] --list PATH
+java -jar oak-run.jar tarmkdiff [--output OUTPUT] [--incremental] [--path 
NODE] [--ignore-snfes] --diff REVS PATH
+```
+
+The `diff` command prints content diffs between revisions in the Segment Store 
at `PATH`.
+
+The `--output` option instructs the command to print its output to the file 
`OUTPUT`.
+If this option is not specified, the tool will print to a `.log` file 
augmented with the current timestamp.
+The default file will be saved in the current directory.
+
+If the `--list` option is specified, the command just prints a list of 
revisions available in the Segment Store.
+This is equivalent to the first command line specification in the example 
above.
+
+If the `--list` option is not specified, `tarmkdiff` prints one or more 
content diff between a pair of revisions.
+In this case, the command line specification is the second in the example 
above.
+
+The `--diff` option specifies an interval of revisions `REVS`.
+The interval is specified by a couple of revisions separated by two dots, e.g. 
`333dc24d-438f-4cca-8b21-3ebf67c05856:12345..46116fda-7a72-4dbc-af88-a09322a7753a:67890`.
+In place of any of the two revisions, the placeholder `head` can be used.
+The `head` placeholder is substituted (in a case-insensitive way) to the most 
recent revision in the Segment Store.
+ 
+The `--path` option can be used to restrict the diff to a portion of the 
content tree.
+The value `NODE` must be a valid path in the content tree.
+
+If the flag `--incremental` is specified, the output will contain an 
incremental diff between every pair of successive revisions occurring in the 
interval specified with `--diff`.
+This parameter is useful if you are interested in every change in content 
between every commit that happened in a specified range. 
+
+The `--ignore-snfes` flag can be used in combination with `--incremental` to 
ignore errors that might occur while generating the incremental diff because of 
damaged or too old content.
+If this flag is not specified and an error occurs while generating the 
incremental diff, the tool stops immediately and reports the error.
+
+### <a name="history"/> History
+
+```
+java -jar oak-run.jar history [--journal JOURNAL] [--path NODE] [--depth 
DEPTH] PATH
+```
+
+The `history` command shows how the content of a node or of a sub-tree changed 
over time in the Segment Store at `PATH`.
+
+The history of the node is computed based on the revisions reported by the 
journal in the Segment Store.
+If a different set of revisions needs to be used, it is possible to specify a 
custom journal file by using the `--journal` option.
+If this option is used, `JOURNAL` must be a path to a valid journal file.
+
+The `--path` parameter specifies the node whose history will be printed. 
+If not specified, the history of the root node will be printed.
+`NODE` must be a valid path to a node in the Segment Store.
+
+The `--depth` parameter determines if the content of a single node should be 
printed, or if the content of the sub-tree rooted at that node should be 
printed instead.
+`DEPTH` must be a positive integer specifying how deep the printed content 
should be.
+If this option is not specified, the depth is assumed to be `0`, i.e. only 
information about the node will be printed.
+
 ## <a name="design"/> Design
 
 The Segment Node Store serializes repository data and stores it in a set of 
TAR files.


Reply via email to