Author: amitj
Date: Thu Jul 21 06:24:56 2016
New Revision: 1753641
URL: http://svn.apache.org/viewvc?rev=1753641&view=rev
Log:
OAK-301: Document Oak
Documentation for blob GC #checkConsistency and #getGlobalGCStatus
Modified:
jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md
Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md
URL:
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md?rev=1753641&r1=1753640&r2=1753641&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md
(original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md Thu Jul
21 06:24:56 2016
@@ -144,6 +144,9 @@ The garbage collection can be triggered
* `MarkSweepGarbageCollector#collectGarbage()` (Oak 1.0.x)
* `MarkSweepGarbageCollector#collectGarbage(false)` (Oak 1.2.x)
+* If the MBeans are registered in the MBeanServer then the following can also
be used to trigger GC:
+ * `BlobGC#startBlobGC()` which takes in a `markOnly` boolean parameter to
indicate mark only or complete gc
+
#### Shared DataStore Blob Garbage Collection (Since 1.2.0)
@@ -175,6 +178,105 @@ The shared DataStore garbage collection
* SharedS3DataStore - Extends the S3DataStore to enable sharing of the data
store with
multiple repositories
+##### Checking GC status for Shared DataStore Garbage Collection
+
+The status of the GC operations on all the repositories connected to the
DataStore can be checked by calling:
+
+* `MarkSweepGarbageCollector#getStats()` which returns a list of
`GarbageCollectionRepoStats` objects having the following fields:
+ * repositoryId - The repositoryId of the repository
+ * local - Indicates whether the repositoryId is of local instance where
the operation ran
+ * startTime - Start time of the mark operation on the repository
+ * endTime - End time of the mark operation on the repository
+ * length - Size of the references file created
+ * numLines - Number of references available
+* If the MBeans are registered in the MBeanServer then the following can also
be used to retrieve the status:
+ * `BlobGC#getBlobGCStatus()` which returns a CompositeData with the above
fields.
+
+This operation can also be used to ascertain when the 'Mark' phase has
executed successfully on all the repositories, as part of the steps to automate
the GC in the Shared DataStore configuration.
+It should be a sufficient condition to check that the references file is
available on all repositories.
+If the server running Oak has remote JMX connection enabled the following code
example can be used to connect remotely and check if the mark phase has
concluded on all repository instances.
+
+
+```java
+import java.util.Hashtable;
+
+import javax.management.openmbean.TabularData;
+import javax.management.MBeanServerConnection;
+import javax.management.MBeanServerInvocationHandler;
+import javax.management.ObjectName;
+import javax.management.remote.JMXConnectorFactory;
+import javax.management.remote.JMXServiceURL;
+import javax.management.openmbean.CompositeData;
+
+
+/**
+ * Checks the status of the mark operation on all instances sharing the
DataStore.
+ */
+public class GetGCStats {
+
+ public static void main(String[] args) throws Exception {
+ String userid = "<user>";
+ String password = "<password>";
+ String serverUrl = "service:jmx:rmi:///jndi/rmi://<host:port>/jmxrmi";
+ String OBJECT_NAME = "org.apache.jackrabbit.oak:name=Document node
store blob garbage collection,type=BlobGarbageCollection";
+ String[] buffer = new String[] {userid, password};
+ Hashtable<String, String[]> attributes = new Hashtable<String,
String[]>();
+ attributes.put("jmx.remote.credentials", buffer);
+ MBeanServerConnection server = JMXConnectorFactory
+ .connect(new JMXServiceURL(serverUrl),
attributes).getMBeanServerConnection();
+ ObjectName name = new ObjectName(OBJECT_NAME);
+ BlobGCMBean gcBean = MBeanServerInvocationHandler
+ .newProxyInstance(server, name, BlobGCMBean.class, false);
+
+ boolean markDone = checkMarkDone("GlobalMarkStats",
gcBean.getGlobalMarkStats());
+ System.out.println("Mark done on all instances - " + markDone);
+ }
+
+ public static boolean checkMarkDone(String operation, TabularData data) {
+ System.out.println("-----Operation " + operation + "--------------");
+
+ boolean markDoneOnOthers = true;
+ try {
+ System.out.println("Number of instances " + data.size());
+
+ for (Object o : data.values()) {
+ CompositeData row = (CompositeData) o;
+ String repositoryId = row.get("repositoryId").toString();
+ System.out.println("Repository " + repositoryId);
+
+ if ((!row.containsKey("markEndTime")
+ || row.get("markEndTime") == null
+ || row.get("markEndTime").toString().length() == 0)) {
+ markDoneOnOthers = false;
+ System.out.println("Mark not done on repository : " +
repositoryId);
+ }
+ }
+ } catch (Exception e) {
+ System.out.println(
+ "-----Error during operation " + operation + "--------------"
+ e.getMessage());
+ }
+ System.out.println("-----Completed " + operation + "--------------");
+
+ return markDoneOnOthers;
+ }
+}
+```
+
+#### Consistency Check
+The data store consistency check will report any data store binaries that are
missing but are still referenced. The consistency check can be triggered by:
+
+* `MarkSweepGarbageCollector#checkConsistency`
+* If the MBeans are registered in the MBeanServer then the following can also
be used:
+ * `BlobGCMbean#checkConsistency`
+
+After the consistency check is complete, a message will show the number of
binaries reported as missing. If the number is greater than 0, check the logs
configured for
`org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector` for more
details on the missing binaries.
+
+Below is an example of how the missing binaries are reported in the logs:
+>
+> 11:32:39.673 INFO [main] MarkSweepGarbageCollector.java:600 Consistency
check found [1] missing blobs
+> 11:32:39.673 WARN [main] MarkSweepGarbageCollector.java:602 Consistency
check failure in the the blob store : DataStore backed BlobStore
[org.apache.jackrabbit.oak.plugins.blob.datastore.OakFileDataStore], check
missing candidates in file /tmp/gcworkdir-1467352959243/gccand-1467352959243
+
+
[1]:
http://serverfault.com/questions/52861/how-does-dropbox-version-upload-large-files
[2]: http://wiki.apache.org/jackrabbit/DataStore