Hi,

I'm setting up a TarMK cold standby for a repository for the first time, and 
have a couple of questions regarding synchronization and administration.  I've 
included the configuration and current dump of the primary and standby MBeans 
below.  The primary and standby are in peered VPCs in AWS, using a shared S3 
bucket for blob storage.


1.) I'm curious as to how long I should expect to wait for the standby to 
establish synchronization.  How much data gets moved over the wire?  I'm seeing 
a steady stream of read cache invalidations on the standby - does this mean 
that all of the blob data must be transferred, even though the two repositories 
use shared storage?

2.) I see in the logs a period where there are read cache invalidations, and 
then there is a 12 hour period where nothing is logged, followed by a 
"org.apache.jackrabbit.oak.plugins.segment.standby.client.SegmentLoaderHandler 
timeout" message.  The quiet period is consistent with my setting 
standby.readtimeout=I"43200000".  Would it make sense to choose a shorter 
timeout to lessen the impact of occasional network issues?  At what point might 
the timeout value be "too short"?

3.) Is there a definitive way to know that the standby is synced?  The 
SyncEndTimestamp value below corresponds to 2016-11-02T09:26:18+00:00, which 
corresponds exactly to the timestamp of the "SegmentLoaderHandler timeout" 
message.  This suggests that this value doesn't really tell me that the standby 
is synchronized.  When I tried with small repositories, it appears that 
synchronization was done when the tarmk.log file started outputting the same 
repository head every 5 seconds ("interval" setting).

4.) Assuming that the standby eventually becomes synchronized, is there a 
documented procedure by which I could "split the mirror"; that is, convert the 
standby into an new, independent primary containing a replica of the original?  
If the current primary and standby are referring to S3 bucket "P", could I shut 
down both instances, copy the contents of bucket "P" to a new bucket "S", 
update the standby Oak S3 configuration to refer to the new bucket "S", and 
restart what was the standby as a new primary?  Are there other steps I would 
need to take?


Thanks!  John


CONFIG VALUES FOR BOTH INSTANCES


STANDBY CONFIG:


/var/lib/sling/install/install.standby/org.apache.jackrabbit.oak.plugins.segment.standby.store.StandbyStoreService.config:
org.apache.sling.installer.configuration.persist=B"false"
port=I"8023"
secure=B"true"
mode="standby"
primary.host="john-proto.dev"
interval=I"5"
standby.readtimeout=I"43200000"


PRIMARY CONFIG:


/var/lib/sling/install/install.primary/org.apache.jackrabbit.oak.plugins.segment.standby.store.StandbyStoreService.config
org.apache.sling.installer.configuration.persist=B"false"
port=I"8023"
secure=B"true"
mode="primary"
primary.allowed-client-ip-ranges=["0.0.0.0-255.255.255.255"]


OAK S3 CONFIG:


/var/lib/sling/install/oak_s3/org.apache.jackrabbit.oak.plugins.blob.datastore.SharedS3DataStore.config:
accessKey=""
secretKey=""
s3Bucket="my-primary-bucket"
s3Region="us-west-2"
s3EndPoint="s3-us-west-2.amazonaws.com"
connectionTimeout="120000"
socketTimeout="120000"
maxConnections="40"
writeThreads="30"
maxErrorRetry="10"


JMX MBEANS


STANDBY:


#mbean = 
org.apache.jackrabbit.oak:id="fa2b9a7c-fc69-4a0c-aa7e-b0cfc61bd1c6",name=Status,type="Standby":
FailedRequests = 0;

SecondsSinceLastSuccess = 24269;

SyncStartTimestamp = 1478021232280;

SyncEndTimestamp = 1478078778813;

Status = running;

Running = true;

Mode = client: fa2b9a7c-fc69-4a0c-aa7e-b0cfc61bd1c6;


PRIMARY:

#mbean = org.apache.jackrabbit.oak:id=8023,name=Status,type="Standby":
Status = got message;

Running = true;

Mode = primary;

#mbean = org.apache.jackrabbit.oak:id="Client 
fa2b9a7c-fc69-4a0c-aa7e-b0cfc61bd1c6",name=Status,type="Standby":
RemotePort = 44322;

RemoteAddress = 10.16.12.44;

LastSeenTimestamp = Wed Nov 02 13:48:59 UTC 2016;

TransferredSegments = 186780;

TransferredSegmentBytes = 1198693232;

TransferredBinaries = 5579;

TransferredBinariesBytes = 170312256398;

LastRequest = b.678851bb77bec68db82c6bda37aca8e763d8a32e#655084301;

Name = fa2b9a7c-fc69-4a0c-aa7e-b0cfc61bd1c6;

Reply via email to