Re: S3DataStore leverage Cross-Region Replication

2015-11-22 Thread Bruce Edge
Maybe not a bottleneck, but for large data sets would it reduce the data 
transfer costs as the inter-s3 bucket transfer would use AWS’ internal links 
rather than route over external interfaces.

-Bruce



> Has this been tested already ? Generally, wdyt ?
No. I suggest to first test cross geo mongo deployment with single S3
bucket. There shouldn't be functional issue in using single S3 bucket. Few
customers use single shared S3 bucket between non-clustered cross-geo
jackrabbit2 repositories in production.


Sure, adding more complexity only make sense if we can demonstrate this is
a bottleneck.

Regards,

Timothee




RE: S3DataStore leverage Cross-Region Replication

2015-06-30 Thread Shashank Gupta
Yes Michael. It would be in OAK's BlobStore

Thanks,
-shashank

-Original Message-
From: Michael Marth [mailto:mma...@adobe.com] 
Sent: Tuesday, June 30, 2015 4:50 PM
To: oak-dev@jackrabbit.apache.org
Subject: Re: S3DataStore leverage Cross-Region Replication

Shashank,

In case we think it’s needed to implement multiple chained S3 DSs then I think 
we should model it after Jackrabbit’s Multidatastore which allows arbitrary DS 
implementations to be chained:
http://jackrabbit.510166.n4.nabble.com/MultiDataStore-td4655772.html

Michael




On 30/06/15 12:11, Shashank Gupta shgu...@adobe.com wrote:

Hi Tim,
There is no time bound SLA provided by AWS when a given binary would be 
successfully replicated to destination S3 bucket.  There would be cases of 
missing binaries if mongo nodes sync faster than S3 replication.  Also S3 
replication works between a given pair of buckets. So one S3 bucket can 
replicate to a single S3 destination bucket. 

I think we can implement a tiered S3Datastore which writes/reads to/from 
multiple S3 buckets. The tiered S3DS first tries to read from same-region 
bucket and if not found than fallback to cross-geo buckets. 

 Has this been tested already ? Generally, wdyt ?
No. I suggest to first test cross geo mongo deployment with single S3 bucket. 
There shouldn't be functional issue in using single S3 bucket. Few customers 
use single shared S3 bucket between non-clustered cross-geo jackrabbit2 
repositories in production. 

Thanks,
-shashank




-Original Message-
From: maret.timot...@gmail.com [mailto:maret.timot...@gmail.com] On 
Behalf Of Timothée Maret
Sent: Monday, June 29, 2015 4:05 PM
To: oak-dev@jackrabbit.apache.org
Subject: S3DataStore leverage Cross-Region Replication

Hi,

In a cross region setup using the S3 data store, it may make sense to leverage 
the Cross-Region auto replication of S3 buckets [0,1].

In order to avoid data replication issues it would make sense IMO to allow 
configuring the S3DataStore with two S3 buckets, one for writing and one for 
reading.
The writing bucket would be shared among all instance (from all regions) while 
the reading bucket would be in each region (thus decreasing the latency).
The writing bucket would auto replicate to the reading buckets.

Has this been tested already ? Generally, wdyt ?

Regards,

Timothee



[0]
https://aws.amazon.com/blogs/aws/new-cross-region-replication-for-amazo
n-s3/ [1] https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html


Re: S3DataStore leverage Cross-Region Replication

2015-06-30 Thread Timothée Maret
Hi Shashank,

Thanks for this.

2015-06-30 12:11 GMT+02:00 Shashank Gupta shgu...@adobe.com:

 Hi Tim,
 There is no time bound SLA provided by AWS when a given binary would be
 successfully replicated to destination S3 bucket.

There would be cases of missing binaries if mongo nodes sync faster than S3
 replication.


Yes, this would be expected, until the buckets replicate.


   Also S3 replication works between a given pair of buckets. So one S3
 bucket can replicate to a single S3 destination bucket.


Yes, the setup would be limited to two regions.



 I think we can implement a tiered S3Datastore which writes/reads to/from
 multiple S3 buckets. The tiered S3DS first tries to read from same-region
 bucket and if not found than fallback to cross-geo buckets.


Great, although I see it would be valuable in a limited set of use cases
(only two regions involved).



  Has this been tested already ? Generally, wdyt ?
 No. I suggest to first test cross geo mongo deployment with single S3
 bucket. There shouldn't be functional issue in using single S3 bucket. Few
 customers use single shared S3 bucket between non-clustered cross-geo
 jackrabbit2 repositories in production.


Sure, adding more complexity only make sense if we can demonstrate this is
a bottleneck.

Regards,

Timothee



 Thanks,
 -shashank




 -Original Message-
 From: maret.timot...@gmail.com [mailto:maret.timot...@gmail.com] On
 Behalf Of Timothée Maret
 Sent: Monday, June 29, 2015 4:05 PM
 To: oak-dev@jackrabbit.apache.org
 Subject: S3DataStore leverage Cross-Region Replication

 Hi,

 In a cross region setup using the S3 data store, it may make sense to
 leverage the Cross-Region auto replication of S3 buckets [0,1].

 In order to avoid data replication issues it would make sense IMO to allow
 configuring the S3DataStore with two S3 buckets, one for writing and one
 for reading.
 The writing bucket would be shared among all instance (from all regions)
 while the reading bucket would be in each region (thus decreasing the
 latency).
 The writing bucket would auto replicate to the reading buckets.

 Has this been tested already ? Generally, wdyt ?

 Regards,

 Timothee



 [0]

 https://aws.amazon.com/blogs/aws/new-cross-region-replication-for-amazon-s3/
 [1] https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html



RE: S3DataStore leverage Cross-Region Replication

2015-06-30 Thread Shashank Gupta
Hi Tim,
There is no time bound SLA provided by AWS when a given binary would be 
successfully replicated to destination S3 bucket.  There would be cases of 
missing binaries if mongo nodes sync faster than S3 replication.  Also S3 
replication works between a given pair of buckets. So one S3 bucket can 
replicate to a single S3 destination bucket. 

I think we can implement a tiered S3Datastore which writes/reads to/from 
multiple S3 buckets. The tiered S3DS first tries to read from same-region 
bucket and if not found than fallback to cross-geo buckets. 

 Has this been tested already ? Generally, wdyt ?
No. I suggest to first test cross geo mongo deployment with single S3 bucket. 
There shouldn't be functional issue in using single S3 bucket. Few customers 
use single shared S3 bucket between non-clustered cross-geo jackrabbit2 
repositories in production. 

Thanks,
-shashank




-Original Message-
From: maret.timot...@gmail.com [mailto:maret.timot...@gmail.com] On Behalf Of 
Timothée Maret
Sent: Monday, June 29, 2015 4:05 PM
To: oak-dev@jackrabbit.apache.org
Subject: S3DataStore leverage Cross-Region Replication

Hi,

In a cross region setup using the S3 data store, it may make sense to leverage 
the Cross-Region auto replication of S3 buckets [0,1].

In order to avoid data replication issues it would make sense IMO to allow 
configuring the S3DataStore with two S3 buckets, one for writing and one for 
reading.
The writing bucket would be shared among all instance (from all regions) while 
the reading bucket would be in each region (thus decreasing the latency).
The writing bucket would auto replicate to the reading buckets.

Has this been tested already ? Generally, wdyt ?

Regards,

Timothee



[0]
https://aws.amazon.com/blogs/aws/new-cross-region-replication-for-amazon-s3/
[1] https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html


Re: S3DataStore leverage Cross-Region Replication

2015-06-30 Thread Michael Marth
Shashank,

In case we think it’s needed to implement multiple chained S3 DSs then I think 
we should model it after Jackrabbit’s Multidatastore which allows arbitrary DS 
implementations to be chained:
http://jackrabbit.510166.n4.nabble.com/MultiDataStore-td4655772.html

Michael




On 30/06/15 12:11, Shashank Gupta shgu...@adobe.com wrote:

Hi Tim,
There is no time bound SLA provided by AWS when a given binary would be 
successfully replicated to destination S3 bucket.  There would be cases of 
missing binaries if mongo nodes sync faster than S3 replication.  Also S3 
replication works between a given pair of buckets. So one S3 bucket can 
replicate to a single S3 destination bucket. 

I think we can implement a tiered S3Datastore which writes/reads to/from 
multiple S3 buckets. The tiered S3DS first tries to read from same-region 
bucket and if not found than fallback to cross-geo buckets. 

 Has this been tested already ? Generally, wdyt ?
No. I suggest to first test cross geo mongo deployment with single S3 bucket. 
There shouldn't be functional issue in using single S3 bucket. Few customers 
use single shared S3 bucket between non-clustered cross-geo jackrabbit2 
repositories in production. 

Thanks,
-shashank




-Original Message-
From: maret.timot...@gmail.com [mailto:maret.timot...@gmail.com] On Behalf Of 
Timothée Maret
Sent: Monday, June 29, 2015 4:05 PM
To: oak-dev@jackrabbit.apache.org
Subject: S3DataStore leverage Cross-Region Replication

Hi,

In a cross region setup using the S3 data store, it may make sense to leverage 
the Cross-Region auto replication of S3 buckets [0,1].

In order to avoid data replication issues it would make sense IMO to allow 
configuring the S3DataStore with two S3 buckets, one for writing and one for 
reading.
The writing bucket would be shared among all instance (from all regions) while 
the reading bucket would be in each region (thus decreasing the latency).
The writing bucket would auto replicate to the reading buckets.

Has this been tested already ? Generally, wdyt ?

Regards,

Timothee



[0]
https://aws.amazon.com/blogs/aws/new-cross-region-replication-for-amazon-s3/
[1] https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html


S3DataStore leverage Cross-Region Replication

2015-06-29 Thread Timothée Maret
Hi,

In a cross region setup using the S3 data store, it may make sense to
leverage the Cross-Region auto replication of S3 buckets [0,1].

In order to avoid data replication issues it would make sense IMO to allow
configuring the S3DataStore with two S3 buckets, one for writing and one
for reading.
The writing bucket would be shared among all instance (from all regions)
while the reading bucket would be in each region (thus decreasing the
latency).
The writing bucket would auto replicate to the reading buckets.

Has this been tested already ? Generally, wdyt ?

Regards,

Timothee



[0]
https://aws.amazon.com/blogs/aws/new-cross-region-replication-for-amazon-s3/
[1] https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html