Github user sraghunandan commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2576#discussion_r207074600
--- Diff: docs/s3-guide.md ---
@@ -0,0 +1,64 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to you under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+#S3 Guide (Alpha Feature 1.4.1)
+S3 is an Object Storage API on cloud, it is recommended for storing large
data files. You can use
+this feature if you want to store data on Amazon cloud or Huawei
cloud(OBS).
+Since the data is stored on to cloud there are no restrictions on the size
of data and the data can be accessed from anywhere at any time.
+Carbondata can support any Object Storage that conforms to Amazon S3 API.
+
+#Writing to Object Storage
+To store carbondata files on to Object Store location, you need to set
`carbon
+.storelocation` property to Object Store path in CarbonProperties file.
For example, carbon
+.storelocation=s3a://mybucket/carbonstore. By setting this property, all
the tables will be created on the specified Object Store path.
+
+If your existing store is HDFS, and you want to store specific tables on
S3 location, then `location` parameter has to be set during create
+table.
+For example:
+
+```
+CREATE TABLE IF NOT EXISTS db1.table1(col1 string, col2 int) STORED AS
carbondata LOCATION 's3a://mybucket/carbonstore'
+```
+
+For more details on create table, Refer
[data-management-on-carbondata](https://github.com/apache/carbondata/blob/master/docs/data-management-on-carbondata.md#create-table)
+
+#Authentication
+You need to set authentication properties to store the carbondata files on
to S3 location. For
+more details on authentication properties, refer
+[hadoop authentication
document](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#Authentication_properties)
+
+Another way of setting the authentication parameters is as follows:
+
+```
+ SparkSession
+ .builder()
+ .master(masterURL)
+ .appName("S3Example")
+ .config("spark.driver.host", "localhost")
+ .config("spark.hadoop.fs.s3a.access.key", "1111")
+ .config("spark.hadoop.fs.s3a.secret.key", "2222")
+ .config("spark.hadoop.fs.s3a.endpoint", "1.1.1.1")
+ .getOrCreateCarbonSession()
+```
+
+#Recommendations
+1. Object Storage like S3 does not support file leasing
mechanism(supported by HDFS) that is
+required to take locks which ensure consistency between concurrent
operations therefore, it is
+recommended to set the configurable lock path
property([carbon.lock.path](https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md#miscellaneous-configuration))
+ to a HDFS directory.
+2. As Object Storage are eventual consistent meaning that any put request
can take some time to
--- End diff --
Concurrent data manipulation operations are not supported. object stores
follow eventual consistency semantics,ie.,any put request might take some time
to reflect when trying to list.This behaviour causes not to ensure the data
read is always consistent or latest.
---