[02/39] carbondata-site git commit: Added new page layout & updated as per new md files

chenliang613 Fri, 07 Sep 2018 09:54:02 -0700

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/installation-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/installation-guide.md 
b/src/site/markdown/installation-guide.md
deleted file mode 100644
index f679338..0000000
--- a/src/site/markdown/installation-guide.md
+++ /dev/null
@@ -1,198 +0,0 @@
-<!--
-    Licensed to the Apache Software Foundation (ASF) under one or more 
-    contributor license agreements.  See the NOTICE file distributed with
-    this work for additional information regarding copyright ownership. 
-    The ASF licenses this file to you under the Apache License, Version 2.0
-    (the "License"); you may not use this file except in compliance with 
-    the License.  You may obtain a copy of the License at
-
-      http://www.apache.org/licenses/LICENSE-2.0
-
-    Unless required by applicable law or agreed to in writing, software 
-    distributed under the License is distributed on an "AS IS" BASIS, 
-    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-    See the License for the specific language governing permissions and 
-    limitations under the License.
--->
-
-# Installation Guide
-This tutorial guides you through the installation and configuration of 
CarbonData in the following two modes :
-
-* [Installing and Configuring CarbonData on Standalone Spark 
Cluster](#installing-and-configuring-carbondata-on-standalone-spark-cluster)
-* [Installing and Configuring CarbonData on Spark on YARN 
Cluster](#installing-and-configuring-carbondata-on-spark-on-yarn-cluster)
-
-followed by :
-
-* [Query Execution using CarbonData Thrift 
Server](#query-execution-using-carbondata-thrift-server)
-
-## Installing and Configuring CarbonData on Standalone Spark Cluster
-
-### Prerequisites
-
-   - Hadoop HDFS and Yarn should be installed and running.
-
-   - Spark should be installed and running on all the cluster nodes.
-
-   - CarbonData user should have permission to access HDFS.
-
-
-### Procedure
-
-1. [Build the 
CarbonData](https://github.com/apache/carbondata/blob/master/build/README.md) 
project and get the assembly jar from 
`./assembly/target/scala-2.1x/carbondata_xxx.jar`. 
-
-2. Copy `./assembly/target/scala-2.1x/carbondata_xxx.jar` to 
`$SPARK_HOME/carbonlib` folder.
-
-     **NOTE**: Create the carbonlib folder if it does not exist inside 
`$SPARK_HOME` path.
-
-3. Add the carbonlib folder path in the Spark classpath. (Edit 
`$SPARK_HOME/conf/spark-env.sh` file and modify the value of `SPARK_CLASSPATH` 
by appending `$SPARK_HOME/carbonlib/*` to the existing value)
-
-4. Copy the `./conf/carbon.properties.template` file from CarbonData 
repository to `$SPARK_HOME/conf/` folder and rename the file to 
`carbon.properties`.
-
-5. Repeat Step 2 to Step 5 in all the nodes of the cluster.
-    
-6. In Spark node[master], configure the properties mentioned in the following 
table in `$SPARK_HOME/conf/spark-defaults.conf` file.
-
-| Property | Value | Description |
-|---------------------------------|-----------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
-| spark.driver.extraJavaOptions | `-Dcarbon.properties.filepath = 
$SPARK_HOME/conf/carbon.properties` | A string of extra JVM options to pass to 
the driver. For instance, GC settings or other logging. |
-| spark.executor.extraJavaOptions | `-Dcarbon.properties.filepath = 
$SPARK_HOME/conf/carbon.properties` | A string of extra JVM options to pass to 
executors. For instance, GC settings or other logging. **NOTE**: You can enter 
multiple values separated by space. |
-
-7. Add the following properties in `$SPARK_HOME/conf/carbon.properties` file:
-
-| Property             | Required | Description                                
                                            | Example                           
  | Remark  |
-|----------------------|----------|----------------------------------------------------------------------------------------|-------------------------------------|---------|
-| carbon.storelocation | NO       | Location where data CarbonData will create 
the store and write the data in its own format. If not specified then it takes 
spark.sql.warehouse.dir path. | hdfs://HOSTNAME:PORT/Opt/CarbonStore      | 
Propose to set HDFS directory |
-
-
-8. Verify the installation. For example:
-
-```
-./spark-shell --master spark://HOSTNAME:PORT --total-executor-cores 2
---executor-memory 2G
-```
-
-**NOTE**: Make sure you have permissions for CarbonData JARs and files through 
which driver and executor will start.
-
-To get started with CarbonData : [Quick Start](quick-start-guide.md), [Data 
Management on CarbonData](data-management-on-carbondata.md)
-
-## Installing and Configuring CarbonData on Spark on YARN Cluster
-
-   This section provides the procedure to install CarbonData on "Spark on 
YARN" cluster.
-
-### Prerequisites
-   * Hadoop HDFS and Yarn should be installed and running.
-   * Spark should be installed and running in all the clients.
-   * CarbonData user should have permission to access HDFS.
-
-### Procedure
-
-   The following steps are only for Driver Nodes. (Driver nodes are the one 
which starts the spark context.)
-
-1. [Build the 
CarbonData](https://github.com/apache/carbondata/blob/master/build/README.md) 
project and get the assembly jar from 
`./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to 
`$SPARK_HOME/carbonlib` folder.
-
-    **NOTE**: Create the carbonlib folder if it does not exists inside 
`$SPARK_HOME` path.
-
-2. Copy the `./conf/carbon.properties.template` file from CarbonData 
repository to `$SPARK_HOME/conf/` folder and rename the file to 
`carbon.properties`.
-
-3. Create `tar.gz` file of carbonlib folder and move it inside the carbonlib 
folder.
-
-```
-cd $SPARK_HOME
-tar -zcvf carbondata.tar.gz carbonlib/
-mv carbondata.tar.gz carbonlib/
-```
-
-4. Configure the properties mentioned in the following table in 
`$SPARK_HOME/conf/spark-defaults.conf` file.
-
-| Property | Description | Value |
-|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
-| spark.master | Set this value to run the Spark in yarn cluster mode. | Set 
yarn-client to run the Spark in yarn cluster mode. |
-| spark.yarn.dist.files | Comma-separated list of files to be placed in the 
working directory of each executor. |`$SPARK_HOME/conf/carbon.properties` |
-| spark.yarn.dist.archives | Comma-separated list of archives to be extracted 
into the working directory of each executor. 
|`$SPARK_HOME/carbonlib/carbondata.tar.gz` |
-| spark.executor.extraJavaOptions | A string of extra JVM options to pass to 
executors. For instance  **NOTE**: You can enter multiple values separated by 
space. |`-Dcarbon.properties.filepath = carbon.properties` |
-| spark.executor.extraClassPath | Extra classpath entries to prepend to the 
classpath of executors. **NOTE**: If SPARK_CLASSPATH is defined in 
spark-env.sh, then comment it and append the values in below parameter 
spark.driver.extraClassPath |`carbondata.tar.gz/carbonlib/*` |
-| spark.driver.extraClassPath | Extra classpath entries to prepend to the 
classpath of the driver. **NOTE**: If SPARK_CLASSPATH is defined in 
spark-env.sh, then comment it and append the value in below parameter 
spark.driver.extraClassPath. |`$SPARK_HOME/carbonlib/*` |
-| spark.driver.extraJavaOptions | A string of extra JVM options to pass to the 
driver. For instance, GC settings or other logging. 
|`-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties` |
-
-
-5. Add the following properties in `$SPARK_HOME/conf/carbon.properties`:
-
-| Property | Required | Description | Example | Default Value |
-|----------------------|----------|----------------------------------------------------------------------------------------|-------------------------------------|---------------|
-| carbon.storelocation | NO | Location where CarbonData will create the store 
and write the data in its own format. If not specified then it takes 
spark.sql.warehouse.dir path.| hdfs://HOSTNAME:PORT/Opt/CarbonStore | Propose 
to set HDFS directory|
-
-6. Verify the installation.
-
-```
- ./bin/spark-shell --master yarn-client --driver-memory 1g
- --executor-cores 2 --executor-memory 2G
-```
-  **NOTE**: Make sure you have permissions for CarbonData JARs and files 
through which driver and executor will start.
-
-  Getting started with CarbonData : [Quick Start](quick-start-guide.md), [Data 
Management on CarbonData](data-management-on-carbondata.md)
-
-## Query Execution Using CarbonData Thrift Server
-
-### Starting CarbonData Thrift Server.
-
-   a. cd `$SPARK_HOME`
-
-   b. Run the following command to start the CarbonData thrift server.
-
-```
-./bin/spark-submit
---class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
-$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
-```
-
-| Parameter | Description | Example |
-|---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
-| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the 
`$SPARK_HOME/carbonlib/` folder. | 
carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar |
-| carbon_store_path | This is a parameter to the CarbonThriftServer class. 
This a HDFS path where CarbonData files will be kept. Strongly Recommended to 
put same as carbon.storelocation parameter of carbon.properties. If not 
specified then it takes spark.sql.warehouse.dir path. | 
`hdfs://<host_name>:port/user/hive/warehouse/carbon.store` |
-
-**NOTE**: From Spark 1.6, by default the Thrift server runs in multi-session 
mode. Which means each JDBC/ODBC connection owns a copy of their own SQL 
configuration and temporary function registry. Cached tables are still shared 
though. If you prefer to run the Thrift server in single-session mode and share 
all SQL configuration and temporary function registry, please set option 
`spark.sql.hive.thriftServer.singleSession` to `true`. You may either add this 
option to `spark-defaults.conf`, or pass it to `spark-submit.sh` via `--conf`:
-
-```
-./bin/spark-submit
---conf spark.sql.hive.thriftServer.singleSession=true
---class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
-$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
-```
-
-**But** in single-session mode, if one user changes the database from one 
connection, the database of the other connections will be changed too.
-
-**Examples**
-   
-   * Start with default memory and executors.
-
-```
-./bin/spark-submit
---class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
-$SPARK_HOME/carbonlib
-/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
-hdfs://<host_name>:port/user/hive/warehouse/carbon.store
-```
-   
-   * Start with Fixed executors and resources.
-
-```
-./bin/spark-submit
---class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
---num-executors 3 --driver-memory 20g --executor-memory 250g 
---executor-cores 32 
-/srv/OSCON/BigData/HACluster/install/spark/sparkJdbc/lib
-/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
-hdfs://<host_name>:port/user/hive/warehouse/carbon.store
-```
-  
-### Connecting to CarbonData Thrift Server Using Beeline.
-
-```
-     cd $SPARK_HOME
-     ./sbin/start-thriftserver.sh
-     ./bin/beeline -u jdbc:hive2://<thriftserver_host>:port
-
-     Example
-     ./bin/beeline -u jdbc:hive2://10.10.10.10:10000
-```
-


http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/introduction.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/introduction.md 
b/src/site/markdown/introduction.md
new file mode 100644
index 0000000..8169958
--- /dev/null
+++ b/src/site/markdown/introduction.md
@@ -0,0 +1,172 @@
+## What is CarbonData
+
+CarbonData is a fully indexed columnar and Hadoop native data-store for 
processing heavy analytical workloads and detailed queries on big data with 
Spark SQL. CarbonData allows faster interactive queries over PetaBytes of data.
+
+
+
+## What does this mean
+
+CarbonData has specially engineered optimizations like multi level indexing, 
compression and encoding techniques targeted to improve performance of 
analytical queries which can include filters, aggregation and distinct counts 
where users expect sub second response time for queries on TB level data on 
commodity hardware clusters with just a few nodes.
+
+CarbonData has 
+
+- **Unique data organisation** for faster retrievals and minimise amount of 
data retrieved
+
+- **Advanced push down optimisations** for deep integration with Spark so as 
to improvise the Spark DataSource API and other experimental features thereby 
ensure computing is performed close to the data to minimise amount of data 
read, processed, converted and transmitted(shuffled) 
+
+- **Multi level indexing** to efficiently prune the files and data to be 
scanned and hence reduce I/O scans and CPU processing
+
+
+
+## Architecture
+
+![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_architecture.png)
+
+
+
+#### Spark Interface Layer: 
+
+CarbonData has deep integration with Apache Spark.CarbonData integrates custom 
Parser,Strategies,Optimization rules into Spark to take advantage of computing 
performed closer to data.
+
+![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_spark_integration.png)
+
+1. **Carbon parser** Enhances Sparkâs SQL parser to support Carbon specific 
DDL and DML commands to create carbon table, create aggregate tables, manage 
data loading, data retention and cleanup.
+2. **Carbon Strategies**:- Modify Spark SQLâs physical query execution plan 
to push down possible operations to Carbon for example:- Grouping, Distinct 
Count, Top N etc.. for improving query performance.
+3. **Carbon Data RDD**:- Makes the data present in Carbon tables visible to 
Spark as a RDD which enables spark to perform distributed computation on Carbon 
tables.
+
+
+
+#### Carbon Processor: 
+
+Receives a query execution fragment from spark and executes the same on the 
Carbon storage. This involves Scanning the carbon store files for matching 
record, using the indices to directly locate the row sets and even the rows 
that may containing the data being searched for. The Carbon processor also 
performs all pushed down operations such as 
+
+Aggregation/Group By
+
+Distinct Count
+
+Top N
+
+Expression Evaluation
+
+And many moreâ¦
+
+#### Carbon Storage:
+
+Custom columnar data store which is heavily compressed, binary, dictionary 
encoded and heavily indexed.Usaually stored in HDFS.
+
+## CarbonData Features
+
+CarbonData has rich set of featues to support various use cases in Big Data 
analytics.
+
+ 
+
+## Design
+
+- ### Dictionary Encoding
+
+CarbonData supports encoding of data with suggogate values to reduce storage 
space and speed up processing.Most databases and big data SQL data stores adopt 
dictionary encoding(integer surrogate numbers) to achieve data 
compression.Unlike other column store databases where the dictionary is local 
to each data block, CarbonData maintains a global dictionary which provides 
opportunity for lazy conversion to actual values enabling all computation to be 
performed on the lightweight surrogate values.
+
+##### Dictionary generation
+
+![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_dict_encoding.png)
+
+
+
+##### MDK Indexing
+
+All the surrogate keys are byte packed to generate an MDK (Multi Dimensional 
Key) Index.
+
+Any non surrogate columns of String data types are compressed using one of the 
configured compression algorithms and stored.For those numeric columns where 
surrogates are not generated, such data is stored as it is after compression.
+
+![image-20180903212418381](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_mdk.png)
+
+##### Sorted MDK
+
+The data is sorted based on the MDK Index.Sorting helps for logical grouping 
of similar data and there by aids in faster look up during query.
+
+#### 
![image-20180903212525214](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_mdk_sort.png)
+
+##### Custom Columnar Encoding
+
+The Sorted MDK Index is split into each column.Unlike other stores where the 
column is compressed and stored as it is, CarbonData sorts this column data so 
that Binary Search can be performed on individual column data based on the 
filter conditions.This aids in magnitude increase in query performance and also 
in better compression.Since the individual column's data gets sorted, it is 
necessary to maintain the row mapping with the sorted MDK Index data in order 
to retrieve data from other columns which are not participating in filter.This 
row mapping is termed as **Inverted Index** and is stored along with the column 
data.The below picture depicts the logical column view.User has the option to 
**turn off** Inverted Index for such columns where filters are never applied or 
is very rare.In such cases, scanning would be sequential, but can aid in 
reducing the storage size(occupied due to inverted index data).
+
+#### 
![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_blocklet_view.png)
+
+- ### CarbonData Storage Format
+
+  CarbonData has a unique storage structure which aids in efficient storage 
and retrieval of data.Please refer to [File Structure of 
CarbonData](#./file-structure-of-carbondata.md) for detailed information on the 
format.
+
+- ### Indexing
+
+  CarbonData maintains multiple indexes at multiple levels to assist in 
efficient pruning of unwanted data from scan during query.Also CarbonData has 
support for plugging in external indexing solutions to speed up the query 
process.
+
+  ##### Min-Max Indexing
+
+  Storing data along with index significantly accelerates query performance 
and reduces the I/O scans and CPU resources in case of filters in the query. 
CarbonData index consists of multiple levels of indices, a processing framework 
can leverage this index to reduce the number of tasks it needs to schedule and 
process. It can also do skip scan in more fine grained units (called blocklet) 
in task side scanning instead of scanning the whole file.  **CarbonData 
maintains Min-Max Index for all the columns.**
+
+  CarbonData maintains a separate index file which contains the footer 
information for efficient IO reads.
+
+  Using the Min-Max info in these index files, two levels of filtering can be 
achieved.
+
+  Min-Max at the carbondata file level,to efficiently prune the files when the 
filter condition doesn't fall in the range.This information when maintained at 
the Spark Driver, will help to efficiently schedule the tasks for scanning
+
+  Min-Max at the blocklet level, to efficiently prune the blocklets when the 
filter condition doesn't fall in the range.This information when maintained at 
the executor can significantly reduce the amount unnecessary data processed by 
the executor tasks. 
+
+
+
+  
![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata-minmax-blocklet.png)
+
+- #### DataMaps
+
+  DataMap is a framework for indexing and also for statistics that can be used 
to add primary index (Blocklet Index) , secondary index type and statistical 
type to CarbonData.
+
+  DataMap is a standardized general interface which CarbonData uses to prune 
data blocks for scanning.
+
+  DataMaps are of 2 types:
+
+  **CG(Coarse Grained) DataMaps** Can prune data to the blocklet or to Page 
level.ie., Holds information for deciding which blocks/blocklets to be 
scanned.This DataMap is used in Spark Driver to decide the number of tasks to 
be scheduled.
+
+  **FG(Fine Grained) DataMaps** Can prune data to row level.This DataMap is 
used in Spark executor for scanning an fetching the data much faster.
+
+  Since DataMap interfaces are generalised, We can write a thin adaptor called 
as **DataMap Providers** to interface between CarbonData and other external 
Indexing engines. For eg., Lucene, Solr,ES,...
+
+  CarbonData has its own DSL to create and manage DataMaps.Please refer to 
[CarbonData DSL](#./datamap/datamap-management.md#overview) for more 
information.
+
+  The below diagram explains about the DataMap execution in CarbonData.
+
+  
![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata-datamap.png)
+
+- #### Update & Delete
+
+
+CarbonData supports Update and delete operations over big data.This 
functionality is not targetted for OLTP scenarios where high concurrent 
update/delete are required.Following are the assumptions considered when this 
feature is designed.
+
+1. Updates or Deletes are periodic and in Bulk
+2. Updates or Deletes are atomic
+3. Data is immediately visible
+4. Concurrent query to be allowed during an update or delete operation
+5. Single statement auto-commit support (not OLTP-style transaction)
+
+Since data stored in HDFS are immutable,data blocks cannot be updated 
in-place.Re-write of entire data block is not efficient for IO and also is a 
slow process.
+
+To over come these limitations, CarbonData adopts methodology of writing a 
delta file containing the rows to be deleted and another delta file containing 
the values to be updated with.During processing, These two delta files are 
merged with the main carbondata file and the correct result is returned for the 
query.
+
+The below diagram describes the process.
+
+![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_update_delete.png)
+
+
+
+## Integration with Big Data ecosystem
+
+Refer to Integration with [Spark](#./quick-start-guide.md#spark), 
[Presto](#./quick-start-guide.md#presto) for detailed information on 
integrating CarbonData with these execution engines.
+
+## Scenarios where CarbonData is suitable
+
+
+
+
+<script>
+// Show selected style on nav item
+$(function() { $('.b-nav__intro').addClass('selected'); });
+</script>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/language-manual.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/language-manual.md 
b/src/site/markdown/language-manual.md
new file mode 100644
index 0000000..9fef71b
--- /dev/null
+++ b/src/site/markdown/language-manual.md
@@ -0,0 +1,51 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more 
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership. 
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with 
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software 
+    distributed under the License is distributed on an "AS IS" BASIS, 
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and 
+    limitations under the License.
+-->
+
+# Overview
+
+
+
+CarbonData has its own parser, in addition to Spark's SQL Parser, to parse and 
process certain Commands related to CarbonData table handling. You can interact 
with the SQL interface using the 
[command-line](https://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-spark-sql-cli)
 or over 
[JDBC/ODBC](https://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbcodbc-server).
+
+- [Data Types](./supported-data-types-in-carbondata.md)
+- Data Definition Statements
+  - 
[DDL:](./ddl-of-carbondata.md)[Create](./ddl-of-carbondata.md#create-table),[Drop](./ddl-of-carbondata.md#drop-table),[Partition](./ddl-of-carbondata.md#partition),[Bucketing](./ddl-of-carbondata.md#bucketing),[Alter](./ddl-of-carbondata.md#alter-table),[CTAS](./ddl-of-carbondata.md#create-table-as-select),[External
 Table](./ddl-of-carbondata.md#create-external-table)
+  - Indexes
+  - [DataMaps](./datamap-management.md)
+    - [Bloom](./bloomfilter-datamap-guide.md)
+    - [Lucene](./lucene-datamap-guide.md)
+    - [Pre-Aggregate](./preaggregate-datamap-guide.md)
+    - [Time Series](./timeseries-datamap-guide.md)
+  - Materialized Views (MV)
+  - [Streaming](./streaming-guide.md)
+- Data Manipulation Statements
+  - [DML:](./dml-of-carbondata.md) [Load](./dml-of-carbondata.md#load-data), 
[Insert](./ddl-of-carbondata.md#insert-overwrite), 
[Update](./dml-of-carbondata.md#update), [Delete](./dml-of-carbondata.md#delete)
+  - [Segment Management](./segment-management-on-carbondata.md)
+- [Configuration Properties](./configuration-parameters.md)
+
+<script>
+$(function() {
+  // Show selected style on nav item
+  $('.b-nav__docs').addClass('selected');
+
+  // Display docs subnav items
+  if (!$('.b-nav__docs').parent().hasClass('nav__item__with__subs--expanded')) 
{
+    $('.b-nav__docs').parent().toggleClass('nav__item__with__subs--expanded');
+  }
+});
+</script>
+

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/lucene-datamap-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/lucene-datamap-guide.md 
b/src/site/markdown/lucene-datamap-guide.md
index 06cd194..248c8e5 100644
--- a/src/site/markdown/lucene-datamap-guide.md
+++ b/src/site/markdown/lucene-datamap-guide.md
@@ -173,4 +173,15 @@ release, user can do as following:
 3. Create the lucene datamap again by `CREATE DATAMAP` command.
 Basically, user can manually trigger the operation by re-building the datamap.
 
+<script>
+$(function() {
+  // Show selected style on nav item
+  $('.b-nav__datamap').addClass('selected');
+  
+  if 
(!$('.b-nav__datamap').parent().hasClass('nav__item__with__subs--expanded')) {
+    // Display datamap subnav items
+    
$('.b-nav__datamap').parent().toggleClass('nav__item__with__subs--expanded');
+  }
+});
+</script>
 

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/performance-tuning.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/performance-tuning.md 
b/src/site/markdown/performance-tuning.md
new file mode 100644
index 0000000..d8b53f2
--- /dev/null
+++ b/src/site/markdown/performance-tuning.md
@@ -0,0 +1,183 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more 
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership. 
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with 
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing, software 
+    distributed under the License is distributed on an "AS IS" BASIS, 
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and 
+    limitations under the License.
+-->
+
+# Useful Tips
+  This tutorial guides you to create CarbonData Tables and optimize 
performance.
+  The following sections will elaborate on the below topics :
+
+  * [Suggestions to create CarbonData 
Table](#suggestions-to-create-carbondata-table)
+  * [Configuration for Optimizing Data Loading performance for Massive 
Data](#configuration-for-optimizing-data-loading-performance-for-massive-data)
+  * [Optimizing Query 
Performance](#configurations-for-optimizing-carbondata-performance)
+
+## Suggestions to Create CarbonData Table
+
+  For example, the results of the analysis for table creation with dimensions 
ranging from 10 thousand to 10 billion rows and 100 to 300 columns have been 
summarized below.
+  The following table describes some of the columns from the table used.
+
+  - **Table Column Description**
+
+| Column Name | Data Type     | Cardinality | Attribution |
+|-------------|---------------|-------------|-------------|
+| msisdn      | String        | 30 million  | Dimension   |
+| BEGIN_TIME  | BigInt        | 10 Thousand | Dimension   |
+| HOST        | String        | 1 million   | Dimension   |
+| Dime_1      | String        | 1 Thousand  | Dimension   |
+| counter_1   | Decimal       | NA          | Measure     |
+| counter_2   | Numeric(20,0) | NA          | Measure     |
+| ...         | ...           | NA          | Measure     |
+| counter_100 | Decimal       | NA          | Measure     |
+
+
+  - **Put the frequently-used column filter in the beginning of SORT_COLUMNS**
+
+  For example, MSISDN filter is used in most of the query then we must put the 
MSISDN as the first column in SORT_COLUMNS property.
+  The create table command can be modified as suggested below :
+
+  ```
+  create table carbondata_table(
+    msisdn String,
+    BEGIN_TIME bigint,
+    HOST String,
+    Dime_1 String,
+    counter_1, Decimal
+    ...
+    
+    )STORED BY 'carbondata'
+    TBLPROPERTIES ('SORT_COLUMNS'='msisdn, Dime_1')
+  ```
+
+  Now the query with MSISDN in the filter will be more efficient.
+
+  - **Put the frequently-used columns in the order of low to high cardinality 
in SORT_COLUMNS**
+
+  If the table in the specified query has multiple columns which are 
frequently used to filter the results, it is suggested to put
+  the columns in the order of cardinality low to high in SORT_COLUMNS 
configuration. This ordering of frequently used columns improves the 
compression ratio and
+  enhances the performance of queries with filter on these columns.
+
+  For example, if MSISDN, HOST and Dime_1 are frequently-used columns, then 
the column order of table is suggested as
+  Dime_1>HOST>MSISDN, because Dime_1 has the lowest cardinality.
+  The create table command can be modified as suggested below :
+
+  ```
+  create table carbondata_table(
+      msisdn String,
+      BEGIN_TIME bigint,
+      HOST String,
+      Dime_1 String,
+      counter_1, Decimal
+      ...
+      
+      )STORED BY 'carbondata'
+      TBLPROPERTIES ('SORT_COLUMNS'='Dime_1, HOST, MSISDN')
+  ```
+
+  - **For measure type columns with non high accuracy, replace Numeric(20,0) 
data type with Double data type**
+
+  For columns of measure type, not requiring high accuracy, it is suggested to 
replace Numeric data type with Double to enhance query performance. 
+  The create table command can be modified as below :
+
+```
+  create table carbondata_table(
+    Dime_1 String,
+    BEGIN_TIME bigint,
+    END_TIME bigint,
+    HOST String,
+    MSISDN String,
+    counter_1 decimal,
+    counter_2 double,
+    ...
+    )STORED BY 'carbondata'
+    TBLPROPERTIES ('SORT_COLUMNS'='Dime_1, HOST, MSISDN')
+```
+  The result of performance analysis of test-case shows reduction in query 
execution time from 15 to 3 seconds, thereby improving performance by nearly 5 
times.
+
+ - **Columns of incremental character should be re-arranged at the end of 
dimensions**
+
+  Consider the following scenario where data is loaded each day and the 
begin_time is incremental for each load, it is suggested to put begin_time at 
the end of dimensions.
+  Incremental values are efficient in using min/max index. The create table 
command can be modified as below :
+
+  ```
+  create table carbondata_table(
+    Dime_1 String,
+    HOST String,
+    MSISDN String,
+    counter_1 double,
+    counter_2 double,
+    BEGIN_TIME bigint,
+    END_TIME bigint,
+    ...
+    counter_100 double
+    )STORED BY 'carbondata'
+    TBLPROPERTIES ('SORT_COLUMNS'='Dime_1, HOST, MSISDN')
+  ```
+
+  **NOTE:**
+  + BloomFilter can be created to enhance performance for queries with precise 
equal/in conditions. You can find more information about it in BloomFilter 
datamap 
[document](https://github.com/apache/carbondata/blob/master/docs/datamap/bloomfilter-datamap-guide.md).
+
+
+## Configuration for Optimizing Data Loading performance for Massive Data
+
+
+  CarbonData supports large data load, in this process sorting data while 
loading consumes a lot of memory and disk IO and
+  this can result sometimes in "Out Of Memory" exception.
+  If you do not have much memory to use, then you may prefer to slow the speed 
of data loading instead of data load failure.
+  You can configure CarbonData by tuning following properties in 
carbon.properties file to get a better performance.
+
+| Parameter | Default Value | Description/Tuning |
+|-----------|-------------|--------|
+|carbon.number.of.cores.while.loading|Default: 2.This value should be >= 
2|Specifies the number of cores used for data processing during data loading in 
CarbonData. |
+|carbon.sort.size|Default: 100000. The value should be >= 100.|Threshold to 
write local file in sort step when loading data|
+|carbon.sort.file.write.buffer.size|Default:  50000.|DataOutputStream buffer. |
+|carbon.merge.sort.reader.thread|Default: 3 |Specifies the number of cores 
used for temp file merging during data loading in CarbonData.|
+|carbon.merge.sort.prefetch|Default: true | You may want set this value to 
false if you have not enough memory|
+
+  For example, if there are 10 million records, and i have only 16 cores, 64GB 
memory, will be loaded to CarbonData table.
+  Using the default configuration  always fail in sort step. Modify 
carbon.properties as suggested below:
+
+  ```
+  carbon.merge.sort.reader.thread=1
+  carbon.sort.size=5000
+  carbon.sort.file.write.buffer.size=5000
+  carbon.merge.sort.prefetch=false
+  ```
+
+## Configurations for Optimizing CarbonData Performance
+
+  Recently we did some performance POC on CarbonData for Finance and 
telecommunication Field. It involved detailed queries and aggregation
+  scenarios. After the completion of POC, some of the configurations impacting 
the performance have been identified and tabulated below :
+
+| Parameter | Location | Used For  | Description | Tuning |
+|----------------------------------------------|-----------------------------------|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| carbon.sort.intermediate.files.limit | spark/carbonlib/carbon.properties | 
Data loading | During the loading of data, local temp is used to sort the data. 
This number specifies the minimum number of intermediate files after which the  
merge sort has to be initiated. | Increasing the parameter to a higher value 
will improve the load performance. For example, when we increase the value from 
20 to 100, it increases the data load performance from 35MB/S to more than 
50MB/S. Higher values of this parameter consumes  more memory during the load. |
+| carbon.number.of.cores.while.loading | spark/carbonlib/carbon.properties | 
Data loading | Specifies the number of cores used for data processing during 
data loading in CarbonData. | If you have more number of CPUs, then you can 
increase the number of CPUs, which will increase the performance. For example 
if we increase the value from 2 to 4 then the CSV reading performance can 
increase about 1 times |
+| carbon.compaction.level.threshold | spark/carbonlib/carbon.properties | Data 
loading and Querying | For minor compaction, specifies the number of segments 
to be merged in stage 1 and number of compacted segments to be merged in stage 
2. | Each CarbonData load will create one segment, if every load is small in 
size it will generate many small file over a period of time impacting the query 
performance. Configuring this parameter will merge the small segment to one big 
segment which will sort the data and improve the performance. For Example in 
one telecommunication scenario, the performance improves about 2 times after 
minor compaction. |
+| spark.sql.shuffle.partitions | spark/conf/spark-defaults.conf | Querying | 
The number of task started when spark shuffle. | The value can be 1 to 2 times 
as much as the executor cores. In an aggregation scenario, reducing the number 
from 200 to 32 reduced the query time from 17 to 9 seconds. |
+| spark.executor.instances/spark.executor.cores/spark.executor.memory | 
spark/conf/spark-defaults.conf | Querying | The number of executors, CPU cores, 
and memory used for CarbonData query. | In the bank scenario, we provide the 4 
CPUs cores and 15 GB for each executor which can get good performance. This 2 
value does not mean more the better. It needs to be configured properly in case 
of limited resources. For example, In the bank scenario, it has enough CPU 32 
cores each node but less memory 64 GB each node. So we cannot give more CPU but 
less memory. For example, when 4 cores and 12GB for each executor. It sometimes 
happens GC during the query which impact the query performance very much from 
the 3 second to more than 15 seconds. In this scenario need to increase the 
memory or decrease the CPU cores. |
+| carbon.detail.batch.size | spark/carbonlib/carbon.properties | Data loading 
| The buffer size to store records, returned from the block scan. | In limit 
scenario this parameter is very important. For example your query limit is 
1000. But if we set this value to 3000 that means we get 3000 records from scan 
but spark will only take 1000 rows. So the 2000 remaining are useless. In one 
Finance test case after we set it to 100, in the limit 1000 scenario the 
performance increase about 2 times in comparison to if we set this value to 
12000. |
+| carbon.use.local.dir | spark/carbonlib/carbon.properties | Data loading | 
Whether use YARN local directories for multi-table load disk load balance | If 
this is set it to true CarbonData will use YARN local directories for 
multi-table load disk load balance, that will improve the data load 
performance. |
+| carbon.use.multiple.temp.dir | spark/carbonlib/carbon.properties | Data 
loading | Whether to use multiple YARN local directories during table data 
loading for disk load balance | After enabling 'carbon.use.local.dir', if this 
is set to true, CarbonData will use all YARN local directories during data load 
for disk load balance, that will improve the data load performance. Please 
enable this property when you encounter disk hotspot problem during data 
loading. |
+| carbon.sort.temp.compressor | spark/carbonlib/carbon.properties | Data 
loading | Specify the name of compressor to compress the intermediate sort 
temporary files during sort procedure in data loading. | The optional values 
are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD' and empty. By default, empty means 
that Carbondata will not compress the sort temp files. This parameter will be 
useful if you encounter disk bottleneck. |
+| carbon.load.skewedDataOptimization.enabled | 
spark/carbonlib/carbon.properties | Data loading | Whether to enable size based 
block allocation strategy for data loading. | When loading, carbondata will use 
file size based block allocation strategy for task distribution. It will make 
sure that all the executors process the same size of data -- It's useful if the 
size of your input data files varies widely, say 1MB~1GB. |
+| carbon.load.min.size.enabled | spark/carbonlib/carbon.properties | Data 
loading | Whether to enable node minumun input data size allocation strategy 
for data loading.| When loading, carbondata will use node minumun input data 
size allocation strategy for task distribution. It will make sure the node load 
the minimum amount of data -- It's useful if the size of your input data files 
very small, say 1MB~256MB,Avoid generating a large number of small files. |
+
+  Note: If your CarbonData instance is provided only for query, you may 
specify the property 'spark.speculation=true' which is in conf directory of 
spark.
+
+
+<script>
+// Show selected style on nav item
+$(function() { $('.b-nav__perf').addClass('selected'); });
+</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/preaggregate-datamap-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/preaggregate-datamap-guide.md 
b/src/site/markdown/preaggregate-datamap-guide.md
index ff4c28e..9c7a5f8 100644
--- a/src/site/markdown/preaggregate-datamap-guide.md
+++ b/src/site/markdown/preaggregate-datamap-guide.md
@@ -126,7 +126,7 @@ kinds of DataMap:
    a. 'path' is used to specify the store location of the 
datamap.('path'='/location/').
    b. 'partitioning' when set to false enables user to disable partitioning of 
the datamap.
        Default value is true for this property.
-2. timeseries, for timeseries roll-up table. Please refer to [Timeseries 
DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/timeseries-datamap-guide.md)
+2. timeseries, for timeseries roll-up table. Please refer to [Timeseries 
DataMap](./timeseries-datamap-guide.md)
 
 DataMap can be dropped using following DDL
   ```
@@ -271,3 +271,14 @@ release, user can do as following:
 Basically, user can manually trigger the operation by re-building the datamap.
 
 
+<script>
+$(function() {
+  // Show selected style on nav item
+  $('.b-nav__datamap').addClass('selected');
+  
+  if 
(!$('.b-nav__datamap').parent().hasClass('nav__item__with__subs--expanded')) {
+    // Display datamap subnav items
+    
$('.b-nav__datamap').parent().toggleClass('nav__item__with__subs--expanded');
+  }
+});
+</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/quick-start-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/quick-start-guide.md 
b/src/site/markdown/quick-start-guide.md
index 84f871d..7ac5a3f 100644
--- a/src/site/markdown/quick-start-guide.md
+++ b/src/site/markdown/quick-start-guide.md
@@ -7,7 +7,7 @@
     the License.  You may obtain a copy of the License at
 
       http://www.apache.org/licenses/LICENSE-2.0
-
+    
     Unless required by applicable law or agreed to in writing, software 
     distributed under the License is distributed on an "AS IS" BASIS, 
     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@@ -16,10 +16,11 @@
 -->
 
 # Quick Start
-This tutorial provides a quick introduction to using CarbonData.
+This tutorial provides a quick introduction to using CarbonData.To follow 
along with this guide, first download a packaged release of CarbonData from the 
[CarbonData 
website](https://dist.apache.org/repos/dist/release/carbondata/).Alternatively 
it can be created following [Building 
CarbonData](https://github.com/apache/carbondata/tree/master/build) steps.
 
 ##  Prerequisites
-* [Installation and building 
CarbonData](https://github.com/apache/carbondata/blob/master/build).
+* Spark 2.2.1 version is installed and running.CarbonData supports Spark 
versions upto 2.2.1.Please follow steps described in [Spark docs 
website](https://spark.apache.org/docs/latest) for installing and running Spark.
+
 * Create a sample.csv file using the following commands. The CSV file is 
required for loading data into CarbonData.
 
   ```
@@ -32,7 +33,30 @@ This tutorial provides a quick introduction to using 
CarbonData.
   EOF
   ```
 
-## Interactive Analysis with Spark Shell Version 2.1
+## Deployment modes
+
+CarbonData can be integrated with Spark and Presto Execution Engines.The below 
documentation guides on Installing and Configuring with these execution engines.
+
+### Spark
+
+[Installing and Configuring CarbonData to run locally with Spark 
Shell](#installing-and-configuring-carbondata-to-run-locally-with-spark-shell)
+
+[Installing and Configuring CarbonData on Standalone Spark 
Cluster](#installing-and-configuring-carbondata-on-standalone-spark-cluster)
+
+[Installing and Configuring CarbonData on Spark on YARN 
Cluster](#installing-and-configuring-carbondata-on-spark-on-yarn-cluster)
+
+
+### Presto
+[Installing and Configuring CarbonData on 
Presto](#installing-and-configuring-carbondata-on-presto)
+
+
+## Querying Data
+
+[Query Execution using CarbonData Thrift 
Server](#query-execution-using-carbondata-thrift-server)
+
+## 
+
+## Installing and Configuring CarbonData to run locally with Spark Shell
 
 Apache Spark Shell provides a simple way to learn the API, as well as a 
powerful tool to analyze data interactively. Please visit [Apache Spark 
Documentation](http://spark.apache.org/docs/latest/) for more details on Spark 
shell.
 
@@ -43,7 +67,7 @@ Start Spark shell by running the following command in the 
Spark directory:
 ```
 ./bin/spark-shell --jars <carbondata assembly jar path>
 ```
-**NOTE**: Assembly jar will be available after [building 
CarbonData](https://github.com/apache/carbondata/blob/master/build/README.md) 
and can be copied from `./assembly/target/scala-2.1x/carbondata_xxx.jar`
+**NOTE**: Path where packaged release of CarbonData was downloaded or assembly 
jar will be available after [building 
CarbonData](https://github.com/apache/carbondata/blob/master/build/README.md) 
and can be copied from `./assembly/target/scala-2.1x/carbondata_xxx.jar`
 
 In this shell, SparkSession is readily available as `spark` and Spark context 
is readily available as `sc`.
 
@@ -62,7 +86,7 @@ import org.apache.spark.sql.CarbonSession._
 val carbon = SparkSession.builder().config(sc.getConf)
              .getOrCreateCarbonSession("<hdfs store path>")
 ```
-**NOTE**: By default metastore location is pointed to `../carbon.metastore`, 
user can provide own metastore location to CarbonSession like 
`SparkSession.builder().config(sc.getConf)
+**NOTE**: By default metastore location points to `../carbon.metastore`, user 
can provide own metastore location to CarbonSession like 
`SparkSession.builder().config(sc.getConf)
 .getOrCreateCarbonSession("<hdfs store path>", "<local metastore path>")`
 
 #### Executing Queries
@@ -86,7 +110,7 @@ scala>carbon.sql("LOAD DATA INPATH '/path/to/sample.csv'
                   INTO TABLE test_table")
 ```
 **NOTE**: Please provide the real file path of `sample.csv` for the above 
script. 
-If you get "tablestatus.lock" issue, please refer to 
[troubleshooting](troubleshooting.md)
+If you get "tablestatus.lock" issue, please refer to [FAQ](faq.md)
 
 ###### Query Data from a Table
 
@@ -97,3 +121,317 @@ scala>carbon.sql("SELECT city, avg(age), sum(age)
                   FROM test_table
                   GROUP BY city").show()
 ```
+
+
+
+## Installing and Configuring CarbonData on Standalone Spark Cluster
+
+### Prerequisites
+
+- Hadoop HDFS and Yarn should be installed and running.
+- Spark should be installed and running on all the cluster nodes.
+- CarbonData user should have permission to access HDFS.
+
+### Procedure
+
+1. [Build the 
CarbonData](https://github.com/apache/carbondata/blob/master/build/README.md) 
project and get the assembly jar from 
`./assembly/target/scala-2.1x/carbondata_xxx.jar`. 
+
+2. Copy `./assembly/target/scala-2.1x/carbondata_xxx.jar` to 
`$SPARK_HOME/carbonlib` folder.
+
+   **NOTE**: Create the carbonlib folder if it does not exist inside 
`$SPARK_HOME` path.
+
+3. Add the carbonlib folder path in the Spark classpath. (Edit 
`$SPARK_HOME/conf/spark-env.sh` file and modify the value of `SPARK_CLASSPATH` 
by appending `$SPARK_HOME/carbonlib/*` to the existing value)
+
+4. Copy the `./conf/carbon.properties.template` file from CarbonData 
repository to `$SPARK_HOME/conf/` folder and rename the file to 
`carbon.properties`.
+
+5. Repeat Step 2 to Step 5 in all the nodes of the cluster.
+
+6. In Spark node[master], configure the properties mentioned in the following 
table in `$SPARK_HOME/conf/spark-defaults.conf` file.
+
+| Property                        | Value                                      
                  | Description                                                 
 |
+| ------------------------------- | 
------------------------------------------------------------ | 
------------------------------------------------------------ |
+| spark.driver.extraJavaOptions   | `-Dcarbon.properties.filepath = 
$SPARK_HOME/conf/carbon.properties` | A string of extra JVM options to pass to 
the driver. For instance, GC settings or other logging. |
+| spark.executor.extraJavaOptions | `-Dcarbon.properties.filepath = 
$SPARK_HOME/conf/carbon.properties` | A string of extra JVM options to pass to 
executors. For instance, GC settings or other logging. **NOTE**: You can enter 
multiple values separated by space. |
+
+1. Add the following properties in `$SPARK_HOME/conf/carbon.properties` file:
+
+| Property             | Required | Description                                
                  | Example                              | Remark               
         |
+| -------------------- | -------- | 
------------------------------------------------------------ | 
------------------------------------ | ----------------------------- |
+| carbon.storelocation | NO       | Location where data CarbonData will create 
the store and write the data in its own format. If not specified then it takes 
spark.sql.warehouse.dir path. | hdfs://HOSTNAME:PORT/Opt/CarbonStore | Propose 
to set HDFS directory |
+
+1. Verify the installation. For example:
+
+```
+./spark-shell --master spark://HOSTNAME:PORT --total-executor-cores 2
+--executor-memory 2G
+```
+
+**NOTE**: Make sure you have permissions for CarbonData JARs and files through 
which driver and executor will start.
+
+
+
+## Installing and Configuring CarbonData on Spark on YARN Cluster
+
+   This section provides the procedure to install CarbonData on "Spark on 
YARN" cluster.
+
+### Prerequisites
+
+- Hadoop HDFS and Yarn should be installed and running.
+- Spark should be installed and running in all the clients.
+- CarbonData user should have permission to access HDFS.
+
+### Procedure
+
+   The following steps are only for Driver Nodes. (Driver nodes are the one 
which starts the spark context.)
+
+1. [Build the 
CarbonData](https://github.com/apache/carbondata/blob/master/build/README.md) 
project and get the assembly jar from 
`./assembly/target/scala-2.1x/carbondata_xxx.jar` and copy to 
`$SPARK_HOME/carbonlib` folder.
+
+   **NOTE**: Create the carbonlib folder if it does not exists inside 
`$SPARK_HOME` path.
+
+2. Copy the `./conf/carbon.properties.template` file from CarbonData 
repository to `$SPARK_HOME/conf/` folder and rename the file to 
`carbon.properties`.
+
+3. Create `tar.gz` file of carbonlib folder and move it inside the carbonlib 
folder.
+
+```
+cd $SPARK_HOME
+tar -zcvf carbondata.tar.gz carbonlib/
+mv carbondata.tar.gz carbonlib/
+```
+
+1. Configure the properties mentioned in the following table in 
`$SPARK_HOME/conf/spark-defaults.conf` file.
+
+| Property                        | Description                                
                  | Value                                                       
 |
+| ------------------------------- | 
------------------------------------------------------------ | 
------------------------------------------------------------ |
+| spark.master                    | Set this value to run the Spark in yarn 
cluster mode.        | Set yarn-client to run the Spark in yarn cluster mode.   
    |
+| spark.yarn.dist.files           | Comma-separated list of files to be placed 
in the working directory of each executor. | 
`$SPARK_HOME/conf/carbon.properties`                         |
+| spark.yarn.dist.archives        | Comma-separated list of archives to be 
extracted into the working directory of each executor. | 
`$SPARK_HOME/carbonlib/carbondata.tar.gz`                    |
+| spark.executor.extraJavaOptions | A string of extra JVM options to pass to 
executors. For instance  **NOTE**: You can enter multiple values separated by 
space. | `-Dcarbon.properties.filepath = carbon.properties`           |
+| spark.executor.extraClassPath   | Extra classpath entries to prepend to the 
classpath of executors. **NOTE**: If SPARK_CLASSPATH is defined in 
spark-env.sh, then comment it and append the values in below parameter 
spark.driver.extraClassPath | `carbondata.tar.gz/carbonlib/*`                   
           |
+| spark.driver.extraClassPath     | Extra classpath entries to prepend to the 
classpath of the driver. **NOTE**: If SPARK_CLASSPATH is defined in 
spark-env.sh, then comment it and append the value in below parameter 
spark.driver.extraClassPath. | `$SPARK_HOME/carbonlib/*`                        
            |
+| spark.driver.extraJavaOptions   | A string of extra JVM options to pass to 
the driver. For instance, GC settings or other logging. | 
`-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties` |
+
+1. Add the following properties in `$SPARK_HOME/conf/carbon.properties`:
+
+| Property             | Required | Description                                
                  | Example                              | Default Value        
         |
+| -------------------- | -------- | 
------------------------------------------------------------ | 
------------------------------------ | ----------------------------- |
+| carbon.storelocation | NO       | Location where CarbonData will create the 
store and write the data in its own format. If not specified then it takes 
spark.sql.warehouse.dir path. | hdfs://HOSTNAME:PORT/Opt/CarbonStore | Propose 
to set HDFS directory |
+
+1. Verify the installation.
+
+```
+ ./bin/spark-shell --master yarn-client --driver-memory 1g
+ --executor-cores 2 --executor-memory 2G
+```
+
+  **NOTE**: Make sure you have permissions for CarbonData JARs and files 
through which driver and executor will start.
+
+
+
+## Query Execution Using CarbonData Thrift Server
+
+### Starting CarbonData Thrift Server.
+
+   a. cd `$SPARK_HOME`
+
+   b. Run the following command to start the CarbonData thrift server.
+
+```
+./bin/spark-submit
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
+$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
+```
+
+| Parameter           | Description                                            
      | Example                                                    |
+| ------------------- | 
------------------------------------------------------------ | 
---------------------------------------------------------- |
+| CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the 
`$SPARK_HOME/carbonlib/` folder. | 
carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar       |
+| carbon_store_path   | This is a parameter to the CarbonThriftServer class. 
This a HDFS path where CarbonData files will be kept. Strongly Recommended to 
put same as carbon.storelocation parameter of carbon.properties. If not 
specified then it takes spark.sql.warehouse.dir path. | 
`hdfs://<host_name>:port/user/hive/warehouse/carbon.store` |
+
+**NOTE**: From Spark 1.6, by default the Thrift server runs in multi-session 
mode. Which means each JDBC/ODBC connection owns a copy of their own SQL 
configuration and temporary function registry. Cached tables are still shared 
though. If you prefer to run the Thrift server in single-session mode and share 
all SQL configuration and temporary function registry, please set option 
`spark.sql.hive.thriftServer.singleSession` to `true`. You may either add this 
option to `spark-defaults.conf`, or pass it to `spark-submit.sh` via `--conf`:
+
+```
+./bin/spark-submit
+--conf spark.sql.hive.thriftServer.singleSession=true
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
+$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
+```
+
+**But** in single-session mode, if one user changes the database from one 
connection, the database of the other connections will be changed too.
+
+**Examples**
+
+- Start with default memory and executors.
+
+```
+./bin/spark-submit
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
+$SPARK_HOME/carbonlib
+/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
+hdfs://<host_name>:port/user/hive/warehouse/carbon.store
+```
+
+- Start with Fixed executors and resources.
+
+```
+./bin/spark-submit
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
+--num-executors 3 --driver-memory 20g --executor-memory 250g 
+--executor-cores 32 
+/srv/OSCON/BigData/HACluster/install/spark/sparkJdbc/lib
+/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
+hdfs://<host_name>:port/user/hive/warehouse/carbon.store
+```
+
+### Connecting to CarbonData Thrift Server Using Beeline.
+
+```
+     cd $SPARK_HOME
+     ./sbin/start-thriftserver.sh
+     ./bin/beeline -u jdbc:hive2://<thriftserver_host>:port
+
+     Example
+     ./bin/beeline -u jdbc:hive2://10.10.10.10:10000
+```
+
+
+
+## Installing and Configuring CarbonData on Presto
+
+
+* ### Installing Presto
+
+ 1. Download the 0.187 version of Presto using:
+    `wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz`
+
+ 2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz`.
+
+ 3. Download the Presto CLI for the coordinator and name it presto.
+
+  ```
+    wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+    mv presto-cli-0.187-executable.jar presto
+
+    chmod +x presto
+  ```
+
+### Create Configuration Files
+
+  1. Create `etc` folder in presto-server-0.187 directory.
+  2. Create `config.properties`, `jvm.config`, `log.properties`, and 
`node.properties` files.
+  3. Install uuid to generate a node.id.
+
+      ```
+      sudo apt-get install uuid
+
+      uuid
+      ```
+
+
+##### Contents of your node.properties file
+
+  ```
+  node.environment=production
+  node.id=<generated uuid>
+  node.data-dir=/home/ubuntu/data
+  ```
+
+##### Contents of your jvm.config file
+
+  ```
+  -server
+  -Xmx16G
+  -XX:+UseG1GC
+  -XX:G1HeapRegionSize=32M
+  -XX:+UseGCOverheadLimit
+  -XX:+ExplicitGCInvokesConcurrent
+  -XX:+HeapDumpOnOutOfMemoryError
+  -XX:OnOutOfMemoryError=kill -9 %p
+  ```
+
+##### Contents of your log.properties file
+  ```
+  com.facebook.presto=INFO
+  ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, 
`WARN` and `ERROR`.
+
+### Coordinator Configurations
+
+##### Contents of your config.properties
+  ```
+  coordinator=true
+  node-scheduler.include-coordinator=false
+  http-server.http.port=8086
+  query.max-memory=50GB
+  query.max-memory-per-node=2GB
+  discovery-server.enabled=true
+  discovery.uri=<coordinator_ip>:8086
+  ```
+The options `node-scheduler.include-coordinator=false` and `coordinator=true` 
indicate that the node is the coordinator and tells the coordinator not to do 
any of the computation work itself and to use the workers.
+
+**Note**: It is recommended to set `query.max-memory-per-node` to half of the 
JVM config max memory, though the workload is highly concurrent, lower value 
for `query.max-memory-per-node` is to be used.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`.
+
+### Worker Configurations
+
+##### Contents of your config.properties
+
+  ```
+  coordinator=false
+  http-server.http.port=8086
+  query.max-memory=50GB
+  query.max-memory-per-node=2GB
+  discovery.uri=<coordinator_ip>:8086
+  ```
+
+**Note**: `jvm.config` and `node.properties` files are same for all the nodes 
(worker + coordinator). All the nodes should have different 
`node.id`.(generated by uuid command).
+
+### Catalog Configurations
+
+1. Create a folder named `catalog` in etc directory of presto on all the nodes 
of the cluster including the coordinator.
+
+##### Configuring Carbondata in Presto
+1. Create a file named `carbondata.properties` in the `catalog` folder and set 
the required properties on all the nodes.
+
+### Add Plugins
+
+1. Create a directory named `carbondata` in plugin directory of presto.
+2. Copy `carbondata` jars to `plugin/carbondata` directory on all nodes.
+
+### Start Presto Server on all nodes
+
+```
+./presto-server-0.187/bin/launcher start
+```
+To run it as a background process.
+
+```
+./presto-server-0.187/bin/launcher run
+```
+To run it in foreground.
+
+### Start Presto CLI
+```
+./presto
+```
+To connect to carbondata catalog use the following command:
+
+```
+./presto --server <coordinator_ip>:8086 --catalog carbondata --schema 
<schema_name>
+```
+Execute the following command to ensure the workers are connected.
+
+```
+select * from system.runtime.nodes;
+```
+Now you can use the Presto CLI on the coordinator to query data sources in the 
catalog using the Presto workers.
+
+**Note :** Create Tables and data loads should be done before executing 
queries as we can not create carbon table from this interface.
+
+<script>
+// Show selected style on nav item
+$(function() { $('.b-nav__quickstart').addClass('selected'); });
+</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/release-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/release-guide.md 
b/src/site/markdown/release-guide.md
new file mode 100644
index 0000000..40a9058
--- /dev/null
+++ b/src/site/markdown/release-guide.md
@@ -0,0 +1,428 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more 
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership. 
+    The ASF licenses this file to you under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with 
+    the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software 
+    distributed under the License is distributed on an "AS IS" BASIS, 
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and 
+    limitations under the License.
+-->
+
+# Apache CarbonData Release Guide
+
+Apache CarbonData periodically declares and publishes releases.
+
+Each release is executed by a _Release Manager_, who is selected among the 
CarbonData committers.
+ This document describes the process that the Release Manager follows to 
perform a release. Any 
+ changes to this process should be discussed and adopted on the 
+ [dev@ mailing list](mailto:d...@carbondata.apache.org).
+ 
+Please remember that publishing software has legal consequences. This guide 
complements the 
+foundation-wide [Product Release 
Policy](http://www.apache.org/dev/release.html) and [Release 
+Distribution Policy](http://www.apache.org/dev/release-distribution).
+
+## Decide to release
+
+Deciding to release and selecting a Release Manager is the first step of the 
release process. 
+This is a consensus-based decision of the entire community.
+
+Anybody can propose a release on the dev@ mailing list, giving a solid 
argument and nominating a 
+committer as the Release Manager (including themselves). There's no formal 
process, no vote 
+requirements, and no timing requirements. Any objections should be resolved by 
consensus before 
+starting the release.
+
+_Checklist to proceed to next step:_
+
+1. Community agrees to release
+2. Community selects a Release Manager
+
+## Prepare for the release
+
+Before your first release, you should perform one-time configuration steps. 
This will set up your
+ security keys for signing the artifacts and access release repository.
+ 
+To prepare for each release, you should audit the project status in the Jira, 
and do necessary 
+bookkeeping. Finally, you should tag a release.
+
+### One-time setup instructions
+
+#### GPG Key
+
+You need to have a GPG key to sign the release artifacts. Please be aware of 
the ASF-wide 
+[release signing guidelines](https://www.apache.org/dev/release-signing.html). 
If you don't have 
+a GPG key associated with your Apache account, please create one according to 
the guidelines.
+
+Determine your Apache GPG key and key ID, as follows:
+
+```
+gpg --list-keys
+```
+
+This will list your GPG keys. One of these should reflect your Apache account, 
for example:
+
+```
+pub   2048R/845E6689 2016-02-23
+uid                  Nomen Nescio <anonym...@apache.org>
+sub   2048R/BA4D50BE 2016-02-23
+```
+
+Here, the key ID is the 8-digit hex string in the `pub` line: `845E6689`.
+
+Now, add your Apache GPG key to the CarbonData's `KEYS` file in `dev` and 
`release` repositories 
+at `dist.apache.org`. Follow the instructions listed at the top of these files.
+ 
+Configure `git` to use this key when signing code by giving it your key ID, as 
follows:
+
+```
+git config --global user.signingkey 845E6689
+```
+
+You may drop the `--global` option if you'd prefer to use this key for the 
current repository only.
+
+You may wish to start `gpg-agent` to unlock your GPG key only once using your 
passphrase. 
+Otherwise, you may need to enter this passphrase several times. The setup of 
`gpg-agent` varies 
+based on operating system, but may be something like this:
+
+```
+eval $(gpg-agent --daemon --no-grab --write-env-file $HOME/.gpg-agent-info)
+export GPG_TTY=$(tty)
+export GPG_AGENT_INFO
+```
+
+#### Access to Apache Nexus
+
+Configure access to the [Apache Nexus 
repository](https://repository.apache.org), used for 
+staging repository and promote the artifacts to Maven Central.
+
+1. You log in with your Apache account.
+2. Confirm you have appropriate access by finding `org.apache.carbondata` 
under `Staging Profiles`.
+3. Navigate to your `Profile` (top right dropdown menu of the page).
+4. Choose `User Token` from the dropdown, then click `Access User Token`. Copy 
a snippet of the 
+Maven XML configuration block.
+5. Insert this snippet twice into your global Maven `settings.xml` file, 
typically `${HOME]/
+.m2/settings.xml`. The end result should look like this, where `TOKEN_NAME` 
and `TOKEN_PASSWORD` 
+are your secret tokens:
+
+```
+ <settings>
+   <servers>
+     <server>
+       <id>apache.releases.https</id>
+       <username>TOKEN_NAME</username>
+       <password>TOKEN_PASSWORD</password>
+     </server>
+     <server>
+       <id>apache.snapshots.https</id>
+       <username>TOKEN_NAME</username>
+       <password>TOKEN_PASSWORD</password>
+     </server>
+   </servers>
+ </settings>
+```
+
+#### Create a new version in Jira
+
+When contributors resolve an issue in Jira, they are tagging it with a release 
that will contain 
+their changes. With the release currently underway, new issues should be 
resolved against a 
+subsequent future release. Therefore, you should create a release item for 
this subsequent 
+release, as follows:
+
+1. In Jira, navigate to `CarbonData > Administration > Versions`.
+2. Add a new release: choose the next minor version number compared to the one 
currently 
+underway, select today's date as the `Start Date`, and choose `Add`. 
+
+#### Triage release-blocking issues in Jira
+
+There could be outstanding release-blocking issues, which should be triaged 
before proceeding to 
+build the release. We track them by assigning a specific `Fix Version` field 
even before the 
+issue is resolved.
+
+The list of release-blocking issues is available at the [version status 
page](https://issues.apache.org/jira/browse/CARBONDATA/?selectedTab=com.atlassian.jira.jira-projects-plugin:versions-panel).
 
+Triage each unresolved issue with one of the following resolutions:
+
+* If the issue has been resolved and Jira was not updated, resolve it 
accordingly.
+* If the issue has not been resolved and it is acceptable to defer until the 
next release, update
+ the `Fix Version` field to the new version you just created. Please consider 
discussing this 
+ with stakeholders and the dev@ mailing list, as appropriate.
+* If the issue has not been resolved and it is not acceptable to release until 
it is fixed, the 
+ release cannot proceed. Instead, work with the CarbonData community to 
resolve the issue.
+ 
+#### Review Release Notes in Jira
+
+Jira automatically generates Release Notes based on the `Fix Version` applied 
to the issues. 
+Release Notes are intended for CarbonData users (not CarbonData 
committers/contributors). You 
+should ensure that Release Notes are informative and useful.
+
+Open the release notes from the [version status 
page](https://issues.apache.org/jira/browse/CARBONDATA/?selectedTab=com.atlassian.jira.jira-projects-plugin:versions-panel)
+by choosing the release underway and clicking Release Notes.
+
+You should verify that the issues listed automatically by Jira are appropriate 
to appear in the 
+Release Notes. Specifically, issues should:
+
+* Be appropriate classified as `Bug`, `New Feature`, `Improvement`, etc.
+* Represent noteworthy user-facing changes, such as new functionality, 
backward-incompatible 
+changes, or performance improvements.
+* Have occurred since the previous release; an issue that was introduced and 
fixed between 
+releases should not appear in the Release Notes.
+* Have an issue title that makes sense when read on its own.
+
+Adjust any of the above properties to the improve clarity and presentation of 
the Release Notes.
+
+#### Verify that a Release Build works
+
+Run `mvn clean install -Prelease` to ensure that the build processes that are 
specific to that 
+profile are in good shape.
+
+_Checklist to proceed to the next step:_
+
+1. Release Manager's GPG key is published to `dist.apache.org`.
+2. Release Manager's GPG key is configured in `git` configuration.
+3. Release Manager has `org.apache.carbondata` listed under `Staging Profiles` 
in Nexus.
+4. Release Manager's Nexus User Token is configured in `settings.xml`.
+5. Jira release item for the subsequent release has been created.
+6. There are no release blocking Jira issues.
+7. Release Notes in Jira have been audited and adjusted.
+
+### Build a release
+
+Use Maven release plugin to tag and build release artifacts, as follows:
+
+```
+mvn release:prepare
+```
+
+Use Maven release plugin to stage these artifacts on the Apache Nexus 
repository, as follows:
+
+```
+mvn release:perform
+```
+
+Review all staged artifacts. They should contain all relevant parts for each 
module, including 
+`pom.xml`, jar, test jar, source, etc. Artifact names should follow 
+[the existing 
format](https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.carbondata%22)
+in which artifact name mirrors directory structure. Carefully review any new 
artifacts.
+
+Close the staging repository on Nexus. When prompted for a description, enter 
"Apache CarbonData 
+x.x.x release".
+
+### Stage source release on dist.apache.org
+
+Copy the source release to dev repository on `dist.apache.org`.
+
+1. If you have not already, check out the section of the `dev` repository on 
`dist.apache.org` via Subversion. In a fresh directory:
+
+```
+svn co https://dist.apache.org/repos/dist/dev/carbondata
+```
+
+2. Make a directory for the new release:
+
+```
+mkdir x.x.x
+```
+
+3. Copy the CarbonData source distribution, hash, and GPG signature:
+
+```
+cp apache-carbondata-x.x.x-source-release.zip x.x.x
+```
+
+4. Add and commit the files:
+
+```
+svn add x.x.x
+svn commit
+```
+
+5. Verify the files are 
[present](https://dist.apache.org/repos/dist/dev/carbondata).
+
+### Propose a pull request for website updates
+
+The final step of building a release candidate is to propose a website pull 
request.
+
+This pull request should update the following page with the new release:
+
+* `src/main/webapp/index.html`
+* `src/main/webapp/docs/latest/mainpage.html`
+
+_Checklist to proceed to the next step:_
+
+1. Maven artifacts deployed to the staging repository of 
+[repository.apache.org](https://repository.apache.org)
+2. Source distribution deployed to the dev repository of
+[dist.apache.org](https://dist.apache.org/repos/dist/dev/carbondata/)
+3. Website pull request to list the release.
+
+## Vote on the release candidate
+
+Once you have built and individually reviewed the release candidate, please 
share it for the 
+community-wide review. Please review foundation-wide [voting 
guidelines](http://www.apache.org/foundation/voting.html)
+for more information.
+
+Start the review-and-vote thread on the dev@ mailing list. Here's an email 
template; please 
+adjust as you see fit:
+
+```
+From: Release Manager
+To: d...@carbondata.apache.org
+Subject: [VOTE] Apache CarbonData Release x.x.x
+
+Hi everyone,
+Please review and vote on the release candidate for the version x.x.x, as 
follows:
+
+[ ] +1, Approve the release
+[ ] -1, Do not approve the release (please provide specific comments)
+
+The complete staging area is available for your review, which includes:
+* JIRA release notes [1],
+* the official Apache source release to be deployed to dist.apache.org [2], 
which is signed with the key with fingerprint FFFFFFFF [3],
+* all artifacts to be deployed to the Maven Central Repository [4],
+* source code tag "x.x.x" [5],
+* website pull request listing the release [6].
+
+The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.
+
+Thanks,
+Release Manager
+
+[1] link
+[2] link
+[3] https://dist.apache.org/repos/dist/dist/carbondata/KEYS
+[4] link
+[5] link
+[6] link
+```
+
+If there are any issues found in the release candidate, reply on the vote 
thread to cancel the vote.
+Thereâs no need to wait 72 hours. Proceed to the `Cancel a Release (Fix 
Issues)` step below and 
+address the problem.
+However, some issues donât require cancellation.
+For example, if an issue is found in the website pull request, just correct it 
on the spot and the
+vote can continue as-is.
+
+If there are no issues, reply on the vote thread to close the voting. Then, 
tally the votes in a
+separate email. Hereâs an email template; please adjust as you see fit.
+
+```
+From: Release Manager
+To: d...@carbondata.apache.org
+Subject: [RESULT][VOTE] Apache CarbonData Release x.x.x
+
+I'm happy to announce that we have unanimously approved this release.
+
+There are XXX approving votes, XXX of which are binding:
+* approver 1
+* approver 2
+* approver 3
+* approver 4
+
+There are no disapproving votes.
+
+Thanks everyone!
+```
+
+While in incubation, the Apache Incubator PMC must also vote on each release, 
using the same 
+process as above. Start the review and vote thread on the 
`gene...@incubator.apache.org` list.
+
+
+_Checklist to proceed to the final step:_
+
+1. Community votes to release the proposed release
+2. While in incubation, Apache Incubator PMC votes to release the proposed 
release
+
+## Cancel a Release (Fix Issues)
+
+Any issue identified during the community review and vote should be fixed in 
this step.
+
+To fully cancel a vote:
+
+* Cancel the current release and verify the version is back to the correct 
SNAPSHOT:
+
+```
+mvn release:cancel
+```
+
+* Drop the release tag:
+
+```
+git tag -d x.x.x
+git push --delete apache x.x.x
+```
+
+* Drop the staging repository on Nexus 
([repository.apache.org](https://repository.apache.org))
+
+
+Verify the version is back to the correct SNAPSHOT.
+
+Code changes should be proposed as standard pull requests and merged.
+
+Once all issues have been resolved, you should go back and build a new release 
candidate with 
+these changes.
+
+## Finalize the release
+
+Once the release candidate has been reviewed and approved by the community, 
the release should be
+ finalized. This involves the final deployment of the release to the release 
repositories, 
+ merging the website changes, and announce the release.
+ 
+### Deploy artifacts to Maven Central repository
+
+On Nexus, release the staged artifacts to Maven Central repository. In the 
`Staging Repositories`
+ section, find the relevant release candidate `orgapachecarbondata-XXX` entry 
and click `Release`.
+
+### Deploy source release to dist.apache.org
+
+Copy the source release from the `dev` repository to `release` repository at 
`dist.apache.org` 
+using Subversion.
+
+### Merge website pull request
+
+Merge the website pull request to list the release created earlier.
+
+### Mark the version as released in Jira
+
+In Jira, inside [version 
management](https://issues.apache.org/jira/plugins/servlet/project-config/CARBONDATA/versions)
+, hover over the current release and a settings menu will appear. Click 
`Release`, and select 
+today's state.
+
+_Checklist to proceed to the next step:_
+
+1. Maven artifacts released and indexed in the
+ [Maven Central 
repository](https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.carbondata%22)
+2. Source distribution available in the release repository of
+ [dist.apache.org](https://dist.apache.org/repos/dist/release/carbondata/)
+3. Website pull request to list the release merged
+4. Release version finalized in Jira
+
+## Promote the release
+
+Once the release has been finalized, the last step of the process is to 
promote the release 
+within the project and beyond.
+
+### Apache mailing lists
+
+Announce on the dev@ mailing list that the release has been finished.
+ 
+Announce on the user@ mailing list that the release is available, listing 
major improvements and 
+contributions.
+
+While in incubation, announce the release on the Incubator's general@ mailing 
list.
+
+_Checklist to declare the process completed:_
+
+1. Release announced on the user@ mailing list.
+2. Release announced on the Incubator's general@ mailing list.
+3. Completion declared on the dev@ mailing list.
+
+
+<script>
+// Show selected style on nav item
+$(function() { $('.b-nav__release').addClass('selected'); });
+</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/s3-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/s3-guide.md b/src/site/markdown/s3-guide.md
index 2f4dfa9..37f157c 100644
--- a/src/site/markdown/s3-guide.md
+++ b/src/site/markdown/s3-guide.md
@@ -15,7 +15,7 @@
     limitations under the License.
 -->
 
-#S3 Guide (Alpha Feature 1.4.1)
+# S3 Guide (Alpha Feature 1.4.1)
 
 Object storage is the recommended storage format in cloud as it can support 
storing large data 
 files. S3 APIs are widely used for accessing object stores. This can be 
@@ -26,7 +26,7 @@ data and the data can be accessed from anywhere at any time.
 Carbondata can support any Object Storage that conforms to Amazon S3 API.
 Carbondata relies on Hadoop provided S3 filesystem APIs to access Object 
stores.
 
-#Writing to Object Storage
+# Writing to Object Storage
 
 To store carbondata files onto Object Store, `carbon.storelocation` property 
will have 
 to be configured with Object Store path in CarbonProperties file. 
@@ -46,9 +46,9 @@ For example:
 CREATE TABLE IF NOT EXISTS db1.table1(col1 string, col2 int) STORED AS 
carbondata LOCATION 's3a://mybucket/carbonstore'
 ``` 
 
-For more details on create table, Refer 
[data-management-on-carbondata](./data-management-on-carbondata.md#create-table)
+For more details on create table, Refer [DDL of 
CarbonData](ddl-of-carbondata.md#create-table)
 
-#Authentication
+# Authentication
 
 Authentication properties will have to be configured to store the carbondata 
files on to S3 location. 
 
@@ -80,12 +80,15 @@ 
sparkSession.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", "123")
 sparkSession.sparkContext.hadoopConfiguration.set("fs.s3a.access.key","456")
 ```
 
-#Recommendations
+# Recommendations
 
 1. Object Storage like S3 does not support file leasing mechanism(supported by 
HDFS) that is 
 required to take locks which ensure consistency between concurrent operations 
therefore, it is 
-recommended to set the configurable lock path 
property([carbon.lock.path](https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md#miscellaneous-configuration))
+recommended to set the configurable lock path 
property([carbon.lock.path](./configuration-parameters.md#system-configuration))
  to a HDFS directory.
-2. Concurrent data manipulation operations are not supported. Object stores 
follow eventual 
-consistency semantics, i.e., any put request might take some time to reflect 
when trying to list
-.This behaviour causes not to ensure the data read is always consistent or 
latest.
+2. Concurrent data manipulation operations are not supported. Object stores 
follow eventual consistency semantics, i.e., any put request might take some 
time to reflect when trying to list. This behaviour causes the data read is 
always not consistent or not the latest.
+
+<script>
+// Show selected style on nav item
+$(function() { $('.b-nav__s3').addClass('selected'); });
+</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/44eed099/src/site/markdown/sdk-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/sdk-guide.md b/src/site/markdown/sdk-guide.md
index e592aa5..66f3d61 100644
--- a/src/site/markdown/sdk-guide.md
+++ b/src/site/markdown/sdk-guide.md
@@ -351,6 +351,25 @@ public CarbonWriterBuilder withLoadOptions(Map<String, 
String> options);
 
 ```
 /**
+ * To support the table properties for sdk writer
+ *
+ * @param options key,value pair of create table properties.
+ * supported keys values are
+ * a. blocksize -- [1-2048] values in MB. Default value is 1024
+ * b. blockletsize -- values in MB. Default value is 64 MB
+ * c. localDictionaryThreshold -- positive value, default is 10000
+ * d. enableLocalDictionary -- true / false. Default is false
+ * e. sortcolumns -- comma separated column. "c1,c2". Default all dimensions 
are sorted.
+ *
+ * @return updated CarbonWriterBuilder
+ */
+public CarbonWriterBuilder withTableProperties(Map<String, String> options);
+```
+
+
+```
+/**
+* this writer is not thread safe, use buildThreadSafeWriterForCSVInput in 
multi thread environment
 * Build a {@link CarbonWriter}, which accepts row in CSV format object
 * @param schema carbon Schema object {org.apache.carbondata.sdk.file.Schema}
 * @return CSVCarbonWriter
@@ -360,8 +379,24 @@ public CarbonWriterBuilder withLoadOptions(Map<String, 
String> options);
 public CarbonWriter 
buildWriterForCSVInput(org.apache.carbondata.sdk.file.Schema schema) throws 
IOException, InvalidLoadOptionException;
 ```
 
+```
+/**
+* Can use this writer in multi-thread instance.
+* Build a {@link CarbonWriter}, which accepts row in CSV format
+* @param schema carbon Schema object {org.apache.carbondata.sdk.file.Schema}
+* @param numOfThreads number of threads() in which .write will be called.      
        
+* @return CSVCarbonWriter
+* @throws IOException
+* @throws InvalidLoadOptionException
+*/
+public CarbonWriter buildThreadSafeWriterForCSVInput(Schema schema, short 
numOfThreads)
+  throws IOException, InvalidLoadOptionException;
+```
+
+
 ```  
 /**
+* this writer is not thread safe, use buildThreadSafeWriterForAvroInput in 
multi thread environment
 * Build a {@link CarbonWriter}, which accepts Avro format object
 * @param avroSchema avro Schema object {org.apache.avro.Schema}
 * @return AvroCarbonWriter 
@@ -373,6 +408,22 @@ public CarbonWriter 
buildWriterForAvroInput(org.apache.avro.Schema schema) throw
 
 ```
 /**
+* Can use this writer in multi-thread instance.
+* Build a {@link CarbonWriter}, which accepts Avro object
+* @param avroSchema avro Schema object {org.apache.avro.Schema}
+* @param numOfThreads number of threads() in which .write will be called.
+* @return AvroCarbonWriter
+* @throws IOException
+* @throws InvalidLoadOptionException
+*/
+public CarbonWriter buildThreadSafeWriterForAvroInput(org.apache.avro.Schema 
avroSchema, short numOfThreads)
+  throws IOException, InvalidLoadOptionException
+```
+
+
+```
+/**
+* this writer is not thread safe, use buildThreadSafeWriterForJsonInput in 
multi thread environment
 * Build a {@link CarbonWriter}, which accepts Json object
 * @param carbonSchema carbon Schema object
 * @return JsonCarbonWriter
@@ -382,6 +433,19 @@ public CarbonWriter 
buildWriterForAvroInput(org.apache.avro.Schema schema) throw
 public JsonCarbonWriter buildWriterForJsonInput(Schema carbonSchema);
 ```
 
+```
+/**
+* Can use this writer in multi-thread instance.
+* Build a {@link CarbonWriter}, which accepts Json object
+* @param carbonSchema carbon Schema object
+* @param numOfThreads number of threads() in which .write will be called.
+* @return JsonCarbonWriter
+* @throws IOException
+* @throws InvalidLoadOptionException
+*/
+public JsonCarbonWriter buildThreadSafeWriterForJsonInput(Schema carbonSchema, 
short numOfThreads)
+```
+
 ### Class org.apache.carbondata.sdk.file.CarbonWriter
 ```
 /**
@@ -390,7 +454,7 @@ public JsonCarbonWriter buildWriterForJsonInput(Schema 
carbonSchema);
 *                      which is one row of data.
 * If CSVCarbonWriter, object is of type String[], which is one row of data
 * If JsonCarbonWriter, object is of type String, which is one row of json
-* Note: This API is not thread safe
+* Note: This API is not thread safe if writer is not built with number of 
threads argument.
 * @param object
 * @throws IOException
 */
@@ -678,7 +742,6 @@ Find example code at 
[CarbonReaderExample](https://github.com/apache/carbondata/
    *
    * @param dataFilePath complete path including carbondata file name
    * @return Schema object
-   * @throws IOException
    */
   public static Schema readSchemaInDataFile(String dataFilePath);
 ```
@@ -802,4 +865,10 @@ public String getProperty(String key);
 */
 public String getProperty(String key, String defaultValue);
 ```
-Reference : [list of carbon 
properties](http://carbondata.apache.org/configuration-parameters.html)
+Reference : [list of carbon properties](./configuration-parameters.md)
+
+
+<script>
+// Show selected style on nav item
+$(function() { $('.b-nav__api').addClass('selected'); });
+</script>

[02/39] carbondata-site git commit: Added new page layout & updated as per new md files

Reply via email to