[GitHub] carbondata pull request #2965: [Documentation] Editorial review

2018-11-29 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2965

[Documentation] Editorial review

Corrected spelling mistakes and grammer

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata DTS

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2965.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2965


commit 2dd9603aece16a53f781265ebc0db6cd482a4d5f
Author: sgururajshetty 
Date:   2018-11-29T13:14:22Z

Spelling mistakes corrected




---


[GitHub] carbondata issue #2805: [Documentation] Local dictionary Data which are not ...

2018-11-05 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2805
  
@sraghunandan kindly review


---


[GitHub] carbondata issue #2805: [Documentation] Local dictionary Data which are not ...

2018-11-05 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2805
  
@sraghunandan kindly review and help me to merge my changes


---


[GitHub] carbondata pull request #2888: [CARBONDATA-3066]add documentation for writte...

2018-10-31 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2888#discussion_r229732059
  
--- Diff: docs/sdk-guide.md ---
@@ -124,7 +124,7 @@ public class TestSdkAvro {
 try {
   CarbonWriter writer = CarbonWriter.builder()
   .outputPath(path)
-  .withAvroInput(new 
org.apache.avro.Schema.Parser().parse(avroSchema)).build();
+  .withAvroInput(new 
org.apache.avro.Schema.Parser().parse(avroSchema))..writtenBy("SDK").build();
--- End diff --

two .. 
Please remove one


---


[GitHub] carbondata issue #2805: [Documentation] Local dictionary Data which are not ...

2018-10-27 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2805
  
@sraghunandan kindly review and help me to merge my changes


---


[GitHub] carbondata pull request #2805: Local dictionary Data which are not supported...

2018-10-09 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2805

Local dictionary Data which are not supported float and byte updated in the 
note



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata Data_not_supported

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2805.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2805


commit 335ae173c72657852539caaecce3437c70e1e07a
Author: sgururajshetty 
Date:   2018-10-09T11:04:20Z

Data which are not supported float and byte updated in the note




---


[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

2018-10-05 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2788
  
@sraghunandan kindly review the PR and help me to merge the same


---


[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

2018-10-04 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2788#discussion_r222666063
  
--- Diff: docs/carbon-as-spark-datasource-guide.md ---
@@ -41,25 +42,23 @@ Carbon table can be created with spark's datasource DDL 
syntax as follows.
 
 | Property | Default Value | Description |
 |---|--||
-| table_blocksize | 1024 | Size of blocks to write onto hdfs |
+| table_blocksize | 1024 | Size of blocks to write onto hdfs. For  more 
details, see [Table Block Size 
Configuration](./ddl-of-carbondata.md#table-block-size-configuration) |
--- End diff --

Added reference to all properties 


---


[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

2018-10-03 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2788
  
@sraghunandan & @kunal642 kindly review and merge the doc


---


[GitHub] carbondata issue #2788: [Documentation] Readme updated with latest topics an...

2018-09-28 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2788
  
@sraghunandan review 


---


[GitHub] carbondata pull request #2788: [Documentation] Readme updated with latest to...

2018-09-28 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2788

[Documentation] Readme updated with latest topics and new TOC

> Readme updated with the new structure
> Formatting issue fixed
> Review comments fixed

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata doc_sept

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2788.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2788


commit c8bc47ec164e43736d6f5b39b7c883d2b11bd7f7
Author: sgururajshetty 
Date:   2018-09-28T13:43:08Z

Readme updated with latest topics and new TOC
Formatting issues fixed




---


[GitHub] carbondata issue #2766: [WIP] Added documentation for fallback condition for...

2018-09-26 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2766
  
LGTM


---


[GitHub] carbondata issue #2757: [DOC] Add spark carbon file format documentation

2018-09-26 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2757
  
LGTM


---


[GitHub] carbondata issue #2744: [CARBONDATA-2957][doc] update doc for supporting com...

2018-09-26 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2744
  
LGTM


---


[GitHub] carbondata issue #2756: [CARBONDATA-2966]Update Documentation For Avro DataT...

2018-09-26 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2756
  
LGTM


---


[GitHub] carbondata pull request #2756: [CARBONDATA-2966]Update Documentation For Avr...

2018-09-25 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2756#discussion_r220428630
  
--- Diff: docs/sdk-guide.md ---
@@ -181,22 +181,31 @@ public class TestSdkJson {
 ```
 
 ## Datatypes Mapping
-Each of SQL data types are mapped into data types of SDK. Following are 
the mapping:
+Each of SQL data types and Avro Data Types are mapped into data types of 
SDK. Following are the mapping:
 
-| SQL DataTypes | Mapped SDK DataTypes |
+| SQL DataTypes | Avro DataTypes | Mapped SDK DataTypes |
 |---|--|
--- End diff --

Check the formatting 


---


[GitHub] carbondata pull request #2756: [CARBONDATA-2966]Update Documentation For Avr...

2018-09-25 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2756#discussion_r220427375
  
--- Diff: docs/configuration-parameters.md ---
@@ -42,6 +42,7 @@ This section provides the details of all the 
configurations required for the Car
 | carbon.lock.type | LOCALLOCK | This configuration specifies the type of 
lock to be acquired during concurrent operations on table. There are following 
types of lock implementation: - LOCALLOCK: Lock is created on local file system 
as file. This lock is useful when only one spark driver (thrift server) runs on 
a machine and no other CarbonData spark application is launched concurrently. - 
HDFSLOCK: Lock is created on HDFS file system as file. This lock is useful when 
multiple CarbonData spark applications are launched and no ZooKeeper is running 
on cluster and HDFS supports file based locking. |
 | carbon.lock.path | TABLEPATH | This configuration specifies the path 
where lock files have to be created. Recommended to configure zookeeper lock 
type or configure HDFS lock path(to this property) in case of S3 file system as 
locking is not feasible on S3. |
 | carbon.unsafe.working.memory.in.mb | 512 | CarbonData supports storing 
data in off-heap memory for certain operations during data loading and 
query.This helps to avoid the Java GC and thereby improve the overall 
performance.The Minimum value recommeded is 512MB.Any value below this is reset 
to default value of 512MB.**NOTE:** The below formulas explain how to arrive at 
the off-heap size required.Memory Required For Data 
Loading:(*carbon.number.of.cores.while.loading*) * (Number of tables to 
load in parallel) * (*offheap.sort.chunk.size.inmb* + 
*carbon.blockletgroup.size.in.mb* + *carbon.blockletgroup.size.in.mb*/3.5 ). 
Memory required for Query:SPARK_EXECUTOR_INSTANCES * 
(*carbon.blockletgroup.size.in.mb* + *carbon.blockletgroup.size.in.mb* * 3.5) * 
spark.executor.cores |
+| carbon.unsafe.driver.working.memory.in.mb | 60% of JVM Heap Memory | 
CarbonData supports storing data in unsafe on-heap memory in driver for certain 
operations like insert into, query for loading datamap cache. The Minimum value 
recommended is 512MB. |
--- End diff --

Parameter description should satisfy following questions:
a.  What does this parameter do?
b.  In what scenario the user needs to configure this parameter?
c.  Is there any benefits by configuring these parameter?
d.  What is the default value?
e.  What is the value range if any?
f.  Is there any limitations?
g.  Any key information to be highlighted?



---


[GitHub] carbondata pull request #2756: [CARBONDATA-2966]Update Documentation For Avr...

2018-09-25 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2756#discussion_r220427408
  
--- Diff: docs/configuration-parameters.md ---
@@ -42,6 +42,7 @@ This section provides the details of all the 
configurations required for the Car
 | carbon.lock.type | LOCALLOCK | This configuration specifies the type of 
lock to be acquired during concurrent operations on table. There are following 
types of lock implementation: - LOCALLOCK: Lock is created on local file system 
as file. This lock is useful when only one spark driver (thrift server) runs on 
a machine and no other CarbonData spark application is launched concurrently. - 
HDFSLOCK: Lock is created on HDFS file system as file. This lock is useful when 
multiple CarbonData spark applications are launched and no ZooKeeper is running 
on cluster and HDFS supports file based locking. |
 | carbon.lock.path | TABLEPATH | This configuration specifies the path 
where lock files have to be created. Recommended to configure zookeeper lock 
type or configure HDFS lock path(to this property) in case of S3 file system as 
locking is not feasible on S3. |
 | carbon.unsafe.working.memory.in.mb | 512 | CarbonData supports storing 
data in off-heap memory for certain operations during data loading and 
query.This helps to avoid the Java GC and thereby improve the overall 
performance.The Minimum value recommeded is 512MB.Any value below this is reset 
to default value of 512MB.**NOTE:** The below formulas explain how to arrive at 
the off-heap size required.Memory Required For Data 
Loading:(*carbon.number.of.cores.while.loading*) * (Number of tables to 
load in parallel) * (*offheap.sort.chunk.size.inmb* + 
*carbon.blockletgroup.size.in.mb* + *carbon.blockletgroup.size.in.mb*/3.5 ). 
Memory required for Query:SPARK_EXECUTOR_INSTANCES * 
(*carbon.blockletgroup.size.in.mb* + *carbon.blockletgroup.size.in.mb* * 3.5) * 
spark.executor.cores |
+| carbon.unsafe.driver.working.memory.in.mb | 60% of JVM Heap Memory | 
CarbonData supports storing data in unsafe on-heap memory in driver for certain 
operations like insert into, query for loading datamap cache. The Minimum value 
recommended is 512MB. |
--- End diff --

Kindly follow the same for all the parameter description


---


[GitHub] carbondata issue #2753: [CARBONDATA-2964] Fix for unsupported float data typ...

2018-09-25 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2753
  
LGTM



---


[GitHub] carbondata pull request #2643: Formatting fix s3

2018-08-17 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2643

Formatting fix s3

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata formattingFix_S3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2643.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2643


commit 529f80dda6db3ce34e0baf766b03a9a13190b286
Author: sgururajshetty 
Date:   2018-07-25T12:44:07Z

Documentation for support for COLUMN_META_CACHE in create table and alter 
table properties

commit d816aaa7a89155b3579906f960ed6a0ba4d4a59f
Author: sgururajshetty 
Date:   2018-07-25T12:48:43Z

Documentation to support for CACHE_LEVEL in create table and alter table 
properties

commit 8ac243f8e9cff8359b6064352deb823eda7b9835
Author: sgururajshetty 
Date:   2018-07-25T13:24:52Z

Review  comment fixed

commit 98501d35cfd110bcb9e75eb02628f3bce0c0f4ab
Author: sgururajshetty 
Date:   2018-07-25T13:26:58Z

review comment fixed

commit 62caf822cbcde1e519501c1d5db3c5cfc05fbd63
Author: Indhumathi27 
Date:   2018-07-21T10:46:21Z

[CARBONDATA-2606]Fix Complex array Pushdown and block auto merge compaction

1.Check for if Complex Column contains ArrayType at n levels and add parent 
to projection if contains array.
2.Block Auto merge compaction for table containing complex datatype columns.
3.Fix Decimal Datatype scale and precision with two level struct type
4.Fix Dictionary Include for ComplexDataType
- If other complex columns other than first complex column is given in 
dictionary include, then its insertion fails.
5.Fix BadRecord and dateformat for Complex primitive type-DATE

This closes #2535

commit d287a102b5c96e54261ac00c77038a1a56161fe9
Author: kumarvishal09 
Date:   2018-07-24T14:40:54Z

[CARBONDATA-2779]Fixed filter query issue in case of V1/v2 format store

Problem:
Filter query is failing for V1/V2 carbondata store

Root Cause:
in V1 store measure min max was not added in blockminmaxindex in executor 
when filter is applied min max pruning is failing with array index out of cound 
exception

Solution:
Need to add min max for measure column same as already handled in driver 
block pruning

This closes #2550

commit b08745f68624ff066e0b23a41ce12d4a99618ac5
Author: Manhua 
Date:   2018-07-25T08:51:49Z

[CARBONDATA-2783][BloomDataMap][Doc] Update document for bloom filter 
datamap

add example for enable/disable datamap

This closes #2554

commit 964d26866468df6be130e9d65d339439cb4cf3b0
Author: praveenmeenakshi56 
Date:   2018-07-25T15:31:37Z

[CARBONDATA-2750] Added Documentation for Local Dictionary Support

Added Documentation for Local Dictionary Support

This closes #2520

commit 1fa9f64d70123d0bc988427a34c0750283f5daae
Author: BJangir 
Date:   2018-07-23T16:44:12Z

[CARBONDATA-2772] Size based dictionary fallback is failing even threshold 
is not reached.

Issue:- Size Based Fallback happened even threshold is not reached.
RootCause:- Current size calculation is wrong. it is calculated for each 
data. instead of generated dictionary data .

Solution :- Current size should be calculated only for generated dictionary 
data.

This closes #2542

commit eae5817e56a20aecb7694c8d387dbb05b96e1045
Author: kunal642 
Date:   2018-07-24T10:42:54Z

[CARBONDATA-2778]Fixed bug when select after delete and cleanup is showing 
empty records

Problem: In case if delete operation when it is found that the data being 
deleted is leading to a state where one complete block data is getting
deleted. In that case the status if that block is marked for delete and 
during the next delete operation run the block is deleted along with its
carbonIndex file. The problem arises due to deletion of carbonIndex file 
because for multiple blocks there can be one carbonIndex file as one
carbonIndex file represents one task

[GitHub] carbondata pull request #2603: [Documentation] Editorial review comment fixe...

2018-08-03 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2603#discussion_r207516087
  
--- Diff: docs/configuration-parameters.md ---
@@ -140,7 +140,7 @@ This section provides the details of all the 
configurations required for CarbonD
 | carbon.enableMinMax | true | Min max is feature added to enhance query 
performance. To disable this feature, set it false. |
 | carbon.dynamicallocation.schedulertimeout | 5 | Specifies the maximum 
time (unit in seconds) the scheduler can wait for executor to be active. 
Minimum value is 5 sec and maximum value is 15 sec. |
 | carbon.scheduler.minregisteredresourcesratio | 0.8 | Specifies the 
minimum resource (executor) ratio needed for starting the block distribution. 
The default value is 0.8, which indicates 80% of the requested resource is 
allocated for starting block distribution.  The minimum value is 0.1 min and 
the maximum value is 1.0. | 
-| carbon.search.enabled | false | If set to true, it will use CarbonReader 
to do distributed scan directly instead of using compute framework like spark, 
thus avoiding limitation of compute framework like SQL optimizer and task 
scheduling overhead. |
+| carbon.search.enabled (Alpha Feature) | false | If set to true, it will 
use CarbonReader to do distributed scan directly instead of using compute 
framework like spark, thus avoiding limitation of compute framework like SQL 
optimizer and task scheduling overhead. |
 
 * **Global Dictionary Configurations**
--- End diff --

This issue is handled in a different PR #2576


---


[GitHub] carbondata pull request #2603: [Documentation] Editorial review comment fixe...

2018-08-03 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2603#discussion_r207516006
  
--- Diff: docs/configuration-parameters.md ---
@@ -140,7 +140,7 @@ This section provides the details of all the 
configurations required for CarbonD
 | carbon.enableMinMax | true | Min max is feature added to enhance query 
performance. To disable this feature, set it false. |
 | carbon.dynamicallocation.schedulertimeout | 5 | Specifies the maximum 
time (unit in seconds) the scheduler can wait for executor to be active. 
Minimum value is 5 sec and maximum value is 15 sec. |
 | carbon.scheduler.minregisteredresourcesratio | 0.8 | Specifies the 
minimum resource (executor) ratio needed for starting the block distribution. 
The default value is 0.8, which indicates 80% of the requested resource is 
allocated for starting block distribution.  The minimum value is 0.1 min and 
the maximum value is 1.0. | 
-| carbon.search.enabled | false | If set to true, it will use CarbonReader 
to do distributed scan directly instead of using compute framework like spark, 
thus avoiding limitation of compute framework like SQL optimizer and task 
scheduling overhead. |
+| carbon.search.enabled (Alpha Feature) | false | If set to true, it will 
use CarbonReader to do distributed scan directly instead of using compute 
framework like spark, thus avoiding limitation of compute framework like SQL 
optimizer and task scheduling overhead. |
 
 * **Global Dictionary Configurations**
--- End diff --

The minimum value need not be mentioned now


---


[GitHub] carbondata issue #2576: [CARBONDATA-2795] Add documentation for S3

2018-08-02 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2576
  
LGTM


---


[GitHub] carbondata pull request #2576: [CARBONDATA-2795] Add documentation for S3

2018-08-02 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2576#discussion_r207223686
  
--- Diff: docs/configuration-parameters.md ---
@@ -106,7 +106,12 @@ This section provides the details of all the 
configurations required for CarbonD
 
|-|--|-|
 | carbon.sort.file.write.buffer.size | 16384 | File write buffer size used 
during sorting. Minimum allowed buffer size is 10240 byte and Maximum allowed 
buffer size is 10485760 byte. |
 | carbon.lock.type | LOCALLOCK | This configuration specifies the type of 
lock to be acquired during concurrent operations on table. There are following 
types of lock implementation: - LOCALLOCK: Lock is created on local file system 
as file. This lock is useful when only one spark driver (thrift server) runs on 
a machine and no other CarbonData spark application is launched concurrently. - 
HDFSLOCK: Lock is created on HDFS file system as file. This lock is useful when 
multiple CarbonData spark applications are launched and no ZooKeeper is running 
on cluster and HDFS supports file based locking. |
-| carbon.lock.path | TABLEPATH | This configuration specifies the path 
where lock files have to be created. Recommended to configure zookeeper lock 
type or configure HDFS lock path(to this property) in case of S3 file system as 
locking is not feasible on S3.
+| carbon.lock.path | TABLEPATH | Locks on the files are used to prevent 
concurrent operation from modifying the same files. This 
+configuration specifies the path where lock files have to be created. 
Recommended to configure 
+HDFS lock path(to this property) in case of S3 file system as locking is 
not feasible on S3. 
+**Note:** If this property is not set to HDFS location for S3 store, then 
there is a possibility 
+of data corruption because multiple data manipulation calls might try to 
update the status file 
+and as lock is not acquired before updation data might get overwritten.
--- End diff --

since it is table, end the line with a pipline |



---


[GitHub] carbondata pull request #2603: [Documentation] Editorial review comment fixe...

2018-08-02 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2603

[Documentation] Editorial review comment fixed

Minor issues fixed (spelling, syntax, and missing info)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata editorial_review1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2603.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2603


commit 529f80dda6db3ce34e0baf766b03a9a13190b286
Author: sgururajshetty 
Date:   2018-07-25T12:44:07Z

Documentation for support for COLUMN_META_CACHE in create table and alter 
table properties

commit d816aaa7a89155b3579906f960ed6a0ba4d4a59f
Author: sgururajshetty 
Date:   2018-07-25T12:48:43Z

Documentation to support for CACHE_LEVEL in create table and alter table 
properties

commit 8ac243f8e9cff8359b6064352deb823eda7b9835
Author: sgururajshetty 
Date:   2018-07-25T13:24:52Z

Review  comment fixed

commit 98501d35cfd110bcb9e75eb02628f3bce0c0f4ab
Author: sgururajshetty 
Date:   2018-07-25T13:26:58Z

review comment fixed

commit 62caf822cbcde1e519501c1d5db3c5cfc05fbd63
Author: Indhumathi27 
Date:   2018-07-21T10:46:21Z

[CARBONDATA-2606]Fix Complex array Pushdown and block auto merge compaction

1.Check for if Complex Column contains ArrayType at n levels and add parent 
to projection if contains array.
2.Block Auto merge compaction for table containing complex datatype columns.
3.Fix Decimal Datatype scale and precision with two level struct type
4.Fix Dictionary Include for ComplexDataType
- If other complex columns other than first complex column is given in 
dictionary include, then its insertion fails.
5.Fix BadRecord and dateformat for Complex primitive type-DATE

This closes #2535

commit d287a102b5c96e54261ac00c77038a1a56161fe9
Author: kumarvishal09 
Date:   2018-07-24T14:40:54Z

[CARBONDATA-2779]Fixed filter query issue in case of V1/v2 format store

Problem:
Filter query is failing for V1/V2 carbondata store

Root Cause:
in V1 store measure min max was not added in blockminmaxindex in executor 
when filter is applied min max pruning is failing with array index out of cound 
exception

Solution:
Need to add min max for measure column same as already handled in driver 
block pruning

This closes #2550

commit b08745f68624ff066e0b23a41ce12d4a99618ac5
Author: Manhua 
Date:   2018-07-25T08:51:49Z

[CARBONDATA-2783][BloomDataMap][Doc] Update document for bloom filter 
datamap

add example for enable/disable datamap

This closes #2554

commit 964d26866468df6be130e9d65d339439cb4cf3b0
Author: praveenmeenakshi56 
Date:   2018-07-25T15:31:37Z

[CARBONDATA-2750] Added Documentation for Local Dictionary Support

Added Documentation for Local Dictionary Support

This closes #2520

commit 1fa9f64d70123d0bc988427a34c0750283f5daae
Author: BJangir 
Date:   2018-07-23T16:44:12Z

[CARBONDATA-2772] Size based dictionary fallback is failing even threshold 
is not reached.

Issue:- Size Based Fallback happened even threshold is not reached.
RootCause:- Current size calculation is wrong. it is calculated for each 
data. instead of generated dictionary data .

Solution :- Current size should be calculated only for generated dictionary 
data.

This closes #2542

commit eae5817e56a20aecb7694c8d387dbb05b96e1045
Author: kunal642 
Date:   2018-07-24T10:42:54Z

[CARBONDATA-2778]Fixed bug when select after delete and cleanup is showing 
empty records

Problem: In case if delete operation when it is found that the data being 
deleted is leading to a state where one complete block data is getting
deleted. In that case the status if that block is marked for delete and 
during the next delete operation run the block is deleted along with its
carbonIndex file. The problem arises due to deletion of carbonIndex file 
because for multiple blocks there can be one carbonIndex file as one
carbonIndex file represents one task.

Solution: Do not delete the carbondata and carbonIndex file. After 
compaction it will automatically take care of deleting the stale data and stale 
segments.

This closes #2548

commit 6d6874a11482a8aa79f2280f6572e84b5e3cbc93
Author: dhatchayani 
Date:   2018-07-25T09:11:58Z

[CARBONDATA-2753][Compatibility] Row count of page is calculated wrong for 
old store(V2 store)

Row count of page is calculated wrong for V2 store.

commit b6f5af6af96140876ec10ff09c3313d9b35ceb36
Author: Sssan520 
Date:   2018-07-25T11:36:00Z

[CARBONDATA-2782]delete dead code in class 'CarbonCleanFilesCommand'

The variables(dms、indexDms) in function processMetadata

[GitHub] carbondata pull request #2572: [CARBONDATA-2793][32k][Doc] Add 32k support i...

2018-07-29 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2572#discussion_r206017345
  
--- Diff: docs/data-management-on-carbondata.md ---
@@ -283,7 +283,29 @@ This tutorial is going to introduce all commands and 
data operations on CarbonDa
 ```
 ALTER TABLE employee SET TBLPROPERTIES 
(‘CACHE_LEVEL’=’Blocklet’)
 ```
-
+
+   - **String longer than 32000 characters**
--- End diff --

If it is a Alpha feature then please mention in the bracket that (Alpha 
Feature 1.4.1 )


---


[GitHub] carbondata issue #2572: [CARBONDATA-2793][32k][Doc] Add 32k support in docum...

2018-07-29 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2572
  
LGTM


---


[GitHub] carbondata issue #2520: [CARBONDATA-2750] Added Documentation for Local Dict...

2018-07-25 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2520
  
LGTM



---


[GitHub] carbondata pull request #2558: [CARBONDATA-2648] Documentation for support f...

2018-07-25 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2558

[CARBONDATA-2648] Documentation for support for COLUMN_META_CACHE in create 
table and a…



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2558.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2558


commit 529f80dda6db3ce34e0baf766b03a9a13190b286
Author: sgururajshetty 
Date:   2018-07-25T12:44:07Z

Documentation for support for COLUMN_META_CACHE in create table and alter 
table properties




---


[GitHub] carbondata issue #2520: [CARBONDATA-2750] Added Documentation for Local Dict...

2018-07-20 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2520
  
LGTM


---


[GitHub] carbondata pull request #:

2018-07-19 Thread sgururajshetty
Github user sgururajshetty commented on the pull request:


https://github.com/apache/carbondata/commit/71048d7b7d2e4aa4a06536b45f2a33a4542f8b76#commitcomment-29761755
  
LGTM


---


[GitHub] carbondata pull request #2520: [CARBONDATA-2750] Added Documentation for Loc...

2018-07-18 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2520#discussion_r203281996
  
--- Diff: docs/data-management-on-carbondata.md ---
@@ -291,6 +330,11 @@ This tutorial is going to introduce all commands and 
data operations on CarbonDa
  ALTER TABLE carbon ADD COLUMNS (a1 INT, b1 STRING) 
TBLPROPERTIES('DEFAULT.VALUE.a1'='10')
  ```
 
+ Users can specify which columns to include and exclude for local 
dictionary generation after adding   new columns. These will be appended with 
the already   existing local dictionary include and exclude  columns of 
main table respectively.
--- End diff --

check the spacing between words


---


[GitHub] carbondata pull request #2520: [CARBONDATA-2750] Added Documentation for Loc...

2018-07-18 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2520#discussion_r203277284
  
--- Diff: docs/data-management-on-carbondata.md ---
@@ -122,6 +122,45 @@ This tutorial is going to introduce all commands and 
data operations on CarbonDa
  TBLPROPERTIES ('streaming'='true')
  ```
 
+  - **Local Dictionary Configuration**
+  
+  Local Dictionary is generated only for no-dictionary string/varchar 
datatype columns. It helps in:
+  1. Getting more compression on dimension columns with less cardinality.
+  2. Filter queries and full scan queries on No-dictionary columns with 
local dictionary will be faster as filter will be done on encoded data.
+  3. Reducing the store size and memory footprint as only unique values 
will be stored as part of local dictionary and corresponding data will be 
stored as encoded data.
+
+   By default, Local Dictionary will be enabled and generated for all 
no-dictionary string/varchar datatype columns.
--- End diff --

Convert this into table

| Properties | Default Value | Description |

The **description** should satisfy the following points:
a.  What does this parameter do?
b.  In what scenario the user needs to configure this parameter?
c.  Are there any benefits in configuring this parameter?
d.  What is the default value?
e.  What is the value range if any?
f.  Are there any limitations?
g.  Any key information to be highlighted?


---


[GitHub] carbondata issue #2502: [CARBONDATA-2738]Update documentation for Complex da...

2018-07-15 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2502
  
LGTM


---


[GitHub] carbondata issue #2361: [CARBONDATA-2577] [CARBONDATA-2579] Fixed issue in A...

2018-06-05 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2361
  
LGTM


---


[GitHub] carbondata issue #2356: [CARBONDATA-2566] Optimize CarbonReaderExample

2018-05-31 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2356
  
LGTM


---


[GitHub] carbondata pull request #2320: [Documentation] Editorial review comment fixe...

2018-05-18 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2320

[Documentation] Editorial review comment fixed

 Editorial review comment fixed

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata Editorial_Review

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2320.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2320


commit f8236ae2eb9e52d22f37a4b61967260071a50957
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-05-18T11:29:38Z

Editorial review comment fixed




---


[GitHub] carbondata issue #2274: [CARBONDATA-2440] doc updated to set the property fo...

2018-05-11 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2274
  
LGTM


---


[GitHub] carbondata issue #2296: [CARBONDATA-2369] updated the document about AVRO to...

2018-05-11 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2296
  
LGTM


---


[GitHub] carbondata issue #2258: [CARBONDATA-2424] Added documentation for properties...

2018-05-02 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2258
  
LGTM


---


[GitHub] carbondata issue #2199: [CARBONDATA-2370] Added document for presto multinod...

2018-05-02 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2199
  
LGTM


---


[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2220
  
LGTM


---


[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2198
  
LGTM


---


[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183617096
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
+
+Package carbon jar, and copy 
assembly/target/scala-2.11/carbondata_2.11-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar 
to $SPARK_HOME/jars
+```shell
+mvn clean package -DskipTests -Pspark-2.2
+```
+
+Start spark-shell in new terminal, type :paste, then copy and run the 
following code.
+```scala
+ import java.io.File
+ import org.apache.spark.sql.{CarbonEnv, SparkSession}
+ import org.apache.spark.sql.CarbonSession._
+ import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+ import org.apache.carbondata.core.util.path.CarbonStorePath
+ 
+ val warehouse = new File("./warehouse").getCanonicalPath
+ val metastore = new File("./metastore").getCanonicalPath
+ 
+ val spark = SparkSession
+   .builder()
+   .master("local")
+   .appName("preAggregateExample")
+   .config("spark.sql.warehouse.dir", warehouse)
+   .getOrCreateCarbonSession(warehouse, metastore)
+
+ spark.sparkContext.setLogLevel("ERROR")
+
+ // drop table if exists previously
+ spark.sql(s"DROP TABLE IF EXISTS datamap_test")
+ 
+ // Create main table
+ spark.sql(
+   s"""
+  |CREATE TABLE datamap_test (
+  |name string,
+  |age int,
+  |city string,
+  |country string)
+  |STORED BY 'carbondata'
+""".stripMargin)
+ 
+ // Create lucene datamap on the main table
+ spark.sql(
+   s"""
+  |CREATE DATAMAP dm
+  |ON TABLE datamap_test
+  |USING "lucene"
+  |DMPROPERTIES ('TEXT_COLUMNS' = 'name, country')
+  
+  import spark.implicits._
+  import org.apache.spark.sql.SaveMode
+  import scala.util.Random
+ 
+  // Load data to the main table, if
+  // lucene index writing fails, the datamap
+  // will be disabled in query
+  val r = new Random()
+  spark.sparkContext.parallelize(1 to 10)
+   .map(x => ("c1" + x % 8, x % 8, "city" + x % 50, "country" + x % 60))
+   .toDF("name", "age", "city", "country")
+   .write
+   .format("carbondata")
+   .option("tableName", "datamap_test")
+   .option("compress", "true")
+   .mode(SaveMode.Append)
+   .save()
+  
+  spark.sql(
+s"""
+   |SELECT *
+   |from datamap_test where
+   |TEXT_MATCH('name:c10')
+ """.stripMargin).show
+
+  spark.stop
+```
+
+ DataMap Management
+Lucene DataMap can be created using following DDL
+  ```
+  CREATE DATAMAP [IF NOT EXISTS] datamap_name
+  ON TABLE main_table
+  USING "lucene"
+  DMPROPERTIES ('text_columns'='city, name', ...)
+  ```
+
+DataMap can be dropped using following DDL
+  ```
+  DROP DATAMAP [IF EXISTS] datamap_name
+  ON TABLE main_table
+  ```
+To show all DataMaps created, use:
+  ```
+  SHOW DATAMAP 
+  ON TABLE main_table
+  ```
+It will show all DataMaps created on main table.
+
+
+## Lucene DataMap Introduction
+  Lucene datamap are created as index DataMaps and managed along with main 
tables by CarbonData.
+  User can create as many lucene datamaps required to improve query 
performance,
+  provided the storage requirements and loading speeds are acceptable.
+  
+  Once lucene datamaps are created, the indexes generated by lucene will 
be read for pruning till
+  row level for the filter query by launching a spark datamap job. This 
pruned data will be read to
+  give the proper and faster result
--- End diff --

end all sentence with a period (.)


---


[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183617702
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
+
+Package carbon jar, and copy 
assembly/target/scala-2.11/carbondata_2.11-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar 
to $SPARK_HOME/jars
+```shell
+mvn clean package -DskipTests -Pspark-2.2
+```
+
+Start spark-shell in new terminal, type :paste, then copy and run the 
following code.
+```scala
+ import java.io.File
+ import org.apache.spark.sql.{CarbonEnv, SparkSession}
+ import org.apache.spark.sql.CarbonSession._
+ import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+ import org.apache.carbondata.core.util.path.CarbonStorePath
+ 
+ val warehouse = new File("./warehouse").getCanonicalPath
+ val metastore = new File("./metastore").getCanonicalPath
+ 
+ val spark = SparkSession
+   .builder()
+   .master("local")
+   .appName("preAggregateExample")
+   .config("spark.sql.warehouse.dir", warehouse)
+   .getOrCreateCarbonSession(warehouse, metastore)
+
+ spark.sparkContext.setLogLevel("ERROR")
+
+ // drop table if exists previously
+ spark.sql(s"DROP TABLE IF EXISTS datamap_test")
+ 
+ // Create main table
+ spark.sql(
+   s"""
+  |CREATE TABLE datamap_test (
+  |name string,
+  |age int,
+  |city string,
+  |country string)
+  |STORED BY 'carbondata'
+""".stripMargin)
+ 
+ // Create lucene datamap on the main table
+ spark.sql(
+   s"""
+  |CREATE DATAMAP dm
+  |ON TABLE datamap_test
+  |USING "lucene"
+  |DMPROPERTIES ('TEXT_COLUMNS' = 'name, country')
+  
+  import spark.implicits._
+  import org.apache.spark.sql.SaveMode
+  import scala.util.Random
+ 
+  // Load data to the main table, if
+  // lucene index writing fails, the datamap
+  // will be disabled in query
+  val r = new Random()
+  spark.sparkContext.parallelize(1 to 10)
+   .map(x => ("c1" + x % 8, x % 8, "city" + x % 50, "country" + x % 60))
+   .toDF("name", "age", "city", "country")
+   .write
+   .format("carbondata")
+   .option("tableName", "datamap_test")
+   .option("compress", "true")
+   .mode(SaveMode.Append)
+   .save()
+  
+  spark.sql(
+s"""
+   |SELECT *
+   |from datamap_test where
+   |TEXT_MATCH('name:c10')
+ """.stripMargin).show
+
+  spark.stop
+```
+
+ DataMap Management
+Lucene DataMap can be created using following DDL
+  ```
+  CREATE DATAMAP [IF NOT EXISTS] datamap_name
+  ON TABLE main_table
+  USING "lucene"
+  DMPROPERTIES ('text_columns'='city, name', ...)
+  ```
+
+DataMap can be dropped using following DDL
+  ```
+  DROP DATAMAP [IF EXISTS] datamap_name
+  ON TABLE main_table
+  ```
+To show all DataMaps created, use:
+  ```
+  SHOW DATAMAP 
+  ON TABLE main_table
+  ```
+It will show all DataMaps created on main table.
+
+
+## Lucene DataMap Introduction
+  Lucene datamap are created as index DataMaps and managed along with main 
tables by CarbonData.
+  User can create as many lucene datamaps required to improve query 
performance,
+  provided the storage requirements and loading speeds are acceptable.
+  
+  Once lucene datamaps are created, the indexes generated by lucene will 
be read for pruning till
+  row level for the filter query by launching a spark datamap job. This 
pruned data will be read to
+  give the proper and faster result
+
+  For instance, main table called **sales** which is defined as 
+  
+  ```
+  CREATE TABLE datamap_test (
+name string,
+age int,
+city string,
+country string)
+  STORED BY 'carbondata'
+  ```
+  
+  User can create Lucene datamap using the Create DataMap DDL
+  
+  ```
+  CREATE DATAMAP dm
+  ON TABLE datamap_test
+  USING "lucene&quo

[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183616698
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
+
+Package carbon jar, and copy 
assembly/target/scala-2.11/carbondata_2.11-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar 
to $SPARK_HOME/jars
+```shell
+mvn clean package -DskipTests -Pspark-2.2
+```
+
+Start spark-shell in new terminal, type :paste, then copy and run the 
following code.
+```scala
+ import java.io.File
+ import org.apache.spark.sql.{CarbonEnv, SparkSession}
+ import org.apache.spark.sql.CarbonSession._
+ import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+ import org.apache.carbondata.core.util.path.CarbonStorePath
+ 
+ val warehouse = new File("./warehouse").getCanonicalPath
+ val metastore = new File("./metastore").getCanonicalPath
+ 
+ val spark = SparkSession
+   .builder()
+   .master("local")
+   .appName("preAggregateExample")
+   .config("spark.sql.warehouse.dir", warehouse)
+   .getOrCreateCarbonSession(warehouse, metastore)
+
+ spark.sparkContext.setLogLevel("ERROR")
+
+ // drop table if exists previously
+ spark.sql(s"DROP TABLE IF EXISTS datamap_test")
+ 
+ // Create main table
+ spark.sql(
+   s"""
+  |CREATE TABLE datamap_test (
+  |name string,
+  |age int,
+  |city string,
+  |country string)
+  |STORED BY 'carbondata'
+""".stripMargin)
+ 
+ // Create lucene datamap on the main table
+ spark.sql(
+   s"""
+  |CREATE DATAMAP dm
+  |ON TABLE datamap_test
+  |USING "lucene"
+  |DMPROPERTIES ('TEXT_COLUMNS' = 'name, country')
+  
+  import spark.implicits._
+  import org.apache.spark.sql.SaveMode
+  import scala.util.Random
+ 
+  // Load data to the main table, if
+  // lucene index writing fails, the datamap
+  // will be disabled in query
+  val r = new Random()
+  spark.sparkContext.parallelize(1 to 10)
+   .map(x => ("c1" + x % 8, x % 8, "city" + x % 50, "country" + x % 60))
+   .toDF("name", "age", "city", "country")
+   .write
+   .format("carbondata")
+   .option("tableName", "datamap_test")
+   .option("compress", "true")
+   .mode(SaveMode.Append)
+   .save()
+  
+  spark.sql(
+s"""
+   |SELECT *
+   |from datamap_test where
+   |TEXT_MATCH('name:c10')
+ """.stripMargin).show
+
+  spark.stop
+```
+
+ DataMap Management
+Lucene DataMap can be created using following DDL
--- End diff --

Lucene DataMap can be created using following DDL:


---


[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183615908
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
--- End diff --

These are procedure steps, so we can have numbered list


---


[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183618083
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
+
+Package carbon jar, and copy 
assembly/target/scala-2.11/carbondata_2.11-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar 
to $SPARK_HOME/jars
+```shell
+mvn clean package -DskipTests -Pspark-2.2
+```
+
+Start spark-shell in new terminal, type :paste, then copy and run the 
following code.
+```scala
+ import java.io.File
+ import org.apache.spark.sql.{CarbonEnv, SparkSession}
+ import org.apache.spark.sql.CarbonSession._
+ import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+ import org.apache.carbondata.core.util.path.CarbonStorePath
+ 
+ val warehouse = new File("./warehouse").getCanonicalPath
+ val metastore = new File("./metastore").getCanonicalPath
+ 
+ val spark = SparkSession
+   .builder()
+   .master("local")
+   .appName("preAggregateExample")
+   .config("spark.sql.warehouse.dir", warehouse)
+   .getOrCreateCarbonSession(warehouse, metastore)
+
+ spark.sparkContext.setLogLevel("ERROR")
+
+ // drop table if exists previously
+ spark.sql(s"DROP TABLE IF EXISTS datamap_test")
+ 
+ // Create main table
+ spark.sql(
+   s"""
+  |CREATE TABLE datamap_test (
+  |name string,
+  |age int,
+  |city string,
+  |country string)
+  |STORED BY 'carbondata'
+""".stripMargin)
+ 
+ // Create lucene datamap on the main table
+ spark.sql(
+   s"""
+  |CREATE DATAMAP dm
+  |ON TABLE datamap_test
+  |USING "lucene"
+  |DMPROPERTIES ('TEXT_COLUMNS' = 'name, country')
+  
+  import spark.implicits._
+  import org.apache.spark.sql.SaveMode
+  import scala.util.Random
+ 
+  // Load data to the main table, if
+  // lucene index writing fails, the datamap
+  // will be disabled in query
+  val r = new Random()
+  spark.sparkContext.parallelize(1 to 10)
+   .map(x => ("c1" + x % 8, x % 8, "city" + x % 50, "country" + x % 60))
+   .toDF("name", "age", "city", "country")
+   .write
+   .format("carbondata")
+   .option("tableName", "datamap_test")
+   .option("compress", "true")
+   .mode(SaveMode.Append)
+   .save()
+  
+  spark.sql(
+s"""
+   |SELECT *
+   |from datamap_test where
+   |TEXT_MATCH('name:c10')
+ """.stripMargin).show
+
+  spark.stop
+```
+
+ DataMap Management
+Lucene DataMap can be created using following DDL
+  ```
+  CREATE DATAMAP [IF NOT EXISTS] datamap_name
+  ON TABLE main_table
+  USING "lucene"
+  DMPROPERTIES ('text_columns'='city, name', ...)
+  ```
+
+DataMap can be dropped using following DDL
+  ```
+  DROP DATAMAP [IF EXISTS] datamap_name
+  ON TABLE main_table
+  ```
+To show all DataMaps created, use:
+  ```
+  SHOW DATAMAP 
+  ON TABLE main_table
+  ```
+It will show all DataMaps created on main table.
+
+
+## Lucene DataMap Introduction
+  Lucene datamap are created as index DataMaps and managed along with main 
tables by CarbonData.
+  User can create as many lucene datamaps required to improve query 
performance,
+  provided the storage requirements and loading speeds are acceptable.
+  
+  Once lucene datamaps are created, the indexes generated by lucene will 
be read for pruning till
+  row level for the filter query by launching a spark datamap job. This 
pruned data will be read to
+  give the proper and faster result
+
+  For instance, main table called **sales** which is defined as 
+  
+  ```
+  CREATE TABLE datamap_test (
+name string,
+age int,
+city string,
+country string)
+  STORED BY 'carbondata'
+  ```
+  
+  User can create Lucene datamap using the Create DataMap DDL
+  
+  ```
+  CREATE DATAMAP dm
+  ON TABLE datamap_test
+  USING "lucene&quo

[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183617201
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
+
+Package carbon jar, and copy 
assembly/target/scala-2.11/carbondata_2.11-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar 
to $SPARK_HOME/jars
+```shell
+mvn clean package -DskipTests -Pspark-2.2
+```
+
+Start spark-shell in new terminal, type :paste, then copy and run the 
following code.
+```scala
+ import java.io.File
+ import org.apache.spark.sql.{CarbonEnv, SparkSession}
+ import org.apache.spark.sql.CarbonSession._
+ import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+ import org.apache.carbondata.core.util.path.CarbonStorePath
+ 
+ val warehouse = new File("./warehouse").getCanonicalPath
+ val metastore = new File("./metastore").getCanonicalPath
+ 
+ val spark = SparkSession
+   .builder()
+   .master("local")
+   .appName("preAggregateExample")
+   .config("spark.sql.warehouse.dir", warehouse)
+   .getOrCreateCarbonSession(warehouse, metastore)
+
+ spark.sparkContext.setLogLevel("ERROR")
+
+ // drop table if exists previously
+ spark.sql(s"DROP TABLE IF EXISTS datamap_test")
+ 
+ // Create main table
+ spark.sql(
+   s"""
+  |CREATE TABLE datamap_test (
+  |name string,
+  |age int,
+  |city string,
+  |country string)
+  |STORED BY 'carbondata'
+""".stripMargin)
+ 
+ // Create lucene datamap on the main table
+ spark.sql(
+   s"""
+  |CREATE DATAMAP dm
+  |ON TABLE datamap_test
+  |USING "lucene"
+  |DMPROPERTIES ('TEXT_COLUMNS' = 'name, country')
+  
+  import spark.implicits._
+  import org.apache.spark.sql.SaveMode
+  import scala.util.Random
+ 
+  // Load data to the main table, if
+  // lucene index writing fails, the datamap
+  // will be disabled in query
+  val r = new Random()
+  spark.sparkContext.parallelize(1 to 10)
+   .map(x => ("c1" + x % 8, x % 8, "city" + x % 50, "country" + x % 60))
+   .toDF("name", "age", "city", "country")
+   .write
+   .format("carbondata")
+   .option("tableName", "datamap_test")
+   .option("compress", "true")
+   .mode(SaveMode.Append)
+   .save()
+  
+  spark.sql(
+s"""
+   |SELECT *
+   |from datamap_test where
+   |TEXT_MATCH('name:c10')
+ """.stripMargin).show
+
+  spark.stop
+```
+
+ DataMap Management
+Lucene DataMap can be created using following DDL
+  ```
+  CREATE DATAMAP [IF NOT EXISTS] datamap_name
+  ON TABLE main_table
+  USING "lucene"
+  DMPROPERTIES ('text_columns'='city, name', ...)
+  ```
+
+DataMap can be dropped using following DDL
+  ```
+  DROP DATAMAP [IF EXISTS] datamap_name
+  ON TABLE main_table
+  ```
+To show all DataMaps created, use:
+  ```
+  SHOW DATAMAP 
+  ON TABLE main_table
+  ```
+It will show all DataMaps created on main table.
+
+
+## Lucene DataMap Introduction
+  Lucene datamap are created as index DataMaps and managed along with main 
tables by CarbonData.
+  User can create as many lucene datamaps required to improve query 
performance,
+  provided the storage requirements and loading speeds are acceptable.
+  
+  Once lucene datamaps are created, the indexes generated by lucene will 
be read for pruning till
+  row level for the filter query by launching a spark datamap job. This 
pruned data will be read to
+  give the proper and faster result
+
+  For instance, main table called **sales** which is defined as 
--- End diff --

For instance, main table called **sales** which is defined as:


---


[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183617996
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
+
+Package carbon jar, and copy 
assembly/target/scala-2.11/carbondata_2.11-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar 
to $SPARK_HOME/jars
+```shell
+mvn clean package -DskipTests -Pspark-2.2
+```
+
+Start spark-shell in new terminal, type :paste, then copy and run the 
following code.
+```scala
+ import java.io.File
+ import org.apache.spark.sql.{CarbonEnv, SparkSession}
+ import org.apache.spark.sql.CarbonSession._
+ import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+ import org.apache.carbondata.core.util.path.CarbonStorePath
+ 
+ val warehouse = new File("./warehouse").getCanonicalPath
+ val metastore = new File("./metastore").getCanonicalPath
+ 
+ val spark = SparkSession
+   .builder()
+   .master("local")
+   .appName("preAggregateExample")
+   .config("spark.sql.warehouse.dir", warehouse)
+   .getOrCreateCarbonSession(warehouse, metastore)
+
+ spark.sparkContext.setLogLevel("ERROR")
+
+ // drop table if exists previously
+ spark.sql(s"DROP TABLE IF EXISTS datamap_test")
+ 
+ // Create main table
+ spark.sql(
+   s"""
+  |CREATE TABLE datamap_test (
+  |name string,
+  |age int,
+  |city string,
+  |country string)
+  |STORED BY 'carbondata'
+""".stripMargin)
+ 
+ // Create lucene datamap on the main table
+ spark.sql(
+   s"""
+  |CREATE DATAMAP dm
+  |ON TABLE datamap_test
+  |USING "lucene"
+  |DMPROPERTIES ('TEXT_COLUMNS' = 'name, country')
+  
+  import spark.implicits._
+  import org.apache.spark.sql.SaveMode
+  import scala.util.Random
+ 
+  // Load data to the main table, if
+  // lucene index writing fails, the datamap
+  // will be disabled in query
+  val r = new Random()
+  spark.sparkContext.parallelize(1 to 10)
+   .map(x => ("c1" + x % 8, x % 8, "city" + x % 50, "country" + x % 60))
+   .toDF("name", "age", "city", "country")
+   .write
+   .format("carbondata")
+   .option("tableName", "datamap_test")
+   .option("compress", "true")
+   .mode(SaveMode.Append)
+   .save()
+  
+  spark.sql(
+s"""
+   |SELECT *
+   |from datamap_test where
+   |TEXT_MATCH('name:c10')
+ """.stripMargin).show
+
+  spark.stop
+```
+
+ DataMap Management
+Lucene DataMap can be created using following DDL
+  ```
+  CREATE DATAMAP [IF NOT EXISTS] datamap_name
+  ON TABLE main_table
+  USING "lucene"
+  DMPROPERTIES ('text_columns'='city, name', ...)
+  ```
+
+DataMap can be dropped using following DDL
+  ```
+  DROP DATAMAP [IF EXISTS] datamap_name
+  ON TABLE main_table
+  ```
+To show all DataMaps created, use:
+  ```
+  SHOW DATAMAP 
+  ON TABLE main_table
+  ```
+It will show all DataMaps created on main table.
+
+
+## Lucene DataMap Introduction
+  Lucene datamap are created as index DataMaps and managed along with main 
tables by CarbonData.
+  User can create as many lucene datamaps required to improve query 
performance,
+  provided the storage requirements and loading speeds are acceptable.
+  
+  Once lucene datamaps are created, the indexes generated by lucene will 
be read for pruning till
+  row level for the filter query by launching a spark datamap job. This 
pruned data will be read to
+  give the proper and faster result
+
+  For instance, main table called **sales** which is defined as 
+  
+  ```
+  CREATE TABLE datamap_test (
+name string,
+age int,
+city string,
+country string)
+  STORED BY 'carbondata'
+  ```
+  
+  User can create Lucene datamap using the Create DataMap DDL
+  
+  ```
+  CREATE DATAMAP dm
+  ON TABLE datamap_test
+  USING "lucene&quo

[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183616317
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
--- End diff --

Close all the sentence with a period (.). This is applicable for all the 
sentences in this topics.


---


[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183617239
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
+
+Package carbon jar, and copy 
assembly/target/scala-2.11/carbondata_2.11-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar 
to $SPARK_HOME/jars
+```shell
+mvn clean package -DskipTests -Pspark-2.2
+```
+
+Start spark-shell in new terminal, type :paste, then copy and run the 
following code.
+```scala
+ import java.io.File
+ import org.apache.spark.sql.{CarbonEnv, SparkSession}
+ import org.apache.spark.sql.CarbonSession._
+ import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+ import org.apache.carbondata.core.util.path.CarbonStorePath
+ 
+ val warehouse = new File("./warehouse").getCanonicalPath
+ val metastore = new File("./metastore").getCanonicalPath
+ 
+ val spark = SparkSession
+   .builder()
+   .master("local")
+   .appName("preAggregateExample")
+   .config("spark.sql.warehouse.dir", warehouse)
+   .getOrCreateCarbonSession(warehouse, metastore)
+
+ spark.sparkContext.setLogLevel("ERROR")
+
+ // drop table if exists previously
+ spark.sql(s"DROP TABLE IF EXISTS datamap_test")
+ 
+ // Create main table
+ spark.sql(
+   s"""
+  |CREATE TABLE datamap_test (
+  |name string,
+  |age int,
+  |city string,
+  |country string)
+  |STORED BY 'carbondata'
+""".stripMargin)
+ 
+ // Create lucene datamap on the main table
+ spark.sql(
+   s"""
+  |CREATE DATAMAP dm
+  |ON TABLE datamap_test
+  |USING "lucene"
+  |DMPROPERTIES ('TEXT_COLUMNS' = 'name, country')
+  
+  import spark.implicits._
+  import org.apache.spark.sql.SaveMode
+  import scala.util.Random
+ 
+  // Load data to the main table, if
+  // lucene index writing fails, the datamap
+  // will be disabled in query
+  val r = new Random()
+  spark.sparkContext.parallelize(1 to 10)
+   .map(x => ("c1" + x % 8, x % 8, "city" + x % 50, "country" + x % 60))
+   .toDF("name", "age", "city", "country")
+   .write
+   .format("carbondata")
+   .option("tableName", "datamap_test")
+   .option("compress", "true")
+   .mode(SaveMode.Append)
+   .save()
+  
+  spark.sql(
+s"""
+   |SELECT *
+   |from datamap_test where
+   |TEXT_MATCH('name:c10')
+ """.stripMargin).show
+
+  spark.stop
+```
+
+ DataMap Management
+Lucene DataMap can be created using following DDL
+  ```
+  CREATE DATAMAP [IF NOT EXISTS] datamap_name
+  ON TABLE main_table
+  USING "lucene"
+  DMPROPERTIES ('text_columns'='city, name', ...)
+  ```
+
+DataMap can be dropped using following DDL
+  ```
+  DROP DATAMAP [IF EXISTS] datamap_name
+  ON TABLE main_table
+  ```
+To show all DataMaps created, use:
+  ```
+  SHOW DATAMAP 
+  ON TABLE main_table
+  ```
+It will show all DataMaps created on main table.
+
+
+## Lucene DataMap Introduction
+  Lucene datamap are created as index DataMaps and managed along with main 
tables by CarbonData.
+  User can create as many lucene datamaps required to improve query 
performance,
+  provided the storage requirements and loading speeds are acceptable.
+  
+  Once lucene datamaps are created, the indexes generated by lucene will 
be read for pruning till
+  row level for the filter query by launching a spark datamap job. This 
pruned data will be read to
+  give the proper and faster result
+
+  For instance, main table called **sales** which is defined as 
+  
+  ```
+  CREATE TABLE datamap_test (
+name string,
+age int,
+city string,
+country string)
+  STORED BY 'carbondata'
+  ```
+  
+  User can create Lucene datamap using the Create DataMap DDL
--- End diff --

User can create Lucene datamap using the Create DataMap DDL:


---


[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183616213
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
--- End diff --

The below is a procedure, so put it in a numbered list: 
Step 1:
Step 2:



---


[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183616653
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
+
+Package carbon jar, and copy 
assembly/target/scala-2.11/carbondata_2.11-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar 
to $SPARK_HOME/jars
+```shell
+mvn clean package -DskipTests -Pspark-2.2
+```
+
+Start spark-shell in new terminal, type :paste, then copy and run the 
following code.
+```scala
+ import java.io.File
+ import org.apache.spark.sql.{CarbonEnv, SparkSession}
+ import org.apache.spark.sql.CarbonSession._
+ import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+ import org.apache.carbondata.core.util.path.CarbonStorePath
+ 
+ val warehouse = new File("./warehouse").getCanonicalPath
+ val metastore = new File("./metastore").getCanonicalPath
+ 
+ val spark = SparkSession
+   .builder()
+   .master("local")
+   .appName("preAggregateExample")
+   .config("spark.sql.warehouse.dir", warehouse)
+   .getOrCreateCarbonSession(warehouse, metastore)
+
+ spark.sparkContext.setLogLevel("ERROR")
+
+ // drop table if exists previously
+ spark.sql(s"DROP TABLE IF EXISTS datamap_test")
+ 
+ // Create main table
+ spark.sql(
+   s"""
+  |CREATE TABLE datamap_test (
+  |name string,
+  |age int,
+  |city string,
+  |country string)
+  |STORED BY 'carbondata'
+""".stripMargin)
+ 
+ // Create lucene datamap on the main table
+ spark.sql(
+   s"""
+  |CREATE DATAMAP dm
+  |ON TABLE datamap_test
+  |USING "lucene"
+  |DMPROPERTIES ('TEXT_COLUMNS' = 'name, country')
+  
+  import spark.implicits._
+  import org.apache.spark.sql.SaveMode
+  import scala.util.Random
+ 
+  // Load data to the main table, if
+  // lucene index writing fails, the datamap
+  // will be disabled in query
+  val r = new Random()
+  spark.sparkContext.parallelize(1 to 10)
+   .map(x => ("c1" + x % 8, x % 8, "city" + x % 50, "country" + x % 60))
+   .toDF("name", "age", "city", "country")
+   .write
+   .format("carbondata")
+   .option("tableName", "datamap_test")
+   .option("compress", "true")
+   .mode(SaveMode.Append)
+   .save()
+  
+  spark.sql(
+s"""
+   |SELECT *
+   |from datamap_test where
+   |TEXT_MATCH('name:c10')
+ """.stripMargin).show
--- End diff --

Why a red background. Please check once


---


[GitHub] carbondata pull request #2215: [wip]add documentation for lucene datamap

2018-04-24 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2215#discussion_r183616769
  
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -0,0 +1,180 @@
+# CarbonData Lucene DataMap
+  
+* [Quick Example](#quick-example)
+* [DataMap Management](#datamap-management)
+* [Lucene Datamap](#lucene-datamap-introduction)
+* [Loading Data](#loading-data)
+* [Querying Data](#querying-data)
+* [Data Management](#data-management-with-pre-aggregate-tables)
+
+## Quick example
+Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
+
+Package carbon jar, and copy 
assembly/target/scala-2.11/carbondata_2.11-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar 
to $SPARK_HOME/jars
+```shell
+mvn clean package -DskipTests -Pspark-2.2
+```
+
+Start spark-shell in new terminal, type :paste, then copy and run the 
following code.
+```scala
+ import java.io.File
+ import org.apache.spark.sql.{CarbonEnv, SparkSession}
+ import org.apache.spark.sql.CarbonSession._
+ import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+ import org.apache.carbondata.core.util.path.CarbonStorePath
+ 
+ val warehouse = new File("./warehouse").getCanonicalPath
+ val metastore = new File("./metastore").getCanonicalPath
+ 
+ val spark = SparkSession
+   .builder()
+   .master("local")
+   .appName("preAggregateExample")
+   .config("spark.sql.warehouse.dir", warehouse)
+   .getOrCreateCarbonSession(warehouse, metastore)
+
+ spark.sparkContext.setLogLevel("ERROR")
+
+ // drop table if exists previously
+ spark.sql(s"DROP TABLE IF EXISTS datamap_test")
+ 
+ // Create main table
+ spark.sql(
+   s"""
+  |CREATE TABLE datamap_test (
+  |name string,
+  |age int,
+  |city string,
+  |country string)
+  |STORED BY 'carbondata'
+""".stripMargin)
+ 
+ // Create lucene datamap on the main table
+ spark.sql(
+   s"""
+  |CREATE DATAMAP dm
+  |ON TABLE datamap_test
+  |USING "lucene"
+  |DMPROPERTIES ('TEXT_COLUMNS' = 'name, country')
+  
+  import spark.implicits._
+  import org.apache.spark.sql.SaveMode
+  import scala.util.Random
+ 
+  // Load data to the main table, if
+  // lucene index writing fails, the datamap
+  // will be disabled in query
+  val r = new Random()
+  spark.sparkContext.parallelize(1 to 10)
+   .map(x => ("c1" + x % 8, x % 8, "city" + x % 50, "country" + x % 60))
+   .toDF("name", "age", "city", "country")
+   .write
+   .format("carbondata")
+   .option("tableName", "datamap_test")
+   .option("compress", "true")
+   .mode(SaveMode.Append)
+   .save()
+  
+  spark.sql(
+s"""
+   |SELECT *
+   |from datamap_test where
+   |TEXT_MATCH('name:c10')
+ """.stripMargin).show
+
+  spark.stop
+```
+
+ DataMap Management
+Lucene DataMap can be created using following DDL
+  ```
+  CREATE DATAMAP [IF NOT EXISTS] datamap_name
+  ON TABLE main_table
+  USING "lucene"
+  DMPROPERTIES ('text_columns'='city, name', ...)
+  ```
+
+DataMap can be dropped using following DDL
--- End diff --

DataMap can be dropped using following DDL:


---


[GitHub] carbondata issue #2198: [CARBONDATA-2369] Add a document for Non Transaction...

2018-04-23 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2198
  
LGTM


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-23 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183304329
  
--- Diff: integration/presto/Presto_Cluster_Setup_For_Carbondata.md ---
@@ -0,0 +1,133 @@
+# Presto Multinode Cluster setup For Carbondata
+
+## Installing Presto
+
+  1. Download the 0.187 version of Presto using:
+  `wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz`
+
+  2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz`
+
+  3. Download the Presto CLI for the coordinator and name it presto.
+
+  ```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+mv presto-cli-0.187-executable.jar presto
+
+chmod +x presto
+  ```
+
+ ## Create Configuration Files
+
+  1. Create `etc` folder in presto-server-0.187 directory.
+  2. Create `config.properties`, `jvm.config`, `log.properties`, and 
`node.properties` files.
+  3. Install uuid to generate a node.id
+
+  ```
+  sudo apt-get install uuid
+
+  uuid
+  ```
+
+
+# Contents of your node.properties file
+
+  ```
+  node.environment=production
+  node.id=
+  node.data-dir=/home/ubuntu/data
+  ```
+
+# Contents of your jvm.config file
+
+  ```
+  -server
+  -Xmx16G
+  -XX:+UseG1GC
+  -XX:G1HeapRegionSize=32M
+  -XX:+UseGCOverheadLimit
+  -XX:+ExplicitGCInvokesConcurrent
+  -XX:+HeapDumpOnOutOfMemoryError
+  -XX:OnOutOfMemoryError=kill -9 %p
+  ```
+
+# Contents of your log.properties file
+  ```
+  com.facebook.presto=INFO
+  ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, 
`INFO`, `WARN` and `ERROR`.
+
+## Coordinator Configurations
+
+  # Contents of your config.properties
+  ```
+  coordinator=true
+  node-scheduler.include-coordinator=false
+  http-server.http.port=8080
+  query.max-memory=50GB
+  query.max-memory-per-node=2GB
+  discovery-server.enabled=true
+  discovery.uri=:8080
+  ```
+The options `node-scheduler.include-coordinator=false` and 
`coordinator=true` indicate that the node is the coordinator and tells the 
coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the 
JVM config max memory, though if your workload is highly concurrent, you may 
want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+## Worker Configurations
+
+# Contents of your config.properties
+
+  ```
+  coordinator=false
+  http-server.http.port=8080
+  query.max-memory=50GB
+  query.max-memory-per-node=2GB
+  discovery.uri=:8080
+  ```
+
+**Note**: `jvm.config` and `node.properties` files are same for all the 
nodes (worker + coordinator). All the nodes should have different `node.id`
+
+## Catalog Configurations
+
+1. Create a folder named `catalog` in etc directory of presto on all the 
nodes of the cluster including the coordinator.
+
+# Configuring Carbondata in Presto
+1. Create a file named `carbondata.properties` in the `catalog` folder and 
set the required properties on all the nodes.
+
+## Add Plugins
+
+1. Create a directory named `carbondata` in plugin directory of presto
--- End diff --

Period at the end of sentence for both the point. Check for all the sentence


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-23 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183304497
  
--- Diff: integration/presto/Presto_Cluster_Setup_For_Carbondata.md ---
@@ -0,0 +1,133 @@
+# Presto Multinode Cluster setup For Carbondata
+
+## Installing Presto
+
+  1. Download the 0.187 version of Presto using:
+  `wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz`
+
+  2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz`
+
+  3. Download the Presto CLI for the coordinator and name it presto.
+
+  ```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+mv presto-cli-0.187-executable.jar presto
+
+chmod +x presto
+  ```
+
+ ## Create Configuration Files
+
+  1. Create `etc` folder in presto-server-0.187 directory.
+  2. Create `config.properties`, `jvm.config`, `log.properties`, and 
`node.properties` files.
+  3. Install uuid to generate a node.id
+
+  ```
+  sudo apt-get install uuid
+
+  uuid
+  ```
+
+
+# Contents of your node.properties file
+
+  ```
+  node.environment=production
+  node.id=
+  node.data-dir=/home/ubuntu/data
+  ```
+
+# Contents of your jvm.config file
+
+  ```
+  -server
+  -Xmx16G
+  -XX:+UseG1GC
+  -XX:G1HeapRegionSize=32M
+  -XX:+UseGCOverheadLimit
+  -XX:+ExplicitGCInvokesConcurrent
+  -XX:+HeapDumpOnOutOfMemoryError
+  -XX:OnOutOfMemoryError=kill -9 %p
+  ```
+
+# Contents of your log.properties file
+  ```
+  com.facebook.presto=INFO
+  ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, 
`INFO`, `WARN` and `ERROR`.
+
+## Coordinator Configurations
+
+  # Contents of your config.properties
+  ```
+  coordinator=true
+  node-scheduler.include-coordinator=false
+  http-server.http.port=8080
+  query.max-memory=50GB
+  query.max-memory-per-node=2GB
+  discovery-server.enabled=true
+  discovery.uri=:8080
+  ```
+The options `node-scheduler.include-coordinator=false` and 
`coordinator=true` indicate that the node is the coordinator and tells the 
coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the 
JVM config max memory, though if your workload is highly concurrent, you may 
want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+## Worker Configurations
+
+# Contents of your config.properties
+
+  ```
+  coordinator=false
+  http-server.http.port=8080
+  query.max-memory=50GB
+  query.max-memory-per-node=2GB
+  discovery.uri=:8080
+  ```
+
+**Note**: `jvm.config` and `node.properties` files are same for all the 
nodes (worker + coordinator). All the nodes should have different `node.id`
+
+## Catalog Configurations
+
+1. Create a folder named `catalog` in etc directory of presto on all the 
nodes of the cluster including the coordinator.
+
+# Configuring Carbondata in Presto
+1. Create a file named `carbondata.properties` in the `catalog` folder and 
set the required properties on all the nodes.
+
+## Add Plugins
+
+1. Create a directory named `carbondata` in plugin directory of presto
+2. Copy `carbondata` jars to `plugin/carbondata` directory on all nodes
+
+## Start Presto Server on all nodes
+
+```
+./presto-server-0.187/bin/launcher start
+```
+To run it as a background process.
+
+```
+./presto-server-0.187/bin/launcher run
+```
+To run it in foreground.
+
+## Start Presto CLI
+```
+./presto
+```
+To connect to carbondata catalog use the following command:
+
+```
+./presto --server :8080 --catalog carbondata --schema 

+```
+Execute the following command to ensure the workers are connected
--- End diff --

: end of sentence


---


[GitHub] carbondata pull request #2198: [CARBONDATA-2369] Add a document for Non Tran...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2198#discussion_r183031656
  
--- Diff: docs/sdk-writer-guide.md ---
@@ -0,0 +1,140 @@
+# SDK Writer Guide
+In the carbon jars package, there exist a 
carbondata-store-sdk-x.x.x-SNAPSHOT.jar.
+This SDK writer, writes carbondata file and carbonindex file at a given 
path.
+External client can make use of this writer to convert other format data 
or live data to create carbondata and index files.
+These SDK writer output contains just a carbondata and carbonindex files. 
No metadata folder will be present.
+
+## Quick example
+
+```scala
+ import java.io.IOException;
+
+ import 
org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+ import org.apache.carbondata.core.metadata.datatype.DataTypes;
+ import org.apache.carbondata.sdk.file.CarbonWriter;
+ import org.apache.carbondata.sdk.file.CarbonWriterBuilder;
+ import org.apache.carbondata.sdk.file.Field;
+ import org.apache.carbondata.sdk.file.Schema;
+
+ public class TestSdk {
+
+ public static void main(String[] args) throws IOException, 
InvalidLoadOptionException {
+ testSdkWriter();
+ }
+
+ public static void testSdkWriter() throws IOException, 
InvalidLoadOptionException {
+ String path ="/home/root1/Documents/ab/temp";
+
+ Field[] fields =new Field[2]; 
+ fields[0] = new Field("name", DataTypes.STRING);
+ fields[1] = new Field("age", DataTypes.INT);
+
+ Schema schema =new Schema(fields);
+
+ CarbonWriterBuilder builder = CarbonWriter.builder()
+ .withSchema(schema)
+ .outputPath(path);
+
+ CarbonWriter writer = builder.buildWriterForCSVInput();
+
+ int rows = 5;
+ for (int i = 0; i < rows; i++) {
+ writer.write(new String[]{"robot" + (i % 10), String.valueOf(i)});
+ }
+ writer.close();
+ }
+ }
+```
+
+## Datatypes Mapping
+Each of SQL data types are mapped into data types of SDK. Following are 
the mapping:
+| SQL DataTypes | Mapped SDK DataTypes |
--- End diff --

Table formatting has issue, please check


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183030442
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+  * Download the 0.187 version of presto using:
+  
+``wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+  * Extract presto tar file
+   ``tar zxvf presto-server-0.187.tar.gz``
+  
+  * Download the presto CLI for the coordinator and name it presto.
+  
+```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+  
+mv presto-cli-0.187-executable.jar presto
+  
+chmod +x presto
+```
+  
+ ### Create configuration Files
+
+  * Create etc folder in presto-server-0.187 directory.
+  * Create config.properties, jvm.config, log.properties, and 
node.properties files.
+  * Install uuid to generate a node.id
+
+  ```
+  sudo apt-get install uuid
+
+  uuid
+  ```
+  
+
+# Contents of your node.properties file
+
+  ```
+  node.environment=production
+  node.id=
+  node.data-dir=/home/ubuntu/data
+  ```
+  
+# Contents of your jvm.config file
+
+  ```
+  -server
+  -Xmx16G
+  -XX:+UseG1GC
+  -XX:G1HeapRegionSize=32M
+  -XX:+UseGCOverheadLimit
+  -XX:+ExplicitGCInvokesConcurrent
+  -XX:+HeapDumpOnOutOfMemoryError
+  -XX:OnOutOfMemoryError=kill -9 %p
+  ```
+  
+# Contents of your log.properties file
+  ```
+  com.facebook.presto=INFO
+  ```
+  
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, 
`INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
--- End diff --

Heading 2


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183030674
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+  * Download the 0.187 version of presto using:
+  
+``wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+  * Extract presto tar file
+   ``tar zxvf presto-server-0.187.tar.gz``
+  
+  * Download the presto CLI for the coordinator and name it presto.
+  
+```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+  
+mv presto-cli-0.187-executable.jar presto
+  
+chmod +x presto
+```
+  
+ ### Create configuration Files
+
+  * Create etc folder in presto-server-0.187 directory.
+  * Create config.properties, jvm.config, log.properties, and 
node.properties files.
+  * Install uuid to generate a node.id
+
+  ```
+  sudo apt-get install uuid
+
+  uuid
+  ```
+  
+
+# Contents of your node.properties file
+
+  ```
+  node.environment=production
+  node.id=
+  node.data-dir=/home/ubuntu/data
+  ```
+  
+# Contents of your jvm.config file
+
+  ```
+  -server
+  -Xmx16G
+  -XX:+UseG1GC
+  -XX:G1HeapRegionSize=32M
+  -XX:+UseGCOverheadLimit
+  -XX:+ExplicitGCInvokesConcurrent
+  -XX:+HeapDumpOnOutOfMemoryError
+  -XX:OnOutOfMemoryError=kill -9 %p
+  ```
+  
+# Contents of your log.properties file
+  ```
+  com.facebook.presto=INFO
+  ```
+  
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, 
`INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
+  
+  # Contents of your config.properties
+```
+coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery-server.enabled=true
+discovery.uri=:8080
+```
+The options `node-scheduler.include-coordinator=false` and 
`coordinator=true` indicate that the node is the coordinator and tells the 
coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the 
JVM config max memory, though if your workload is highly concurrent, you may 
want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like: 
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+### Worker Configurations
+
+# Contents of your config.properties
+
+```
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery.uri=:8080
+```
+
+**Note**: `jvm.config`, `node.properties` file is same for all the nodes 
(worker + coordinator). All the nodes should have different `node.id`
--- End diff --

`jvm.config` and `node.properties` files are same for all the nodes (worker 
+ coordinator). All the nodes should have different `node.id`.


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183031033
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+  * Download the 0.187 version of presto using:
+  
+``wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+  * Extract presto tar file
+   ``tar zxvf presto-server-0.187.tar.gz``
+  
+  * Download the presto CLI for the coordinator and name it presto.
+  
+```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+  
+mv presto-cli-0.187-executable.jar presto
+  
+chmod +x presto
+```
+  
+ ### Create configuration Files
+
+  * Create etc folder in presto-server-0.187 directory.
+  * Create config.properties, jvm.config, log.properties, and 
node.properties files.
+  * Install uuid to generate a node.id
+
+  ```
+  sudo apt-get install uuid
+
+  uuid
+  ```
+  
+
+# Contents of your node.properties file
+
+  ```
+  node.environment=production
+  node.id=
+  node.data-dir=/home/ubuntu/data
+  ```
+  
+# Contents of your jvm.config file
+
+  ```
+  -server
+  -Xmx16G
+  -XX:+UseG1GC
+  -XX:G1HeapRegionSize=32M
+  -XX:+UseGCOverheadLimit
+  -XX:+ExplicitGCInvokesConcurrent
+  -XX:+HeapDumpOnOutOfMemoryError
+  -XX:OnOutOfMemoryError=kill -9 %p
+  ```
+  
+# Contents of your log.properties file
+  ```
+  com.facebook.presto=INFO
+  ```
+  
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, 
`INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
+  
+  # Contents of your config.properties
+```
+coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery-server.enabled=true
+discovery.uri=:8080
+```
+The options `node-scheduler.include-coordinator=false` and 
`coordinator=true` indicate that the node is the coordinator and tells the 
coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the 
JVM config max memory, though if your workload is highly concurrent, you may 
want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like: 
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+### Worker Configurations
+
+# Contents of your config.properties
+
+```
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery.uri=:8080
+```
+
+**Note**: `jvm.config`, `node.properties` file is same for all the nodes 
(worker + coordinator). All the nodes should have different `node.id`
+
+### Catalog Configurations
+
+Create a folder named `catalog` in etc directory of presto on all the 
nodes of the cluster including the coordinator. 
+
+# Configuring Carbondata in Presto
+* Create a file named `carbondata.properties` in the `catalog` folder and 
set the required properties on all the nodes.
+
+### Add Plugins
+
+* Create a directory named `carbondata` in plugin directory of presto
+* Copy `carbondata` jars to `plugin/carbondata` directory on all nodes
+
+### Start Presto Server on all nodes
+
+```
+./presto-server-0.187/bin/launcher start
+```
+To run it as a background process.
+
+```
+./presto-server-0.187/bin/launcher run
+```
+To run it in foreground.
+
+### Start presto CLI
+```
+./presto
+```
+To connect to carbondata catalog use the following command:
--- End diff --

To connect to carbondata catalog, use the following command:


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183027350
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
--- End diff --

We can make it heading 2 (##) and change the heading to "Installing Presto"


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183030457
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+  * Download the 0.187 version of presto using:
+  
+``wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+  * Extract presto tar file
+   ``tar zxvf presto-server-0.187.tar.gz``
+  
+  * Download the presto CLI for the coordinator and name it presto.
+  
+```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+  
+mv presto-cli-0.187-executable.jar presto
+  
+chmod +x presto
+```
+  
+ ### Create configuration Files
+
+  * Create etc folder in presto-server-0.187 directory.
+  * Create config.properties, jvm.config, log.properties, and 
node.properties files.
+  * Install uuid to generate a node.id
+
+  ```
+  sudo apt-get install uuid
+
+  uuid
+  ```
+  
+
+# Contents of your node.properties file
+
+  ```
+  node.environment=production
+  node.id=
+  node.data-dir=/home/ubuntu/data
+  ```
+  
+# Contents of your jvm.config file
+
+  ```
+  -server
+  -Xmx16G
+  -XX:+UseG1GC
+  -XX:G1HeapRegionSize=32M
+  -XX:+UseGCOverheadLimit
+  -XX:+ExplicitGCInvokesConcurrent
+  -XX:+HeapDumpOnOutOfMemoryError
+  -XX:OnOutOfMemoryError=kill -9 %p
+  ```
+  
+# Contents of your log.properties file
+  ```
+  com.facebook.presto=INFO
+  ```
+  
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, 
`INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
+  
+  # Contents of your config.properties
+```
+coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery-server.enabled=true
+discovery.uri=:8080
+```
+The options `node-scheduler.include-coordinator=false` and 
`coordinator=true` indicate that the node is the coordinator and tells the 
coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the 
JVM config max memory, though if your workload is highly concurrent, you may 
want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like: 
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+### Worker Configurations
--- End diff --

Heading 2


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183027087
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
--- End diff --

Give a space after # 


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183028214
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+  * Download the 0.187 version of presto using:
--- End diff --

If this are Steps then can change from Bulleted points to Numbered point


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183027976
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
--- End diff --

Leave a space after # 


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183030962
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+  * Download the 0.187 version of presto using:
+  
+``wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+  * Extract presto tar file
+   ``tar zxvf presto-server-0.187.tar.gz``
+  
+  * Download the presto CLI for the coordinator and name it presto.
+  
+```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+  
+mv presto-cli-0.187-executable.jar presto
+  
+chmod +x presto
+```
+  
+ ### Create configuration Files
+
+  * Create etc folder in presto-server-0.187 directory.
+  * Create config.properties, jvm.config, log.properties, and 
node.properties files.
+  * Install uuid to generate a node.id
+
+  ```
+  sudo apt-get install uuid
+
+  uuid
+  ```
+  
+
+# Contents of your node.properties file
+
+  ```
+  node.environment=production
+  node.id=
+  node.data-dir=/home/ubuntu/data
+  ```
+  
+# Contents of your jvm.config file
+
+  ```
+  -server
+  -Xmx16G
+  -XX:+UseG1GC
+  -XX:G1HeapRegionSize=32M
+  -XX:+UseGCOverheadLimit
+  -XX:+ExplicitGCInvokesConcurrent
+  -XX:+HeapDumpOnOutOfMemoryError
+  -XX:OnOutOfMemoryError=kill -9 %p
+  ```
+  
+# Contents of your log.properties file
+  ```
+  com.facebook.presto=INFO
+  ```
+  
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, 
`INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
+  
+  # Contents of your config.properties
+```
+coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery-server.enabled=true
+discovery.uri=:8080
+```
+The options `node-scheduler.include-coordinator=false` and 
`coordinator=true` indicate that the node is the coordinator and tells the 
coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the 
JVM config max memory, though if your workload is highly concurrent, you may 
want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like: 
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`
+
+### Worker Configurations
+
+# Contents of your config.properties
+
+```
+coordinator=false
+http-server.http.port=8080
+query.max-memory=50GB
+query.max-memory-per-node=2GB
+discovery.uri=:8080
+```
+
+**Note**: `jvm.config`, `node.properties` file is same for all the nodes 
(worker + coordinator). All the nodes should have different `node.id`
+
+### Catalog Configurations
+
+Create a folder named `catalog` in etc directory of presto on all the 
nodes of the cluster including the coordinator. 
+
+# Configuring Carbondata in Presto
+* Create a file named `carbondata.properties` in the `catalog` folder and 
set the required properties on all the nodes.
+
+### Add Plugins
+
+* Create a directory named `carbondata` in plugin directory of presto
--- End diff --

Procedure so change to numbered step


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183028486
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+  * Download the 0.187 version of presto using:
+  
+``wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+  * Extract presto tar file
+   ``tar zxvf presto-server-0.187.tar.gz``
+  
+  * Download the presto CLI for the coordinator and name it presto.
--- End diff --

All 'presto' instances can be changed to title case 'Presto'


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183029300
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+  * Download the 0.187 version of presto using:
+  
+``wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+  * Extract presto tar file
+   ``tar zxvf presto-server-0.187.tar.gz``
+  
+  * Download the presto CLI for the coordinator and name it presto.
+  
+```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+  
+mv presto-cli-0.187-executable.jar presto
+  
+chmod +x presto
+```
+  
+ ### Create configuration Files
+
+  * Create etc folder in presto-server-0.187 directory.
--- End diff --

This is a procedure so change it to number point


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183028111
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
--- End diff --

We can change it to Heading 2 (##) and change the heading to "Installing 
Presto"


---


[GitHub] carbondata pull request #2199: [CARBONDATA-2370] Added document for presto m...

2018-04-20 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2199#discussion_r183028525
  
--- Diff: integration/presto/Presto_Cluster_setup_for_Carbondata.md ---
@@ -0,0 +1,135 @@
+#Presto Multinode Cluster setup For Carbondata
+
+### Install Presto
+
+  * Download the 0.187 version of presto using:
+  
+``wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz
+ ``
+  * Extract presto tar file
+   ``tar zxvf presto-server-0.187.tar.gz``
+  
+  * Download the presto CLI for the coordinator and name it presto.
+  
+```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+  
+mv presto-cli-0.187-executable.jar presto
+  
+chmod +x presto
+```
+  
+ ### Create configuration Files
--- End diff --

Headin 2 (##) and make it title case


---


[GitHub] carbondata issue #2183: [Documentation] FAQ added for Why all executors are ...

2018-04-18 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2183
  
@rahulforallp kindly review


---


[GitHub] carbondata pull request #2183: [Documentation] FAQ added for Why all executo...

2018-04-18 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2183

[Documentation] FAQ added for Why all executors are showing success in 
Spark UI even …

FAQ added for 
Why are all executors showing success in Spark UI even after Dataload 
command failed at Driver side?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata Faq_dadaload

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2183.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2183


commit aca05e32a4acb12491b85e9989c82b3331f20c0f
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-04-18T11:38:05Z

FAQ added for Why all executors are showing success in Spark UI even after 
Dataload command failed at Driver side?




---


[GitHub] carbondata issue #1944: [Documentation] Added a FAQ for executor returning s...

2018-04-18 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/1944
  
Closed as this FAQ is not needed


---


[GitHub] carbondata pull request #1944: [Documentation] Added a FAQ for executor retu...

2018-04-18 Thread sgururajshetty
Github user sgururajshetty closed the pull request at:

https://github.com/apache/carbondata/pull/1944


---


[GitHub] carbondata issue #2138: [CARBONDATA-2230][Documentation]add documentation fo...

2018-04-04 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2138
  
LGTM


---


[GitHub] carbondata pull request #2138: [CARBONDATA-2230][Documentation]add documenta...

2018-04-04 Thread sgururajshetty
Github user sgururajshetty commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2138#discussion_r179039767
  
--- Diff: docs/configuration-parameters.md ---
@@ -39,6 +39,7 @@ This section provides the details of all the 
configurations required for the Car
 | carbon.streaming.auto.handoff.enabled | true | If this parameter value 
is set to true, auto trigger handoff function will be enabled.|
 | carbon.streaming.segment.max.size | 102400 | This parameter defines 
the maximum size of the streaming segment. Setting this parameter to 
appropriate value will avoid impacting the streaming ingestion. The value is in 
bytes.|
 | carbon.query.show.datamaps | true | If this parameter value is set to 
true, show tables command will list all the tables including datatmaps(eg: 
Preaggregate table), else datamaps will be excluded from the table list. |
+| carbon.segment.lock.files.preserve.hours | 48 | This property value 
indicates the number of hours the segment lock files will be preserved after 
dataload. These lock fils will be deleted with clean files command after the 
configured amount of hours. |
--- End diff --

Spelling error "fils"

These lock files will be deleted with the clean command after the 
configured number of hours. 


---


[GitHub] carbondata pull request #2116: [Documentation] The syntax and the example is...

2018-03-30 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2116

[Documentation] The syntax and the example is corrected

Overwrite syntax and examples where corrected as it was throwing error

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata Syntax_correction

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2116.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2116


commit b9ad02b94f9ab8293143125cdf60bc4a2c7d515b
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-03-30T12:14:05Z

The syntax and the example is corrected as it was throwing error




---


[GitHub] carbondata pull request #2079: [Documentation] Editorial Review

2018-03-19 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2079

[Documentation] Editorial Review

Editorial Review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata review1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2079.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2079


commit af6e717f895001fa44416b1cdabb97f948b08af7
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-03-20T05:31:11Z

Spelling corrected




---


[GitHub] carbondata pull request #2067: [Documentation] Example added for Drop Partit...

2018-03-15 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2067

[Documentation] Example added for Drop Partition

Example added for Drop Partition

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata exampleDropPartition

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2067.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2067


commit ae1f161764b4464d3051dc750aa20139e0480b8d
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-03-15T10:50:41Z

Example added for Drop Partition




---


[GitHub] carbondata issue #2041: [CARBONDATA-2235]Update configuration-parameters.md

2018-03-07 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/2041
  
LGTM


---


[GitHub] carbondata pull request #2044: [Documentation] Updated Readme for Datamap Fe...

2018-03-07 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/2044

[Documentation] Updated Readme for Datamap Feature

Readme is updated with the links to the following new topics on Datamap
> CarbonData Pre-aggregate DataMap
> CarbonData Timeseries DataMap



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata Datamap

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2044.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2044


commit 574c69f93bb78cf09daf492634791a3ef30f27fd
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-03-08T06:02:17Z

Updated Readme for Datamap
> CarbonData Pre-aggregate DataMap
> CarbonData Timeseries DataMap




---


[GitHub] carbondata pull request #1992: [Documentation] Editorial review

2018-02-23 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1992

[Documentation] Editorial review

Editorial review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata reviewed

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1992.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1992


commit e9e4b92772eb0ea04b0d56dd7b94828f7648a657
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-23T11:35:17Z

Editorial review




---


[GitHub] carbondata issue #1936: [CARBONDATA-2135] Documentation for Table comment an...

2018-02-22 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/1936
  
@chenliang613 kindly review 




---


[GitHub] carbondata issue #1944: [Documentation] Added a FAQ for executor returning s...

2018-02-15 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/1944
  
@sraghunandan kindly review 


---


[GitHub] carbondata issue #1954: [Documentation] Formatting issue fixed

2018-02-08 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/1954
  
LGTM


---


[GitHub] carbondata pull request #1944: [Documentation] Added a FAQ for executor retu...

2018-02-07 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1944

[Documentation] Added a FAQ for executor returning successful even after 
the query fa…

dded a FAQ for executor returning successful even after the query failed

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata faq1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1944.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1944


commit d32c887b4b683c2cdd4160bf9552e64f706a654c
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-07T11:24:57Z

Added a FAQ for executor returning successful even after the query failed

commit 411267affcfec7e6e6955de9435adfc9bb4497d9
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-07T11:27:54Z

Fixed a link issue




---


[GitHub] carbondata issue #1938: [CARBONDATA-2138] Added documentation for HEADER opt...

2018-02-06 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/1938
  
@QiangCai fixed the review comment. Kindly review and merge.


---


[GitHub] carbondata pull request #1938: [CARBONDATA-2138] Added documentation for HEA...

2018-02-06 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1938

[CARBONDATA-2138] Added documentation for HEADER option while loading data

Added documentation for HEADER option in load data

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata 2138

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1938.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1938


commit 20ec4b85d4bf41d1ee99402a8e59aa4ad5d1f08f
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-06T15:28:59Z

Added documentation for HEADER option while loading data




---


[GitHub] carbondata pull request #1936: [CARBONDATA-2135] Documentation for Table com...

2018-02-06 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1936

[CARBONDATA-2135] Documentation for Table comment and Column Comment

Documentation for table comment and column comment

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata 2135

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1936.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1936


commit c78d3a56f209dea3259e0af147a96641de4e9c0f
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-06T10:36:42Z

Documentation for Table comment and Column Comment




---


[GitHub] carbondata pull request #1927: [CARBONDATA-2128] Documentation for table pat...

2018-02-03 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1927

[CARBONDATA-2128] Documentation for table path while creating the table

Documentation for table path while creating the table

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata 2128

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1927.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1927


commit c640768a31f97cb5278d1804a9ceba8283889726
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-03T15:50:41Z

Documentation for table path while creating the table




---


[GitHub] carbondata pull request #1926: [CARBONDATA-2127] Documentation for Hive Stan...

2018-02-03 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1926

[CARBONDATA-2127] Documentation for Hive Standard Partition

Documentation for Hive Standard Partition

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata 2127

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1926.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1926


commit f97d0b5eca4d06d40d14436c7b62d93691d77e58
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-03T15:34:23Z

Documentation for Hive Standard Partition




---


[GitHub] carbondata issue #1925: [CARBONDATA-2126] Documentation for Create database ...

2018-02-03 Thread sgururajshetty
Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/1925
  
Closed


---


[GitHub] carbondata pull request #1925: [CARBONDATA-2126] Documentation for Create da...

2018-02-03 Thread sgururajshetty
Github user sgururajshetty closed the pull request at:

https://github.com/apache/carbondata/pull/1925


---


[GitHub] carbondata pull request #1925: [CARBONDATA-2126] Documentation for Create da...

2018-02-03 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1925

[CARBONDATA-2126] Documentation for Create database and custom location

Link issue fixed

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata 2126

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1925.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1925


commit d66529dd9cca066b2030f3926a8d8c2945016cfc
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-03T13:08:10Z

Documentation for create database and custom location

commit 241ba99b939ea3b14150485d016794b761d2fe50
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-03T14:35:55Z

Link issue fixed




---


[GitHub] carbondata pull request #1923: [CARBONDATA-2126] Documentation for create da...

2018-02-03 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1923

[CARBONDATA-2126] Documentation for create database and custom location

Added documentation for create database and also specify the custom location

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata 2126

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1923.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1923


commit d66529dd9cca066b2030f3926a8d8c2945016cfc
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-03T13:08:10Z

Documentation for create database and custom location




---


[GitHub] carbondata pull request #1907: Data types

2018-02-01 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1907

Data types

Spelling fixed

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata data_types

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1907.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1907


commit e9fa0a2a09b89bbef10ea7a635ea929c54a58edc
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-01T15:00:12Z

The supported datatype mentioned for dictionary exclude and sort columns

commit 0a0af9d38359414cc194f8a9e2f6c5850c88e9f9
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-01T15:03:34Z

Spelling correction




---


[GitHub] carbondata pull request #1906: [CARBONDATA-2116] Documentation for CTAS

2018-02-01 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1906

[CARBONDATA-2116] Documentation for CTAS

Added the documentation for CTAS

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata 2116

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1906.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1906


commit 7527f15e021613df89efc97375ce293898abf9e1
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-01T14:34:54Z

Documentation for CTAS




---


[GitHub] carbondata pull request #1905: [CARBONDATA-2115] Scenarios in which aggregat...

2018-02-01 Thread sgururajshetty
GitHub user sgururajshetty opened a pull request:

https://github.com/apache/carbondata/pull/1905

[CARBONDATA-2115] Scenarios in which aggregate query is not fetching …

Added the FAQ on Scenarios in which aggregate query is not fetching 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sgururajshetty/carbondata 2115

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1905.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1905


commit 4d96e0f8f48f64a261f63eca6e60ae385f70f872
Author: sgururajshetty <sgururajshetty@...>
Date:   2018-02-01T12:29:17Z

[CARBONDATA-2115] Scenarios in which aggregate query is not fetching data 
from aggregate table




---


  1   2   >