Jenkins build became unstable: carbondata-master-spark-2.1 » Apache CarbonData :: Spark2 #2124

2018-03-08 Thread Apache Jenkins Server
See 




Jenkins build became unstable: carbondata-master-spark-2.1 #2124

2018-03-08 Thread Apache Jenkins Server
See 




Jenkins build became unstable: carbondata-master-spark-2.2 #161

2018-03-08 Thread Apache Jenkins Server
See 




Jenkins build became unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Spark2 #161

2018-03-08 Thread Apache Jenkins Server
See 




carbondata git commit: [CARBONDATA-1993] Carbon properties default values fix and corresponding template and document correction

2018-03-08 Thread gvramana
Repository: carbondata
Updated Branches:
  refs/heads/branch-1.3 c36e944d6 -> 0283c938b


[CARBONDATA-1993] Carbon properties default values fix and corresponding 
template and document correction

This closes #1831


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/0283c938
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/0283c938
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/0283c938

Branch: refs/heads/branch-1.3
Commit: 0283c938bd4b1c292fa6536fb7e5c04d61f21fff
Parents: c36e944
Author: mohammadshahidkhan 
Authored: Thu Jan 18 16:54:15 2018 +0530
Committer: Venkata Ramana G 
Committed: Fri Mar 9 12:10:38 2018 +0530

--
 conf/carbon.properties.template | 32 ---
 .../core/constants/CarbonCommonConstants.java   | 56 +++-
 .../core/datastore/page/ColumnPage.java |  4 +-
 docs/configuration-parameters.md| 19 +++
 docs/installation-guide.md  |  6 +--
 .../carbondata/examples/CompareTest.scala   |  2 +-
 .../carbondata/examples/ConcurrencyTest.scala   |  2 +-
 .../carbondata/examples/ExampleUtils.scala  |  2 +-
 .../hadoop/test/util/StoreCreator.java  |  1 -
 .../presto/util/CarbonDataStoreCreator.scala|  4 +-
 .../sdv/generated/OffheapQuery1TestCase.scala   |  2 -
 .../sdv/generated/OffheapQuery2TestCase.scala   |  2 -
 .../aggquery/IntegerDataTypeTestCase.scala  |  6 +--
 .../TestNullAndEmptyFieldsUnsafe.scala  |  6 +--
 .../TestLoadDataWithHiveSyntaxUnsafe.scala  |  6 +--
 .../spark/rdd/NewCarbonDataLoadRDD.scala|  1 -
 .../CarbonAlterTableCompactionCommand.scala |  4 +-
 .../sql/test/Spark2TestQueryExecutor.scala  |  2 -
 .../BooleanDataTypesBigFileTest.scala   |  4 +-
 .../booleantype/BooleanDataTypesLoadTest.scala  |  4 +-
 .../carbondata/processing/StoreCreator.java |  1 -
 21 files changed, 47 insertions(+), 119 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/0283c938/conf/carbon.properties.template
--
diff --git a/conf/carbon.properties.template b/conf/carbon.properties.template
index 8c1e458..9727578 100644
--- a/conf/carbon.properties.template
+++ b/conf/carbon.properties.template
@@ -17,29 +17,26 @@
 #
 
  System Configuration ##
-#Mandatory. Carbon Store path
-carbon.storelocation=hdfs://hacluster/Opt/CarbonStore
+##Optional. Location where CarbonData will create the store, and write the 
data in its own format.
+##If not specified then it takes spark.sql.warehouse.dir path.
+#carbon.storelocation
 #Base directory for Data files
-carbon.ddl.base.hdfs.url=hdfs://hacluster/opt/data
+#carbon.ddl.base.hdfs.url
 #Path where the bad records are stored
-carbon.badRecords.location=/opt/Carbon/Spark/badrecords
+#carbon.badRecords.location
 
  Performance Configuration ##
  DataLoading Configuration 
 #File read buffer size used during sorting(in MB) :MIN=1:MAX=100
-carbon.sort.file.buffer.size=20
-#Rowset size exchanged between data load graph steps :MIN=500:MAX=100
-carbon.graph.rowset.size=10
+carbon.sort.file.buffer.size=10
 #Number of cores to be used while data loading
-carbon.number.of.cores.while.loading=6
+carbon.number.of.cores.while.loading=2
 #Record count to sort and write to temp intermediate files
-carbon.sort.size=50
+carbon.sort.size=10
 #Algorithm for hashmap for hashkey calculation
 carbon.enableXXHash=true
 #Number of cores to be used for block sort while dataloading
 #carbon.number.of.cores.block.sort=7
-#max level cache size upto which level cache will be loaded in memory
-#carbon.max.level.cache.size=-1
 #enable prefetch of data during merge sort while reading data from sort temp 
files in data loading
 #carbon.merge.sort.prefetch=true
  Alter Partition Configuration 
@@ -76,22 +73,15 @@ carbon.enable.quick.filter=false
 #carbon.block.meta.size.reserved.percentage=10
 ##csv reading buffer size.
 #carbon.csv.read.buffersize.byte=1048576
-##To identify and apply compression for non-high cardinality columns
-#high.cardinality.value=10
 ##maximum no of threads used for reading intermediate files for final merging.
 #carbon.merge.sort.reader.thread=3
 ##Carbon blocklet size. Note: this configuration cannot be change once store 
is generated
 #carbon.blocklet.size=12
-##number of retries to get the metadata lock for loading data to table
-#carbon.load.metadata.lock.retries=3
 ##Minimum blocklets needed for distribution.
 #carbon.blockletdistribution.min.blocklet.size=10
 ##Interval between the retries to get the lock
 

carbondata git commit: [CARBONDATA-1993] Carbon properties default values fix and corresponding template and document correction

2018-03-08 Thread gvramana
Repository: carbondata
Updated Branches:
  refs/heads/master 39fa1eb58 -> be600bc90


[CARBONDATA-1993] Carbon properties default values fix and corresponding 
template and document correction

This closes #1831


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/be600bc9
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/be600bc9
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/be600bc9

Branch: refs/heads/master
Commit: be600bc907dbd6051b7ef51452c7a4fe044f4786
Parents: 39fa1eb
Author: mohammadshahidkhan 
Authored: Thu Jan 18 16:54:15 2018 +0530
Committer: Venkata Ramana G 
Committed: Fri Mar 9 12:07:35 2018 +0530

--
 conf/carbon.properties.template | 32 ---
 .../core/constants/CarbonCommonConstants.java   | 56 +++-
 .../core/datastore/page/ColumnPage.java |  4 +-
 docs/configuration-parameters.md| 19 +++
 docs/installation-guide.md  |  6 +--
 .../carbondata/examples/CompareTest.scala   |  2 +-
 .../carbondata/examples/ConcurrencyTest.scala   |  2 +-
 .../carbondata/examples/ExampleUtils.scala  |  2 +-
 .../hadoop/test/util/StoreCreator.java  |  1 -
 .../presto/util/CarbonDataStoreCreator.scala|  4 +-
 .../sdv/generated/OffheapQuery1TestCase.scala   |  2 -
 .../sdv/generated/OffheapQuery2TestCase.scala   |  2 -
 .../aggquery/IntegerDataTypeTestCase.scala  |  6 +--
 .../TestNullAndEmptyFieldsUnsafe.scala  |  6 +--
 .../TestLoadDataWithHiveSyntaxUnsafe.scala  |  6 +--
 .../spark/rdd/NewCarbonDataLoadRDD.scala|  1 -
 .../CarbonAlterTableCompactionCommand.scala |  4 +-
 .../sql/test/Spark2TestQueryExecutor.scala  |  2 -
 .../BooleanDataTypesBigFileTest.scala   |  4 +-
 .../booleantype/BooleanDataTypesLoadTest.scala  |  4 +-
 .../carbondata/processing/StoreCreator.java |  1 -
 21 files changed, 47 insertions(+), 119 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/be600bc9/conf/carbon.properties.template
--
diff --git a/conf/carbon.properties.template b/conf/carbon.properties.template
index 8c1e458..9727578 100644
--- a/conf/carbon.properties.template
+++ b/conf/carbon.properties.template
@@ -17,29 +17,26 @@
 #
 
  System Configuration ##
-#Mandatory. Carbon Store path
-carbon.storelocation=hdfs://hacluster/Opt/CarbonStore
+##Optional. Location where CarbonData will create the store, and write the 
data in its own format.
+##If not specified then it takes spark.sql.warehouse.dir path.
+#carbon.storelocation
 #Base directory for Data files
-carbon.ddl.base.hdfs.url=hdfs://hacluster/opt/data
+#carbon.ddl.base.hdfs.url
 #Path where the bad records are stored
-carbon.badRecords.location=/opt/Carbon/Spark/badrecords
+#carbon.badRecords.location
 
  Performance Configuration ##
  DataLoading Configuration 
 #File read buffer size used during sorting(in MB) :MIN=1:MAX=100
-carbon.sort.file.buffer.size=20
-#Rowset size exchanged between data load graph steps :MIN=500:MAX=100
-carbon.graph.rowset.size=10
+carbon.sort.file.buffer.size=10
 #Number of cores to be used while data loading
-carbon.number.of.cores.while.loading=6
+carbon.number.of.cores.while.loading=2
 #Record count to sort and write to temp intermediate files
-carbon.sort.size=50
+carbon.sort.size=10
 #Algorithm for hashmap for hashkey calculation
 carbon.enableXXHash=true
 #Number of cores to be used for block sort while dataloading
 #carbon.number.of.cores.block.sort=7
-#max level cache size upto which level cache will be loaded in memory
-#carbon.max.level.cache.size=-1
 #enable prefetch of data during merge sort while reading data from sort temp 
files in data loading
 #carbon.merge.sort.prefetch=true
  Alter Partition Configuration 
@@ -76,22 +73,15 @@ carbon.enable.quick.filter=false
 #carbon.block.meta.size.reserved.percentage=10
 ##csv reading buffer size.
 #carbon.csv.read.buffersize.byte=1048576
-##To identify and apply compression for non-high cardinality columns
-#high.cardinality.value=10
 ##maximum no of threads used for reading intermediate files for final merging.
 #carbon.merge.sort.reader.thread=3
 ##Carbon blocklet size. Note: this configuration cannot be change once store 
is generated
 #carbon.blocklet.size=12
-##number of retries to get the metadata lock for loading data to table
-#carbon.load.metadata.lock.retries=3
 ##Minimum blocklets needed for distribution.
 #carbon.blockletdistribution.min.blocklet.size=10
 ##Interval between the retries to get the lock
 

[41/54] [abbrv] carbondata git commit: [CARBONDATA-2189] Add DataMapProvider developer interface

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/89a12af5/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
deleted file mode 100644
index 90178b1..000
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
+++ /dev/null
@@ -1,971 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.carbondata.core.indexstore.blockletindex;
-
-import java.io.ByteArrayInputStream;
-import java.io.ByteArrayOutputStream;
-import java.io.DataInputStream;
-import java.io.DataOutput;
-import java.io.DataOutputStream;
-import java.io.IOException;
-import java.io.UnsupportedEncodingException;
-import java.math.BigDecimal;
-import java.nio.ByteBuffer;
-import java.util.ArrayList;
-import java.util.BitSet;
-import java.util.Comparator;
-import java.util.List;
-
-import org.apache.carbondata.common.logging.LogService;
-import org.apache.carbondata.common.logging.LogServiceFactory;
-import org.apache.carbondata.core.cache.Cacheable;
-import org.apache.carbondata.core.constants.CarbonCommonConstants;
-import org.apache.carbondata.core.datamap.dev.DataMapModel;
-import 
org.apache.carbondata.core.datamap.dev.cgdatamap.AbstractCoarseGrainDataMap;
-import org.apache.carbondata.core.datastore.IndexKey;
-import org.apache.carbondata.core.datastore.block.SegmentProperties;
-import org.apache.carbondata.core.datastore.block.TableBlockInfo;
-import org.apache.carbondata.core.indexstore.BlockMetaInfo;
-import org.apache.carbondata.core.indexstore.Blocklet;
-import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
-import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
-import org.apache.carbondata.core.indexstore.PartitionSpec;
-import org.apache.carbondata.core.indexstore.UnsafeMemoryDMStore;
-import org.apache.carbondata.core.indexstore.row.DataMapRow;
-import org.apache.carbondata.core.indexstore.row.DataMapRowImpl;
-import org.apache.carbondata.core.indexstore.schema.CarbonRowSchema;
-import org.apache.carbondata.core.memory.MemoryException;
-import org.apache.carbondata.core.metadata.blocklet.BlockletInfo;
-import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
-import org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex;
-import org.apache.carbondata.core.metadata.blocklet.index.BlockletMinMaxIndex;
-import org.apache.carbondata.core.metadata.datatype.DataType;
-import org.apache.carbondata.core.metadata.datatype.DataTypes;
-import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
-import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
-import org.apache.carbondata.core.scan.filter.FilterExpressionProcessor;
-import org.apache.carbondata.core.scan.filter.FilterUtil;
-import org.apache.carbondata.core.scan.filter.executer.FilterExecuter;
-import 
org.apache.carbondata.core.scan.filter.executer.ImplicitColumnFilterExecutor;
-import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
-import org.apache.carbondata.core.util.ByteUtil;
-import org.apache.carbondata.core.util.CarbonUtil;
-import org.apache.carbondata.core.util.DataFileFooterConverter;
-import org.apache.carbondata.core.util.DataTypeUtil;
-import org.apache.carbondata.core.util.path.CarbonTablePath;
-
-import org.apache.commons.lang3.StringUtils;
-import org.apache.hadoop.fs.Path;
-import org.xerial.snappy.Snappy;
-
-/**
- * Datamap implementation for blocklet.
- */
-public class BlockletDataMap extends AbstractCoarseGrainDataMap implements 
Cacheable {
-
-  private static final LogService LOGGER =
-  LogServiceFactory.getLogService(BlockletDataMap.class.getName());
-
-  private static int KEY_INDEX = 0;
-
-  private static int MIN_VALUES_INDEX = 1;
-
-  private static int MAX_VALUES_INDEX = 2;
-
-  private static int ROW_COUNT_INDEX = 3;
-
-  private static int FILE_PATH_INDEX = 4;
-
-  private static int PAGE_COUNT_INDEX = 5;
-
-  

[33/54] [abbrv] carbondata git commit: [CARBONDATA-1114][Tests] Fix bugs in tests in windows env

2018-03-08 Thread ravipesala
[CARBONDATA-1114][Tests] Fix bugs in tests in windows env

Fix bugs in tests that will cause failure under windows env

This closes #1994


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/859d71c1
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/859d71c1
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/859d71c1

Branch: refs/heads/master
Commit: 859d71c1737b6287c4f1a06f7cd9055d32ff8a99
Parents: d5396b1
Author: xuchuanyin 
Authored: Sat Feb 24 21:18:17 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:11 2018 +0530

--
 .../carbondata/core/datamap/dev/DataMap.java|   6 -
 .../core/datamap/dev/DataMapFactory.java|   2 +-
 .../exception/ConcurrentOperationException.java |  16 +-
 .../carbondata/core/locks/LocalFileLock.java|  30 ++--
 .../core/metadata/PartitionMapFileStore.java|   0
 .../statusmanager/SegmentStatusManager.java |  10 +-
 .../SegmentUpdateStatusManager.java |   1 -
 .../store/impl/DFSFileReaderImplUnitTest.java   |  11 +-
 .../store/impl/FileFactoryImplUnitTest.java |  28 +++-
 .../filesystem/HDFSCarbonFileTest.java  |   3 +-
 .../filesystem/LocalCarbonFileTest.java |  20 ++-
 datamap/examples/pom.xml| 145 +++--
 .../datamap/examples/MinMaxDataWriter.java  |   1 -
 examples/flink/pom.xml  |   4 +-
 .../carbondata/examples/FlinkExample.scala  |  10 +-
 .../CarbonStreamSparkStreamingExample.scala |   1 -
 .../hadoop/api/CarbonTableInputFormat.java  |   5 +-
 .../TestInsertAndOtherCommandConcurrent.scala   |   2 +-
 .../StandardPartitionGlobalSortTestCase.scala   |   2 +-
 .../exception/ProcessMetaDataException.java |   2 +
 .../org/apache/carbondata/api/CarbonStore.scala |   6 +-
 .../carbondata/spark/load/CsvRDDHelper.scala| 157 +++
 .../load/DataLoadProcessBuilderOnSpark.scala|   3 +-
 .../carbondata/spark/util/CarbonScalaUtil.scala |   2 +-
 .../carbondata/spark/util/CommonUtil.scala  |   2 -
 .../command/carbonTableSchemaCommon.scala   |   6 +-
 .../CarbonAlterTableCompactionCommand.scala |   3 +-
 .../management/CarbonCleanFilesCommand.scala|   2 +-
 .../CarbonDeleteLoadByIdCommand.scala   |   2 +-
 .../CarbonDeleteLoadByLoadDateCommand.scala |   2 +-
 .../management/CarbonLoadDataCommand.scala  |  28 ++--
 .../CarbonProjectForDeleteCommand.scala |   2 +-
 .../CarbonProjectForUpdateCommand.scala |   2 +-
 .../schema/CarbonAlterTableRenameCommand.scala  |   2 +-
 .../command/table/CarbonDropTableCommand.scala  |   2 +-
 .../datasources/CarbonFileFormat.scala  |   3 -
 .../BooleanDataTypesInsertTest.scala|   5 +-
 .../vectorreader/AddColumnTestCases.scala   |   1 +
 .../datamap/DataMapWriterListener.java  |   3 +-
 .../loading/model/CarbonLoadModelBuilder.java   |  34 +++-
 .../processing/loading/model/LoadOption.java|  15 +-
 .../processing/merger/CarbonDataMergerUtil.java |   3 +-
 .../util/CarbonDataProcessorUtil.java   |   3 +-
 .../processing/util/CarbonLoaderUtil.java   |   8 +
 .../carbondata/lcm/locks/LocalFileLockTest.java |   2 +-
 .../loading/csvinput/CSVInputFormatTest.java|   1 +
 store/sdk/pom.xml   |   2 +-
 .../carbondata/sdk/file/CSVCarbonWriter.java|   8 +-
 48 files changed, 400 insertions(+), 208 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/859d71c1/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
index 02db8af..dd5507c 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java
@@ -38,9 +38,6 @@ public interface DataMap {
   /**
* Prune the datamap with filter expression and partition information. It 
returns the list of
* blocklets where these filters can exist.
-   *
-   * @param filterExp
-   * @return
*/
   List prune(FilterResolverIntf filterExp, SegmentProperties 
segmentProperties,
   List partitions);
@@ -48,9 +45,6 @@ public interface DataMap {
   // TODO Move this method to Abstract class
   /**
* Validate whether the current segment needs to be fetching the required 
data
-   *
-   * @param filterExp
-   * @return
*/
   boolean isScanRequired(FilterResolverIntf filterExp);
 


[48/54] [abbrv] carbondata git commit: [CARBONDATA-2213][DataMap] Fixed wrong version for module datamap-example

2018-03-08 Thread ravipesala
[CARBONDATA-2213][DataMap] Fixed wrong version for module datamap-example

The version of Module ‘carbondata-datamap-example’ should be 1.4.0-snapshot 
instead of 1.3.0-snapshot, otherwise compilation will failed.

This closes #2011


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/5397c05c
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/5397c05c
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/5397c05c

Branch: refs/heads/master
Commit: 5397c05c9bd80a0ee956046d027b6a2900475f52
Parents: 5eb476f
Author: xuchuanyin 
Authored: Wed Feb 28 10:31:02 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:11 2018 +0530

--
 datamap/examples/pom.xml|  8 
 .../datamap/examples/MinMaxIndexDataMapFactory.java |  1 -
 .../datamap/lucene/LuceneCoarseGrainDataMap.java|  3 ++-
 .../lucene/LuceneCoarseGrainDataMapFactory.java |  9 ++---
 .../datamap/lucene/LuceneDataMapFactoryBase.java| 16 +++-
 .../datamap/lucene/LuceneDataMapWriter.java |  5 +++--
 .../datamap/lucene/LuceneFineGrainDataMap.java  |  3 ++-
 .../lucene/LuceneFineGrainDataMapFactory.java   |  9 ++---
 8 files changed, 34 insertions(+), 20 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/5397c05c/datamap/examples/pom.xml
--
diff --git a/datamap/examples/pom.xml b/datamap/examples/pom.xml
index 8539a86..30e1522 100644
--- a/datamap/examples/pom.xml
+++ b/datamap/examples/pom.xml
@@ -22,10 +22,10 @@
   4.0.0
 
   
-  org.apache.carbondata
-  carbondata-parent
-  1.4.0-SNAPSHOT
-  ../../pom.xml
+org.apache.carbondata
+carbondata-parent
+1.4.0-SNAPSHOT
+../../pom.xml
   
 
   carbondata-datamap-examples

http://git-wip-us.apache.org/repos/asf/carbondata/blob/5397c05c/datamap/examples/src/minmaxdatamap/main/java/org/apache/carbondata/datamap/examples/MinMaxIndexDataMapFactory.java
--
diff --git 
a/datamap/examples/src/minmaxdatamap/main/java/org/apache/carbondata/datamap/examples/MinMaxIndexDataMapFactory.java
 
b/datamap/examples/src/minmaxdatamap/main/java/org/apache/carbondata/datamap/examples/MinMaxIndexDataMapFactory.java
index 9fea55b..45dee2a 100644
--- 
a/datamap/examples/src/minmaxdatamap/main/java/org/apache/carbondata/datamap/examples/MinMaxIndexDataMapFactory.java
+++ 
b/datamap/examples/src/minmaxdatamap/main/java/org/apache/carbondata/datamap/examples/MinMaxIndexDataMapFactory.java
@@ -25,7 +25,6 @@ import java.util.List;
 import org.apache.carbondata.core.datamap.DataMapDistributable;
 import org.apache.carbondata.core.datamap.DataMapMeta;
 import org.apache.carbondata.core.datamap.Segment;
-import org.apache.carbondata.core.datamap.dev.AbstractDataMapWriter;
 import org.apache.carbondata.core.datamap.dev.DataMapModel;
 import org.apache.carbondata.core.datamap.dev.DataMapWriter;
 import org.apache.carbondata.core.datamap.dev.cgdatamap.CoarseGrainDataMap;

http://git-wip-us.apache.org/repos/asf/carbondata/blob/5397c05c/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneCoarseGrainDataMap.java
--
diff --git 
a/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneCoarseGrainDataMap.java
 
b/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneCoarseGrainDataMap.java
index 0b7df86..580f18b 100644
--- 
a/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneCoarseGrainDataMap.java
+++ 
b/datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneCoarseGrainDataMap.java
@@ -33,6 +33,7 @@ import 
org.apache.carbondata.core.datamap.dev.cgdatamap.CoarseGrainDataMap;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.datastore.impl.FileFactory;
 import org.apache.carbondata.core.indexstore.Blocklet;
+import org.apache.carbondata.core.indexstore.PartitionSpec;
 import org.apache.carbondata.core.memory.MemoryException;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
 
@@ -133,7 +134,7 @@ public class LuceneCoarseGrainDataMap extends 
CoarseGrainDataMap {
*/
   @Override
   public List prune(FilterResolverIntf filterExp, SegmentProperties 
segmentProperties,
-  List partitions) throws IOException {
+  List partitions) throws IOException {
 
 // convert filter expr into lucene list query
 List fields = new ArrayList();


[37/54] [abbrv] carbondata git commit: [CARBONDATA-2189] Add DataMapProvider developer interface

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/89a12af5/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateTableHelper.scala
--
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateTableHelper.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateTableHelper.scala
new file mode 100644
index 000..4c0e637
--- /dev/null
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateTableHelper.scala
@@ -0,0 +1,195 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.preaaggregate
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+
+import org.apache.spark.sql._
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.execution.command._
+import org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand
+import org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand
+import org.apache.spark.sql.execution.command.timeseries.TimeSeriesUtil
+import org.apache.spark.sql.optimizer.CarbonFilters
+import org.apache.spark.sql.parser.CarbonSpark2SqlParser
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedDataMapCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.schema.datamap.DataMapProvider
+import 
org.apache.carbondata.core.metadata.schema.table.AggregationDataMapSchema
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.metadata.schema.datamap.DataMapProvider
+import 
org.apache.carbondata.core.metadata.schema.table.{AggregationDataMapSchema, 
CarbonTable, DataMapSchema}
+import org.apache.carbondata.core.statusmanager.{SegmentStatus, 
SegmentStatusManager}
+
+/**
+ * Below helper class will be used to create pre-aggregate table
+ * and updating the parent table about the child table information
+ * It will be either success or nothing happen in case of failure:
+ * 1. failed to create pre aggregate table.
+ * 2. failed to update main table
+ *
+ */
+case class PreAggregateTableHelper(
+var parentTable: CarbonTable,
+dataMapName: String,
+dataMapClassName: String,
+dataMapProperties: java.util.Map[String, String],
+queryString: String,
+timeSeriesFunction: Option[String] = None,
+ifNotExistsSet: Boolean = false) {
+
+  var loadCommand: CarbonLoadDataCommand = _
+
+  def initMeta(sparkSession: SparkSession): Seq[Row] = {
+val dmProperties = dataMapProperties.asScala
+val updatedQuery = new 
CarbonSpark2SqlParser().addPreAggFunction(queryString)
+val df = sparkSession.sql(updatedQuery)
+val fieldRelationMap = 
PreAggregateUtil.validateActualSelectPlanAndGetAttributes(
+  df.logicalPlan, queryString)
+val fields = fieldRelationMap.keySet.toSeq
+val tableProperties = mutable.Map[String, String]()
+dmProperties.foreach(t => tableProperties.put(t._1, t._2))
+
+val selectTable = PreAggregateUtil.getParentCarbonTable(df.logicalPlan)
+if (!parentTable.getTableName.equalsIgnoreCase(selectTable.getTableName)) {
+  throw new MalformedDataMapCommandException(
+"Parent table name is different in select and create")
+}
+var neworder = Seq[String]()
+val parentOrder = 
parentTable.getSortColumns(parentTable.getTableName).asScala
+parentOrder.foreach(parentcol =>
+  fields.filter(col => fieldRelationMap(col).aggregateFunction.isEmpty &&
+   parentcol.equals(fieldRelationMap(col).
+ columnTableRelationList.get(0).parentColumnName))
+.map(cols => neworder :+= cols.column)
+)
+tableProperties.put(CarbonCommonConstants.SORT_COLUMNS, 
neworder.mkString(","))
+tableProperties.put("sort_scope", parentTable.getTableInfo.getFactTable.
+  getTableProperties.asScala.getOrElse("sort_scope", CarbonCommonConstants
+  .LOAD_SORT_SCOPE_DEFAULT))
+tableProperties
+   

[49/54] [abbrv] carbondata git commit: [HOTFIX] Add dava doc for datamap interface

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/fc2a7eb3/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/TestIndexDataMapCommand.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/TestIndexDataMapCommand.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/TestIndexDataMapCommand.scala
deleted file mode 100644
index a05a8c2..000
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/TestIndexDataMapCommand.scala
+++ /dev/null
@@ -1,285 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.carbondata.spark.testsuite.datamap
-
-import java.io.{File, FilenameFilter}
-
-import org.apache.spark.sql.Row
-import org.apache.spark.sql.test.util.QueryTest
-import org.scalatest.BeforeAndAfterAll
-
-import org.apache.carbondata.common.exceptions.MetadataProcessException
-import 
org.apache.carbondata.common.exceptions.sql.{MalformedDataMapCommandException, 
NoSuchDataMapException}
-import org.apache.carbondata.core.constants.CarbonCommonConstants
-import org.apache.carbondata.core.metadata.CarbonMetadata
-import org.apache.carbondata.core.util.CarbonProperties
-import org.apache.carbondata.core.util.path.CarbonTablePath
-
-class TestIndexDataMapCommand extends QueryTest with BeforeAndAfterAll {
-
-  val testData = s"$resourcesPath/sample.csv"
-
-  override def beforeAll {
-sql("drop table if exists datamaptest")
-sql("drop table if exists datamapshowtest")
-sql("drop table if exists uniqdata")
-sql("create table datamaptest (a string, b string, c string) stored by 
'carbondata'")
-  }
-
-  val newClass = "org.apache.spark.sql.CarbonSource"
-
-  test("test datamap create: don't support using non-exist class") {
-intercept[MetadataProcessException] {
-  sql(s"CREATE DATAMAP datamap1 ON TABLE datamaptest USING '$newClass'")
-}
-  }
-
-  test("test datamap create with dmproperties: don't support using non-exist 
class") {
-intercept[MetadataProcessException] {
-  sql(s"CREATE DATAMAP datamap2 ON TABLE datamaptest USING '$newClass' 
DMPROPERTIES('key'='value')")
-}
-  }
-
-  test("test datamap create with existing name: don't support using non-exist 
class") {
-intercept[MetadataProcessException] {
-  sql(
-s"CREATE DATAMAP datamap2 ON TABLE datamaptest USING '$newClass' 
DMPROPERTIES('key'='value')")
-}
-  }
-
-  test("test datamap create with preagg") {
-sql("drop datamap if exists datamap3 on table datamaptest")
-sql(
-  "create datamap datamap3 on table datamaptest using 'preaggregate' as 
select count(a) from datamaptest")
-val table = CarbonMetadata.getInstance().getCarbonTable("default", 
"datamaptest")
-assert(table != null)
-val dataMapSchemaList = table.getTableInfo.getDataMapSchemaList
-assert(dataMapSchemaList.size() == 1)
-assert(dataMapSchemaList.get(0).getDataMapName.equals("datamap3"))
-
assert(dataMapSchemaList.get(0).getChildSchema.getTableName.equals("datamaptest_datamap3"))
-  }
-
-  test("check hivemetastore after drop datamap") {
-try {
-  CarbonProperties.getInstance()
-.addProperty(CarbonCommonConstants.ENABLE_HIVE_SCHEMA_META_STORE,
-  "true")
-  sql("drop table if exists hiveMetaStoreTable")
-  sql("create table hiveMetaStoreTable (a string, b string, c string) 
stored by 'carbondata'")
-
-  sql(
-"create datamap datamap_hiveMetaStoreTable on table hiveMetaStoreTable 
using 'preaggregate' as select count(a) from hiveMetaStoreTable")
-  checkExistence(sql("show datamap on table hiveMetaStoreTable"), true, 
"datamap_hiveMetaStoreTable")
-
-  sql("drop datamap datamap_hiveMetaStoreTable on table 
hiveMetaStoreTable")
-  checkExistence(sql("show datamap on table hiveMetaStoreTable"), false, 
"datamap_hiveMetaStoreTable")
-
-} finally {
-  sql("drop table hiveMetaStoreTable")
-  CarbonProperties.getInstance()
-.addProperty(CarbonCommonConstants.ENABLE_HIVE_SCHEMA_META_STORE,
- 

[19/54] [abbrv] carbondata git commit: [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row

2018-03-08 Thread ravipesala
[CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row

Pick up the no-sort fields in the row and pack them as bytes array and skip 
parsing them during merge sort to reduce CPU consumption

This closes #1792


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/2b41f140
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/2b41f140
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/2b41f140

Branch: refs/heads/master
Commit: 2b41f140229c1799178313b257f3779908e69010
Parents: 89cfd8e
Author: xuchuanyin 
Authored: Thu Feb 8 14:35:14 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:10 2018 +0530

--
 .../carbondata/core/util/NonDictionaryUtil.java |  67 +--
 .../presto/util/CarbonDataStoreCreator.scala|   1 -
 .../load/DataLoadProcessorStepOnSpark.scala |   6 +-
 .../loading/row/IntermediateSortTempRow.java| 117 +
 .../loading/sort/SortStepRowHandler.java| 466 +++
 .../loading/sort/SortStepRowUtil.java   | 103 
 .../sort/unsafe/UnsafeCarbonRowPage.java| 331 ++---
 .../loading/sort/unsafe/UnsafeSortDataRows.java |  57 +--
 .../unsafe/comparator/UnsafeRowComparator.java  |  95 ++--
 .../UnsafeRowComparatorForNormalDIms.java   |  59 ---
 .../UnsafeRowComparatorForNormalDims.java   |  59 +++
 .../sort/unsafe/holder/SortTempChunkHolder.java |   3 +-
 .../holder/UnsafeFinalMergePageHolder.java  |  19 +-
 .../unsafe/holder/UnsafeInmemoryHolder.java |  21 +-
 .../holder/UnsafeSortTempFileChunkHolder.java   | 138 ++
 .../merger/UnsafeIntermediateFileMerger.java| 118 +
 .../UnsafeSingleThreadFinalSortFilesMerger.java |  27 +-
 .../merger/CompactionResultSortProcessor.java   |   1 -
 .../sort/sortdata/IntermediateFileMerger.java   |  95 +---
 .../IntermediateSortTempRowComparator.java  |  73 +++
 .../sort/sortdata/NewRowComparator.java |   5 +-
 .../sortdata/NewRowComparatorForNormalDims.java |   3 +-
 .../processing/sort/sortdata/RowComparator.java |  94 
 .../sortdata/RowComparatorForNormalDims.java|  62 ---
 .../SingleThreadFinalSortFilesMerger.java   |  25 +-
 .../processing/sort/sortdata/SortDataRows.java  |  85 +---
 .../sort/sortdata/SortTempFileChunkHolder.java  | 174 ++-
 .../sort/sortdata/TableFieldStat.java   | 176 +++
 28 files changed, 1186 insertions(+), 1294 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/2b41f140/core/src/main/java/org/apache/carbondata/core/util/NonDictionaryUtil.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/NonDictionaryUtil.java 
b/core/src/main/java/org/apache/carbondata/core/util/NonDictionaryUtil.java
index d6ecfbc..fca1244 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/NonDictionaryUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/NonDictionaryUtil.java
@@ -82,18 +82,26 @@ public class NonDictionaryUtil {
   }
 
   /**
-   * Method to get the required Dimension from obj []
+   * Method to get the required dictionary Dimension from obj []
*
* @param index
* @param row
* @return
*/
-  public static Integer getDimension(int index, Object[] row) {
-
-Integer[] dimensions = (Integer[]) 
row[WriteStepRowUtil.DICTIONARY_DIMENSION];
-
+  public static int getDictDimension(int index, Object[] row) {
+int[] dimensions = (int[]) row[WriteStepRowUtil.DICTIONARY_DIMENSION];
 return dimensions[index];
+  }
 
+  /**
+   * Method to get the required non-dictionary & complex from 3-parted row
+   * @param index
+   * @param row
+   * @return
+   */
+  public static byte[] getNoDictOrComplex(int index, Object[] row) {
+byte[][] nonDictArray = (byte[][]) 
row[WriteStepRowUtil.NO_DICTIONARY_AND_COMPLEX];
+return nonDictArray[index];
   }
 
   /**
@@ -108,60 +116,11 @@ public class NonDictionaryUtil {
 return measures[index];
   }
 
-  public static byte[] getByteArrayForNoDictionaryCols(Object[] row) {
-
-return (byte[]) row[WriteStepRowUtil.NO_DICTIONARY_AND_COMPLEX];
-  }
-
   public static void prepareOutObj(Object[] out, int[] dimArray, byte[][] 
byteBufferArr,
   Object[] measureArray) {
-
 out[WriteStepRowUtil.DICTIONARY_DIMENSION] = dimArray;
 out[WriteStepRowUtil.NO_DICTIONARY_AND_COMPLEX] = byteBufferArr;
 out[WriteStepRowUtil.MEASURE] = measureArray;
 
   }
-
-  /**
-   * This method will extract the single dimension from the complete high card 
dims byte[].+ *
-   * The format of the byte [] will be,  Totallength,CompleteStartOffsets,Dat
-   *
-   * @param highCardArr
-   * @param index
-   * @param highCardinalityCount
-   * 

[18/54] [abbrv] carbondata git commit: [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/2b41f140/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
 
b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
index 11b3d43..527452a 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
@@ -31,15 +31,14 @@ import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datastore.impl.FileFactory;
-import org.apache.carbondata.core.metadata.datatype.DataType;
-import org.apache.carbondata.core.metadata.datatype.DataTypes;
 import org.apache.carbondata.core.util.CarbonProperties;
 import org.apache.carbondata.core.util.CarbonUtil;
-import org.apache.carbondata.core.util.DataTypeUtil;
-import 
org.apache.carbondata.processing.loading.sort.unsafe.UnsafeCarbonRowPage;
+import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
+import org.apache.carbondata.processing.loading.sort.SortStepRowHandler;
 import 
org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException;
-import org.apache.carbondata.processing.sort.sortdata.NewRowComparator;
+import 
org.apache.carbondata.processing.sort.sortdata.IntermediateSortTempRowComparator;
 import org.apache.carbondata.processing.sort.sortdata.SortParameters;
+import org.apache.carbondata.processing.sort.sortdata.TableFieldStat;
 
 public class UnsafeSortTempFileChunkHolder implements SortTempChunkHolder {
 
@@ -63,21 +62,15 @@ public class UnsafeSortTempFileChunkHolder implements 
SortTempChunkHolder {
* entry count
*/
   private int entryCount;
-
   /**
* return row
*/
-  private Object[] returnRow;
-  private int dimCnt;
-  private int complexCnt;
-  private int measureCnt;
-  private boolean[] isNoDictionaryDimensionColumn;
-  private DataType[] measureDataTypes;
+  private IntermediateSortTempRow returnRow;
   private int readBufferSize;
   private String compressorName;
-  private Object[][] currentBuffer;
+  private IntermediateSortTempRow[] currentBuffer;
 
-  private Object[][] backupBuffer;
+  private IntermediateSortTempRow[] backupBuffer;
 
   private boolean isBackupFilled;
 
@@ -100,27 +93,21 @@ public class UnsafeSortTempFileChunkHolder implements 
SortTempChunkHolder {
 
   private int numberOfObjectRead;
 
-  private int nullSetWordsLength;
-
-  private Comparator comparator;
-
+  private TableFieldStat tableFieldStat;
+  private SortStepRowHandler sortStepRowHandler;
+  private Comparator comparator;
   /**
* Constructor to initialize
*/
   public UnsafeSortTempFileChunkHolder(File tempFile, SortParameters 
parameters) {
 // set temp file
 this.tempFile = tempFile;
-this.dimCnt = parameters.getDimColCount();
-this.complexCnt = parameters.getComplexDimColCount();
-this.measureCnt = parameters.getMeasureColCount();
-this.isNoDictionaryDimensionColumn = 
parameters.getNoDictionaryDimnesionColumn();
-this.measureDataTypes = parameters.getMeasureDataType();
 this.readBufferSize = parameters.getBufferSize();
 this.compressorName = parameters.getSortTempCompressorName();
-
+this.tableFieldStat = new TableFieldStat(parameters);
+this.sortStepRowHandler = new SortStepRowHandler(tableFieldStat);
 this.executorService = Executors.newFixedThreadPool(1);
-this.nullSetWordsLength = ((parameters.getMeasureColCount() - 1) >> 6) + 1;
-comparator = new NewRowComparator(parameters.getNoDictionarySortColumn());
+comparator = new 
IntermediateSortTempRowComparator(parameters.getNoDictionarySortColumn());
 initialize();
   }
 
@@ -169,11 +156,17 @@ public class UnsafeSortTempFileChunkHolder implements 
SortTempChunkHolder {
*
* @throws CarbonSortKeyAndGroupByException problem while reading
*/
+  @Override
   public void readRow() throws CarbonSortKeyAndGroupByException {
 if (prefetch) {
   fillDataForPrefetch();
 } else {
-  this.returnRow = getRowFromStream();
+  try {
+this.returnRow = 
sortStepRowHandler.readIntermediateSortTempRowFromInputStream(stream);
+this.numberOfObjectRead++;
+  } catch (IOException e) {
+throw new CarbonSortKeyAndGroupByException("Problems while reading 
row", e);
+  }
 }
   }
 
@@ -207,63 +200,22 @@ public class UnsafeSortTempFileChunkHolder implements 
SortTempChunkHolder {
   }
 
   /**

[27/54] [abbrv] carbondata git commit: [CARBONDATA-2159] Remove carbon-spark dependency in store-sdk module

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/89cfd8e0/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/DataLoadingUtil.scala
--
diff --git 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/DataLoadingUtil.scala
 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/DataLoadingUtil.scala
index 8d394db..e69de29 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/DataLoadingUtil.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/DataLoadingUtil.scala
@@ -1,610 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.carbondata.spark.util
-
-import java.text.SimpleDateFormat
-import java.util
-import java.util.{Date, List, Locale}
-
-import scala.collection.{immutable, mutable}
-import scala.collection.JavaConverters._
-import scala.collection.mutable.ArrayBuffer
-
-import org.apache.commons.lang3.StringUtils
-import org.apache.hadoop.conf.Configuration
-import org.apache.hadoop.fs.Path
-import org.apache.hadoop.mapred.JobConf
-import org.apache.hadoop.mapreduce.{TaskAttemptID, TaskType}
-import org.apache.hadoop.mapreduce.lib.input.{FileInputFormat, FileSplit}
-import org.apache.hadoop.mapreduce.task.{JobContextImpl, 
TaskAttemptContextImpl}
-import org.apache.spark.deploy.SparkHadoopUtil
-import org.apache.spark.rdd.RDD
-import org.apache.spark.sql.SparkSession
-import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.catalyst.expressions.GenericInternalRow
-import org.apache.spark.sql.execution.datasources.{FilePartition, FileScanRDD, 
PartitionedFile}
-import org.apache.spark.sql.util.CarbonException
-import org.apache.spark.sql.util.SparkSQLUtil.sessionState
-
-import org.apache.carbondata.common.constants.LoggerAction
-import org.apache.carbondata.common.logging.{LogService, LogServiceFactory}
-import org.apache.carbondata.core.constants.{CarbonCommonConstants, 
CarbonLoadOptionConstants}
-import org.apache.carbondata.core.indexstore.PartitionSpec
-import org.apache.carbondata.core.locks.{CarbonLockFactory, CarbonLockUtil, 
LockUsage}
-import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier
-import org.apache.carbondata.core.metadata.schema.table.CarbonTable
-import org.apache.carbondata.core.statusmanager.{LoadMetadataDetails, 
SegmentStatus, SegmentStatusManager}
-import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
-import 
org.apache.carbondata.processing.loading.constants.DataLoadProcessorConstants
-import org.apache.carbondata.processing.loading.csvinput.CSVInputFormat
-import org.apache.carbondata.processing.loading.model.{CarbonDataLoadSchema, 
CarbonLoadModel}
-import org.apache.carbondata.processing.util.{CarbonLoaderUtil, 
DeleteLoadFolders, TableOptionConstant}
-import org.apache.carbondata.spark.exception.MalformedCarbonCommandException
-import org.apache.carbondata.spark.load.DataLoadProcessBuilderOnSpark.LOGGER
-import org.apache.carbondata.spark.load.ValidateUtil
-import org.apache.carbondata.spark.rdd.SerializableConfiguration
-
-/**
- * the util object of data loading
- */
-object DataLoadingUtil {
-
-  val LOGGER: LogService = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
-
-  /**
-   * get data loading options and initialise default value
-   */
-  def getDataLoadingOptions(
-  carbonProperty: CarbonProperties,
-  options: immutable.Map[String, String]): mutable.Map[String, String] = {
-val optionsFinal = scala.collection.mutable.Map[String, String]()
-optionsFinal.put("delimiter", options.getOrElse("delimiter", ","))
-optionsFinal.put("quotechar", options.getOrElse("quotechar", "\""))
-optionsFinal.put("fileheader", options.getOrElse("fileheader", ""))
-optionsFinal.put("commentchar", options.getOrElse("commentchar", "#"))
-optionsFinal.put("columndict", options.getOrElse("columndict", null))
-
-optionsFinal.put("escapechar",
-  CarbonLoaderUtil.getEscapeChar(options.getOrElse("escapechar", "\\")))
-
-optionsFinal.put(
-  "serialization_null_format",
-  

[17/54] [abbrv] carbondata git commit: [CARBONDATA-2186] Add InterfaceAudience.Internal to annotate internal interface

2018-03-08 Thread ravipesala
[CARBONDATA-2186] Add InterfaceAudience.Internal to annotate internal interface

This closes #1986


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/e72bfd1a
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/e72bfd1a
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/e72bfd1a

Branch: refs/heads/master
Commit: e72bfd1a949b9992026ebaa9df3224e4fa004422
Parents: 0539662
Author: Jacky Li 
Authored: Tue Feb 20 11:16:53 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:10 2018 +0530

--
 .../java/org/apache/carbondata/common/Maps.java  |  2 +-
 .../org/apache/carbondata/common/Strings.java|  2 +-
 .../common/annotations/InterfaceAudience.java| 19 ++-
 .../common/annotations/InterfaceStability.java   |  2 +-
 .../loading/model/CarbonLoadModelBuilder.java|  2 +-
 .../processing/loading/model/LoadOption.java |  2 +-
 .../carbondata/sdk/file/CSVCarbonWriter.java |  4 +---
 7 files changed, 20 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/e72bfd1a/common/src/main/java/org/apache/carbondata/common/Maps.java
--
diff --git a/common/src/main/java/org/apache/carbondata/common/Maps.java 
b/common/src/main/java/org/apache/carbondata/common/Maps.java
index 14fc329..4e76192 100644
--- a/common/src/main/java/org/apache/carbondata/common/Maps.java
+++ b/common/src/main/java/org/apache/carbondata/common/Maps.java
@@ -21,7 +21,7 @@ import java.util.Map;
 
 import org.apache.carbondata.common.annotations.InterfaceAudience;
 
-@InterfaceAudience.Developer
+@InterfaceAudience.Internal
 public class Maps {
 
   /**

http://git-wip-us.apache.org/repos/asf/carbondata/blob/e72bfd1a/common/src/main/java/org/apache/carbondata/common/Strings.java
--
diff --git a/common/src/main/java/org/apache/carbondata/common/Strings.java 
b/common/src/main/java/org/apache/carbondata/common/Strings.java
index 08fdc3c..23c7f9f 100644
--- a/common/src/main/java/org/apache/carbondata/common/Strings.java
+++ b/common/src/main/java/org/apache/carbondata/common/Strings.java
@@ -21,7 +21,7 @@ import java.util.Objects;
 
 import org.apache.carbondata.common.annotations.InterfaceAudience;
 
-@InterfaceAudience.Developer
+@InterfaceAudience.Internal
 public class Strings {
 
   /**

http://git-wip-us.apache.org/repos/asf/carbondata/blob/e72bfd1a/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceAudience.java
--
diff --git 
a/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceAudience.java
 
b/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceAudience.java
index fa9729d..8d214ff 100644
--- 
a/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceAudience.java
+++ 
b/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceAudience.java
@@ -25,10 +25,10 @@ import java.lang.annotation.RetentionPolicy;
  * This annotation is ported and modified from Apache Hadoop project.
  *
  * Annotation to inform users of a package, class or method's intended 
audience.
- * Currently the audience can be {@link User}, {@link Developer}
+ * Currently the audience can be {@link User}, {@link Developer}, {@link 
Internal}
  *
  * Public classes that are not marked with this annotation must be
- * considered by default as {@link Developer}.
+ * considered by default as {@link Internal}.
  *
  * External applications must only use classes that are marked {@link User}.
  *
@@ -47,12 +47,21 @@ public class InterfaceAudience {
   public @interface User { }
 
   /**
-   * Intended only for developers to extend interface for CarbonData project
-   * For example, new Datamap implementations.
+   * Intended for developers to develop extension for Apache CarbonData project
+   * For example, "Index DataMap" to add a new index implementation, etc
*/
   @Documented
   @Retention(RetentionPolicy.RUNTIME)
-  public @interface Developer { }
+  public @interface Developer {
+String[] value();
+  }
+
+  /**
+   * Intended only for internal usage within Apache CarbonData project.
+   */
+  @Documented
+  @Retention(RetentionPolicy.RUNTIME)
+  public @interface Internal { }
 
   private InterfaceAudience() { } // Audience can't exist on its own
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/e72bfd1a/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceStability.java
--
diff --git 

[52/54] [abbrv] carbondata git commit: [HOTFIX] Add dava doc for datamap interface

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/fc2a7eb3/core/src/main/java/org/apache/carbondata/core/indexstore/FineGrainBlocklet.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/FineGrainBlocklet.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/FineGrainBlocklet.java
deleted file mode 100644
index 229e5bf..000
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/FineGrainBlocklet.java
+++ /dev/null
@@ -1,128 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.carbondata.core.indexstore;
-
-import java.io.DataInput;
-import java.io.DataOutput;
-import java.io.IOException;
-import java.io.Serializable;
-import java.util.ArrayList;
-import java.util.BitSet;
-import java.util.List;
-
-import org.apache.carbondata.core.constants.CarbonV3DataFormatConstants;
-import org.apache.carbondata.core.metadata.schema.table.Writable;
-import org.apache.carbondata.core.util.BitSetGroup;
-
-/**
- * FineGrainBlocklet
- */
-public class FineGrainBlocklet extends Blocklet implements Serializable {
-
-  private List pages;
-
-  public FineGrainBlocklet(String blockId, String blockletId, List 
pages) {
-super(blockId, blockletId);
-this.pages = pages;
-  }
-
-  // For serialization purpose
-  public FineGrainBlocklet() {
-
-  }
-
-  public List getPages() {
-return pages;
-  }
-
-  public static class Page implements Writable,Serializable {
-
-private int pageId;
-
-private int[] rowId;
-
-public BitSet getBitSet() {
-  BitSet bitSet =
-  new 
BitSet(CarbonV3DataFormatConstants.NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_DEFAULT);
-  for (int row : rowId) {
-bitSet.set(row);
-  }
-  return bitSet;
-}
-
-@Override public void write(DataOutput out) throws IOException {
-  out.writeInt(pageId);
-  out.writeInt(rowId.length);
-  for (int i = 0; i < rowId.length; i++) {
-out.writeInt(rowId[i]);
-  }
-}
-
-@Override public void readFields(DataInput in) throws IOException {
-  pageId = in.readInt();
-  int length = in.readInt();
-  rowId = new int[length];
-  for (int i = 0; i < length; i++) {
-rowId[i] = in.readInt();
-  }
-}
-
-public void setPageId(int pageId) {
-  this.pageId = pageId;
-}
-
-public void setRowId(int[] rowId) {
-  this.rowId = rowId;
-}
-  }
-
-  public BitSetGroup getBitSetGroup(int numberOfPages) {
-BitSetGroup bitSetGroup = new BitSetGroup(numberOfPages);
-for (int i = 0; i < pages.size(); i++) {
-  bitSetGroup.setBitSet(pages.get(i).getBitSet(), pages.get(i).pageId);
-}
-return bitSetGroup;
-  }
-
-  @Override public void write(DataOutput out) throws IOException {
-super.write(out);
-int size = pages.size();
-out.writeInt(size);
-for (Page page : pages) {
-  page.write(out);
-}
-  }
-
-  @Override public void readFields(DataInput in) throws IOException {
-super.readFields(in);
-int size = in.readInt();
-pages = new ArrayList<>(size);
-for (int i = 0; i < size; i++) {
-  Page page = new Page();
-  page.readFields(in);
-  pages.add(page);
-}
-  }
-
-  @Override public boolean equals(Object o) {
-return super.equals(o);
-  }
-
-  @Override public int hashCode() {
-return super.hashCode();
-  }
-}

http://git-wip-us.apache.org/repos/asf/carbondata/blob/fc2a7eb3/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
new file mode 100644
index 000..3ca9c5a
--- /dev/null
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java
@@ -0,0 +1,971 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional 

[30/54] [abbrv] carbondata git commit: [CARBONDATA-2172][Lucene] Add text_columns property for Lucene DataMap

2018-03-08 Thread ravipesala
[CARBONDATA-2172][Lucene] Add text_columns property for Lucene DataMap

Add text_columns property for Lucene DataMap

This closes #2019


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/d23f7fad
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/d23f7fad
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/d23f7fad

Branch: refs/heads/master
Commit: d23f7fad1f7db029d1dd0cc8e3db7a5b79463179
Parents: f9291cd
Author: QiangCai 
Authored: Thu Mar 1 15:40:01 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:11 2018 +0530

--
 .../core/datamap/DataMapStoreManager.java   | 16 ++--
 .../core/datamap/dev/DataMapFactory.java|  4 +-
 .../blockletindex/BlockletDataMapFactory.java   |  2 +-
 .../ThriftWrapperSchemaConverterImpl.java   |  2 +-
 .../schema/datamap/DataMapProvider.java |  4 +
 .../schema/table/AggregationDataMapSchema.java  |  2 +-
 .../core/metadata/schema/table/CarbonTable.java | 16 +++-
 .../metadata/schema/table/DataMapSchema.java| 44 --
 .../core/metadata/schema/table/TableInfo.java   |  2 +-
 .../core/metadata/schema/table/TableSchema.java | 16 ++--
 datamap/lucene/pom.xml  |  5 ++
 .../lucene/LuceneDataMapFactoryBase.java| 89 ++--
 .../lucene/LuceneFineGrainDataMapSuite.scala| 58 -
 .../preaggregate/TestPreAggCreateCommand.scala  | 75 -
 .../preaggregate/TestPreAggregateLoad.scala |  7 +-
 .../timeseries/TestTimeseriesDataLoad.scala |  6 +-
 .../testsuite/datamap/CGDataMapTestCase.scala   |  4 +-
 .../testsuite/datamap/DataMapWriterSuite.scala  |  1 +
 .../testsuite/datamap/FGDataMapTestCase.scala   |  4 +-
 .../iud/InsertOverwriteConcurrentTest.scala |  0
 .../TestInsertAndOtherCommandConcurrent.scala   | 38 -
 .../carbondata/spark/load/ValidateUtil.scala|  0
 .../carbondata/spark/util/DataLoadingUtil.scala |  0
 .../carbondata/datamap/DataMapManager.java  |  4 +-
 .../datamap/IndexDataMapProvider.java   |  6 +-
 .../datamap/PreAggregateDataMapProvider.java|  2 +-
 .../datamap/TimeseriesDataMapProvider.java  |  4 +-
 .../datamap/CarbonCreateDataMapCommand.scala| 21 +++--
 .../datamap/CarbonDataMapShowCommand.scala  |  2 +-
 .../preaaggregate/PreAggregateListeners.scala   |  9 +-
 .../command/timeseries/TimeSeriesUtil.scala | 16 ++--
 .../loading/model/CarbonLoadModelBuilder.java   | 16 ++--
 .../processing/loading/model/LoadOption.java| 23 -
 33 files changed, 311 insertions(+), 187 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/d23f7fad/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
index ab339e8..a8d467f 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
@@ -33,7 +33,6 @@ import 
org.apache.carbondata.core.indexstore.BlockletDetailsFetcher;
 import org.apache.carbondata.core.indexstore.SegmentPropertiesFetcher;
 import 
org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
-import org.apache.carbondata.core.metadata.schema.datamap.DataMapProvider;
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.metadata.schema.table.DataMapSchema;
 import org.apache.carbondata.core.mutate.SegmentUpdateDetails;
@@ -89,8 +88,7 @@ public final class DataMapStoreManager {
 List dataMaps = new ArrayList<>();
 if (dataMapSchemaList != null) {
   for (DataMapSchema dataMapSchema : dataMapSchemaList) {
-if (!dataMapSchema.getClassName().equalsIgnoreCase(
-DataMapProvider.PREAGGREGATE.toString())) {
+if (dataMapSchema.isIndexDataMap()) {
   dataMaps.add(getDataMap(carbonTable.getAbsoluteTableIdentifier(), 
dataMapSchema));
 }
   }
@@ -144,26 +142,28 @@ public final class DataMapStoreManager {
* Return a new datamap instance and registered in the store manager.
* The datamap is created using datamap name, datamap factory class and 
table identifier.
*/
-  private TableDataMap createAndRegisterDataMap(AbsoluteTableIdentifier 
identifier,
+  // TODO: make it private
+  public TableDataMap createAndRegisterDataMap(AbsoluteTableIdentifier 
identifier,
   DataMapSchema dataMapSchema) throws MalformedDataMapCommandException, 

[43/54] [abbrv] carbondata git commit: [CARBONDATA-2206] support lucene index datamap

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/bbb10922/processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java
 
b/processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java
index 80d8154..1c9ccc8 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java
@@ -110,7 +110,7 @@ public class CarbonFactDataWriterImplV3 extends 
AbstractFactDataWriter {
* @param tablePage
*/
   @Override public void writeTablePage(TablePage tablePage)
-  throws CarbonDataWriterException {
+  throws CarbonDataWriterException,IOException {
 // condition for writting all the pages
 if (!tablePage.isLastPage()) {
   boolean isAdded = false;
@@ -148,7 +148,7 @@ public class CarbonFactDataWriterImplV3 extends 
AbstractFactDataWriter {
 }
   }
 
-  private void addPageData(TablePage tablePage) {
+  private void addPageData(TablePage tablePage) throws IOException {
 blockletDataHolder.addPage(tablePage);
 if (listener != null) {
   if (pageId == 0) {



[31/54] [abbrv] carbondata git commit: [CARBONDATA-2216][Test] Fix bugs in sdv tests

2018-03-08 Thread ravipesala
[CARBONDATA-2216][Test] Fix bugs in sdv tests

This closes #2012


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/f9291cdb
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/f9291cdb
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/f9291cdb

Branch: refs/heads/master
Commit: f9291cdb3b480d3ad46f963b5c772b8fa3185034
Parents: 5397c05
Author: xuchuanyin 
Authored: Wed Feb 28 16:02:55 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:11 2018 +0530

--
 .../core/datamap/dev/expr/AndDataMapExprWrapper.java |  2 +-
 .../core/datamap/dev/expr/DataMapExprWrapper.java|  2 +-
 .../datamap/dev/expr/DataMapExprWrapperImpl.java |  2 +-
 .../core/datamap/dev/expr/OrDataMapExprWrapper.java  |  2 +-
 .../hadoop/api/CarbonTableInputFormat.java   |  3 ++-
 .../cluster/sdv/generated/MergeIndexTestCase.scala   | 15 +--
 .../preaaggregate/PreAggregateTableHelper.scala  |  5 -
 .../apache/spark/sql/optimizer/CarbonFilters.scala   |  4 +---
 8 files changed, 20 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/f9291cdb/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/AndDataMapExprWrapper.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/AndDataMapExprWrapper.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/AndDataMapExprWrapper.java
index 12b60b4..74469d7 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/AndDataMapExprWrapper.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/AndDataMapExprWrapper.java
@@ -20,8 +20,8 @@ import java.io.IOException;
 import java.util.ArrayList;
 import java.util.List;
 
-import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datamap.DataMapLevel;
+import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;

http://git-wip-us.apache.org/repos/asf/carbondata/blob/f9291cdb/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapper.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapper.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapper.java
index ddb19e9..14cfc33 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapper.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapper.java
@@ -20,8 +20,8 @@ import java.io.IOException;
 import java.io.Serializable;
 import java.util.List;
 
-import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datamap.DataMapLevel;
+import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;

http://git-wip-us.apache.org/repos/asf/carbondata/blob/f9291cdb/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapperImpl.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapperImpl.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapperImpl.java
index d4be416..c6b011c 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapperImpl.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/DataMapExprWrapperImpl.java
@@ -22,8 +22,8 @@ import java.util.List;
 import java.util.UUID;
 
 import org.apache.carbondata.core.datamap.DataMapDistributable;
-import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datamap.DataMapLevel;
+import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datamap.TableDataMap;
 import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;

http://git-wip-us.apache.org/repos/asf/carbondata/blob/f9291cdb/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/OrDataMapExprWrapper.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/expr/OrDataMapExprWrapper.java
 

[34/54] [abbrv] carbondata git commit: [CARBONDATA-2091][DataLoad] Support specifying sort column bounds in data loading

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/d5396b15/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterWithBucketingImpl.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterWithBucketingImpl.java
 
b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterWithBucketingImpl.java
deleted file mode 100644
index f605b22..000
--- 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/impl/UnsafeParallelReadMergeSorterWithBucketingImpl.java
+++ /dev/null
@@ -1,263 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.carbondata.processing.loading.sort.impl;
-
-import java.io.File;
-import java.util.Iterator;
-import java.util.List;
-import java.util.concurrent.ExecutorService;
-import java.util.concurrent.Executors;
-import java.util.concurrent.TimeUnit;
-
-import org.apache.carbondata.common.CarbonIterator;
-import org.apache.carbondata.common.logging.LogService;
-import org.apache.carbondata.common.logging.LogServiceFactory;
-import org.apache.carbondata.core.constants.CarbonCommonConstants;
-import org.apache.carbondata.core.datastore.row.CarbonRow;
-import org.apache.carbondata.core.memory.MemoryException;
-import org.apache.carbondata.core.metadata.schema.BucketingInfo;
-import org.apache.carbondata.core.util.CarbonProperties;
-import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory;
-import org.apache.carbondata.processing.loading.DataField;
-import 
org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException;
-import org.apache.carbondata.processing.loading.row.CarbonRowBatch;
-import org.apache.carbondata.processing.loading.sort.AbstractMergeSorter;
-import 
org.apache.carbondata.processing.loading.sort.unsafe.UnsafeCarbonRowPage;
-import org.apache.carbondata.processing.loading.sort.unsafe.UnsafeSortDataRows;
-import 
org.apache.carbondata.processing.loading.sort.unsafe.merger.UnsafeIntermediateMerger;
-import 
org.apache.carbondata.processing.loading.sort.unsafe.merger.UnsafeSingleThreadFinalSortFilesMerger;
-import org.apache.carbondata.processing.sort.sortdata.SortParameters;
-import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
-
-/**
- * It parallely reads data from array of iterates and do merge sort.
- * First it sorts the data and write to temp files. These temp files will be 
merge sorted to get
- * final merge sort result.
- * This step is specifically for bucketing, it sorts each bucket data 
separately and write to
- * temp files.
- */
-public class UnsafeParallelReadMergeSorterWithBucketingImpl extends 
AbstractMergeSorter {
-
-  private static final LogService LOGGER =
-  LogServiceFactory.getLogService(
-
UnsafeParallelReadMergeSorterWithBucketingImpl.class.getName());
-
-  private SortParameters sortParameters;
-
-  private BucketingInfo bucketingInfo;
-
-  public UnsafeParallelReadMergeSorterWithBucketingImpl(DataField[] 
inputDataFields,
-  BucketingInfo bucketingInfo) {
-this.bucketingInfo = bucketingInfo;
-  }
-
-  @Override public void initialize(SortParameters sortParameters) {
-this.sortParameters = sortParameters;
-  }
-
-  @Override public Iterator[] sort(Iterator[] 
iterators)
-  throws CarbonDataLoadingException {
-UnsafeSortDataRows[] sortDataRows = new 
UnsafeSortDataRows[bucketingInfo.getNumberOfBuckets()];
-UnsafeIntermediateMerger[] intermediateFileMergers =
-new UnsafeIntermediateMerger[sortDataRows.length];
-int inMemoryChunkSizeInMB = 
CarbonProperties.getInstance().getSortMemoryChunkSizeInMB();
-inMemoryChunkSizeInMB = inMemoryChunkSizeInMB / 
bucketingInfo.getNumberOfBuckets();
-if (inMemoryChunkSizeInMB < 5) {
-  inMemoryChunkSizeInMB = 5;
-}
-try {
-  for (int i = 0; i < bucketingInfo.getNumberOfBuckets(); i++) {
-SortParameters parameters = sortParameters.getCopy();
-parameters.setPartitionID(i + "");
-setTempLocation(parameters);
-

[11/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeFixedLengthDimensionDataChunkStore.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeFixedLengthDimensionDataChunkStore.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeFixedLengthDimensionDataChunkStore.java
index 8c8d08f..a689d8e 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeFixedLengthDimensionDataChunkStore.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeFixedLengthDimensionDataChunkStore.java
@@ -124,22 +124,22 @@ public class UnsafeFixedLengthDimensionDataChunkStore
   /**
* to compare the two byte array
*
-   * @param indexindex of first byte array
+   * @param rowIdindex of first byte array
* @param compareValue value of to be compared
* @return compare result
*/
-  @Override public int compareTo(int index, byte[] compareValue) {
+  @Override public int compareTo(int rowId, byte[] compareValue) {
 // based on index we need to calculate the actual position in memory block
-index = index * columnValueSize;
+rowId = rowId * columnValueSize;
 int compareResult = 0;
 for (int i = 0; i < compareValue.length; i++) {
   compareResult = (CarbonUnsafe.getUnsafe()
-  .getByte(dataPageMemoryBlock.getBaseObject(), 
dataPageMemoryBlock.getBaseOffset() + index)
+  .getByte(dataPageMemoryBlock.getBaseObject(), 
dataPageMemoryBlock.getBaseOffset() + rowId)
   & 0xff) - (compareValue[i] & 0xff);
   if (compareResult != 0) {
 break;
   }
-  index++;
+  rowId++;
 }
 return compareResult;
   }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeVariableLengthDimesionDataChunkStore.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeVariableLengthDimesionDataChunkStore.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeVariableLengthDimesionDataChunkStore.java
index 36b2bd8..e1eb378 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeVariableLengthDimesionDataChunkStore.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeVariableLengthDimesionDataChunkStore.java
@@ -189,11 +189,11 @@ public class UnsafeVariableLengthDimesionDataChunkStore
   /**
* to compare the two byte array
*
-   * @param index index of first byte array
+   * @param rowId index of first byte array
* @param compareValue value of to be compared
* @return compare result
*/
-  @Override public int compareTo(int index, byte[] compareValue) {
+  @Override public int compareTo(int rowId, byte[] compareValue) {
 // now to get the row from memory block we need to do following thing
 // 1. first get the current offset
 // 2. if it's not a last row- get the next row offset
@@ -201,13 +201,13 @@ public class UnsafeVariableLengthDimesionDataChunkStore
 // else subtract the current row offset
 // with complete data length get the offset of set of data
 int currentDataOffset = 
CarbonUnsafe.getUnsafe().getInt(dataPageMemoryBlock.getBaseObject(),
-dataPageMemoryBlock.getBaseOffset() + this.dataPointersOffsets + 
((long)index
+dataPageMemoryBlock.getBaseOffset() + this.dataPointersOffsets + 
((long) rowId
 * CarbonCommonConstants.INT_SIZE_IN_BYTE * 1L));
 short length = 0;
 // calculating the length of data
-if (index < numberOfRows - 1) {
+if (rowId < numberOfRows - 1) {
   int OffsetOfNextdata = 
CarbonUnsafe.getUnsafe().getInt(dataPageMemoryBlock.getBaseObject(),
-  dataPageMemoryBlock.getBaseOffset() + this.dataPointersOffsets + 
((index + 1)
+  dataPageMemoryBlock.getBaseOffset() + this.dataPointersOffsets + 
((rowId + 1)
   * CarbonCommonConstants.INT_SIZE_IN_BYTE));
   length = (short) (OffsetOfNextdata - (currentDataOffset
   + CarbonCommonConstants.SHORT_SIZE_IN_BYTE));

http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/datastore/columnar/ColumnGroupModel.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/columnar/ColumnGroupModel.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/columnar/ColumnGroupModel.java
index 74d268a..e2a4161 100644
--- 

[14/54] [abbrv] carbondata git commit: [CARBONDATA-2023][DataLoad] Add size base block allocation in data loading

2018-03-08 Thread ravipesala
[CARBONDATA-2023][DataLoad] Add size base block allocation in data loading

Carbondata assign blocks to nodes at the beginning of data loading.
Previous block allocation strategy is block number based and it will
suffer skewed data problem if the size of input files differs a lot.

We introduced a size based block allocation strategy to optimize data
loading performance in skewed data scenario.

This closes #1808


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/9a423c24
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/9a423c24
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/9a423c24

Branch: refs/heads/master
Commit: 9a423c241b01d6e0b3f1946c052230adb47f4538
Parents: 2b41f14
Author: xuchuanyin 
Authored: Thu Feb 8 14:42:39 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:10 2018 +0530

--
 .../constants/CarbonLoadOptionConstants.java|  10 +
 .../core/datastore/block/TableBlockInfo.java|  29 ++
 .../carbondata/core/util/CarbonProperties.java  |  11 +
 docs/useful-tips-on-carbondata.md   |   1 +
 .../spark/rdd/NewCarbonDataLoadRDD.scala|   4 +-
 .../spark/sql/hive/DistributionUtil.scala   |   2 +-
 .../spark/rdd/CarbonDataRDDFactory.scala|  18 +-
 .../merger/NodeMultiBlockRelation.java  |  40 ++
 .../processing/util/CarbonLoaderUtil.java   | 480 ---
 .../processing/util/CarbonLoaderUtilTest.java   | 125 +
 10 files changed, 545 insertions(+), 175 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/9a423c24/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
index bcfeba0..a6bf60f 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
@@ -114,4 +114,14 @@ public final class CarbonLoadOptionConstants {
*/
   public static final int MAX_EXTERNAL_DICTIONARY_SIZE = 1000;
 
+  /**
+   * enable block size based block allocation while loading data. By default, 
carbondata assigns
+   * blocks to node based on block number. If this option is set to `true`, 
carbondata will
+   * consider block size first and make sure that all the nodes will process 
almost equal size of
+   * data. This option is especially useful when you encounter skewed data.
+   */
+  @CarbonProperty
+  public static final String ENABLE_CARBON_LOAD_SKEWED_DATA_OPTIMIZATION
+  = "carbon.load.skewedDataOptimization.enabled";
+  public static final String 
ENABLE_CARBON_LOAD_SKEWED_DATA_OPTIMIZATION_DEFAULT = "false";
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/9a423c24/core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
index a7bfdba..c0cebe0 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
@@ -19,6 +19,8 @@ package org.apache.carbondata.core.datastore.block;
 import java.io.IOException;
 import java.io.Serializable;
 import java.nio.charset.Charset;
+import java.util.Arrays;
+import java.util.Comparator;
 import java.util.HashMap;
 import java.util.Map;
 
@@ -98,6 +100,20 @@ public class TableBlockInfo implements Distributable, 
Serializable {
 
   private String dataMapWriterPath;
 
+  /**
+   * comparator to sort by block size in descending order.
+   * Since each line is not exactly the same, the size of a InputSplit may 
differs,
+   * so we allow some deviation for these splits.
+   */
+  public static final Comparator DATA_SIZE_DESC_COMPARATOR =
+  new Comparator() {
+@Override public int compare(Distributable o1, Distributable o2) {
+  long diff =
+  ((TableBlockInfo) o1).getBlockLength() - ((TableBlockInfo) 
o2).getBlockLength();
+  return diff < 0 ? 1 : (diff == 0 ? 0 : -1);
+}
+  };
+
   public TableBlockInfo(String filePath, long blockOffset, String segmentId,
   String[] locations, long blockLength, ColumnarFormatVersion version,
   String[] deletedDeltaFilePath) {
@@ -434,4 +450,17 @@ public class TableBlockInfo 

[21/54] [abbrv] carbondata git commit: [CARBONDATA-2023][DataLoad] Add size base block allocation in data loading

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/8d8b589e/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java
 
b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java
index 7ea5cb3..e5583c2 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/UnsafeCarbonRowPage.java
@@ -19,20 +19,35 @@ package 
org.apache.carbondata.processing.loading.sort.unsafe;
 
 import java.io.DataOutputStream;
 import java.io.IOException;
-import java.nio.ByteBuffer;
+import java.math.BigDecimal;
+import java.util.Arrays;
 
+import org.apache.carbondata.core.memory.CarbonUnsafe;
 import org.apache.carbondata.core.memory.IntPointerBuffer;
 import org.apache.carbondata.core.memory.MemoryBlock;
 import org.apache.carbondata.core.memory.UnsafeMemoryManager;
 import org.apache.carbondata.core.memory.UnsafeSortMemoryManager;
-import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
-import org.apache.carbondata.processing.loading.sort.SortStepRowHandler;
-import org.apache.carbondata.processing.sort.sortdata.TableFieldStat;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.util.DataTypeUtil;
 
 /**
  * It can keep the data of prescribed size data in offheap/onheap memory and 
returns it when needed
  */
 public class UnsafeCarbonRowPage {
+
+  private boolean[] noDictionaryDimensionMapping;
+
+  private boolean[] noDictionarySortColumnMapping;
+
+  private int dimensionSize;
+
+  private int measureSize;
+
+  private DataType[] measureDataType;
+
+  private long[] nullSetWords;
+
   private IntPointerBuffer buffer;
 
   private int lastSize;
@@ -47,14 +62,16 @@ public class UnsafeCarbonRowPage {
 
   private long taskId;
 
-  private TableFieldStat tableFieldStat;
-  private SortStepRowHandler sortStepRowHandler;
-
-  public UnsafeCarbonRowPage(TableFieldStat tableFieldStat, MemoryBlock 
memoryBlock,
-  boolean saveToDisk, long taskId) {
-this.tableFieldStat = tableFieldStat;
-this.sortStepRowHandler = new SortStepRowHandler(tableFieldStat);
+  public UnsafeCarbonRowPage(boolean[] noDictionaryDimensionMapping,
+  boolean[] noDictionarySortColumnMapping, int dimensionSize, int 
measureSize, DataType[] type,
+  MemoryBlock memoryBlock, boolean saveToDisk, long taskId) {
+this.noDictionaryDimensionMapping = noDictionaryDimensionMapping;
+this.noDictionarySortColumnMapping = noDictionarySortColumnMapping;
+this.dimensionSize = dimensionSize;
+this.measureSize = measureSize;
+this.measureDataType = type;
 this.saveToDisk = saveToDisk;
+this.nullSetWords = new long[((measureSize - 1) >> 6) + 1];
 this.taskId = taskId;
 buffer = new IntPointerBuffer(this.taskId);
 this.dataBlock = memoryBlock;
@@ -63,44 +80,255 @@ public class UnsafeCarbonRowPage {
 this.managerType = MemoryManagerType.UNSAFE_MEMORY_MANAGER;
   }
 
-  public int addRow(Object[] row, ByteBuffer rowBuffer) {
-int size = addRow(row, dataBlock.getBaseOffset() + lastSize, rowBuffer);
+  public int addRow(Object[] row) {
+int size = addRow(row, dataBlock.getBaseOffset() + lastSize);
 buffer.set(lastSize);
 lastSize = lastSize + size;
 return size;
   }
 
-  /**
-   * add raw row as intermidiate sort temp row to page
-   *
-   * @param row
-   * @param address
-   * @return
-   */
-  private int addRow(Object[] row, long address, ByteBuffer rowBuffer) {
-return 
sortStepRowHandler.writeRawRowAsIntermediateSortTempRowToUnsafeMemory(row,
-dataBlock.getBaseObject(), address, rowBuffer);
+  private int addRow(Object[] row, long address) {
+if (row == null) {
+  throw new RuntimeException("Row is null ??");
+}
+int dimCount = 0;
+int size = 0;
+Object baseObject = dataBlock.getBaseObject();
+for (; dimCount < noDictionaryDimensionMapping.length; dimCount++) {
+  if (noDictionaryDimensionMapping[dimCount]) {
+byte[] col = (byte[]) row[dimCount];
+CarbonUnsafe.getUnsafe()
+.putShort(baseObject, address + size, (short) col.length);
+size += 2;
+CarbonUnsafe.getUnsafe().copyMemory(col, 
CarbonUnsafe.BYTE_ARRAY_OFFSET, baseObject,
+address + size, col.length);
+size += col.length;
+  } else {
+int value = (int) row[dimCount];
+CarbonUnsafe.getUnsafe().putInt(baseObject, address + size, value);
+size += 4;
+  }
+}
+
+// write complex dimensions here.
+for (; dimCount < dimensionSize; dimCount++) {
+ 

[51/54] [abbrv] carbondata git commit: [HOTFIX] Add dava doc for datamap interface

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/fc2a7eb3/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletIndexDataMap.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletIndexDataMap.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletIndexDataMap.java
deleted file mode 100644
index 34e11ac..000
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletIndexDataMap.java
+++ /dev/null
@@ -1,971 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.carbondata.core.indexstore.blockletindex;
-
-import java.io.ByteArrayInputStream;
-import java.io.ByteArrayOutputStream;
-import java.io.DataInputStream;
-import java.io.DataOutput;
-import java.io.DataOutputStream;
-import java.io.IOException;
-import java.io.UnsupportedEncodingException;
-import java.math.BigDecimal;
-import java.nio.ByteBuffer;
-import java.util.ArrayList;
-import java.util.BitSet;
-import java.util.Comparator;
-import java.util.List;
-
-import org.apache.carbondata.common.logging.LogService;
-import org.apache.carbondata.common.logging.LogServiceFactory;
-import org.apache.carbondata.core.cache.Cacheable;
-import org.apache.carbondata.core.constants.CarbonCommonConstants;
-import org.apache.carbondata.core.datamap.dev.DataMapModel;
-import 
org.apache.carbondata.core.datamap.dev.cgdatamap.AbstractCoarseGrainIndexDataMap;
-import org.apache.carbondata.core.datastore.IndexKey;
-import org.apache.carbondata.core.datastore.block.SegmentProperties;
-import org.apache.carbondata.core.datastore.block.TableBlockInfo;
-import org.apache.carbondata.core.indexstore.BlockMetaInfo;
-import org.apache.carbondata.core.indexstore.Blocklet;
-import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
-import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
-import org.apache.carbondata.core.indexstore.PartitionSpec;
-import org.apache.carbondata.core.indexstore.UnsafeMemoryDMStore;
-import org.apache.carbondata.core.indexstore.row.DataMapRow;
-import org.apache.carbondata.core.indexstore.row.DataMapRowImpl;
-import org.apache.carbondata.core.indexstore.schema.CarbonRowSchema;
-import org.apache.carbondata.core.memory.MemoryException;
-import org.apache.carbondata.core.metadata.blocklet.BlockletInfo;
-import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
-import org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex;
-import org.apache.carbondata.core.metadata.blocklet.index.BlockletMinMaxIndex;
-import org.apache.carbondata.core.metadata.datatype.DataType;
-import org.apache.carbondata.core.metadata.datatype.DataTypes;
-import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
-import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
-import org.apache.carbondata.core.scan.filter.FilterExpressionProcessor;
-import org.apache.carbondata.core.scan.filter.FilterUtil;
-import org.apache.carbondata.core.scan.filter.executer.FilterExecuter;
-import 
org.apache.carbondata.core.scan.filter.executer.ImplicitColumnFilterExecutor;
-import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
-import org.apache.carbondata.core.util.ByteUtil;
-import org.apache.carbondata.core.util.CarbonUtil;
-import org.apache.carbondata.core.util.DataFileFooterConverter;
-import org.apache.carbondata.core.util.DataTypeUtil;
-import org.apache.carbondata.core.util.path.CarbonTablePath;
-
-import org.apache.commons.lang3.StringUtils;
-import org.apache.hadoop.fs.Path;
-import org.xerial.snappy.Snappy;
-
-/**
- * Datamap implementation for blocklet.
- */
-public class BlockletIndexDataMap extends AbstractCoarseGrainIndexDataMap 
implements Cacheable {
-
-  private static final LogService LOGGER =
-  LogServiceFactory.getLogService(BlockletIndexDataMap.class.getName());
-
-  private static int KEY_INDEX = 0;
-
-  private static int MIN_VALUES_INDEX = 1;
-
-  private static int MAX_VALUES_INDEX = 2;
-
-  private static int ROW_COUNT_INDEX = 3;
-
-  private static int FILE_PATH_INDEX = 4;
-
-  private 

[12/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/FixedLengthDimensionDataChunk.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/FixedLengthDimensionDataChunk.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/FixedLengthDimensionDataChunk.java
deleted file mode 100644
index 6629d31..000
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/FixedLengthDimensionDataChunk.java
+++ /dev/null
@@ -1,163 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.carbondata.core.datastore.chunk.impl;
-
-import org.apache.carbondata.core.constants.CarbonCommonConstants;
-import 
org.apache.carbondata.core.datastore.chunk.store.DimensionChunkStoreFactory;
-import 
org.apache.carbondata.core.datastore.chunk.store.DimensionChunkStoreFactory.DimensionStoreType;
-import org.apache.carbondata.core.metadata.datatype.DataType;
-import org.apache.carbondata.core.metadata.datatype.DataTypes;
-import org.apache.carbondata.core.scan.executor.infos.KeyStructureInfo;
-import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
-import org.apache.carbondata.core.scan.result.vector.ColumnVectorInfo;
-
-/**
- * This class is gives access to fixed length dimension data chunk store
- */
-public class FixedLengthDimensionDataChunk extends AbstractDimensionDataChunk {
-
-  /**
-   * Constructor
-   *
-   * @param dataChunkdata chunk
-   * @param invertedIndexinverted index
-   * @param invertedIndexReverse reverse inverted index
-   * @param numberOfRows number of rows
-   * @param columnValueSize  size of each column value
-   */
-  public FixedLengthDimensionDataChunk(byte[] dataChunk, int[] invertedIndex,
-  int[] invertedIndexReverse, int numberOfRows, int columnValueSize) {
-long totalSize = null != invertedIndex ?
-dataChunk.length + (2 * numberOfRows * 
CarbonCommonConstants.INT_SIZE_IN_BYTE) :
-dataChunk.length;
-dataChunkStore = DimensionChunkStoreFactory.INSTANCE
-.getDimensionChunkStore(columnValueSize, null != invertedIndex, 
numberOfRows, totalSize,
-DimensionStoreType.FIXEDLENGTH);
-dataChunkStore.putArray(invertedIndex, invertedIndexReverse, dataChunk);
-  }
-
-  /**
-   * Below method will be used to fill the data based on offset and row id
-   *
-   * @param data data to filed
-   * @param offset   offset from which data need to be filed
-   * @param indexrow id of the chunk
-   * @param keyStructureInfo define the structure of the key
-   * @return how many bytes was copied
-   */
-  @Override public int fillChunkData(byte[] data, int offset, int index,
-  KeyStructureInfo keyStructureInfo) {
-dataChunkStore.fillRow(index, data, offset);
-return dataChunkStore.getColumnValueSize();
-  }
-
-  /**
-   * Converts to column dictionary integer value
-   *
-   * @param rowId
-   * @param columnIndex
-   * @param row
-   * @param restructuringInfo
-   * @return
-   */
-  @Override public int fillConvertedChunkData(int rowId, int columnIndex, 
int[] row,
-  KeyStructureInfo restructuringInfo) {
-row[columnIndex] = dataChunkStore.getSurrogate(rowId);
-return columnIndex + 1;
-  }
-
-  /**
-   * Fill the data to vector
-   *
-   * @param vectorInfo
-   * @param column
-   * @param restructuringInfo
-   * @return next column index
-   */
-  @Override public int fillConvertedChunkData(ColumnVectorInfo[] vectorInfo, 
int column,
-  KeyStructureInfo restructuringInfo) {
-ColumnVectorInfo columnVectorInfo = vectorInfo[column];
-int offset = columnVectorInfo.offset;
-int vectorOffset = columnVectorInfo.vectorOffset;
-int len = columnVectorInfo.size + offset;
-CarbonColumnVector vector = columnVectorInfo.vector;
-for (int j = offset; j < len; j++) {
-  int dict = dataChunkStore.getSurrogate(j);
-  if (columnVectorInfo.directDictionaryGenerator == null) {
-vector.putInt(vectorOffset++, dict);
-  } else {
-Object 

[50/54] [abbrv] carbondata git commit: [HOTFIX] Add dava doc for datamap interface

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/fc2a7eb3/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/CGIndexDataMapTestCase.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/CGIndexDataMapTestCase.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/CGIndexDataMapTestCase.scala
deleted file mode 100644
index 72a88ea..000
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/CGIndexDataMapTestCase.scala
+++ /dev/null
@@ -1,383 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.carbondata.spark.testsuite.datamap
-
-import java.io.{ByteArrayInputStream, DataOutputStream, ObjectInputStream, 
ObjectOutputStream}
-
-import scala.collection.JavaConverters._
-import scala.collection.mutable.ArrayBuffer
-
-import com.sun.xml.internal.messaging.saaj.util.ByteOutputStream
-import org.apache.spark.sql.test.util.QueryTest
-import org.scalatest.BeforeAndAfterAll
-
-import org.apache.carbondata.core.datamap.dev.{AbstractDataMapWriter, 
DataMapModel}
-import org.apache.carbondata.core.datamap.{DataMapDistributable, DataMapMeta, 
DataMapStoreManager, Segment}
-import org.apache.carbondata.core.datamap.{DataMapDistributable, DataMapMeta, 
DataMapStoreManager}
-import org.apache.carbondata.core.datamap.dev.{AbstractDataMapWriter, 
DataMapModel}
-import 
org.apache.carbondata.core.datamap.dev.cgdatamap.{AbstractCoarseGrainIndexDataMap,
 AbstractCoarseGrainIndexDataMapFactory}
-import org.apache.carbondata.core.datastore.FileReader
-import org.apache.carbondata.core.datastore.block.SegmentProperties
-import org.apache.carbondata.core.datastore.compression.SnappyCompressor
-import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
-import org.apache.carbondata.core.datastore.impl.FileFactory
-import org.apache.carbondata.core.datastore.page.ColumnPage
-import org.apache.carbondata.core.indexstore.{Blocklet, PartitionSpec}
-import 
org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapDistributable
-import org.apache.carbondata.core.metadata.schema.table.DataMapSchema
-import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata}
-import org.apache.carbondata.core.scan.expression.Expression
-import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression
-import org.apache.carbondata.core.scan.filter.intf.ExpressionType
-import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf
-import org.apache.carbondata.core.util.ByteUtil
-import org.apache.carbondata.core.util.path.CarbonTablePath
-import org.apache.carbondata.events.Event
-import 
org.apache.carbondata.spark.testsuite.datacompaction.CompactionSupportGlobalSortBigFileTest
-
-class CGIndexDataMapFactory extends AbstractCoarseGrainIndexDataMapFactory {
-  var identifier: AbsoluteTableIdentifier = _
-  var dataMapSchema: DataMapSchema = _
-
-  /**
-   * Initialization of Datamap factory with the identifier and datamap name
-   */
-  override def init(identifier: AbsoluteTableIdentifier, dataMapSchema: 
DataMapSchema): Unit = {
-this.identifier = identifier
-this.dataMapSchema = dataMapSchema
-  }
-
-  /**
-   * Return a new write for this datamap
-   */
-  override def createWriter(segment: Segment, dataWritePath: String): 
AbstractDataMapWriter = {
-new CGDataMapWriter(identifier, segment, dataWritePath, dataMapSchema)
-  }
-
-  /**
-   * Get the datamap for segmentid
-   */
-  override def getDataMaps(segment: Segment): 
java.util.List[AbstractCoarseGrainDataMap] = {
-val file = FileFactory.getCarbonFile(
-  CarbonTablePath.getSegmentPath(identifier.getTablePath, 
segment.getSegmentNo))
-
-val files = file.listFiles(new CarbonFileFilter {
-  override def accept(file: CarbonFile): Boolean = 
file.getName.endsWith(".datamap")
-})
-files.map {f =>
-  val dataMap: AbstractCoarseGrainIndexDataMap = new CGIndexDataMap()
-  dataMap.init(new DataMapModel(f.getCanonicalPath))
-  

[23/54] [abbrv] carbondata git commit: [CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp row

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/21704cf7/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeInmemoryHolder.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeInmemoryHolder.java
 
b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeInmemoryHolder.java
index 6f05088..cbcbbae 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeInmemoryHolder.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeInmemoryHolder.java
@@ -19,8 +19,9 @@ package 
org.apache.carbondata.processing.loading.sort.unsafe.holder;
 
 import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
 import 
org.apache.carbondata.processing.loading.sort.unsafe.UnsafeCarbonRowPage;
-import org.apache.carbondata.processing.sort.sortdata.NewRowComparator;
+import 
org.apache.carbondata.processing.sort.sortdata.IntermediateSortTempRowComparator;
 
 public class UnsafeInmemoryHolder implements SortTempChunkHolder {
 
@@ -33,21 +34,18 @@ public class UnsafeInmemoryHolder implements 
SortTempChunkHolder {
 
   private UnsafeCarbonRowPage rowPage;
 
-  private Object[] currentRow;
+  private IntermediateSortTempRow currentRow;
 
   private long address;
 
-  private NewRowComparator comparator;
+  private IntermediateSortTempRowComparator comparator;
 
-  private int columnSize;
-
-  public UnsafeInmemoryHolder(UnsafeCarbonRowPage rowPage, int columnSize,
-  int numberOfSortColumns) {
+  public UnsafeInmemoryHolder(UnsafeCarbonRowPage rowPage) {
 this.actualSize = rowPage.getBuffer().getActualSize();
 this.rowPage = rowPage;
 LOGGER.audit("Processing unsafe inmemory rows page with size : " + 
actualSize);
-this.comparator = new 
NewRowComparator(rowPage.getNoDictionarySortColumnMapping());
-this.columnSize = columnSize;
+this.comparator = new IntermediateSortTempRowComparator(
+rowPage.getTableFieldStat().getIsSortColNoDictFlags());
   }
 
   public boolean hasNext() {
@@ -58,13 +56,12 @@ public class UnsafeInmemoryHolder implements 
SortTempChunkHolder {
   }
 
   public void readRow() {
-currentRow = new Object[columnSize];
 address = rowPage.getBuffer().get(counter);
-rowPage.getRow(address + rowPage.getDataBlock().getBaseOffset(), 
currentRow);
+currentRow = rowPage.getRow(address + 
rowPage.getDataBlock().getBaseOffset());
 counter++;
   }
 
-  public Object[] getRow() {
+  public IntermediateSortTempRow getRow() {
 return currentRow;
   }
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/21704cf7/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
 
b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
index 11b3d43..527452a 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/loading/sort/unsafe/holder/UnsafeSortTempFileChunkHolder.java
@@ -31,15 +31,14 @@ import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datastore.impl.FileFactory;
-import org.apache.carbondata.core.metadata.datatype.DataType;
-import org.apache.carbondata.core.metadata.datatype.DataTypes;
 import org.apache.carbondata.core.util.CarbonProperties;
 import org.apache.carbondata.core.util.CarbonUtil;
-import org.apache.carbondata.core.util.DataTypeUtil;
-import 
org.apache.carbondata.processing.loading.sort.unsafe.UnsafeCarbonRowPage;
+import org.apache.carbondata.processing.loading.row.IntermediateSortTempRow;
+import org.apache.carbondata.processing.loading.sort.SortStepRowHandler;
 import 
org.apache.carbondata.processing.sort.exception.CarbonSortKeyAndGroupByException;
-import org.apache.carbondata.processing.sort.sortdata.NewRowComparator;
+import 
org.apache.carbondata.processing.sort.sortdata.IntermediateSortTempRowComparator;
 import org.apache.carbondata.processing.sort.sortdata.SortParameters;
+import org.apache.carbondata.processing.sort.sortdata.TableFieldStat;
 
 public class UnsafeSortTempFileChunkHolder implements SortTempChunkHolder {
 
@@ -63,21 +62,15 @@ public class 

[42/54] [abbrv] carbondata git commit: [CARBONDATA-2189] Add DataMapProvider developer interface

2018-03-08 Thread ravipesala
[CARBONDATA-2189] Add DataMapProvider developer interface

Add developer interface for 2 types of DataMap:

1.IndexDataMap: DataMap that leveraging index to accelerate filter query
2.MVDataMap: DataMap that leveraging Materialized View to accelerate olap style 
query, like SPJG query (select, predicate, join, groupby)
This PR adds support for following logic when creating and dropping the DataMap

This closes #1987


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/89a12af5
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/89a12af5
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/89a12af5

Branch: refs/heads/master
Commit: 89a12af5aba17f12c4e695971982abfeff256fc1
Parents: 56330ae
Author: Jacky Li 
Authored: Thu Feb 22 20:59:59 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:11 2018 +0530

--
 .../sql/MalformedDataMapCommandException.java   |   4 +
 .../carbondata/core/datamap/DataMapChooser.java |   5 +-
 .../core/datamap/DataMapRegistry.java   |  37 +
 .../core/datamap/DataMapStoreManager.java   |  82 +-
 .../carbondata/core/datamap/TableDataMap.java   |  42 +-
 .../core/datamap/dev/AbstractDataMapWriter.java |   2 +-
 .../carbondata/core/datamap/dev/DataMap.java|  56 --
 .../core/datamap/dev/DataMapFactory.java|  86 --
 .../core/datamap/dev/IndexDataMap.java  |  56 ++
 .../core/datamap/dev/IndexDataMapFactory.java   |  86 ++
 .../cgdatamap/AbstractCoarseGrainDataMap.java   |  24 -
 .../AbstractCoarseGrainDataMapFactory.java  |  34 -
 .../AbstractCoarseGrainIndexDataMap.java|  24 +
 .../AbstractCoarseGrainIndexDataMapFactory.java |  34 +
 .../dev/expr/DataMapExprWrapperImpl.java|   2 +-
 .../dev/fgdatamap/AbstractFineGrainDataMap.java |  24 -
 .../AbstractFineGrainDataMapFactory.java|  38 -
 .../AbstractFineGrainIndexDataMap.java  |  24 +
 .../AbstractFineGrainIndexDataMapFactory.java   |  38 +
 .../indexstore/BlockletDataMapIndexStore.java   |  33 +-
 .../blockletindex/BlockletDataMap.java  | 971 ---
 .../blockletindex/BlockletDataMapFactory.java   | 285 --
 .../blockletindex/BlockletDataMapModel.java |   2 +-
 .../blockletindex/BlockletIndexDataMap.java | 971 +++
 .../BlockletIndexDataMapFactory.java| 285 ++
 .../core/metadata/schema/table/CarbonTable.java |  10 +-
 .../metadata/schema/table/DataMapSchema.java|   1 +
 .../schema/table/DataMapSchemaFactory.java  |  13 +-
 .../core/metadata/schema/table/TableSchema.java |   3 +-
 .../blockletindex/TestBlockletDataMap.java  |  66 --
 .../blockletindex/TestBlockletIndexDataMap.java |  59 ++
 .../datamap/examples/MinMaxDataMap.java | 152 ---
 .../datamap/examples/MinMaxDataMapFactory.java  | 117 ---
 .../datamap/examples/MinMaxIndexDataMap.java| 150 +++
 .../examples/MinMaxIndexDataMapFactory.java | 117 +++
 .../MinMaxDataMapExample.scala  |   4 +-
 docs/datamap-developer-guide.md |  16 +
 .../hadoop/api/CarbonTableInputFormat.java  |   5 +-
 .../preaggregate/TestPreAggCreateCommand.scala  |   2 +-
 .../TestPreAggregateTableSelection.scala|   4 +-
 .../timeseries/TestTimeSeriesCreateTable.scala  |   4 +-
 .../testsuite/datamap/CGDataMapTestCase.scala   | 381 
 .../datamap/CGIndexDataMapTestCase.scala| 383 
 .../testsuite/datamap/DataMapWriterSuite.scala  | 216 -
 .../testsuite/datamap/FGDataMapTestCase.scala   | 473 -
 .../datamap/FGIndexDataMapTestCase.scala| 472 +
 .../datamap/IndexDataMapWriterSuite.scala   | 217 +
 .../testsuite/datamap/TestDataMapCommand.scala  | 288 --
 .../datamap/TestIndexDataMapCommand.scala   | 285 ++
 .../carbondata/datamap/DataMapManager.java  |  53 +
 .../carbondata/datamap/DataMapProperty.java |  32 +
 .../carbondata/datamap/DataMapProvider.java | 105 ++
 .../datamap/IndexDataMapProvider.java   | 116 +++
 .../datamap/PreAggregateDataMapProvider.java|  92 ++
 .../datamap/TimeseriesDataMapProvider.java  |  50 +
 .../scala/org/apache/spark/sql/CarbonEnv.scala  |   2 +-
 .../datamap/CarbonCreateDataMapCommand.scala|  92 +-
 .../datamap/CarbonDropDataMapCommand.scala  |  73 +-
 .../CarbonAlterTableDropPartitionCommand.scala  |   2 +-
 .../CarbonAlterTableSplitPartitionCommand.scala |   2 +-
 .../CreatePreAggregateTableCommand.scala| 203 
 .../preaaggregate/PreAggregateTableHelper.scala | 195 
 .../preaaggregate/PreAggregateUtil.scala|  24 +-
 .../CarbonAlterTableAddColumnCommand.scala  |   2 +-
 .../CarbonAlterTableDataTypeChangeCommand.scala |   2 +-
 .../CarbonAlterTableDropColumnCommand.scala |   2 +-
 .../schema/CarbonAlterTableRenameCommand.scala  |   2 +-
 

[13/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
[CARBONDATA-2099] Refactor query scan process to improve readability

Unified concepts in scan process flow:

1.QueryModel contains all parameter for scan, it is created by API in 
CarbonTable. (In future, CarbonTable will be the entry point for various table 
operations)
2.Use term ColumnChunk to represent one column in one blocklet, and use 
ChunkIndex in reader to read specified column chunk
3.Use term ColumnPage to represent one page in one ColumnChunk
4.QueryColumn => ProjectionColumn, indicating it is for projection

This closes #1874


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/daa64650
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/daa64650
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/daa64650

Branch: refs/heads/master
Commit: daa646503ce9ddc5d581b23cbec5494ad8e0be08
Parents: 5ab0957
Author: Jacky Li 
Authored: Tue Jan 30 21:24:04 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:09 2018 +0530

--
 .../dictionary/AbstractDictionaryCache.java |   3 +-
 .../cache/dictionary/DictionaryCacheLoader.java |   7 +-
 .../dictionary/DictionaryCacheLoaderImpl.java   |  11 +-
 .../core/datastore/BTreeBuilderInfo.java|   6 -
 .../carbondata/core/datastore/DataRefNode.java  |  81 +--
 .../carbondata/core/datastore/FileHolder.java   | 118 
 .../carbondata/core/datastore/FileReader.java   | 114 +++
 .../core/datastore/block/SegmentProperties.java |  50 +-
 .../chunk/DimensionColumnDataChunk.java | 116 ---
 .../datastore/chunk/DimensionColumnPage.java| 111 +++
 .../chunk/impl/AbstractDimensionColumnPage.java |  89 +++
 .../chunk/impl/AbstractDimensionDataChunk.java  |  95 ---
 .../impl/ColumnGroupDimensionColumnPage.java| 194 ++
 .../impl/ColumnGroupDimensionDataChunk.java | 194 --
 .../chunk/impl/DimensionRawColumnChunk.java |  46 +-
 .../impl/FixedLengthDimensionColumnPage.java| 163 +
 .../impl/FixedLengthDimensionDataChunk.java | 163 -
 .../chunk/impl/MeasureRawColumnChunk.java   |  26 +-
 .../impl/VariableLengthDimensionColumnPage.java | 133 
 .../impl/VariableLengthDimensionDataChunk.java  | 140 
 .../reader/DimensionColumnChunkReader.java  |  14 +-
 .../chunk/reader/MeasureColumnChunkReader.java  |  12 +-
 .../AbstractChunkReaderV2V3Format.java  |  34 +-
 ...mpressedDimensionChunkFileBasedReaderV1.java |  38 +-
 ...mpressedDimensionChunkFileBasedReaderV2.java |  30 +-
 ...essedDimChunkFileBasedPageLevelReaderV3.java |  11 +-
 ...mpressedDimensionChunkFileBasedReaderV3.java |  49 +-
 .../AbstractMeasureChunkReaderV2V3Format.java   |  42 +-
 ...CompressedMeasureChunkFileBasedReaderV1.java |  16 +-
 ...CompressedMeasureChunkFileBasedReaderV2.java |  24 +-
 ...CompressedMeasureChunkFileBasedReaderV3.java |  45 +-
 ...essedMsrChunkFileBasedPageLevelReaderV3.java |   8 +-
 .../chunk/store/ColumnPageWrapper.java  |  30 +-
 .../chunk/store/DimensionDataChunkStore.java|   8 +-
 .../SafeFixedLengthDimensionDataChunkStore.java |   6 +-
 ...feVariableLengthDimensionDataChunkStore.java |   8 +-
 ...nsafeFixedLengthDimensionDataChunkStore.java |  10 +-
 ...afeVariableLengthDimesionDataChunkStore.java |  10 +-
 .../datastore/columnar/ColumnGroupModel.java|  26 -
 .../core/datastore/impl/DFSFileHolderImpl.java  | 166 -
 .../core/datastore/impl/DFSFileReaderImpl.java  | 155 
 .../datastore/impl/DefaultFileTypeProvider.java |  16 +-
 .../core/datastore/impl/FileFactory.java|   4 +-
 .../core/datastore/impl/FileHolderImpl.java | 224 --
 .../core/datastore/impl/FileReaderImpl.java | 215 ++
 .../core/datastore/impl/FileTypeInerface.java   |   4 +-
 .../impl/btree/AbstractBTreeLeafNode.java   |  60 +-
 .../impl/btree/BTreeDataRefNodeFinder.java  |   6 +-
 .../datastore/impl/btree/BTreeNonLeafNode.java  |  52 +-
 .../impl/btree/BlockBTreeLeafNode.java  |   6 +-
 .../impl/btree/BlockletBTreeLeafNode.java   |  46 +-
 .../page/encoding/EncodingFactory.java  |   8 +-
 .../server/NonSecureDictionaryServer.java   |   1 -
 .../core/indexstore/BlockletDetailInfo.java |   4 -
 .../blockletindex/BlockletDataRefNode.java  | 228 ++
 .../BlockletDataRefNodeWrapper.java | 241 ---
 .../indexstore/blockletindex/IndexWrapper.java  |   2 +-
 .../blockletindex/SegmentIndexFileStore.java|   7 +-
 .../core/memory/HeapMemoryAllocator.java|   2 +-
 .../core/metadata/blocklet/SegmentInfo.java |  19 -
 .../core/metadata/schema/table/CarbonTable.java | 130 +++-
 .../schema/table/RelationIdentifier.java|  16 -
 .../core/metadata/schema/table/TableInfo.java   |   6 +-
 .../schema/table/column/CarbonColumn.java   |   2 +-
 .../schema/table/column/CarbonDimension.java|  12 -
 

[36/54] [abbrv] carbondata git commit: [HOTFIX] Fix timestamp issue in TestSortColumnsWithUnsafe

2018-03-08 Thread ravipesala
[HOTFIX] Fix timestamp issue in TestSortColumnsWithUnsafe

Fix timestamp issue in TestSortColumnsWithUnsafe

This closes #2001


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/a80db8e8
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/a80db8e8
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/a80db8e8

Branch: refs/heads/master
Commit: a80db8e851a6e0b7061a013b5a9149e4f7041984
Parents: fc2a7eb
Author: Jacky Li 
Authored: Tue Feb 27 13:06:02 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:11 2018 +0530

--
 .../testsuite/sortcolumns/TestSortColumns.scala | 14 +--
 .../sortcolumns/TestSortColumnsWithUnsafe.scala | 25 +++-
 2 files changed, 15 insertions(+), 24 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/a80db8e8/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
index 13db652..adf8423 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
@@ -37,7 +37,7 @@ class TestSortColumns extends QueryTest with 
BeforeAndAfterAll {
 CarbonProperties.getInstance()
   .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "dd-MM-")
 sql("CREATE TABLE origintable1 (empno int, empname String, designation 
String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, 
deptno int, deptname String, projectcode int, projectjoindate Timestamp, 
projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 
'org.apache.carbondata.format'")
-sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
origintable1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")
+sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
origintable1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"', 
'TIMESTAMPFORMAT'='dd-MM-')""")
 
 sql("CREATE TABLE tableOne(id int, name string, city string, age int) 
STORED BY 'org.apache.carbondata.format'")
 sql("CREATE TABLE tableTwo(id int, age int) STORED BY 
'org.apache.carbondata.format'")
@@ -244,7 +244,7 @@ class TestSortColumns extends QueryTest with 
BeforeAndAfterAll {
   sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
origintable1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")
   setLoadingProperties("false", "false", "false")
   sql("CREATE TABLE unsortedtable_heap_safe (empno int, empname String, 
designation String, doj Timestamp, workgroupcategory int, workgroupcategoryname 
String, deptno int, deptname String, projectcode int, projectjoindate 
Timestamp, projectenddate Timestamp,attendance int,utilization int,salary int) 
STORED BY 'org.apache.carbondata.format' tblproperties('sort_columns'='')")
-  sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
unsortedtable_heap_safe OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")
+  sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
unsortedtable_heap_safe OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"', 
'TIMESTAMPFORMAT'='dd-MM-')""")
   checkAnswer(sql("select * from unsortedtable_heap_safe where empno = 
11"), sql("select * from origintable1 where empno = 11"))
   checkAnswer(sql("select * from unsortedtable_heap_safe order by empno"), 
sql("select * from origintable1 order by empno"))
 } finally {
@@ -259,7 +259,7 @@ class TestSortColumns extends QueryTest with 
BeforeAndAfterAll {
   sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
origintable1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")
   setLoadingProperties("false", "true", "false")
   sql("CREATE TABLE unsortedtable_heap_unsafe (empno int, empname String, 
designation String, doj Timestamp, workgroupcategory int, workgroupcategoryname 
String, deptno int, deptname String, projectcode int, projectjoindate 
Timestamp, projectenddate Timestamp,attendance int,utilization int,salary int) 
STORED BY 'org.apache.carbondata.format' tblproperties('sort_columns'='')")
-  sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
unsortedtable_heap_unsafe OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")
+  sql(s"""LOAD DATA local inpath 

[22/54] [abbrv] carbondata git commit: [CARBONDATA-2023][DataLoad] Add size base block allocation in data loading

2018-03-08 Thread ravipesala
[CARBONDATA-2023][DataLoad] Add size base block allocation in data loading

Carbondata assign blocks to nodes at the beginning of data loading.
Previous block allocation strategy is block number based and it will
suffer skewed data problem if the size of input files differs a lot.

We introduced a size based block allocation strategy to optimize data
loading performance in skewed data scenario.

This closes #1808


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/8d8b589e
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/8d8b589e
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/8d8b589e

Branch: refs/heads/master
Commit: 8d8b589e78a9db1ddc101d20c1e3feb500acce19
Parents: 21704cf
Author: xuchuanyin 
Authored: Thu Feb 8 14:42:39 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:10 2018 +0530

--
 .../core/datamap/dev/AbstractDataMapWriter.java |   5 +-
 .../core/datamap/dev/DataMapFactory.java|   2 +-
 .../blockletindex/BlockletDataMapFactory.java   |   2 +-
 .../SegmentUpdateStatusManager.java |   9 +-
 .../carbondata/core/util/NonDictionaryUtil.java |  67 ++-
 .../datamap/examples/MinMaxDataMapFactory.java  |   5 +-
 .../datamap/examples/MinMaxDataWriter.java  |   7 +-
 .../presto/util/CarbonDataStoreCreator.scala|   1 +
 .../CarbonIndexFileMergeTestCase.scala  |   4 -
 .../testsuite/datamap/CGDataMapTestCase.scala   |  26 +-
 .../testsuite/datamap/DataMapWriterSuite.scala  |  19 +-
 .../testsuite/datamap/FGDataMapTestCase.scala   |  31 +-
 .../iud/DeleteCarbonTableTestCase.scala |   2 +-
 .../TestInsertAndOtherCommandConcurrent.scala   |  14 +-
 .../StandardPartitionTableCleanTestCase.scala   |  12 +-
 .../StandardPartitionTableLoadingTestCase.scala |   2 +-
 .../load/DataLoadProcessorStepOnSpark.scala |   6 +-
 .../carbondata/spark/util/DataLoadingUtil.scala |   2 +-
 .../datamap/DataMapWriterListener.java  |   2 +-
 .../loading/row/IntermediateSortTempRow.java| 117 -
 .../loading/sort/SortStepRowHandler.java| 466 ---
 .../loading/sort/SortStepRowUtil.java   | 103 
 .../sort/unsafe/UnsafeCarbonRowPage.java| 331 +++--
 .../loading/sort/unsafe/UnsafeSortDataRows.java |  57 ++-
 .../unsafe/comparator/UnsafeRowComparator.java  |  95 ++--
 .../UnsafeRowComparatorForNormalDIms.java   |  59 +++
 .../UnsafeRowComparatorForNormalDims.java   |  59 ---
 .../sort/unsafe/holder/SortTempChunkHolder.java |   3 +-
 .../holder/UnsafeFinalMergePageHolder.java  |  19 +-
 .../unsafe/holder/UnsafeInmemoryHolder.java |  21 +-
 .../holder/UnsafeSortTempFileChunkHolder.java   | 138 --
 .../merger/UnsafeIntermediateFileMerger.java| 118 -
 .../UnsafeSingleThreadFinalSortFilesMerger.java |  27 +-
 .../processing/merger/CarbonDataMergerUtil.java |   8 +-
 .../merger/CompactionResultSortProcessor.java   |   5 +-
 .../merger/RowResultMergerProcessor.java|   5 +-
 .../partition/spliter/RowResultProcessor.java   |   5 +-
 .../sort/sortdata/IntermediateFileMerger.java   |  95 +++-
 .../IntermediateSortTempRowComparator.java  |  73 ---
 .../sort/sortdata/NewRowComparator.java |   5 +-
 .../sortdata/NewRowComparatorForNormalDims.java |   3 +-
 .../processing/sort/sortdata/RowComparator.java |  94 
 .../sortdata/RowComparatorForNormalDims.java|  62 +++
 .../SingleThreadFinalSortFilesMerger.java   |  25 +-
 .../processing/sort/sortdata/SortDataRows.java  |  85 +++-
 .../sort/sortdata/SortTempFileChunkHolder.java  | 174 +--
 .../sort/sortdata/TableFieldStat.java   | 176 ---
 .../util/CarbonDataProcessorUtil.java   |   4 +-
 .../processing/util/CarbonLoaderUtil.java   |   9 -
 49 files changed, 1368 insertions(+), 1291 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/8d8b589e/core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java
index bcc9bad..de6dcb1 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java
@@ -18,6 +18,7 @@ package org.apache.carbondata.core.datamap.dev;
 
 import java.io.IOException;
 
+import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datastore.impl.FileFactory;
 import org.apache.carbondata.core.datastore.page.ColumnPage;
 import 

[35/54] [abbrv] carbondata git commit: [CARBONDATA-2091][DataLoad] Support specifying sort column bounds in data loading

2018-03-08 Thread ravipesala
[CARBONDATA-2091][DataLoad] Support specifying sort column bounds in data 
loading

Enhance data loading performance by specifying sort column bounds
1. Add row range number during convert-process-step
2. Dispatch rows to each sorter by range number
3. Sort/Write process step can be done concurrently in each range
4. Since all sorttemp files will be written in one folders, we add range
number to the file name to distingush them

Tests added and docs updated

This closes #1953


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/d5396b15
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/d5396b15
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/d5396b15

Branch: refs/heads/master
Commit: d5396b154433dc3e24e7b95666375f7f2981fea1
Parents: e72bfd1
Author: xuchuanyin 
Authored: Tue Feb 13 10:58:06 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:11 2018 +0530

--
 .../constants/CarbonLoadOptionConstants.java|  10 +
 .../core/datastore/row/CarbonRow.java   |  10 +-
 .../ThriftWrapperSchemaConverterImpl.java   |   2 +-
 .../core/metadata/schema/BucketingInfo.java |  24 +-
 .../core/metadata/schema/ColumnRangeInfo.java   |  29 ++
 .../metadata/schema/SortColumnRangeInfo.java|  83 +
 docs/data-management-on-carbondata.md   |  11 +
 .../TestLoadDataWithSortColumnBounds.scala  | 348 +++
 .../carbondata/spark/rdd/CarbonScanRDD.scala|   2 +-
 .../carbondata/spark/rdd/PartitionDropper.scala |   2 +-
 .../spark/rdd/PartitionSplitter.scala   |   2 +-
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala |   3 +-
 .../strategy/CarbonLateDecodeStrategy.scala |   2 +-
 .../loading/CarbonDataLoadConfiguration.java|  11 +
 .../loading/DataLoadProcessBuilder.java |  77 +++-
 .../loading/converter/RowConverter.java |   2 +-
 .../converter/impl/RowConverterImpl.java|   5 +
 .../loading/model/CarbonLoadModel.java  |  14 +
 .../loading/model/CarbonLoadModelBuilder.java   |   1 +
 .../processing/loading/model/LoadOption.java|   1 +
 .../partition/impl/HashPartitionerImpl.java |  10 +-
 .../partition/impl/RangePartitionerImpl.java|  71 
 .../partition/impl/RawRowComparator.java|  63 
 .../processing/loading/sort/SorterFactory.java  |  16 +-
 ...arallelReadMergeSorterWithBucketingImpl.java | 272 ---
 ...allelReadMergeSorterWithColumnRangeImpl.java | 289 +++
 ...arallelReadMergeSorterWithBucketingImpl.java | 263 --
 ...allelReadMergeSorterWithColumnRangeImpl.java | 293 
 .../loading/sort/unsafe/UnsafeSortDataRows.java |   6 +-
 .../unsafe/merger/UnsafeIntermediateMerger.java |   6 +-
 .../UnsafeSingleThreadFinalSortFilesMerger.java |  11 +-
 .../steps/DataConverterProcessorStepImpl.java   | 102 +-
 ...ConverterProcessorWithBucketingStepImpl.java | 161 -
 .../steps/DataWriterProcessorStepImpl.java  |  70 +++-
 .../SingleThreadFinalSortFilesMerger.java   |   3 +-
 .../processing/sort/sortdata/SortDataRows.java  |  11 +-
 .../sortdata/SortIntermediateFileMerger.java|   6 +-
 .../sort/sortdata/SortParameters.java   |  10 +
 .../store/CarbonFactDataHandlerColumnar.java|   6 +-
 39 files changed, 1558 insertions(+), 750 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/d5396b15/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
index a6bf60f..8ff8dc4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
@@ -124,4 +124,14 @@ public final class CarbonLoadOptionConstants {
   public static final String ENABLE_CARBON_LOAD_SKEWED_DATA_OPTIMIZATION
   = "carbon.load.skewedDataOptimization.enabled";
   public static final String 
ENABLE_CARBON_LOAD_SKEWED_DATA_OPTIMIZATION_DEFAULT = "false";
+
+  /**
+   * field delimiter for each field in one bound
+   */
+  public static final String SORT_COLUMN_BOUNDS_FIELD_DELIMITER = ",";
+
+  /**
+   * row delimiter for each sort column bounds
+   */
+  public static final String SORT_COLUMN_BOUNDS_ROW_DELIMITER = ";";
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/d5396b15/core/src/main/java/org/apache/carbondata/core/datastore/row/CarbonRow.java
--
diff --git 

[01/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
Repository: carbondata
Updated Branches:
  refs/heads/master 3fb406618 -> 39fa1eb58


http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java
 
b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java
index f51ced3..6a401d8 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java
@@ -34,20 +34,16 @@ import 
org.apache.carbondata.core.datastore.block.TableBlockInfo;
 import org.apache.carbondata.core.datastore.block.TaskBlockInfo;
 import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
-import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
-import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
 import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
 import org.apache.carbondata.core.scan.executor.QueryExecutor;
 import org.apache.carbondata.core.scan.executor.QueryExecutorFactory;
 import 
org.apache.carbondata.core.scan.executor.exception.QueryExecutionException;
-import org.apache.carbondata.core.scan.model.QueryDimension;
-import org.apache.carbondata.core.scan.model.QueryMeasure;
 import org.apache.carbondata.core.scan.model.QueryModel;
-import org.apache.carbondata.core.scan.result.BatchResult;
+import org.apache.carbondata.core.scan.result.RowBatch;
 import org.apache.carbondata.core.scan.result.iterator.RawResultIterator;
 import org.apache.carbondata.core.util.CarbonProperties;
 import org.apache.carbondata.core.util.CarbonUtil;
-import org.apache.carbondata.core.util.DataTypeUtil;
+import org.apache.carbondata.core.util.DataTypeConverter;
 
 /**
  * Executor class for executing the query on the selected segments to be 
merged.
@@ -70,6 +66,9 @@ public class CarbonCompactionExecutor {
*/
   private boolean restructuredBlockExists;
 
+  // converter for UTF8String and decimal conversion
+  private DataTypeConverter dataTypeConverter;
+
   /**
* Constructor
*
@@ -82,13 +81,14 @@ public class CarbonCompactionExecutor {
   public CarbonCompactionExecutor(Map segmentMapping,
   SegmentProperties segmentProperties, CarbonTable carbonTable,
   Map dataFileMetadataSegMapping,
-  boolean restructuredBlockExists) {
+  boolean restructuredBlockExists, DataTypeConverter dataTypeConverter) {
 this.segmentMapping = segmentMapping;
 this.destinationSegProperties = segmentProperties;
 this.carbonTable = carbonTable;
 this.dataFileMetadataSegMapping = dataFileMetadataSegMapping;
 this.restructuredBlockExists = restructuredBlockExists;
-queryExecutorList = new 
ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
+this.queryExecutorList = new 
ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
+this.dataTypeConverter = dataTypeConverter;
   }
 
   /**
@@ -100,7 +100,9 @@ public class CarbonCompactionExecutor {
 List resultList =
 new ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
 List list = null;
-queryModel = prepareQueryModel(list);
+queryModel = 
carbonTable.createQueryModelWithProjectAllColumns(dataTypeConverter);
+queryModel.setReadPageByPage(enablePageLevelReaderForCompaction());
+queryModel.setForcedDetailRawQuery(true);
 // iterate each seg ID
 for (Map.Entry taskMap : segmentMapping.entrySet()) 
{
   String segmentId = taskMap.getKey();
@@ -156,7 +158,7 @@ public class CarbonCompactionExecutor {
* @param blockList
* @return
*/
-  private CarbonIterator executeBlockList(List 
blockList)
+  private CarbonIterator executeBlockList(List 
blockList)
   throws QueryExecutionException, IOException {
 queryModel.setTableBlockInfos(blockList);
 QueryExecutor queryExecutor = 
QueryExecutorFactory.getQueryExecutor(queryModel);
@@ -195,48 +197,6 @@ public class CarbonCompactionExecutor {
   }
 
   /**
-   * Preparing of the query model.
-   *
-   * @param blockList
-   * @return
-   */
-  private QueryModel prepareQueryModel(List blockList) {
-QueryModel model = new QueryModel();
-model.setTableBlockInfos(blockList);
-model.setForcedDetailRawQuery(true);
-model.setFilterExpressionResolverTree(null);
-model.setConverter(DataTypeUtil.getDataTypeConverter());
-model.setReadPageByPage(enablePageLevelReaderForCompaction());
-
-List dims = new 
ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
-
-List dimensions =
-

[09/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
index 69f5ceb..22d1df1 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
@@ -43,10 +43,9 @@ import 
org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.datastore.block.TableBlockInfo;
 import org.apache.carbondata.core.datastore.block.TableBlockUniqueIdentifier;
 import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
-import 
org.apache.carbondata.core.indexstore.blockletindex.BlockletDataRefNodeWrapper;
+import org.apache.carbondata.core.indexstore.blockletindex.BlockletDataRefNode;
 import org.apache.carbondata.core.indexstore.blockletindex.IndexWrapper;
 import org.apache.carbondata.core.keygenerator.KeyGenException;
-import org.apache.carbondata.core.keygenerator.KeyGenerator;
 import org.apache.carbondata.core.memory.UnsafeMemoryManager;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
 import org.apache.carbondata.core.metadata.blocklet.BlockletInfo;
@@ -64,8 +63,8 @@ import 
org.apache.carbondata.core.scan.executor.util.RestructureUtil;
 import org.apache.carbondata.core.scan.filter.FilterUtil;
 import org.apache.carbondata.core.scan.filter.SingleTableProvider;
 import org.apache.carbondata.core.scan.filter.TableProvider;
-import org.apache.carbondata.core.scan.model.QueryDimension;
-import org.apache.carbondata.core.scan.model.QueryMeasure;
+import org.apache.carbondata.core.scan.model.ProjectionDimension;
+import org.apache.carbondata.core.scan.model.ProjectionMeasure;
 import org.apache.carbondata.core.scan.model.QueryModel;
 import org.apache.carbondata.core.stats.QueryStatistic;
 import org.apache.carbondata.core.stats.QueryStatisticsConstants;
@@ -121,7 +120,6 @@ public abstract class AbstractQueryExecutor implements 
QueryExecutor {
 queryProperties.queryStatisticsRecorder =
 
CarbonTimeStatisticsFactory.createExecutorRecorder(queryModel.getQueryId());
 queryModel.setStatisticsRecorder(queryProperties.queryStatisticsRecorder);
-QueryUtil.resolveQueryModel(queryModel);
 QueryStatistic queryStatistic = new QueryStatistic();
 // sort the block info
 // so block will be loaded in sorted order this will be required for
@@ -168,12 +166,12 @@ public abstract class AbstractQueryExecutor implements 
QueryExecutor {
 .addStatistics(QueryStatisticsConstants.LOAD_BLOCKS_EXECUTOR, 
System.currentTimeMillis());
 queryProperties.queryStatisticsRecorder.recordStatistics(queryStatistic);
 // calculating the total number of aggeragted columns
-int measureCount = queryModel.getQueryMeasures().size();
+int measureCount = queryModel.getProjectionMeasures().size();
 
 int currentIndex = 0;
 DataType[] dataTypes = new DataType[measureCount];
 
-for (QueryMeasure carbonMeasure : queryModel.getQueryMeasures()) {
+for (ProjectionMeasure carbonMeasure : queryModel.getProjectionMeasures()) 
{
   // adding the data type and aggregation type of all the measure this
   // can be used
   // to select the aggregator
@@ -198,9 +196,11 @@ public abstract class AbstractQueryExecutor implements 
QueryExecutor {
 queryStatistic = new QueryStatistic();
 // dictionary column unique column id to dictionary mapping
 // which will be used to get column actual data
-queryProperties.columnToDictionayMapping = QueryUtil
-.getDimensionDictionaryDetail(queryModel.getQueryDimension(),
-queryProperties.complexFilterDimension, 
queryModel.getAbsoluteTableIdentifier(),
+queryProperties.columnToDictionayMapping =
+QueryUtil.getDimensionDictionaryDetail(
+queryModel.getProjectionDimensions(),
+queryProperties.complexFilterDimension,
+queryModel.getAbsoluteTableIdentifier(),
 tableProvider);
 queryStatistic
 .addStatistics(QueryStatisticsConstants.LOAD_DICTIONARY, 
System.currentTimeMillis());
@@ -263,8 +263,8 @@ public abstract class AbstractQueryExecutor implements 
QueryExecutor {
 // and query will be executed based on that infos
 for (int i = 0; i < queryProperties.dataBlocks.size(); i++) {
   AbstractIndex abstractIndex = queryProperties.dataBlocks.get(i);
-  BlockletDataRefNodeWrapper dataRefNode =
-  (BlockletDataRefNodeWrapper) abstractIndex.getDataRefNode();
+  BlockletDataRefNode dataRefNode =
+  (BlockletDataRefNode) abstractIndex.getDataRefNode();
   

[46/54] [abbrv] carbondata git commit: [CARBONDATA-1543] Supported DataMap chooser and expression for supporting multiple datamaps in single query

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/56330ae2/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
index ccb9b68..b1962c1 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
@@ -19,15 +19,15 @@ package 
org.apache.carbondata.integration.spark.testsuite.preaggregate
 
 import scala.collection.JavaConverters._
 
-import org.apache.spark.sql.{AnalysisException, 
CarbonDatasourceHadoopRelation, Row}
+import org.apache.spark.sql.{CarbonDatasourceHadoopRelation, Row}
 import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
 import org.apache.spark.sql.execution.datasources.LogicalRelation
 import org.apache.spark.sql.hive.CarbonRelation
 import org.apache.spark.sql.test.util.QueryTest
 import org.scalatest.BeforeAndAfterAll
 
-import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
 import org.apache.carbondata.core.constants.CarbonCommonConstants
+import 
org.apache.carbondata.common.exceptions.sql.{MalformedCarbonCommandException, 
MalformedDataMapCommandException}
 import org.apache.carbondata.core.metadata.encoder.Encoding
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
 import 
org.apache.carbondata.core.metadata.schema.datamap.DataMapProvider.TIMESERIES
@@ -293,7 +293,7 @@ class TestPreAggCreateCommand extends QueryTest with 
BeforeAndAfterAll {
   test("test pre agg create table 22: using invalid datamap provider") {
 sql("DROP DATAMAP IF EXISTS agg0 ON TABLE maintable")
 
-val e: Exception = intercept[MalformedDataMapCommandException] {
+val e = intercept[MalformedDataMapCommandException] {
   sql(
 """
   | CREATE DATAMAP agg0 ON TABLE mainTable

http://git-wip-us.apache.org/repos/asf/carbondata/blob/56330ae2/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala
index 43316b3..ec76b37 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala
@@ -201,7 +201,7 @@ class TestTimeSeriesCreateTable extends QueryTest with 
BeforeAndAfterAll {
   | GROUP BY dataTime
 """.stripMargin)
 }
-assert(e.getMessage.equals("Unknown datamap provider/class abc"))
+assert(e.getMessage.equals("DataMap class 'abc' not found"))
   }
 
   test("test timeseries create table 12: USING and catch 
MalformedCarbonCommandException") {
@@ -216,7 +216,7 @@ class TestTimeSeriesCreateTable extends QueryTest with 
BeforeAndAfterAll {
   | GROUP BY dataTime
 """.stripMargin)
 }
-assert(e.getMessage.equals("Unknown datamap provider/class abc"))
+assert(e.getMessage.equals("DataMap class 'abc' not found"))
   }
 
   test("test timeseries create table 13: Only one granularity level can be 
defined 1") {
@@ -237,6 +237,7 @@ class TestTimeSeriesCreateTable extends QueryTest with 
BeforeAndAfterAll {
| GROUP BY dataTime
""".stripMargin)
 }
+e.printStackTrace()
 assert(e.getMessage.equals("Only one granularity level can be defined"))
   }
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/56330ae2/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala
index c522c1e..e31896f 100644
--- 

[07/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java
index de97e82..540607d 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java
@@ -34,8 +34,8 @@ import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.cache.dictionary.Dictionary;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
-import org.apache.carbondata.core.datastore.chunk.DimensionColumnDataChunk;
-import 
org.apache.carbondata.core.datastore.chunk.impl.VariableLengthDimensionDataChunk;
+import org.apache.carbondata.core.datastore.chunk.DimensionColumnPage;
+import 
org.apache.carbondata.core.datastore.chunk.impl.VariableLengthDimensionColumnPage;
 import org.apache.carbondata.core.datastore.page.ColumnPage;
 import org.apache.carbondata.core.keygenerator.KeyGenException;
 import 
org.apache.carbondata.core.keygenerator.directdictionary.DirectDictionaryGenerator;
@@ -58,7 +58,7 @@ import org.apache.carbondata.core.scan.filter.intf.RowImpl;
 import org.apache.carbondata.core.scan.filter.intf.RowIntf;
 import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.DimColumnResolvedFilterInfo;
 import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.MeasureColumnResolvedFilterInfo;
-import org.apache.carbondata.core.scan.processor.BlocksChunkHolder;
+import org.apache.carbondata.core.scan.processor.RawBlockletColumnChunks;
 import org.apache.carbondata.core.util.BitSetGroup;
 import org.apache.carbondata.core.util.ByteUtil;
 import org.apache.carbondata.core.util.CarbonUtil;
@@ -68,20 +68,20 @@ public class RowLevelFilterExecuterImpl implements 
FilterExecuter {
 
   private static final LogService LOGGER =
   
LogServiceFactory.getLogService(RowLevelFilterExecuterImpl.class.getName());
-  protected List dimColEvaluatorInfoList;
-  protected List msrColEvalutorInfoList;
+  List dimColEvaluatorInfoList;
+  List msrColEvalutorInfoList;
   protected Expression exp;
   protected AbsoluteTableIdentifier tableIdentifier;
   protected SegmentProperties segmentProperties;
   /**
* it has index at which given dimension is stored in file
*/
-  protected int[] dimensionBlocksIndex;
+  int[] dimensionChunkIndex;
 
   /**
* it has index at which given measure is stored in file
*/
-  protected int[] measureBlocksIndex;
+  int[] measureChunkIndex;
 
   private Map complexDimensionInfoMap;
 
@@ -89,18 +89,18 @@ public class RowLevelFilterExecuterImpl implements 
FilterExecuter {
* flag to check whether the filter dimension is present in current block 
list of dimensions.
* Applicable for restructure scenarios
*/
-  protected boolean[] isDimensionPresentInCurrentBlock;
+  boolean[] isDimensionPresentInCurrentBlock;
 
   /**
* flag to check whether the filter measure is present in current block list 
of measures.
* Applicable for restructure scenarios
*/
-  protected boolean[] isMeasurePresentInCurrentBlock;
+  boolean[] isMeasurePresentInCurrentBlock;
 
   /**
* is dimension column data is natural sorted
*/
-  protected boolean isNaturalSorted;
+  boolean isNaturalSorted;
 
   /**
* date direct dictionary generator
@@ -124,10 +124,10 @@ public class RowLevelFilterExecuterImpl implements 
FilterExecuter {
 }
 if (this.dimColEvaluatorInfoList.size() > 0) {
   this.isDimensionPresentInCurrentBlock = new 
boolean[dimColEvaluatorInfoList.size()];
-  this.dimensionBlocksIndex = new int[dimColEvaluatorInfoList.size()];
+  this.dimensionChunkIndex = new int[dimColEvaluatorInfoList.size()];
 } else {
   this.isDimensionPresentInCurrentBlock = new boolean[]{false};
-  this.dimensionBlocksIndex = new int[]{0};
+  this.dimensionChunkIndex = new int[]{0};
 }
 if (null == msrColEvalutorInfoList) {
   this.msrColEvalutorInfoList = new 
ArrayList(20);
@@ -136,10 +136,10 @@ public class RowLevelFilterExecuterImpl implements 
FilterExecuter {
 }
 if (this.msrColEvalutorInfoList.size() > 0) {
   this.isMeasurePresentInCurrentBlock = new 
boolean[msrColEvalutorInfoList.size()];
-  this.measureBlocksIndex = new int[msrColEvalutorInfoList.size()];
+  this.measureChunkIndex = new int[msrColEvalutorInfoList.size()];
 } else {
   this.isMeasurePresentInCurrentBlock = new boolean[]{false};
-  this.measureBlocksIndex = new int[] {0};

[15/54] [abbrv] carbondata git commit: Support generating assembling JAR for store-sdk module

2018-03-08 Thread ravipesala
Support generating assembling JAR for store-sdk module

Support generating assembling JAR for store-sdk module and remove junit 
dependency

This closes #1976


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/05396623
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/05396623
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/05396623

Branch: refs/heads/master
Commit: 053966237e8061a9d13131c4abd9a52567ed8642
Parents: 9a423c2
Author: Jacky Li 
Authored: Tue Feb 13 09:12:09 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:10 2018 +0530

--
 common/pom.xml|  2 +
 core/pom.xml  |  2 +
 hadoop/pom.xml|  1 +
 integration/presto/pom.xml|  3 +-
 integration/spark-common-cluster-test/pom.xml |  2 +-
 integration/spark-common-test/pom.xml |  3 +-
 integration/spark-common/pom.xml  |  2 +-
 integration/spark2/pom.xml|  2 +-
 pom.xml   |  5 +++
 processing/pom.xml|  1 +
 store/sdk/pom.xml | 50 +-
 streaming/pom.xml |  1 -
 12 files changed, 66 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/05396623/common/pom.xml
--
diff --git a/common/pom.xml b/common/pom.xml
index 5550129..433d575 100644
--- a/common/pom.xml
+++ b/common/pom.xml
@@ -42,10 +42,12 @@
 
   junit
   junit
+  test
 
 
   org.jmockit
   jmockit
+  test
 
 
   org.apache.hadoop

http://git-wip-us.apache.org/repos/asf/carbondata/blob/05396623/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index 92c9607..824de0d 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -70,10 +70,12 @@
 
   org.jmockit
   jmockit
+  test
 
 
   junit
   junit
+  test
 
 
   org.apache.spark

http://git-wip-us.apache.org/repos/asf/carbondata/blob/05396623/hadoop/pom.xml
--
diff --git a/hadoop/pom.xml b/hadoop/pom.xml
index 2aaac99..c3964c5 100644
--- a/hadoop/pom.xml
+++ b/hadoop/pom.xml
@@ -42,6 +42,7 @@
 
   junit
   junit
+  test
 
   
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/05396623/integration/presto/pom.xml
--
diff --git a/integration/presto/pom.xml b/integration/presto/pom.xml
index 17f5d41..c3c7c64 100644
--- a/integration/presto/pom.xml
+++ b/integration/presto/pom.xml
@@ -193,7 +193,7 @@
 
 
   org.scalatest
-  scalatest_2.11
+  scalatest_${scala.binary.version}
 
 
   org.apache.zookeeper
@@ -330,7 +330,6 @@
 
   org.scalatest
   scalatest_${scala.binary.version}
-  2.2.1
   test
 
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/05396623/integration/spark-common-cluster-test/pom.xml
--
diff --git a/integration/spark-common-cluster-test/pom.xml 
b/integration/spark-common-cluster-test/pom.xml
index fd907a3..028da11 100644
--- a/integration/spark-common-cluster-test/pom.xml
+++ b/integration/spark-common-cluster-test/pom.xml
@@ -49,11 +49,11 @@
 
   junit
   junit
+  test
 
 
   org.scalatest
   scalatest_${scala.binary.version}
-  2.2.1
   test
 
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/05396623/integration/spark-common-test/pom.xml
--
diff --git a/integration/spark-common-test/pom.xml 
b/integration/spark-common-test/pom.xml
index 67a2317..d1c04ae 100644
--- a/integration/spark-common-test/pom.xml
+++ b/integration/spark-common-test/pom.xml
@@ -106,16 +106,17 @@
 
   junit
   junit
+  test
 
 
   org.scalatest
   scalatest_${scala.binary.version}
-  2.2.1
   test
 
 
   org.jmockit
   jmockit
+  test
 
   
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/05396623/integration/spark-common/pom.xml
--
diff --git a/integration/spark-common/pom.xml b/integration/spark-common/pom.xml
index 295d62b..16f327d 100644
--- a/integration/spark-common/pom.xml
+++ b/integration/spark-common/pom.xml
@@ -58,11 +58,11 @@
 
 

[03/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java
 
b/core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java
index 94a041a..b74c279 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java
@@ -378,7 +378,6 @@ public abstract class AbstractDataFileFooterConverter {
   cardinality[i] = segmentInfo.getColumn_cardinalities().get(i);
 }
 info.setColumnCardinality(cardinality);
-info.setNumberOfColumns(segmentInfo.getNum_cols());
 return info;
   }
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
--
diff --git a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java 
b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
index 52305bd..0cc783e 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
@@ -48,10 +48,10 @@ import 
org.apache.carbondata.core.cache.dictionary.DictionaryColumnUniqueIdentif
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.constants.CarbonLoadOptionConstants;
 import org.apache.carbondata.core.datamap.Segment;
-import org.apache.carbondata.core.datastore.FileHolder;
+import org.apache.carbondata.core.datastore.FileReader;
 import org.apache.carbondata.core.datastore.block.AbstractIndex;
 import org.apache.carbondata.core.datastore.block.TableBlockInfo;
-import org.apache.carbondata.core.datastore.chunk.DimensionColumnDataChunk;
+import org.apache.carbondata.core.datastore.chunk.DimensionColumnPage;
 import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk;
 import org.apache.carbondata.core.datastore.chunk.impl.MeasureRawColumnChunk;
 import org.apache.carbondata.core.datastore.columnar.ColumnGroupModel;
@@ -82,7 +82,7 @@ import 
org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
 import org.apache.carbondata.core.mutate.UpdateVO;
 import org.apache.carbondata.core.reader.ThriftReader;
 import org.apache.carbondata.core.reader.ThriftReader.TBaseCreator;
-import org.apache.carbondata.core.scan.model.QueryDimension;
+import org.apache.carbondata.core.scan.model.ProjectionDimension;
 import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
 import org.apache.carbondata.core.statusmanager.SegmentStatus;
 import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
@@ -248,16 +248,13 @@ public final class CarbonUtil {
   public static ColumnGroupModel getColGroupModel(int[][] columnGroups) {
 int[] columnSplit = new int[columnGroups.length];
 int noOfColumnStore = columnSplit.length;
-boolean[] columnarStore = new boolean[noOfColumnStore];
 
 for (int i = 0; i < columnGroups.length; i++) {
   columnSplit[i] = columnGroups[i].length;
-  columnarStore[i] = columnGroups[i].length <= 1;
 }
 ColumnGroupModel colGroupModel = new ColumnGroupModel();
 colGroupModel.setNoOfColumnStore(noOfColumnStore);
 colGroupModel.setColumnSplit(columnSplit);
-colGroupModel.setColumnarStore(columnarStore);
 colGroupModel.setColumnGroup(columnGroups);
 return colGroupModel;
   }
@@ -418,7 +415,7 @@ public final class CarbonUtil {
 }
   }
 
-  public static int getFirstIndexUsingBinarySearch(DimensionColumnDataChunk 
dimColumnDataChunk,
+  public static int getFirstIndexUsingBinarySearch(DimensionColumnPage 
dimColumnDataChunk,
   int low, int high, byte[] compareValue, boolean matchUpLimit) {
 int cmpResult = 0;
 while (high >= low) {
@@ -457,7 +454,7 @@ public final class CarbonUtil {
* @return the compareValue's range index in the dimColumnDataChunk
*/
   public static int[] getRangeIndexUsingBinarySearch(
-  DimensionColumnDataChunk dimColumnDataChunk, int low, int high, byte[] 
compareValue) {
+  DimensionColumnPage dimColumnDataChunk, int low, int high, byte[] 
compareValue) {
 
 int[] rangeIndex = new int[2];
 int cmpResult = 0;
@@ -551,7 +548,7 @@ public final class CarbonUtil {
* @return index value
*/
   public static int nextLesserValueToTarget(int currentIndex,
-  DimensionColumnDataChunk dimColumnDataChunk, byte[] compareValue) {
+  DimensionColumnPage dimColumnDataChunk, byte[] compareValue) {
 while (currentIndex - 1 >= 0
 && dimColumnDataChunk.compareTo(currentIndex - 1, compareValue) >= 0) {
   --currentIndex;
@@ -571,7 +568,7 @@ 

[38/54] [abbrv] carbondata git commit: [CARBONDATA-2189] Add DataMapProvider developer interface

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/89a12af5/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/IndexDataMapWriterSuite.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/IndexDataMapWriterSuite.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/IndexDataMapWriterSuite.scala
new file mode 100644
index 000..795ef6a
--- /dev/null
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/IndexDataMapWriterSuite.scala
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.datamap
+
+import java.util
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.sql.{DataFrame, SaveMode}
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datamap.{DataMapDistributable, DataMapMeta, 
DataMapStoreManager, Segment}
+import org.apache.carbondata.core.datamap.dev.{AbstractDataMapWriter}
+import org.apache.carbondata.core.datamap.dev.AbstractDataMapWriter
+import 
org.apache.carbondata.core.datamap.dev.cgdatamap.{AbstractCoarseGrainIndexDataMap,
 AbstractCoarseGrainIndexDataMapFactory}
+import org.apache.carbondata.core.datamap.{DataMapDistributable, DataMapMeta}
+import org.apache.carbondata.core.datastore.page.ColumnPage
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier
+import org.apache.carbondata.core.metadata.schema.table.DataMapSchema
+import org.apache.carbondata.core.scan.filter.intf.ExpressionType
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.events.Event
+
+class C2IndexDataMapFactory() extends AbstractCoarseGrainIndexDataMapFactory {
+
+  var identifier: AbsoluteTableIdentifier = _
+
+  override def init(identifier: AbsoluteTableIdentifier,
+  dataMapSchema: DataMapSchema): Unit = {
+this.identifier = identifier
+  }
+
+  override def fireEvent(event: Event): Unit = ???
+
+  override def clear(segment: Segment): Unit = {}
+
+  override def clear(): Unit = {}
+
+  override def getDataMaps(distributable: DataMapDistributable): 
util.List[AbstractCoarseGrainIndexDataMap] = ???
+
+  override def getDataMaps(segment: Segment): 
util.List[AbstractCoarseGrainIndexDataMap] = ???
+
+  override def createWriter(segment: Segment, dataWritePath: String): 
AbstractDataMapWriter =
+IndexDataMapWriterSuite.dataMapWriterC2Mock(identifier, segment, 
dataWritePath)
+
+  override def getMeta: DataMapMeta = new DataMapMeta(List("c2").asJava, 
List(ExpressionType.EQUALS).asJava)
+
+  /**
+   * Get all distributable objects of a segmentid
+   *
+   * @return
+   */
+  override def toDistributable(segmentId: Segment): 
util.List[DataMapDistributable] = {
+???
+  }
+
+}
+
+class IndexDataMapWriterSuite extends QueryTest with BeforeAndAfterAll {
+  def buildTestData(numRows: Int): DataFrame = {
+import sqlContext.implicits._
+sqlContext.sparkContext.parallelize(1 to numRows, 1)
+  .map(x => ("a" + x, "b", x))
+  .toDF("c1", "c2", "c3")
+  }
+
+  def dropTable(): Unit = {
+sql("DROP TABLE IF EXISTS carbon1")
+sql("DROP TABLE IF EXISTS carbon2")
+  }
+
+  override def beforeAll {
+dropTable()
+  }
+
+  test("test write datamap 2 pages") {
+sql(s"CREATE TABLE carbon1(c1 STRING, c2 STRING, c3 INT) STORED BY 
'org.apache.carbondata.format'")
+// register datamap writer
+sql(s"CREATE DATAMAP test ON TABLE carbon1 USING 
'${classOf[C2IndexDataMapFactory].getName}'")
+val df = buildTestData(33000)
+
+// save dataframe to carbon file
+df.write
+  .format("carbondata")
+  .option("tableName", "carbon1")
+  .option("tempCSV", "false")
+  .option("sort_columns","c1")
+  .mode(SaveMode.Overwrite)
+  .save()
+
+assert(IndexDataMapWriterSuite.callbackSeq.head.contains("block start"))
+assert(IndexDataMapWriterSuite.callbackSeq.last.contains("block 

[25/54] [abbrv] carbondata git commit: [CARBONDATA-2156] Add interface annotation

2018-03-08 Thread ravipesala
[CARBONDATA-2156] Add interface annotation

InterfaceAudience and InterfaceStability annotation should be added for user 
and developer

1.InetfaceAudience can be User and Developer
2.InterfaceStability can be Stable, Evolving, Unstable

This closes #1968


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/ce6e71c5
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/ce6e71c5
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/ce6e71c5

Branch: refs/heads/master
Commit: ce6e71c5b19876415f00bd20b1cdf7fe5cc22825
Parents: 8d8b589
Author: Jacky Li 
Authored: Sun Feb 11 10:12:10 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:10 2018 +0530

--
 .../common/annotations/InterfaceAudience.java   | 58 
 .../common/annotations/InterfaceStability.java  | 69 
 2 files changed, 127 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/ce6e71c5/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceAudience.java
--
diff --git 
a/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceAudience.java
 
b/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceAudience.java
new file mode 100644
index 000..fa9729d
--- /dev/null
+++ 
b/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceAudience.java
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.common.annotations;
+
+import java.lang.annotation.Documented;
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+
+/**
+ * This annotation is ported and modified from Apache Hadoop project.
+ *
+ * Annotation to inform users of a package, class or method's intended 
audience.
+ * Currently the audience can be {@link User}, {@link Developer}
+ *
+ * Public classes that are not marked with this annotation must be
+ * considered by default as {@link Developer}.
+ *
+ * External applications must only use classes that are marked {@link User}.
+ *
+ * Methods may have a different annotation that it is more restrictive
+ * compared to the audience classification of the class. Example: A class
+ * might be {@link User}, but a method may be {@link Developer}
+ */
+@InterfaceAudience.User
+@InterfaceStability.Evolving
+public class InterfaceAudience {
+  /**
+   * Intended for use by any project or application.
+   */
+  @Documented
+  @Retention(RetentionPolicy.RUNTIME)
+  public @interface User { }
+
+  /**
+   * Intended only for developers to extend interface for CarbonData project
+   * For example, new Datamap implementations.
+   */
+  @Documented
+  @Retention(RetentionPolicy.RUNTIME)
+  public @interface Developer { }
+
+  private InterfaceAudience() { } // Audience can't exist on its own
+}

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ce6e71c5/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceStability.java
--
diff --git 
a/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceStability.java
 
b/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceStability.java
new file mode 100644
index 000..b8e5e52
--- /dev/null
+++ 
b/common/src/main/java/org/apache/carbondata/common/annotations/InterfaceStability.java
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable 

[05/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java
new file mode 100644
index 000..fde4e55
--- /dev/null
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/processor/DataBlockIterator.java
@@ -0,0 +1,269 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.processor;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+import org.apache.carbondata.common.CarbonIterator;
+import org.apache.carbondata.core.datastore.DataRefNode;
+import org.apache.carbondata.core.datastore.FileReader;
+import org.apache.carbondata.core.scan.collector.ResultCollectorFactory;
+import org.apache.carbondata.core.scan.collector.ScannedResultCollector;
+import org.apache.carbondata.core.scan.executor.infos.BlockExecutionInfo;
+import org.apache.carbondata.core.scan.result.BlockletScannedResult;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnarBatch;
+import org.apache.carbondata.core.scan.scanner.BlockletScanner;
+import org.apache.carbondata.core.scan.scanner.impl.BlockletFilterScanner;
+import org.apache.carbondata.core.scan.scanner.impl.BlockletFullScanner;
+import org.apache.carbondata.core.stats.QueryStatisticsModel;
+import org.apache.carbondata.core.util.TaskMetricsMap;
+
+/**
+ * This abstract class provides a skeletal implementation of the
+ * Block iterator.
+ */
+public class DataBlockIterator extends CarbonIterator> {
+
+  /**
+   * iterator which will be used to iterate over blocklets
+   */
+  private BlockletIterator blockletIterator;
+
+  /**
+   * result collector which will be used to aggregate the scanned result
+   */
+  private ScannedResultCollector scannerResultAggregator;
+
+  /**
+   * processor which will be used to process the block processing can be
+   * filter processing or non filter processing
+   */
+  private BlockletScanner blockletScanner;
+
+  /**
+   * batch size of result
+   */
+  private int batchSize;
+
+  private ExecutorService executorService;
+
+  private Future future;
+
+  private Future futureIo;
+
+  private BlockletScannedResult scannedResult;
+
+  private BlockExecutionInfo blockExecutionInfo;
+
+  private FileReader fileReader;
+
+  private AtomicBoolean nextBlock;
+
+  private AtomicBoolean nextRead;
+
+  public DataBlockIterator(BlockExecutionInfo blockExecutionInfo, FileReader 
fileReader,
+  int batchSize, QueryStatisticsModel queryStatisticsModel, 
ExecutorService executorService) {
+this.blockExecutionInfo = blockExecutionInfo;
+this.fileReader = fileReader;
+blockletIterator = new 
BlockletIterator(blockExecutionInfo.getFirstDataBlock(),
+blockExecutionInfo.getNumberOfBlockToScan());
+if (blockExecutionInfo.getFilterExecuterTree() != null) {
+  blockletScanner = new BlockletFilterScanner(blockExecutionInfo, 
queryStatisticsModel);
+} else {
+  blockletScanner = new BlockletFullScanner(blockExecutionInfo, 
queryStatisticsModel);
+}
+this.scannerResultAggregator =
+ResultCollectorFactory.getScannedResultCollector(blockExecutionInfo);
+this.batchSize = batchSize;
+this.executorService = executorService;
+this.nextBlock = new AtomicBoolean(false);
+this.nextRead = new AtomicBoolean(false);
+  }
+
+  @Override
+  public List next() {
+List collectedResult = null;
+if (updateScanner()) {
+  collectedResult = 
this.scannerResultAggregator.collectResultInRow(scannedResult, batchSize);
+  while (collectedResult.size() < batchSize && updateScanner()) {
+List data = this.scannerResultAggregator
+

[44/54] [abbrv] carbondata git commit: [CARBONDATA-2206] support lucene index datamap

2018-03-08 Thread ravipesala
[CARBONDATA-2206] support lucene index datamap

This PR is an initial effort to integrate lucene as an index datamap into 
carbondata.
A new module called carbondata-lucene is added to support lucene datamap:

1.Add LuceneFineGrainDataMap, implement FineGrainDataMap interface.
2.Add LuceneCoarseGrainDataMap, implement CoarseGrainDataMap interface.
3.Support writing lucene index via LuceneDataMapWriter.
4.Implement LuceneDataMapFactory
5.A UDF called TEXT_MATCH is added, use it to do filtering on string column by 
lucene

This closes #2003


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/bbb10922
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/bbb10922
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/bbb10922

Branch: refs/heads/master
Commit: bbb10922b3ed7ddf12881dae339848f1d88b0984
Parents: a80db8e
Author: Jacky Li 
Authored: Mon Feb 26 16:30:38 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:11 2018 +0530

--
 .../carbondata/core/datamap/DataMapChooser.java |   4 +
 .../core/datamap/DataMapStoreManager.java   |   5 +-
 .../carbondata/core/datamap/dev/DataMap.java|   2 +-
 .../core/datamap/dev/DataMapFactory.java|   2 +-
 .../core/datamap/dev/DataMapWriter.java |   7 +-
 .../cgdatamap/CoarseGrainDataMapFactory.java|   1 +
 .../core/scan/filter/intf/ExpressionType.java   |   3 +-
 datamap/lucene/pom.xml  | 149 +
 .../lucene/LuceneCoarseGrainDataMap.java| 232 +
 .../lucene/LuceneCoarseGrainDataMapFactory.java |  72 
 .../lucene/LuceneDataMapDistributable.java  |  36 ++
 .../lucene/LuceneDataMapFactoryBase.java| 180 ++
 .../datamap/lucene/LuceneDataMapWriter.java | 328 +++
 .../datamap/lucene/LuceneFineGrainDataMap.java  | 280 
 .../lucene/LuceneFineGrainDataMapFactory.java   |  68 
 .../lucene/LuceneCoarseGrainDataMapSuite.scala  |  73 +
 .../lucene/LuceneFineGrainDataMapSuite.scala|  98 ++
 integration/spark-common-test/pom.xml   |   6 +
 .../testsuite/datamap/FGDataMapTestCase.scala   |   2 +-
 .../carbondata/datamap/DataMapProvider.java |   4 +-
 .../datamap/IndexDataMapProvider.java   |   4 +-
 .../datamap/expression/MatchExpression.java |  56 
 .../carbondata/datamap/TextMatchUDF.scala   |  34 ++
 .../scala/org/apache/spark/sql/CarbonEnv.scala  |   5 +
 .../strategy/CarbonLateDecodeStrategy.scala |   9 +
 .../spark/sql/optimizer/CarbonFilters.scala |   4 +
 pom.xml |   3 +
 .../datamap/DataMapWriterListener.java  |   6 +-
 .../store/writer/AbstractFactDataWriter.java|  12 +-
 .../writer/v3/CarbonFactDataWriterImplV3.java   |   4 +-
 30 files changed, 1671 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/bbb10922/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
index 94b48c6..c8c971d 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
@@ -228,6 +228,10 @@ public class DataMapChooser {
 
   private boolean contains(DataMapMeta mapMeta, List 
columnExpressions,
   Set expressionTypes) {
+if (mapMeta.getOptimizedOperation().contains(ExpressionType.TEXT_MATCH)) {
+  // TODO: fix it with right logic
+  return true;
+}
 if (mapMeta.getIndexedColumns().size() == 0 || columnExpressions.size() == 
0) {
   return false;
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/bbb10922/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
index e57a841..ab339e8 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
@@ -16,6 +16,7 @@
  */
 package org.apache.carbondata.core.datamap;
 
+import java.io.IOException;
 import java.util.ArrayList;
 import java.util.HashMap;
 import java.util.List;
@@ -144,7 +145,7 @@ public final class DataMapStoreManager {
* The datamap is created using datamap name, datamap factory class and 
table identifier.
   

[20/54] [abbrv] carbondata git commit: [CARBONDATA-2023][DataLoad] Add size base block allocation in data loading

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/8d8b589e/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/RowComparator.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/RowComparator.java
 
b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/RowComparator.java
new file mode 100644
index 000..0ae0b93
--- /dev/null
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/RowComparator.java
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.processing.sort.sortdata;
+
+import java.nio.ByteBuffer;
+import java.util.Comparator;
+
+import org.apache.carbondata.core.datastore.row.WriteStepRowUtil;
+import org.apache.carbondata.core.util.ByteUtil.UnsafeComparer;
+import org.apache.carbondata.core.util.NonDictionaryUtil;
+
+public class RowComparator implements Comparator {
+  /**
+   * noDictionaryCount represent number of no dictionary cols
+   */
+  private int noDictionaryCount;
+
+  /**
+   * noDictionaryColMaping mapping of dictionary dimensions and no dictionary 
dimensions.
+   */
+  private boolean[] noDictionarySortColumnMaping;
+
+  /**
+   * @param noDictionarySortColumnMaping
+   * @param noDictionaryCount
+   */
+  public RowComparator(boolean[] noDictionarySortColumnMaping, int 
noDictionaryCount) {
+this.noDictionaryCount = noDictionaryCount;
+this.noDictionarySortColumnMaping = noDictionarySortColumnMaping;
+  }
+
+  /**
+   * Below method will be used to compare two mdkey
+   */
+  public int compare(Object[] rowA, Object[] rowB) {
+int diff = 0;
+
+int normalIndex = 0;
+int noDictionaryindex = 0;
+
+for (boolean isNoDictionary : noDictionarySortColumnMaping) {
+
+  if (isNoDictionary) {
+byte[] byteArr1 = (byte[]) 
rowA[WriteStepRowUtil.NO_DICTIONARY_AND_COMPLEX];
+
+ByteBuffer buff1 = ByteBuffer.wrap(byteArr1);
+
+// extract a high card dims from complete byte[].
+NonDictionaryUtil
+.extractSingleHighCardDims(byteArr1, noDictionaryindex, 
noDictionaryCount, buff1);
+
+byte[] byteArr2 = (byte[]) 
rowB[WriteStepRowUtil.NO_DICTIONARY_AND_COMPLEX];
+
+ByteBuffer buff2 = ByteBuffer.wrap(byteArr2);
+
+// extract a high card dims from complete byte[].
+NonDictionaryUtil
+.extractSingleHighCardDims(byteArr2, noDictionaryindex, 
noDictionaryCount, buff2);
+
+int difference = UnsafeComparer.INSTANCE.compareTo(buff1, buff2);
+if (difference != 0) {
+  return difference;
+}
+noDictionaryindex++;
+  } else {
+int dimFieldA = NonDictionaryUtil.getDimension(normalIndex, rowA);
+int dimFieldB = NonDictionaryUtil.getDimension(normalIndex, rowB);
+diff = dimFieldA - dimFieldB;
+if (diff != 0) {
+  return diff;
+}
+normalIndex++;
+  }
+
+}
+
+return diff;
+  }
+}

http://git-wip-us.apache.org/repos/asf/carbondata/blob/8d8b589e/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/RowComparatorForNormalDims.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/RowComparatorForNormalDims.java
 
b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/RowComparatorForNormalDims.java
new file mode 100644
index 000..0883ae1
--- /dev/null
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/sort/sortdata/RowComparatorForNormalDims.java
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless 

[29/54] [abbrv] carbondata git commit: [CARBONDATA-1997] Add CarbonWriter SDK API

2018-03-08 Thread ravipesala
[CARBONDATA-1997] Add CarbonWriter SDK API

Added a new module called store-sdk, and added a CarbonWriter API, it can be 
used to write Carbondata files to a specified folder, without Spark and Hadoop 
dependency. User can use this API in any environment.

This closes #1967


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/1d827c7b
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/1d827c7b
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/1d827c7b

Branch: refs/heads/master
Commit: 1d827c7b515a21d936769424c282d2aafb1ef6b7
Parents: ce6e71c
Author: Jacky Li 
Authored: Sat Feb 10 19:44:23 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:10 2018 +0530

--
 .../org/apache/carbondata/common/Strings.java   |  40 
 .../apache/carbondata/common/StringsSuite.java  |  53 +
 .../core/metadata/schema/table/CarbonTable.java |   7 +
 .../schema/table/CarbonTableBuilder.java|  72 +++
 .../core/metadata/schema/table/TableSchema.java |   7 +
 .../schema/table/TableSchemaBuilder.java| 107 ++
 .../schema/table/CarbonTableBuilderSuite.java   |  86 
 .../metadata/schema/table/CarbonTableTest.java  |  12 +-
 .../schema/table/TableSchemaBuilderSuite.java   |  56 ++
 .../carbondata/spark/util/DataLoadingUtil.scala |  45 +
 pom.xml |   7 +
 store/sdk/pom.xml   | 130 +
 .../carbondata/sdk/file/CSVCarbonWriter.java|  89 +
 .../carbondata/sdk/file/CarbonWriter.java   |  51 +
 .../sdk/file/CarbonWriterBuilder.java   | 194 +++
 .../org/apache/carbondata/sdk/file/Field.java   |  74 +++
 .../org/apache/carbondata/sdk/file/Schema.java  |  74 +++
 .../sdk/file/CSVCarbonWriterSuite.java  | 127 
 18 files changed, 1225 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/1d827c7b/common/src/main/java/org/apache/carbondata/common/Strings.java
--
diff --git a/common/src/main/java/org/apache/carbondata/common/Strings.java 
b/common/src/main/java/org/apache/carbondata/common/Strings.java
new file mode 100644
index 000..23288dd
--- /dev/null
+++ b/common/src/main/java/org/apache/carbondata/common/Strings.java
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.common;
+
+import java.util.Objects;
+
+public class Strings {
+
+  /**
+   * Provide same function as mkString in Scala.
+   * This is added to avoid JDK 8 dependency.
+   */
+  public static String mkString(String[] strings, String delimeter) {
+Objects.requireNonNull(strings);
+Objects.requireNonNull(delimeter);
+StringBuilder builder = new StringBuilder();
+for (int i = 0; i < strings.length; i++) {
+  builder.append(strings[i]);
+  if (i != strings.length - 1) {
+builder.append(delimeter);
+  }
+}
+return builder.toString();
+  }
+}

http://git-wip-us.apache.org/repos/asf/carbondata/blob/1d827c7b/common/src/test/java/org/apache/carbondata/common/StringsSuite.java
--
diff --git 
a/common/src/test/java/org/apache/carbondata/common/StringsSuite.java 
b/common/src/test/java/org/apache/carbondata/common/StringsSuite.java
new file mode 100644
index 000..65da32b
--- /dev/null
+++ b/common/src/test/java/org/apache/carbondata/common/StringsSuite.java
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at

[39/54] [abbrv] carbondata git commit: [CARBONDATA-2189] Add DataMapProvider developer interface

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/89a12af5/docs/datamap-developer-guide.md
--
diff --git a/docs/datamap-developer-guide.md b/docs/datamap-developer-guide.md
new file mode 100644
index 000..31afd34
--- /dev/null
+++ b/docs/datamap-developer-guide.md
@@ -0,0 +1,16 @@
+# DataMap Developer Guide
+
+### Introduction
+DataMap is a data structure that can be used to accelerate certain query of 
the table. Different DataMap can be implemented by developers. 
+Currently, there are two 2 types of DataMap supported:
+1. IndexDataMap: DataMap that leveraging index to accelerate filter query
+2. MVDataMap: DataMap that leveraging Materialized View to accelerate olap 
style query, like SPJG query (select, predicate, join, groupby)
+
+### DataMap provider
+When user issues `CREATE DATAMAP dm ON TABLE main USING 'provider'`, the 
corresponding DataMapProvider implementation will be created and initialized. 
+Currently, the provider string can be:
+1. preaggregate: one type of MVDataMap that do pre-aggregate of single table
+2. timeseries: one type of MVDataMap that do pre-aggregate based on time 
dimension of the table
+3. class name IndexDataMapFactory  implementation: Developer can implement new 
type of IndexDataMap by extending IndexDataMapFactory
+
+When user issues `DROP DATAMAP dm ON TABLE main`, the corresponding 
DataMapProvider interface will be called.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/carbondata/blob/89a12af5/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
--
diff --git 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
index 3bc4547..007ba2f 100644
--- 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
+++ 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
@@ -43,8 +43,7 @@ import org.apache.carbondata.core.datastore.impl.FileFactory;
 import org.apache.carbondata.core.exception.InvalidConfigurationException;
 import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
 import org.apache.carbondata.core.indexstore.PartitionSpec;
-import org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMap;
-import 
org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory;
+import 
org.apache.carbondata.core.indexstore.blockletindex.BlockletIndexDataMapFactory;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
 import org.apache.carbondata.core.metadata.ColumnarFormatVersion;
 import org.apache.carbondata.core.metadata.schema.PartitionInfo;
@@ -756,7 +755,7 @@ public class CarbonTableInputFormat extends 
FileInputFormat {
   DistributableDataMapFormat datamapDstr =
   new DistributableDataMapFormat(absoluteTableIdentifier, 
dataMapExprWrapper,
   segmentIds, partitionsToPrune,
-  BlockletDataMapFactory.class.getName());
+  BlockletIndexDataMapFactory.class.getName());
   prunedBlocklets = dataMapJob.execute(datamapDstr, resolver);
   // Apply expression on the blocklets.
   prunedBlocklets = dataMapExprWrapper.pruneBlocklets(prunedBlocklets);

http://git-wip-us.apache.org/repos/asf/carbondata/blob/89a12af5/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
index b1962c1..f208c92 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggCreateCommand.scala
@@ -286,7 +286,7 @@ class TestPreAggCreateCommand extends QueryTest with 
BeforeAndAfterAll {
| GROUP BY dob,name
""".stripMargin)
 }
-assert(e.getMessage.contains(s"$timeSeries keyword missing"))
+assert(e.getMessage.contains("Only 'path' dmproperty is allowed for this 
datamap"))
 sql("DROP TABLE IF EXISTS maintabletime")
   }
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/89a12af5/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateTableSelection.scala
--
diff --git 

[04/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
index 553f85e..773fbd7 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
@@ -17,20 +17,15 @@
 package org.apache.carbondata.core.scan.result.iterator;
 
 import org.apache.carbondata.common.CarbonIterator;
-import org.apache.carbondata.common.logging.LogService;
-import org.apache.carbondata.common.logging.LogServiceFactory;
-import org.apache.carbondata.core.scan.result.BatchResult;
+import org.apache.carbondata.core.scan.result.RowBatch;
 
 public class PartitionSpliterRawResultIterator extends 
CarbonIterator {
 
-  private CarbonIterator iterator;
-  private BatchResult batch;
+  private CarbonIterator iterator;
+  private RowBatch batch;
   private int counter;
 
-  private static final LogService LOGGER =
-  
LogServiceFactory.getLogService(PartitionSpliterRawResultIterator.class.getName());
-
-  public PartitionSpliterRawResultIterator(CarbonIterator 
iterator) {
+  public PartitionSpliterRawResultIterator(CarbonIterator iterator) {
 this.iterator = iterator;
   }
 
@@ -65,7 +60,7 @@ public class PartitionSpliterRawResultIterator extends 
CarbonIterator
* @param batch
* @return
*/
-  private boolean checkBatchEnd(BatchResult batch) {
+  private boolean checkBatchEnd(RowBatch batch) {
 return !(counter < batch.getSize());
   }
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/RawResultIterator.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/RawResultIterator.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/RawResultIterator.java
index 70d0958..1dd1595 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/RawResultIterator.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/RawResultIterator.java
@@ -21,7 +21,7 @@ import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.keygenerator.KeyGenException;
-import org.apache.carbondata.core.scan.result.BatchResult;
+import org.apache.carbondata.core.scan.result.RowBatch;
 import org.apache.carbondata.core.scan.wrappers.ByteArrayWrapper;
 
 /**
@@ -37,7 +37,7 @@ public class RawResultIterator extends 
CarbonIterator {
   /**
* Iterator of the Batch raw result.
*/
-  private CarbonIterator detailRawQueryResultIterator;
+  private CarbonIterator detailRawQueryResultIterator;
 
   /**
* Counter to maintain the row counter.
@@ -55,9 +55,9 @@ public class RawResultIterator extends 
CarbonIterator {
   /**
* batch of the result.
*/
-  private BatchResult batch;
+  private RowBatch batch;
 
-  public RawResultIterator(CarbonIterator 
detailRawQueryResultIterator,
+  public RawResultIterator(CarbonIterator 
detailRawQueryResultIterator,
   SegmentProperties sourceSegProperties, SegmentProperties 
destinationSegProperties) {
 this.detailRawQueryResultIterator = detailRawQueryResultIterator;
 this.sourceSegProperties = sourceSegProperties;
@@ -155,7 +155,7 @@ public class RawResultIterator extends 
CarbonIterator {
* @param batch
* @return
*/
-  private boolean checkIfBatchIsProcessedCompletely(BatchResult batch) {
+  private boolean checkIfBatchIsProcessedCompletely(RowBatch batch) {
 if (counter < batch.getSize()) {
   return false;
 } else {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/VectorDetailQueryResultIterator.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/VectorDetailQueryResultIterator.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/VectorDetailQueryResultIterator.java
index cc9710e..c7cb00d 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/VectorDetailQueryResultIterator.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/VectorDetailQueryResultIterator.java
@@ -35,10 +35,12 

[08/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/ExcludeColGroupFilterExecuterImpl.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/ExcludeColGroupFilterExecuterImpl.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/ExcludeColGroupFilterExecuterImpl.java
index 9391ebd..44f7c07 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/ExcludeColGroupFilterExecuterImpl.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/ExcludeColGroupFilterExecuterImpl.java
@@ -16,20 +16,10 @@
  */
 package org.apache.carbondata.core.scan.filter.executer;
 
-import java.util.ArrayList;
 import java.util.BitSet;
-import java.util.List;
 
-import org.apache.carbondata.common.logging.LogService;
-import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
-import org.apache.carbondata.core.datastore.chunk.DimensionColumnDataChunk;
-import org.apache.carbondata.core.keygenerator.KeyGenException;
-import org.apache.carbondata.core.keygenerator.KeyGenerator;
-import org.apache.carbondata.core.scan.executor.infos.KeyStructureInfo;
-import org.apache.carbondata.core.scan.executor.util.QueryUtil;
 import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.DimColumnResolvedFilterInfo;
-import org.apache.carbondata.core.util.ByteUtil;
 
 /**
  * It checks if filter is required on given block and if required, it does
@@ -38,12 +28,6 @@ import org.apache.carbondata.core.util.ByteUtil;
 public class ExcludeColGroupFilterExecuterImpl extends 
ExcludeFilterExecuterImpl {
 
   /**
-   * LOGGER
-   */
-  private static final LogService LOGGER =
-  
LogServiceFactory.getLogService(ExcludeColGroupFilterExecuterImpl.class.getName());
-
-  /**
* @param dimColResolvedFilterInfo
* @param segmentProperties
*/
@@ -53,54 +37,6 @@ public class ExcludeColGroupFilterExecuterImpl extends 
ExcludeFilterExecuterImpl
   }
 
   /**
-   * It fills BitSet with row index which matches filter key
-   */
-  protected BitSet getFilteredIndexes(DimensionColumnDataChunk 
dimensionColumnDataChunk,
-  int numerOfRows) {
-BitSet bitSet = new BitSet(numerOfRows);
-bitSet.flip(0, numerOfRows);
-try {
-  KeyStructureInfo keyStructureInfo = getKeyStructureInfo();
-  byte[][] filterValues = dimColumnExecuterInfo.getExcludeFilterKeys();
-  for (int i = 0; i < filterValues.length; i++) {
-byte[] filterVal = filterValues[i];
-for (int rowId = 0; rowId < numerOfRows; rowId++) {
-  byte[] colData = new 
byte[keyStructureInfo.getMaskByteRanges().length];
-  dimensionColumnDataChunk.fillChunkData(colData, 0, rowId, 
keyStructureInfo);
-  if (ByteUtil.UnsafeComparer.INSTANCE.compareTo(filterVal, colData) 
== 0) {
-bitSet.flip(rowId);
-  }
-}
-  }
-
-} catch (Exception e) {
-  LOGGER.error(e);
-}
-
-return bitSet;
-  }
-
-  /**
-   * It is required for extracting column data from columngroup chunk
-   *
-   * @return
-   * @throws KeyGenException
-   */
-  private KeyStructureInfo getKeyStructureInfo() throws KeyGenException {
-int colGrpId = getColumnGroupId(dimColEvaluatorInfo.getColumnIndex());
-KeyGenerator keyGenerator = 
segmentProperties.getColumnGroupAndItsKeygenartor().get(colGrpId);
-List mdKeyOrdinal = new ArrayList();
-mdKeyOrdinal.add(getMdkeyOrdinal(dimColEvaluatorInfo.getColumnIndex(), 
colGrpId));
-int[] maskByteRanges = 
QueryUtil.getMaskedByteRangeBasedOrdinal(mdKeyOrdinal, keyGenerator);
-byte[] maxKey = QueryUtil.getMaxKeyBasedOnOrinal(mdKeyOrdinal, 
keyGenerator);
-KeyStructureInfo restructureInfos = new KeyStructureInfo();
-restructureInfos.setKeyGenerator(keyGenerator);
-restructureInfos.setMaskByteRanges(maskByteRanges);
-restructureInfos.setMaxKey(maxKey);
-return restructureInfos;
-  }
-
-  /**
* Check if scan is required on given block based on min and max value
*/
   public BitSet isScanRequired(byte[][] blkMaxVal, byte[][] blkMinVal) {
@@ -109,25 +45,4 @@ public class ExcludeColGroupFilterExecuterImpl extends 
ExcludeFilterExecuterImpl
 return bitSet;
   }
 
-  private int getMdkeyOrdinal(int ordinal, int colGrpId) {
-return segmentProperties.getColumnGroupMdKeyOrdinal(colGrpId, ordinal);
-  }
-
-  private int getColumnGroupId(int ordinal) {
-int[][] columnGroups = segmentProperties.getColumnGroups();
-int colGrpId = -1;
-for (int i = 0; i < columnGroups.length; i++) {
-  if (columnGroups[i].length > 1) {
-colGrpId++;
-if (QueryUtil.searchInArray(columnGroups[i], ordinal)) {
-  break;
-}
-  }
-}
-return colGrpId;
-  }
-
-  public KeyGenerator getKeyGenerator(int 

[06/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeLessThanFiterExecuterImpl.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeLessThanFiterExecuterImpl.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeLessThanFiterExecuterImpl.java
index 447ab46..547ecaa 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeLessThanFiterExecuterImpl.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeLessThanFiterExecuterImpl.java
@@ -22,7 +22,7 @@ import java.util.List;
 
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
-import org.apache.carbondata.core.datastore.chunk.DimensionColumnDataChunk;
+import org.apache.carbondata.core.datastore.chunk.DimensionColumnPage;
 import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk;
 import org.apache.carbondata.core.datastore.chunk.impl.MeasureRawColumnChunk;
 import org.apache.carbondata.core.datastore.page.ColumnPage;
@@ -35,12 +35,11 @@ import org.apache.carbondata.core.metadata.encoder.Encoding;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
 import org.apache.carbondata.core.scan.expression.Expression;
-import 
org.apache.carbondata.core.scan.expression.exception.FilterUnsupportedException;
 import org.apache.carbondata.core.scan.filter.FilterUtil;
 import org.apache.carbondata.core.scan.filter.intf.RowIntf;
 import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.DimColumnResolvedFilterInfo;
 import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.MeasureColumnResolvedFilterInfo;
-import org.apache.carbondata.core.scan.processor.BlocksChunkHolder;
+import org.apache.carbondata.core.scan.processor.RawBlockletColumnChunks;
 import org.apache.carbondata.core.util.BitSetGroup;
 import org.apache.carbondata.core.util.ByteUtil;
 import org.apache.carbondata.core.util.CarbonUtil;
@@ -73,7 +72,7 @@ public class RowLevelRangeLessThanFiterExecuterImpl extends 
RowLevelFilterExecut
   comparator = 
Comparator.getComparatorByDataTypeForMeasure(measure.getDataType());
 }
 ifDefaultValueMatchesFilter();
-if (isDimensionPresentInCurrentBlock[0] == true) {
+if (isDimensionPresentInCurrentBlock[0]) {
   isNaturalSorted = 
dimColEvaluatorInfoList.get(0).getDimension().isUseInvertedIndex()
   && dimColEvaluatorInfoList.get(0).getDimension().isSortColumn();
 }
@@ -120,11 +119,11 @@ public class RowLevelRangeLessThanFiterExecuterImpl 
extends RowLevelFilterExecut
 boolean isScanRequired = false;
 if (isMeasurePresentInCurrentBlock[0] || 
isDimensionPresentInCurrentBlock[0]) {
   if (isMeasurePresentInCurrentBlock[0]) {
-minValue = blockMinValue[measureBlocksIndex[0] + 
lastDimensionColOrdinal];
+minValue = blockMinValue[measureChunkIndex[0] + 
lastDimensionColOrdinal];
 isScanRequired =
 isScanRequired(minValue, msrFilterRangeValues, 
msrColEvalutorInfoList.get(0).getType());
   } else {
-minValue = blockMinValue[dimensionBlocksIndex[0]];
+minValue = blockMinValue[dimensionChunkIndex[0]];
 isScanRequired = isScanRequired(minValue, filterRangeValues);
   }
 } else {
@@ -170,67 +169,69 @@ public class RowLevelRangeLessThanFiterExecuterImpl 
extends RowLevelFilterExecut
   }
 
   @Override
-  public BitSetGroup applyFilter(BlocksChunkHolder blockChunkHolder, boolean 
useBitsetPipeLine)
-  throws FilterUnsupportedException, IOException {
+  public BitSetGroup applyFilter(RawBlockletColumnChunks 
rawBlockletColumnChunks,
+  boolean useBitsetPipeLine) throws IOException {
 // select all rows if dimension does not exists in the current block
 if (!isDimensionPresentInCurrentBlock[0] && 
!isMeasurePresentInCurrentBlock[0]) {
-  int numberOfRows = blockChunkHolder.getDataBlock().nodeSize();
+  int numberOfRows = rawBlockletColumnChunks.getDataBlock().numRows();
   return FilterUtil
-  
.createBitSetGroupWithDefaultValue(blockChunkHolder.getDataBlock().numberOfPages(),
+  
.createBitSetGroupWithDefaultValue(rawBlockletColumnChunks.getDataBlock().numberOfPages(),
   numberOfRows, true);
 }
 if (isDimensionPresentInCurrentBlock[0]) {
-  int blockIndex =
-  
segmentProperties.getDimensionOrdinalToBlockMapping().get(dimensionBlocksIndex[0]);
-  if (null == blockChunkHolder.getDimensionRawDataChunk()[blockIndex]) {
-blockChunkHolder.getDimensionRawDataChunk()[blockIndex] = 
blockChunkHolder.getDataBlock()
-

[26/54] [abbrv] carbondata git commit: [CARBONDATA-2159] Remove carbon-spark dependency in store-sdk module

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/89cfd8e0/processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModelBuilder.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModelBuilder.java
 
b/processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModelBuilder.java
new file mode 100644
index 000..fbb93b6
--- /dev/null
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModelBuilder.java
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.processing.loading.model;
+
+import java.io.IOException;
+import java.text.SimpleDateFormat;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.common.Maps;
+import org.apache.carbondata.common.Strings;
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.common.constants.LoggerAction;
+import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn;
+import org.apache.carbondata.core.util.CarbonProperties;
+import org.apache.carbondata.core.util.CarbonUtil;
+import 
org.apache.carbondata.processing.loading.constants.DataLoadProcessorConstants;
+import org.apache.carbondata.processing.loading.csvinput.CSVInputFormat;
+import org.apache.carbondata.processing.loading.sort.SortScopeOptions;
+import org.apache.carbondata.processing.util.TableOptionConstant;
+
+import org.apache.commons.lang.StringUtils;
+import org.apache.hadoop.conf.Configuration;
+
+/**
+ * Builder for {@link CarbonLoadModel}
+ */
+@InterfaceAudience.Developer
+public class CarbonLoadModelBuilder {
+
+  private CarbonTable table;
+
+  public CarbonLoadModelBuilder(CarbonTable table) {
+this.table = table;
+  }
+
+  /**
+   * build CarbonLoadModel for data loading
+   * @param options Load options from user input
+   * @return a new CarbonLoadModel instance
+   */
+  public CarbonLoadModel build(
+  Map options) throws InvalidLoadOptionException, 
IOException {
+Map optionsFinal = 
LoadOption.fillOptionWithDefaultValue(options);
+optionsFinal.put("sort_scope", "no_sort");
+if (!options.containsKey("fileheader")) {
+  List csvHeader = 
table.getCreateOrderColumn(table.getTableName());
+  String[] columns = new String[csvHeader.size()];
+  for (int i = 0; i < columns.length; i++) {
+columns[i] = csvHeader.get(i).getColName();
+  }
+  optionsFinal.put("fileheader", Strings.mkString(columns, ","));
+}
+CarbonLoadModel model = new CarbonLoadModel();
+
+// we have provided 'fileheader', so it hadoopConf can be null
+build(options, optionsFinal, model, null);
+
+// set default values
+
model.setTimestampformat(CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT);
+model.setDateFormat(CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT);
+model.setUseOnePass(Boolean.parseBoolean(Maps.getOrDefault(options, 
"onepass", "false")));
+model.setDictionaryServerHost(Maps.getOrDefault(options, "dicthost", 
null));
+try {
+  
model.setDictionaryServerPort(Integer.parseInt(Maps.getOrDefault(options, 
"dictport", "-1")));
+} catch (NumberFormatException e) {
+  throw new InvalidLoadOptionException(e.getMessage());
+}
+return model;
+  }
+
+  /**
+   * build CarbonLoadModel for data loading
+   * @param options Load options from user input
+   * @param optionsFinal Load options that populated with default values for 
optional options
+   * @param carbonLoadModel The output load model
+   * @param hadoopConf hadoopConf is needed to read CSV header if there 
'fileheader' is not set in
+   *   user provided load options
+   */
+  public void build(
+  Map options,
+  Map optionsFinal,
+  CarbonLoadModel 

[40/54] [abbrv] carbondata git commit: [CARBONDATA-2189] Add DataMapProvider developer interface

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/89a12af5/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletIndexDataMap.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletIndexDataMap.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletIndexDataMap.java
new file mode 100644
index 000..34e11ac
--- /dev/null
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletIndexDataMap.java
@@ -0,0 +1,971 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.indexstore.blockletindex;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.DataInputStream;
+import java.io.DataOutput;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.io.UnsupportedEncodingException;
+import java.math.BigDecimal;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.BitSet;
+import java.util.Comparator;
+import java.util.List;
+
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.cache.Cacheable;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datamap.dev.DataMapModel;
+import 
org.apache.carbondata.core.datamap.dev.cgdatamap.AbstractCoarseGrainIndexDataMap;
+import org.apache.carbondata.core.datastore.IndexKey;
+import org.apache.carbondata.core.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.datastore.block.TableBlockInfo;
+import org.apache.carbondata.core.indexstore.BlockMetaInfo;
+import org.apache.carbondata.core.indexstore.Blocklet;
+import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
+import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
+import org.apache.carbondata.core.indexstore.PartitionSpec;
+import org.apache.carbondata.core.indexstore.UnsafeMemoryDMStore;
+import org.apache.carbondata.core.indexstore.row.DataMapRow;
+import org.apache.carbondata.core.indexstore.row.DataMapRowImpl;
+import org.apache.carbondata.core.indexstore.schema.CarbonRowSchema;
+import org.apache.carbondata.core.memory.MemoryException;
+import org.apache.carbondata.core.metadata.blocklet.BlockletInfo;
+import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
+import org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex;
+import org.apache.carbondata.core.metadata.blocklet.index.BlockletMinMaxIndex;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
+import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
+import org.apache.carbondata.core.scan.filter.FilterExpressionProcessor;
+import org.apache.carbondata.core.scan.filter.FilterUtil;
+import org.apache.carbondata.core.scan.filter.executer.FilterExecuter;
+import 
org.apache.carbondata.core.scan.filter.executer.ImplicitColumnFilterExecutor;
+import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
+import org.apache.carbondata.core.util.ByteUtil;
+import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.core.util.DataFileFooterConverter;
+import org.apache.carbondata.core.util.DataTypeUtil;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.xerial.snappy.Snappy;
+
+/**
+ * Datamap implementation for blocklet.
+ */
+public class BlockletIndexDataMap extends AbstractCoarseGrainIndexDataMap 
implements Cacheable {
+
+  private static final LogService LOGGER =
+  LogServiceFactory.getLogService(BlockletIndexDataMap.class.getName());
+
+  private static int KEY_INDEX = 0;
+
+  private static int MIN_VALUES_INDEX = 1;
+
+  private static int MAX_VALUES_INDEX = 2;
+
+  private static int ROW_COUNT_INDEX = 3;
+
+  private static int FILE_PATH_INDEX = 4;
+
+  private static 

[32/54] [abbrv] carbondata git commit: [CARBONDATA-1114][Tests] Fix bugs in tests in windows env

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/859d71c1/processing/src/test/java/org/apache/carbondata/lcm/locks/LocalFileLockTest.java
--
diff --git 
a/processing/src/test/java/org/apache/carbondata/lcm/locks/LocalFileLockTest.java
 
b/processing/src/test/java/org/apache/carbondata/lcm/locks/LocalFileLockTest.java
index 812db5c..8d5f3d4 100644
--- 
a/processing/src/test/java/org/apache/carbondata/lcm/locks/LocalFileLockTest.java
+++ 
b/processing/src/test/java/org/apache/carbondata/lcm/locks/LocalFileLockTest.java
@@ -65,7 +65,7 @@ public class LocalFileLockTest {
 
 Assert.assertTrue(localLock1.unlock());
 Assert.assertTrue(localLock2.lock());
-
+Assert.assertTrue(localLock2.unlock());
   }
 
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/859d71c1/processing/src/test/java/org/apache/carbondata/processing/loading/csvinput/CSVInputFormatTest.java
--
diff --git 
a/processing/src/test/java/org/apache/carbondata/processing/loading/csvinput/CSVInputFormatTest.java
 
b/processing/src/test/java/org/apache/carbondata/processing/loading/csvinput/CSVInputFormatTest.java
index d89f10d..30d9da4 100644
--- 
a/processing/src/test/java/org/apache/carbondata/processing/loading/csvinput/CSVInputFormatTest.java
+++ 
b/processing/src/test/java/org/apache/carbondata/processing/loading/csvinput/CSVInputFormatTest.java
@@ -176,6 +176,7 @@ public class CSVInputFormatTest extends TestCase {
 FileOutputFormat.setOutputPath(job, new Path(output.getCanonicalPath()));
 
 Assert.assertTrue(job.waitForCompletion(true));
+deleteOutput(output);
   }
 
   private void prepareConf(Configuration conf) {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/859d71c1/store/sdk/pom.xml
--
diff --git a/store/sdk/pom.xml b/store/sdk/pom.xml
index 54fba55..b3dd464 100644
--- a/store/sdk/pom.xml
+++ b/store/sdk/pom.xml
@@ -7,7 +7,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.3.0-SNAPSHOT
+1.4.0-SNAPSHOT
 ../../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/859d71c1/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
--
diff --git 
a/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java 
b/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
index dc5696a..df6afc6 100644
--- 
a/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
+++ 
b/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
@@ -23,7 +23,7 @@ import java.util.UUID;
 
 import org.apache.carbondata.common.annotations.InterfaceAudience;
 import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
-import org.apache.carbondata.processing.loading.csvinput.StringArrayWritable;
+import org.apache.carbondata.hadoop.internal.ObjectArrayWritable;
 import org.apache.carbondata.processing.loading.model.CarbonLoadModel;
 
 import org.apache.hadoop.conf.Configuration;
@@ -42,9 +42,9 @@ import 
org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl;
 @InterfaceAudience.Internal
 class CSVCarbonWriter extends CarbonWriter {
 
-  private RecordWriter recordWriter;
+  private RecordWriter recordWriter;
   private TaskAttemptContext context;
-  private StringArrayWritable writable;
+  private ObjectArrayWritable writable;
 
   CSVCarbonWriter(CarbonLoadModel loadModel) throws IOException {
 Configuration hadoopConf = new Configuration();
@@ -57,7 +57,7 @@ class CSVCarbonWriter extends CarbonWriter {
 TaskAttemptContextImpl context = new TaskAttemptContextImpl(hadoopConf, 
attemptID);
 this.recordWriter = format.getRecordWriter(context);
 this.context = context;
-this.writable = new StringArrayWritable();
+this.writable = new ObjectArrayWritable();
   }
 
   /**



[10/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/metadata/blocklet/SegmentInfo.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/metadata/blocklet/SegmentInfo.java
 
b/core/src/main/java/org/apache/carbondata/core/metadata/blocklet/SegmentInfo.java
index 0cb2918..099fffd 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/metadata/blocklet/SegmentInfo.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/metadata/blocklet/SegmentInfo.java
@@ -29,31 +29,12 @@ public class SegmentInfo implements Serializable {
   private static final long serialVersionUID = -174987462709431L;
 
   /**
-   * number of column in the segment
-   */
-  private int numberOfColumns;
-
-  /**
* cardinality of each columns
* column which is not participating in the multidimensional key cardinality 
will be -1;
*/
   private int[] columnCardinality;
 
   /**
-   * @return the numberOfColumns
-   */
-  public int getNumberOfColumns() {
-return numberOfColumns;
-  }
-
-  /**
-   * @param numberOfColumns the numberOfColumns to set
-   */
-  public void setNumberOfColumns(int numberOfColumns) {
-this.numberOfColumns = numberOfColumns;
-  }
-
-  /**
* @return the columnCardinality
*/
   public int[] getColumnCardinality() {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
index 6036569..d17d865 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
@@ -19,7 +19,13 @@ package org.apache.carbondata.core.metadata.schema.table;
 
 import java.io.IOException;
 import java.io.Serializable;
-import java.util.*;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
 
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
@@ -33,7 +39,10 @@ import 
org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
 import 
org.apache.carbondata.core.metadata.schema.table.column.CarbonImplicitDimension;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
 import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import org.apache.carbondata.core.scan.model.QueryProjection;
 import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.core.util.DataTypeConverter;
 import org.apache.carbondata.core.util.DataTypeUtil;
 import org.apache.carbondata.core.util.path.CarbonTablePath;
 
@@ -136,10 +145,7 @@ public class CarbonTable implements Serializable {
   /**
* During creation of TableInfo from hivemetastore the DataMapSchemas and 
the columns
* DataTypes are not converted to the appropriate child classes.
-   *
* This method will cast the same to the appropriate classes
-   *
-   * @param tableInfo
*/
   public static void updateTableInfo(TableInfo tableInfo) {
 List dataMapSchemas = new ArrayList<>();
@@ -153,8 +159,9 @@ public class CarbonTable implements Serializable {
 }
 tableInfo.setDataMapSchemaList(dataMapSchemas);
 for (ColumnSchema columnSchema : 
tableInfo.getFactTable().getListOfColumns()) {
-  columnSchema.setDataType(DataTypeUtil.valueOf(columnSchema.getDataType(),
-  columnSchema.getPrecision(), columnSchema.getScale()));
+  columnSchema.setDataType(
+  DataTypeUtil.valueOf(
+  columnSchema.getDataType(), columnSchema.getPrecision(), 
columnSchema.getScale()));
 }
 List childSchema = tableInfo.getDataMapSchemaList();
 for (DataMapSchema dataMapSchema : childSchema) {
@@ -168,10 +175,11 @@ public class CarbonTable implements Serializable {
   }
 }
 if (tableInfo.getFactTable().getBucketingInfo() != null) {
-  for (ColumnSchema columnSchema : tableInfo.getFactTable()
-  .getBucketingInfo().getListOfColumns()) {
-
columnSchema.setDataType(DataTypeUtil.valueOf(columnSchema.getDataType(),
-columnSchema.getPrecision(), columnSchema.getScale()));
+  for (ColumnSchema columnSchema :
+  tableInfo.getFactTable().getBucketingInfo().getListOfColumns()) {
+columnSchema.setDataType(
+DataTypeUtil.valueOf(
+columnSchema.getDataType(), columnSchema.getPrecision(), 

[02/54] [abbrv] carbondata git commit: [CARBONDATA-2099] Refactor query scan process to improve readability

2018-03-08 Thread ravipesala
http://git-wip-us.apache.org/repos/asf/carbondata/blob/daa64650/core/src/test/java/org/apache/carbondata/core/util/CarbonUtilTest.java
--
diff --git 
a/core/src/test/java/org/apache/carbondata/core/util/CarbonUtilTest.java 
b/core/src/test/java/org/apache/carbondata/core/util/CarbonUtilTest.java
index f4450e3..5f8d199 100644
--- a/core/src/test/java/org/apache/carbondata/core/util/CarbonUtilTest.java
+++ b/core/src/test/java/org/apache/carbondata/core/util/CarbonUtilTest.java
@@ -31,7 +31,7 @@ import java.util.Map;
 
 import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datastore.block.TableBlockInfo;
-import 
org.apache.carbondata.core.datastore.chunk.impl.FixedLengthDimensionDataChunk;
+import 
org.apache.carbondata.core.datastore.chunk.impl.FixedLengthDimensionColumnPage;
 import org.apache.carbondata.core.datastore.columnar.ColumnGroupModel;
 import org.apache.carbondata.core.datastore.filesystem.LocalCarbonFile;
 import org.apache.carbondata.core.datastore.impl.FileFactory;
@@ -45,7 +45,7 @@ import org.apache.carbondata.core.metadata.encoder.Encoding;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
 import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
-import org.apache.carbondata.core.scan.model.QueryDimension;
+import org.apache.carbondata.core.scan.model.ProjectionDimension;
 
 import mockit.Mock;
 import mockit.MockUp;
@@ -267,8 +267,8 @@ public class CarbonUtilTest {
   @Test public void testToGetNextLesserValue() {
 byte[] dataChunks = { 5, 6, 7, 8, 9 };
 byte[] compareValues = { 7 };
-FixedLengthDimensionDataChunk fixedLengthDataChunk =
-new FixedLengthDimensionDataChunk(dataChunks, null, null, 5, 1);
+FixedLengthDimensionColumnPage fixedLengthDataChunk =
+new FixedLengthDimensionColumnPage(dataChunks, null, null, 5, 1);
 int result = CarbonUtil.nextLesserValueToTarget(2, fixedLengthDataChunk, 
compareValues);
 assertEquals(result, 1);
   }
@@ -276,8 +276,8 @@ public class CarbonUtilTest {
   @Test public void testToGetNextLesserValueToTarget() {
 byte[] dataChunks = { 7, 7, 7, 8, 9 };
 byte[] compareValues = { 7 };
-FixedLengthDimensionDataChunk fixedLengthDataChunk =
-new FixedLengthDimensionDataChunk(dataChunks, null, null, 5, 1);
+FixedLengthDimensionColumnPage fixedLengthDataChunk =
+new FixedLengthDimensionColumnPage(dataChunks, null, null, 5, 1);
 int result = CarbonUtil.nextLesserValueToTarget(2, fixedLengthDataChunk, 
compareValues);
 assertEquals(result, -1);
   }
@@ -285,8 +285,8 @@ public class CarbonUtilTest {
   @Test public void testToGetnextGreaterValue() {
 byte[] dataChunks = { 5, 6, 7, 8, 9 };
 byte[] compareValues = { 7 };
-FixedLengthDimensionDataChunk fixedLengthDataChunk =
-new FixedLengthDimensionDataChunk(dataChunks, null, null, 5, 1);
+FixedLengthDimensionColumnPage fixedLengthDataChunk =
+new FixedLengthDimensionColumnPage(dataChunks, null, null, 5, 1);
 int result = CarbonUtil.nextGreaterValueToTarget(2, fixedLengthDataChunk, 
compareValues, 5);
 assertEquals(result, 3);
   }
@@ -302,8 +302,8 @@ public class CarbonUtilTest {
   @Test public void testToGetnextGreaterValueToTarget() {
 byte[] dataChunks = { 5, 6, 7, 7, 7 };
 byte[] compareValues = { 7 };
-FixedLengthDimensionDataChunk fixedLengthDataChunk =
-new FixedLengthDimensionDataChunk(dataChunks, null, null, 5, 1);
+FixedLengthDimensionColumnPage fixedLengthDataChunk =
+new FixedLengthDimensionColumnPage(dataChunks, null, null, 5, 1);
 int result = CarbonUtil.nextGreaterValueToTarget(2, fixedLengthDataChunk, 
compareValues, 5);
 assertEquals(result, 5);
   }
@@ -525,23 +525,23 @@ public class CarbonUtilTest {
   }
 
   @Test public void testToGetDictionaryEncodingArray() {
-QueryDimension column1 = new QueryDimension("Column1");
-QueryDimension column2 = new QueryDimension("Column2");
 ColumnSchema column1Schema = new ColumnSchema();
 ColumnSchema column2Schema = new ColumnSchema();
 column1Schema.setColumnName("Column1");
 List encoding = new ArrayList<>();
 encoding.add(Encoding.DICTIONARY);
 column1Schema.setEncodingList(encoding);
-column1.setDimension(new CarbonDimension(column1Schema, 1, 1, 1, 1));
+ProjectionDimension
+column1 = new ProjectionDimension(new CarbonDimension(column1Schema, 
1, 1, 1, 1));
 
 column2Schema.setColumnName("Column2");
 List encoding2 = new ArrayList<>();
 encoding2.add(Encoding.DELTA);
 column2Schema.setEncodingList(encoding2);
-column2.setDimension(new CarbonDimension(column2Schema, 1, 1, 1, 1));
+ProjectionDimension
+column2 = new ProjectionDimension(new CarbonDimension(column2Schema, 
1, 1, 1, 1));
 
-

[28/54] [abbrv] carbondata git commit: [CARBONDATA-2159] Remove carbon-spark dependency in store-sdk module

2018-03-08 Thread ravipesala
[CARBONDATA-2159] Remove carbon-spark dependency in store-sdk module

To make assembling JAR of store-sdk module, it should not depend on 
carbon-spark module

This closes #1970


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/89cfd8e0
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/89cfd8e0
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/89cfd8e0

Branch: refs/heads/master
Commit: 89cfd8e06e783c610ff8cf28e86c8501b450da37
Parents: 1d827c7
Author: Jacky Li 
Authored: Sun Feb 11 21:37:04 2018 +0800
Committer: ravipesala 
Committed: Thu Mar 8 22:21:10 2018 +0530

--
 .../java/org/apache/carbondata/common/Maps.java |  39 ++
 .../org/apache/carbondata/common/Strings.java   |   3 +
 .../exceptions/TableStatusLockException.java|  34 ++
 .../sql/InvalidLoadOptionException.java |  33 +
 .../sql/MalformedCarbonCommandException.java|  75 +++
 .../sql/MalformedDataMapCommandException.java   |  37 ++
 .../exceptions/sql/NoSuchDataMapException.java  |  39 ++
 .../carbondata/core/datamap/TableDataMap.java   |   5 +-
 .../exception/ConcurrentOperationException.java |  50 ++
 .../statusmanager/SegmentStatusManager.java | 124 
 .../carbondata/core/util/DeleteLoadFolders.java | 210 +++
 .../preaggregate/TestPreAggCreateCommand.scala  |   2 +-
 .../preaggregate/TestPreAggregateDrop.scala |   2 +-
 .../timeseries/TestTimeSeriesCreateTable.scala  |   2 +-
 .../timeseries/TestTimeSeriesDropSuite.scala|   2 +-
 .../TestTimeseriesTableSelection.scala  |   2 +-
 .../TestDataLoadWithColumnsMoreThanSchema.scala |   3 +-
 .../dataload/TestGlobalSortDataLoad.scala   |   2 +-
 .../TestLoadDataWithDiffTimestampFormat.scala   |   2 +-
 .../TestLoadDataWithFileHeaderException.scala   |  11 +-
 ...ataWithMalformedCarbonCommandException.scala |   3 +-
 .../testsuite/dataload/TestLoadOptions.scala|   2 +-
 .../dataload/TestTableLevelBlockSize.scala  |   4 +-
 .../testsuite/datamap/TestDataMapCommand.scala  |   2 +-
 .../dataretention/DataRetentionTestCase.scala   |   2 +-
 .../spark/testsuite/datetype/DateTypeTest.scala |   2 +-
 .../testsuite/sortcolumns/TestSortColumns.scala |   3 +-
 integration/spark-common/pom.xml|   5 -
 .../exception/ConcurrentOperationException.java |  38 --
 .../MalformedCarbonCommandException.java|  69 ---
 .../MalformedDataMapCommandException.java   |  32 -
 .../spark/exception/NoSuchDataMapException.java |  33 -
 .../org/apache/carbondata/api/CarbonStore.scala |   3 +-
 .../spark/CarbonColumnValidator.scala   |   8 +-
 .../carbondata/spark/load/ValidateUtil.scala|  72 ---
 .../carbondata/spark/rdd/CarbonMergerRDD.scala  |   6 +-
 .../carbondata/spark/util/CommonUtil.scala  |  70 +--
 .../carbondata/spark/util/DataLoadingUtil.scala | 610 ---
 .../spark/util/GlobalDictionaryUtil.scala   |   2 +-
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala |   2 +-
 .../spark/rdd/CarbonDataRDDFactory.scala|   4 +-
 .../spark/rdd/CarbonTableCompactor.scala|   2 +-
 .../org/apache/spark/sql/CarbonSource.scala |   2 +-
 .../datamap/CarbonCreateDataMapCommand.scala|   2 +-
 .../datamap/CarbonDropDataMapCommand.scala  |   2 +-
 .../CarbonAlterTableCompactionCommand.scala |  13 +-
 .../management/CarbonLoadDataCommand.scala  |  17 +-
 .../CarbonProjectForDeleteCommand.scala |   2 +-
 .../CarbonProjectForUpdateCommand.scala |   2 +-
 .../command/mutation/IUDCommonUtil.scala|   2 +-
 .../CreatePreAggregateTableCommand.scala|   7 +-
 .../preaaggregate/PreAggregateUtil.scala|   2 +-
 .../schema/CarbonAlterTableRenameCommand.scala  |   3 +-
 .../command/timeseries/TimeSeriesUtil.scala |   2 +-
 .../datasources/CarbonFileFormat.scala  |  14 +-
 .../sql/execution/strategy/DDLStrategy.scala|   2 +-
 .../strategy/StreamingTableStrategy.scala   |   2 +-
 .../execution/command/CarbonHiveCommands.scala  |   2 +-
 .../sql/parser/CarbonSpark2SqlParser.scala  |   2 +-
 .../spark/sql/parser/CarbonSparkSqlParser.scala |   2 +-
 .../org/apache/spark/util/AlterTableUtil.scala  |   2 +-
 .../org/apache/spark/util/TableAPIUtil.scala|   2 +-
 .../spark/sql/hive/CarbonSessionState.scala |   7 +-
 .../segmentreading/TestSegmentReading.scala |   2 +-
 .../spark/util/AllDictionaryTestCase.scala  |   4 +-
 .../util/ExternalColumnDictionaryTestCase.scala |   6 +-
 .../TestStreamingTableOperation.scala   |   4 +-
 .../bucketing/TableBucketingTestCase.scala  |   2 +-
 .../vectorreader/AddColumnTestCases.scala   |   2 +-
 .../loading/model/CarbonLoadModel.java  |  14 +-
 .../loading/model/CarbonLoadModelBuilder.java   | 322 ++
 .../processing/loading/model/LoadOption.java| 251