Re: Unable to perform compaction,

2016-10-26 Thread prabhatkashyap
Hi Liang,

Sorry for the late reply, 

*For auto compaction:*

I've set my default threshold for compaction and set 


and for force compaction I've used the *ALTER* Query



But on both of the cases it is showing me error






--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Unable-to-perform-compaction-tp2099p2349.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


[jira] [Created] (CARBONDATA-338) Remove the method arguments as they are never used inside the method

2016-10-26 Thread Shivansh (JIRA)
Shivansh created CARBONDATA-338:
---

 Summary: Remove the method arguments as they are never used inside 
the method
 Key: CARBONDATA-338
 URL: https://issues.apache.org/jira/browse/CARBONDATA-338
 Project: CarbonData
  Issue Type: Improvement
  Components: core
Reporter: Shivansh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #208: [CARBONDATA-284][WIP] Abstracting in...

2016-10-26 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/208#discussion_r85061025
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/internal/index/memory/InMemoryBTreeIndex.java
 ---
@@ -0,0 +1,214 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.carbondata.hadoop.internal.index.memory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.core.carbon.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.carbon.datastore.DataRefNode;
+import org.apache.carbondata.core.carbon.datastore.DataRefNodeFinder;
+import org.apache.carbondata.core.carbon.datastore.IndexKey;
+import org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore;
+import org.apache.carbondata.core.carbon.datastore.block.AbstractIndex;
+import org.apache.carbondata.core.carbon.datastore.block.BlockletInfos;
+import org.apache.carbondata.core.carbon.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo;
+import 
org.apache.carbondata.core.carbon.datastore.exception.IndexBuilderException;
+import 
org.apache.carbondata.core.carbon.datastore.impl.btree.BTreeDataRefNodeFinder;
+import 
org.apache.carbondata.core.carbon.datastore.impl.btree.BlockBTreeLeafNode;
+import org.apache.carbondata.core.carbon.querystatistics.QueryStatistic;
+import 
org.apache.carbondata.core.carbon.querystatistics.QueryStatisticsConstants;
+import 
org.apache.carbondata.core.carbon.querystatistics.QueryStatisticsRecorder;
+import org.apache.carbondata.core.keygenerator.KeyGenException;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.hadoop.internal.index.Index;
+import org.apache.carbondata.hadoop.internal.segment.Segment;
+import org.apache.carbondata.hadoop.util.CarbonInputFormatUtil;
+import 
org.apache.carbondata.scan.executor.exception.QueryExecutionException;
+import org.apache.carbondata.scan.filter.FilterExpressionProcessor;
+import org.apache.carbondata.scan.filter.FilterUtil;
+import org.apache.carbondata.scan.filter.resolver.FilterResolverIntf;
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.JobContext;
+
+class InMemoryBTreeIndex implements Index {
--- End diff --

I understand InMemoryBTreeIndex  is segment level's index.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #208: [CARBONDATA-284][WIP] Abstracting in...

2016-10-26 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/208#discussion_r85061184
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/internal/index/memory/InMemoryBTreeIndex.java
 ---
@@ -0,0 +1,220 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.carbondata.hadoop.internal.index.memory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.core.carbon.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.carbon.datastore.DataRefNode;
+import org.apache.carbondata.core.carbon.datastore.DataRefNodeFinder;
+import org.apache.carbondata.core.carbon.datastore.IndexKey;
+import org.apache.carbondata.core.carbon.datastore.SegmentTaskIndexStore;
+import org.apache.carbondata.core.carbon.datastore.block.AbstractIndex;
+import org.apache.carbondata.core.carbon.datastore.block.BlockletInfos;
+import org.apache.carbondata.core.carbon.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo;
+import 
org.apache.carbondata.core.carbon.datastore.exception.IndexBuilderException;
+import 
org.apache.carbondata.core.carbon.datastore.impl.btree.BTreeDataRefNodeFinder;
+import 
org.apache.carbondata.core.carbon.datastore.impl.btree.BlockBTreeLeafNode;
+import org.apache.carbondata.core.carbon.querystatistics.QueryStatistic;
+import 
org.apache.carbondata.core.carbon.querystatistics.QueryStatisticsConstants;
+import 
org.apache.carbondata.core.carbon.querystatistics.QueryStatisticsRecorder;
+import org.apache.carbondata.core.keygenerator.KeyGenException;
+import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.hadoop.internal.index.Index;
+import org.apache.carbondata.hadoop.internal.segment.Segment;
+import org.apache.carbondata.hadoop.util.CarbonInputFormatUtil;
+import 
org.apache.carbondata.scan.executor.exception.QueryExecutionException;
+import org.apache.carbondata.scan.filter.FilterExpressionProcessor;
+import org.apache.carbondata.scan.filter.FilterUtil;
+import org.apache.carbondata.scan.filter.resolver.FilterResolverIntf;
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.JobContext;
+
+class InMemoryBTreeIndex implements Index {
+
+  private static final Log LOG = 
LogFactory.getLog(InMemoryBTreeIndex.class);
+  private Segment segment;
+
+  InMemoryBTreeIndex(Segment segment) {
+this.segment = segment;
+  }
+
+  @Override
+  public String getName() {
+return null;
+  }
+
+  @Override
+  public List filter(JobContext job, FilterResolverIntf filter)
+  throws IOException {
+
+List result = new LinkedList();
+
+FilterExpressionProcessor filterExpressionProcessor = new 
FilterExpressionProcessor();
+
+AbsoluteTableIdentifier absoluteTableIdentifier = null;
+
//CarbonInputFormatUtil.getAbsoluteTableIdentifier(job.getConfiguration());
+
+//for this segment fetch blocks matching filter in BTree
+List dataRefNodes = null;
+try {
+  dataRefNodes = getDataBlocksOfSegment(job, 
filterExpressionProcessor, absoluteTableIdentifier,
+  filter, segment.getId());
+} catch (IndexBuilderException e) {
+  throw new IOException(e.getMessage());
+}
+for (DataRefNode dataRefNode : dataRefNodes) {
+  BlockBTreeLeafNode leafNode = (BlockBTreeLeafNode) dataRefNode;
+  TableBlockInfo tableBlockInfo = leafNode.getTableBlockInfo();
+  result.add(new CarbonInputSplit(segme

[GitHub] incubator-carbondata pull request #258: [CARBONDATA-338] Removed the unused ...

2016-10-26 Thread shiv4nsh
GitHub user shiv4nsh opened a pull request:

https://github.com/apache/incubator-carbondata/pull/258

[CARBONDATA-338] Removed the unused value inside the method

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shiv4nsh/incubator-carbondata 
improvement/CARBONDATA-338

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/258.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #258


commit 97cdfdc6bd4fc112253437628683d8fbdaab8c6f
Author: Knoldus 
Date:   2016-10-26T08:01:35Z

Removed the unused value inside the method




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #250: CARBONDATA-330: Fix compiler warning...

2016-10-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/250


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #258: [CARBONDATA-338] Removed the unused ...

2016-10-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/258


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #257: [CARBONDATA-337] Inverted Index Spel...

2016-10-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/257


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-339) Align storePath name in generateGlobalDictionary() of GlobalDictionaryUtil.scala

2016-10-26 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-339:
-

 Summary: Align storePath name in generateGlobalDictionary() of 
GlobalDictionaryUtil.scala
 Key: CARBONDATA-339
 URL: https://issues.apache.org/jira/browse/CARBONDATA-339
 Project: CarbonData
  Issue Type: Bug
Reporter: Liang Chen
Assignee: Liang Chen
Priority: Trivial


Align storePath name in generateGlobalDictionary() of 
GlobalDictionaryUtil.scala: Change all "hdfsLocation" to "storePath".

I can support any path, not only hdfs path,need to change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


B-Tree LRU cache (New Feature)

2016-10-26 Thread Mohammad shahid khan
Hi All,
Please find the problem and proposed solution.

*B-Tree LRU Cache:*

Problem:

CarbonData is maintaining two level of B-Tree cache, one at the driver
level and another at executor level.  Currently CarbonData has the
mechanism to invalidate the segments and blocks cache for the invalid table
segments, but there is no eviction policy for the unused cached object. So
the instance at which complete memory is utilized then the system will not
be able to process any new requests.

*Solution:*

In the cache maintained at the driver level and at the executor there must
be objects in cache currently not in use. Therefore system should have the
mechanism to below mechanism.

1.   Set the max memory limit till which objects could be hold in the
memory.

2.   When configured memory limit reached then identify the cached
objects currently not in use so that the required memory could be freed
without impacting the existing process.

3.   Eviction should be done only till the required memory is not meet.

For details please refer to attachments.


Regards.

Shahid


[GitHub] incubator-carbondata pull request #259: Fix constants and method names

2016-10-26 Thread Zhangshunyu
GitHub user Zhangshunyu opened a pull request:

https://github.com/apache/incubator-carbondata/pull/259

Fix constants and method names

## Why raise this pr?
To rename some constants and method names, for example:
It is hard to get clear about what the parameter is used for 
'carbon.number.of.cores', cores for what?
It is hard to get clear about what the method is used for 
'getNumberOfCores', query or load cores?
etc
## How to test?
Pass all the test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Zhangshunyu/incubator-carbondata constants

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/259.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #259


commit 8a3c1b4758a93d7e5b7c1d983f9a9309995f4c79
Author: Zhangshunyu 
Date:   2016-10-26T13:53:21Z

Fix constans




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #251: [CARBONDATA-302]Added Writer process...

2016-10-26 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/251#discussion_r85157225
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/writer/DataWriterProcessorStepImpl.java
 ---
@@ -0,0 +1,360 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.carbondata.processing.newflow.steps.writer;
+
+import java.io.File;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.carbon.CarbonTableIdentifier;
+import org.apache.carbondata.core.carbon.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.carbon.metadata.CarbonMetadata;
+import org.apache.carbondata.core.carbon.metadata.schema.table.CarbonTable;
+import 
org.apache.carbondata.core.carbon.metadata.schema.table.column.ColumnSchema;
+import org.apache.carbondata.core.carbon.path.CarbonStorePath;
+import org.apache.carbondata.core.carbon.path.CarbonTablePath;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.constants.IgnoreDictionary;
+import org.apache.carbondata.core.keygenerator.KeyGenerator;
+import org.apache.carbondata.core.util.CarbonProperties;
+import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory;
+import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.processing.datatypes.GenericDataType;
+import 
org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep;
+import 
org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration;
+import org.apache.carbondata.processing.newflow.DataField;
+import 
org.apache.carbondata.processing.newflow.constants.DataLoadProcessorConstants;
+import 
org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException;
+import org.apache.carbondata.processing.newflow.row.CarbonRow;
+import org.apache.carbondata.processing.newflow.row.CarbonRowBatch;
+import org.apache.carbondata.processing.store.CarbonDataFileAttributes;
+import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel;
+import org.apache.carbondata.processing.store.CarbonFactHandler;
+import org.apache.carbondata.processing.store.CarbonFactHandlerFactory;
+import 
org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException;
+import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
+
+/**
+ * It reads data from sorted files which are generated in previous sort 
step.
+ * And it writes data to carbondata file. It also generates mdk key while 
writing to carbondata file
+ */
+public class DataWriterProcessorStepImpl extends 
AbstractDataLoadProcessorStep {
+
+  private static final LogService LOGGER =
+  
LogServiceFactory.getLogService(DataWriterProcessorStepImpl.class.getName());
+
+  private String storeLocation;
+
+  private boolean[] isUseInvertedIndex;
+
+  private int[] dimLens;
+
+  private int dimensionCount;
+
+  private List wrapperColumnSchema;
+
+  private int[] colCardinality;
+
+  private SegmentProperties segmentProperties;
+
+  private KeyGenerator keyGenerator;
+
+  private CarbonFactHandler dataHandler;
+
+  private Map complexIndexMap;
+
+  private int noDictionaryCount;
+
+  private int complexDimensionCount;
+
+  private int measureCount;
+
+  private long readCounter;
+
+  private long writeCounter;
+
+  private int measureIndex = 
IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex();
+
+  private int noDimByteArrayIndex = 
IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex();
+
+  private int dimsArrayIndex = 
IgnoreDictionary.DIMENSION_INDEX_I

[GitHub] incubator-carbondata pull request #251: [CARBONDATA-302]Added Writer process...

2016-10-26 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/251#discussion_r85159146
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactHandlerFactory.java
 ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.carbondata.processing.store;
+
+/**
+ * Factory class for CarbonFactHandler.
+ */
+public final class CarbonFactHandlerFactory {
+
+  /**
+   * Creating fact handler to write data.
+   * @param model
+   * @param handlerType
+   * @return
+   */
+  public static CarbonFactHandler 
createCarbonFactHandler(CarbonFactDataHandlerModel model,
--- End diff --

One doubt, in `CarbonFactDataHandlerColumnar.addDataToStore`, why semaphore 
is needed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #251: [CARBONDATA-302]Added Writer process...

2016-10-26 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/251#discussion_r85159483
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
 ---
@@ -304,4 +311,92 @@ public static String getLocalDataFolderLocation(String 
databaseName, String tabl
 return ArrayUtils
 .toPrimitive(noDictionaryMapping.toArray(new 
Boolean[noDictionaryMapping.size()]));
   }
+
+  /**
+   * Preparing the boolean [] to map whether the dimension use inverted 
index or not.
+   */
+  public static boolean[] getIsUseInvertedIndex(DataField[] fields) {
+List isUseInvertedIndexList = new ArrayList();
+for (DataField field : fields) {
+  if (field.getColumn().isUseInvertedIndnex() && 
field.getColumn().isDimesion()) {
+isUseInvertedIndexList.add(true);
+  } else if(field.getColumn().isDimesion()){
+isUseInvertedIndexList.add(false);
+  }
+}
+return ArrayUtils
+.toPrimitive(isUseInvertedIndexList.toArray(new 
Boolean[isUseInvertedIndexList.size()]));
+  }
+
+  private static String getComplexTypeString(DataField[] dataFields) {
+StringBuilder dimString = new StringBuilder();
+for (int i = 0; i < dataFields.length; i++) {
+  DataField dataField = dataFields[i];
+  if (dataField.getColumn().getDataType().equals(DataType.ARRAY) || 
dataField.getColumn()
+  .getDataType().equals(DataType.STRUCT)) {
+addAllComplexTypeChildren((CarbonDimension) dataField.getColumn(), 
dimString, "");
+dimString.append(CarbonCommonConstants.SEMICOLON_SPC_CHARACTER);
+  }
+}
+return dimString.toString();
+  }
+
+  /**
+   * This method will return all the child dimensions under complex 
dimension
+   *
+   */
+  private static void addAllComplexTypeChildren(CarbonDimension dimension, 
StringBuilder dimString,
+  String parent) {
+dimString.append(
+dimension.getColName() + CarbonCommonConstants.COLON_SPC_CHARACTER 
+ dimension.getDataType()
--- End diff --

change `+` to append


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #260: Add one FQA in Readme

2016-10-26 Thread bill1208
GitHub user bill1208 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/260

Add one FQA in Readme

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bill1208/incubator-carbondata master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/260.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #260


commit 4ba5bf16383bbca15d7273457c083b3c65137d34
Author: 周广成 
Date:   2016-10-15T16:10:04Z

Create test1

commit cedf7c30dd3a9fec7b9a92ff9d4fac180da73b74
Author: 周广成 
Date:   2016-10-15T16:10:33Z

Create test2

commit d7dfa6ee53b89bee65768333ff1a9e622a35eab5
Author: 周广成 
Date:   2016-10-15T16:11:18Z

Create parameter01

commit 47b4c09e643a41fbb910bf826d8e129844795a46
Author: 周广成 
Date:   2016-10-15T16:11:51Z

Create parameter02

commit f6e778ad3681321b08fbe9d4e160a2933916d245
Author: 周广成 
Date:   2016-10-15T16:13:01Z

Update test1

commit 15f3a9988a3d44efe3e640c6ec295b5fef6cfa64
Author: 周广成 
Date:   2016-10-16T11:58:15Z

Delete parameter01

commit d53a0db63014c85d248dfc3647fb3a96a81e1b72
Author: 周广成 
Date:   2016-10-16T11:58:24Z

Delete parameter02

commit a963ff4552855605ec851e9b8c9750a9b45a6f49
Author: 周广成 
Date:   2016-10-16T11:58:33Z

Delete test1

commit aa256de5ed4e4efbbb654bf0c9e81cc92db7ced5
Author: 周广成 
Date:   2016-10-16T11:58:42Z

Delete test2

commit 0becfb6ce038fecb220c5ce270843bddfa9acfc1
Author: 周广成 
Date:   2016-10-15T16:10:04Z

Create test1

commit 351bc337402d62da0252ade3c6b060fbff1e0631
Author: 周广成 
Date:   2016-10-15T16:10:33Z

Create test2

commit 8aca24399b04e2fb695c91a0ff801a3d5814904a
Author: 周广成 
Date:   2016-10-15T16:11:18Z

Create parameter01

commit 6d7cf10a986680309cb271e059cf71a190892157
Author: 周广成 
Date:   2016-10-15T16:11:51Z

Create parameter02

commit e0eb15f70689cbbcce4e267f6f7f104f0d349874
Author: 周广成 
Date:   2016-10-15T16:13:01Z

Update test1

commit 4a194152d56f917ca972d93396c62efb76ae9c93
Author: 周广成 
Date:   2016-10-16T11:58:15Z

Delete parameter01

commit eb035a2ee99bb84c63736a00d4da44598fe55477
Author: 周广成 
Date:   2016-10-16T11:58:24Z

Delete parameter02

commit f6eb83092c27c26eec79ceb98c76c0db1e0035fc
Author: 周广成 
Date:   2016-10-16T11:58:33Z

Delete test1

commit 1eca56cfdb3300f043fec8759431ab2adc3f65ff
Author: 周广成 
Date:   2016-10-16T11:58:42Z

Delete test2

commit 94b1d55abaf1659ff15cc54232fb2da5877604e4
Author: bill1208 
Date:   2016-10-18T17:04:42Z

add the doc suggestion to create the carbondata table

commit 914512e9b6069c1138af41de1956cac59d86853f
Author: bill1208 
Date:   2016-10-18T17:05:17Z

merge branch 'master' of https://github.com/bill1208/incubator-carbondata

2016-10-19 01:10:35: I need merger the code from githup bill1208 to my
local master

commit 514f09c0e56a38a087becfb992ed264d2f5450a1
Author: bill1208 
Date:   2016-10-18T17:19:44Z

add the doc suggestion to create the carbondata table

commit bb2048e01562aeb0103cfa3bdfa390911d478d4f
Author: bill1208 
Date:   2016-10-22T17:13:58Z

add the suggestion file

commit 1e8a72ce3b1c14c3269243a51e452a890b39208b
Author: bill1208 
Date:   2016-10-22T17:18:55Z

modify the ruby

commit be031df64b82b4ea900c3a5773a96383ba767a47
Author: bill1208 
Date:   2016-10-22T17:21:19Z

finish all the ruby formatted

commit 6b3eab2dd9ed06512daaee62eee1854d3b72a7da
Author: bill1208 
Date:   2016-10-22T17:23:26Z

modify the last sentense

commit 254d2d20ce85a7b71482435674a241a838d38470
Author: 周广成 
Date:   2016-10-22T17:29:00Z

Delete Suggestion-To-Create-Carbon-Table.md

commit 298171c8304f5f

Podling Report Reminder - November 2016

2016-10-26 Thread johndament
Dear podling,

This email was sent by an automated system on behalf of the Apache
Incubator PMC. It is an initial reminder to give you plenty of time to
prepare your quarterly board report.

The board meeting is scheduled for Wed, 16 November 2016, 10:30 am PDT.
The report for your podling will form a part of the Incubator PMC
report. The Incubator PMC requires your report to be submitted 2 weeks
before the board meeting, to allow sufficient time for review and
submission (Wed, November 02).

Please submit your report with sufficient time to allow the Incubator
PMC, and subsequently board members to review and digest. Again, the
very latest you should submit your report is 2 weeks prior to the board
meeting.

Thanks,

The Apache Incubator PMC

Submitting your Report

--

Your report should contain the following:

*   Your project name
*   A brief description of your project, which assumes no knowledge of
the project or necessarily of its field
*   A list of the three most important issues to address in the move
towards graduation.
*   Any issues that the Incubator PMC or ASF Board might wish/need to be
aware of
*   How has the community developed since the last report
*   How has the project developed since the last report.

This should be appended to the Incubator Wiki page at:

http://wiki.apache.org/incubator/November2016

Note: This is manually populated. You may need to wait a little before
this page is created from a template.

Mentors
---

Mentors should review reports for their project(s) and sign them off on
the Incubator wiki page. Signing off reports shows that you are
following the project - projects that are not signed may raise alarms
for the Incubator PMC.

Incubator PMC


[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-26 Thread lion-x
Github user lion-x commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r85248921
  
--- Diff: 
processing/src/test/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/TimeStampDirectDictionaryGeneratorTest.java
 ---
@@ -37,7 +37,7 @@
   private int surrogateKey = -1;
 
   @Before public void setUp() throws Exception {
-TimeStampDirectDictionaryGenerator generator = 
TimeStampDirectDictionaryGenerator.instance;
+TimeStampDirectDictionaryGenerator generator = new 
TimeStampDirectDictionaryGenerator(CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT);
--- End diff --

This file is a test file, I think the TimeStampDirectDictionaryGenerator 
should be set 'CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT' for 
testing. pls check again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-26 Thread lion-x
Github user lion-x commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r85249559
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/TimeStampDirectDictionaryGenerator.java
 ---
@@ -39,37 +39,32 @@
  */
 public class TimeStampDirectDictionaryGenerator implements 
DirectDictionaryGenerator {
 
-  private TimeStampDirectDictionaryGenerator() {
+  private ThreadLocal threadLocal = new ThreadLocal<>();
 
-  }
-
-  public static TimeStampDirectDictionaryGenerator instance =
-  new TimeStampDirectDictionaryGenerator();
+  private String dateFormat;
 
   /**
* The value of 1 unit of the SECOND, MINUTE, HOUR, or DAY in millis.
*/
-  public static final long granularityFactor;
+  public  long granularityFactor;
   /**
* The date timestamp to be considered as start date for calculating the 
timestamp
* java counts the number of milliseconds from  start of "January 1, 
1970", this property is
* customized the start of position. for example "January 1, 2000"
*/
-  public static final long cutOffTimeStamp;
+  public  long cutOffTimeStamp;
   /**
* Logger instance
*/
+
   private static final LogService LOGGER =
-  
LogServiceFactory.getLogService(TimeStampDirectDictionaryGenerator.class.getName());
+  
LogServiceFactory.getLogService(TimeStampDirectDictionaryGenerator.class.getName());
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-26 Thread lion-x
Github user lion-x commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r85250472
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/CarbonCSVBasedSeqGenMeta.java
 ---
@@ -651,6 +654,7 @@ public void setDefault() {
 columnSchemaDetails = "";
 columnsDataTypeString="";
 tableOption = "";
+dateFormat = CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT;
--- End diff --

ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-26 Thread lion-x
Github user lion-x commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r85255469
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/CarbonCSVBasedSeqGenStep.java
 ---
@@ -470,6 +474,36 @@ public boolean processRow(StepMetaInterface smi, 
StepDataInterface sdi) throws K
   break;
   }
 }
+HashMap dateformatsHashMap = new HashMap();
+if (meta.dateFormat != null) {
+  String[] dateformats = meta.dateFormat.split(",");
+  for (String dateFormat:dateformats) {
+String[] dateFormatSplits = dateFormat.split(":", 2);
+
dateformatsHashMap.put(dateFormatSplits[0],dateFormatSplits[1]);
+// TODO  verify the dateFormatSplits is valid or not
+  }
+}
+directDictionaryGenerators =
+new 
DirectDictionaryGenerator[meta.getDimensionColumnIds().length];
+for (int i = 0; i < meta.getDimensionColumnIds().length; i++) {
+  ColumnSchemaDetails columnSchemaDetails = 
columnSchemaDetailsWrapper.get(
+  meta.getDimensionColumnIds()[i]);
+  if (columnSchemaDetails.isDirectDictionary()) {
+if 
(dateformatsHashMap.containsKey(columnSchemaDetails.getColumnName())) {
+  directDictionaryGenerators[i] =
+  
DirectDictionaryKeyGeneratorFactory.getDirectDictionaryGenerator(
+  columnSchemaDetails.getColumnType(),
+  
dateformatsHashMap.get(columnSchemaDetails.getColumnName()));
+} else {
+  String dateFormat = CarbonProperties.getInstance()
+  
.getProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
+  
CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT);
+  directDictionaryGenerators[i] =
+  
DirectDictionaryKeyGeneratorFactory.getDirectDictionaryGenerator(
+  columnSchemaDetails.getColumnType(), 
dateFormat);
--- End diff --

OK


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-26 Thread lion-x
Github user lion-x commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r85256460
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/CarbonCSVBasedSeqGenMeta.java
 ---
@@ -111,7 +110,7 @@
   /**
* timeFormat
*/
-  protected SimpleDateFormat timeFormat;
+  protected String dateFormat;
--- End diff --

ok



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #261: fix issue carbondata-339

2016-10-26 Thread hseagle
GitHub user hseagle opened a pull request:

https://github.com/apache/incubator-carbondata/pull/261

fix issue carbondata-339

fix jira issue carbondata-339, replace hdfsLocation with storePath in the 
function generateGlobalDictionary


https://issues.apache.org/jira/browse/CARBONDATA-339

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hseagle/incubator-carbondata carbondata-339

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #261


commit 64d4d6daaf6e8adede6cfffe94221d20f365631c
Author: hseagle 
Date:   2016-10-27T02:55:53Z

fix issue carbondata-339




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #262: [CARBONDATA-308] [WIP] Use CarbonInp...

2016-10-26 Thread jackylk
GitHub user jackylk opened a pull request:

https://github.com/apache/incubator-carbondata/pull/262

[CARBONDATA-308] [WIP] Use CarbonInputFormat in CarbonScanRDD compute

Use CarbonInputFormat in CarbonScanRDD compute function

1. In driver side, only getSplit is required, so only filter condition is 
required, no need to create full QueryModel object, so creation of QueryModel 
is moved from driver side to executor side.
2. use CarbonInputFormat.createRecordReader in CarbonScanRDD.compute 
instead of use 
QueryExecutor directly

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata scanrdd

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #262


commit ef4a889db9b86653c273794c9a810a9cd9683437
Author: jackylk 
Date:   2016-10-22T18:43:53Z

use CarbonInputFormat in executor

commit a5c17f523c7127b538cc2d384cbff4fa454a007a
Author: jackylk 
Date:   2016-10-27T04:01:36Z

modify getPartition




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #251: [CARBONDATA-302]Added Writer process...

2016-10-26 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/251#discussion_r85267443
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactHandlerFactory.java
 ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.carbondata.processing.store;
+
+/**
+ * Factory class for CarbonFactHandler.
+ */
+public final class CarbonFactHandlerFactory {
+
+  /**
+   * Creating fact handler to write data.
+   * @param model
+   * @param handlerType
+   * @return
+   */
+  public static CarbonFactHandler 
createCarbonFactHandler(CarbonFactDataHandlerModel model,
--- End diff --

Yes, I don't see the advantage of using semaphore here because we are 
already using fixed thread pool to control the threads. I will discuss with 
team and confirm whether it is needed. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #251: [CARBONDATA-302]Added Writer process...

2016-10-26 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/251#discussion_r85267495
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/writer/DataWriterProcessorStepImpl.java
 ---
@@ -0,0 +1,360 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.carbondata.processing.newflow.steps.writer;
+
+import java.io.File;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.carbon.CarbonTableIdentifier;
+import org.apache.carbondata.core.carbon.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.carbon.metadata.CarbonMetadata;
+import org.apache.carbondata.core.carbon.metadata.schema.table.CarbonTable;
+import 
org.apache.carbondata.core.carbon.metadata.schema.table.column.ColumnSchema;
+import org.apache.carbondata.core.carbon.path.CarbonStorePath;
+import org.apache.carbondata.core.carbon.path.CarbonTablePath;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.constants.IgnoreDictionary;
+import org.apache.carbondata.core.keygenerator.KeyGenerator;
+import org.apache.carbondata.core.util.CarbonProperties;
+import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory;
+import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.processing.datatypes.GenericDataType;
+import 
org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep;
+import 
org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration;
+import org.apache.carbondata.processing.newflow.DataField;
+import 
org.apache.carbondata.processing.newflow.constants.DataLoadProcessorConstants;
+import 
org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException;
+import org.apache.carbondata.processing.newflow.row.CarbonRow;
+import org.apache.carbondata.processing.newflow.row.CarbonRowBatch;
+import org.apache.carbondata.processing.store.CarbonDataFileAttributes;
+import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel;
+import org.apache.carbondata.processing.store.CarbonFactHandler;
+import org.apache.carbondata.processing.store.CarbonFactHandlerFactory;
+import 
org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException;
+import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
+
+/**
+ * It reads data from sorted files which are generated in previous sort 
step.
+ * And it writes data to carbondata file. It also generates mdk key while 
writing to carbondata file
+ */
+public class DataWriterProcessorStepImpl extends 
AbstractDataLoadProcessorStep {
+
+  private static final LogService LOGGER =
+  
LogServiceFactory.getLogService(DataWriterProcessorStepImpl.class.getName());
+
+  private String storeLocation;
+
+  private boolean[] isUseInvertedIndex;
+
+  private int[] dimLens;
+
+  private int dimensionCount;
+
+  private List wrapperColumnSchema;
+
+  private int[] colCardinality;
+
+  private SegmentProperties segmentProperties;
+
+  private KeyGenerator keyGenerator;
+
+  private CarbonFactHandler dataHandler;
+
+  private Map complexIndexMap;
+
+  private int noDictionaryCount;
+
+  private int complexDimensionCount;
+
+  private int measureCount;
+
+  private long readCounter;
+
+  private long writeCounter;
+
+  private int measureIndex = 
IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex();
+
+  private int noDimByteArrayIndex = 
IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex();
+
+  private int dimsArrayIndex = 
IgnoreDictionary.DIMENSION_INDE

[GitHub] incubator-carbondata pull request #251: [CARBONDATA-302]Added Writer process...

2016-10-26 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/251#discussion_r85270229
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
 ---
@@ -304,4 +311,92 @@ public static String getLocalDataFolderLocation(String 
databaseName, String tabl
 return ArrayUtils
 .toPrimitive(noDictionaryMapping.toArray(new 
Boolean[noDictionaryMapping.size()]));
   }
+
+  /**
+   * Preparing the boolean [] to map whether the dimension use inverted 
index or not.
+   */
+  public static boolean[] getIsUseInvertedIndex(DataField[] fields) {
+List isUseInvertedIndexList = new ArrayList();
+for (DataField field : fields) {
+  if (field.getColumn().isUseInvertedIndnex() && 
field.getColumn().isDimesion()) {
+isUseInvertedIndexList.add(true);
+  } else if(field.getColumn().isDimesion()){
+isUseInvertedIndexList.add(false);
+  }
+}
+return ArrayUtils
+.toPrimitive(isUseInvertedIndexList.toArray(new 
Boolean[isUseInvertedIndexList.size()]));
+  }
+
+  private static String getComplexTypeString(DataField[] dataFields) {
+StringBuilder dimString = new StringBuilder();
+for (int i = 0; i < dataFields.length; i++) {
+  DataField dataField = dataFields[i];
+  if (dataField.getColumn().getDataType().equals(DataType.ARRAY) || 
dataField.getColumn()
+  .getDataType().equals(DataType.STRUCT)) {
+addAllComplexTypeChildren((CarbonDimension) dataField.getColumn(), 
dimString, "");
+dimString.append(CarbonCommonConstants.SEMICOLON_SPC_CHARACTER);
+  }
+}
+return dimString.toString();
+  }
+
+  /**
+   * This method will return all the child dimensions under complex 
dimension
+   *
+   */
+  private static void addAllComplexTypeChildren(CarbonDimension dimension, 
StringBuilder dimString,
+  String parent) {
+dimString.append(
+dimension.getColName() + CarbonCommonConstants.COLON_SPC_CHARACTER 
+ dimension.getDataType()
--- End diff --

ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #251: [CARBONDATA-302]Added Writer process...

2016-10-26 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/251#discussion_r85270264
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/writer/DataWriterProcessorStepImpl.java
 ---
@@ -0,0 +1,360 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.carbondata.processing.newflow.steps.writer;
+
+import java.io.File;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.carbon.CarbonTableIdentifier;
+import org.apache.carbondata.core.carbon.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.carbon.metadata.CarbonMetadata;
+import org.apache.carbondata.core.carbon.metadata.schema.table.CarbonTable;
+import 
org.apache.carbondata.core.carbon.metadata.schema.table.column.ColumnSchema;
+import org.apache.carbondata.core.carbon.path.CarbonStorePath;
+import org.apache.carbondata.core.carbon.path.CarbonTablePath;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.constants.IgnoreDictionary;
+import org.apache.carbondata.core.keygenerator.KeyGenerator;
+import org.apache.carbondata.core.util.CarbonProperties;
+import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory;
+import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.processing.datatypes.GenericDataType;
+import 
org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep;
+import 
org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration;
+import org.apache.carbondata.processing.newflow.DataField;
+import 
org.apache.carbondata.processing.newflow.constants.DataLoadProcessorConstants;
+import 
org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException;
+import org.apache.carbondata.processing.newflow.row.CarbonRow;
+import org.apache.carbondata.processing.newflow.row.CarbonRowBatch;
+import org.apache.carbondata.processing.store.CarbonDataFileAttributes;
+import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel;
+import org.apache.carbondata.processing.store.CarbonFactHandler;
+import org.apache.carbondata.processing.store.CarbonFactHandlerFactory;
+import 
org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException;
+import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
+
+/**
+ * It reads data from sorted files which are generated in previous sort 
step.
+ * And it writes data to carbondata file. It also generates mdk key while 
writing to carbondata file
+ */
+public class DataWriterProcessorStepImpl extends 
AbstractDataLoadProcessorStep {
+
+  private static final LogService LOGGER =
+  
LogServiceFactory.getLogService(DataWriterProcessorStepImpl.class.getName());
+
+  private String storeLocation;
+
+  private boolean[] isUseInvertedIndex;
+
+  private int[] dimLens;
+
+  private int dimensionCount;
+
+  private List wrapperColumnSchema;
+
+  private int[] colCardinality;
+
+  private SegmentProperties segmentProperties;
+
+  private KeyGenerator keyGenerator;
+
+  private CarbonFactHandler dataHandler;
+
+  private Map complexIndexMap;
+
+  private int noDictionaryCount;
+
+  private int complexDimensionCount;
+
+  private int measureCount;
+
+  private long readCounter;
+
+  private long writeCounter;
+
+  private int measureIndex = 
IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex();
+
+  private int noDimByteArrayIndex = 
IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex();
+
+  private int dimsArrayIndex = 
IgnoreDictionary.DIMENSION_INDE