[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat edited a comment on pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#issuecomment-669844507


   @akkio-97 : update limitations and TODO clearly
   a. with local dictionary arrays cannot be read now
   b. arrays with other complex type is not supported yet
   c. currently, array is row by row filling, not really vector processing.  
can use offset vector-like ORC
   
https://github.com/prestosql/presto/blob/master/presto-orc/src/main/java/io/prestosql/orc/reader/ListColumnReader.java
   
   I also feel arrayStreamRader and some interface need to cleaned up [I will 
do it with struct support]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#issuecomment-670327844


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3649/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#issuecomment-670327499


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1910/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3883: [CARBONDATA-3940] CommitTask fails due to Rename IOException during L…

2020-08-06 Thread GitBox


QiangCai commented on a change in pull request #3883:
URL: https://github.com/apache/carbondata/pull/3883#discussion_r466786922



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoHadoopFsRelationCommand.scala
##
@@ -104,11 +104,13 @@ case class CarbonInsertIntoHadoopFsRelationCommand(
 val dynamicPartitionOverwrite = enableDynamicOverwrite && mode == 
SaveMode.Overwrite &&
 staticPartitions.size < 
partitionColumns.length
 
-val committer = FileCommitProtocol.instantiate(
-  sparkSession.sessionState.conf.fileCommitProtocolClass,
-  jobId = java.util.UUID.randomUUID().toString,
-  outputPath = outputPath.toString,
-  dynamicPartitionOverwrite = dynamicPartitionOverwrite)
+val committer = fileFormat match {

Review comment:
   better to check whether it is carbondata or carbonfile table in 
DDLStrategy.
   
   if the table is carbonfile table, it should not go to 
CarbonInsertIntoHadoopFsRelationCommand flow.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3874: [CARBONDATA-3931]Fix Secondary index with index column as DateType giving wrong results

2020-08-06 Thread GitBox


QiangCai commented on a change in pull request #3874:
URL: https://github.com/apache/carbondata/pull/3874#discussion_r466781511



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/query/SecondaryIndexQueryResultProcessor.java
##
@@ -249,10 +249,17 @@ private void processResult(List> 
detailQueryResultItera
   private Object[] prepareRowObjectForSorting(Object[] row) {
 ByteArrayWrapper wrapper = (ByteArrayWrapper) row[0];
 // ByteBuffer[] noDictionaryBuffer = new ByteBuffer[noDictionaryCount];
-
 List dimensions = segmentProperties.getDimensions();
 Object[] preparedRow = new Object[dimensions.size() + measureCount];
 
+// get dictionary values for date type
+byte[] dictionaryKey = wrapper.getDictionaryKey();
+int[] keyArray = ByteUtil.convertBytesToIntArray(dictionaryKey);
+Object[] dictionaryValues = new Object[dimensionColumnCount + 
measureCount];
+for (int i = 0; i < keyArray.length; i++) {
+  dictionaryValues[i] = keyArray[i];

Review comment:
   why do this copy





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3881: [HOTFIX] NPE While Data Loading

2020-08-06 Thread GitBox


QiangCai commented on pull request #3881:
URL: https://github.com/apache/carbondata/pull/3881#issuecomment-670270759


   please create an issue in JIRA to describe the issues.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3881: [HOTFIX] NPE While Data Loading

2020-08-06 Thread GitBox


QiangCai commented on a change in pull request #3881:
URL: https://github.com/apache/carbondata/pull/3881#discussion_r466769303



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonRelation.scala
##
@@ -207,7 +207,7 @@ case class CarbonRelation(
   null != validSeg.getLoadMetadataDetails.getIndexSize) {
 size = size + 
validSeg.getLoadMetadataDetails.getDataSize.toLong +
validSeg.getLoadMetadataDetails.getIndexSize.toLong
-  } else {
+  } else if (!carbonTable.isHivePartitionTable) {

Review comment:
   better to find the root cause and fix it. LoadMetadataDetail should have 
data/index size beside of old store.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670150976


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3647/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670145319


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1908/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670143028


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3646/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-670095485


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1907/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-670082507


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3643/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-670079892


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1904/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#issuecomment-670072418


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3645/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#issuecomment-670071253


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1906/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3877: [CARBONDATA-3889] Cleanup duplicated code in carbondata-spark module

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3877:
URL: https://github.com/apache/carbondata/pull/3877#issuecomment-670062362


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3641/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


ajantha-bhat commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670056335


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670055404


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1901/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3877: [CARBONDATA-3889] Cleanup duplicated code in carbondata-spark module

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3877:
URL: https://github.com/apache/carbondata/pull/3877#issuecomment-670053317


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1902/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-670051650


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3640/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3879: [WIP] Handling the addition of geo column to hive at the time of table creation.

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-670005097


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3638/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3943) Handling the addition of geo column to hive at the time of table creation

2020-08-06 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3943:
-

 Summary:  Handling the addition of geo column to hive at the time 
of table creation
 Key: CARBONDATA-3943
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3943
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


 Handling the addition of geo column to hive at the time of table creation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3943) Handling the addition of geo column to hive at the time of table creation

2020-08-06 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-3943:
--
Priority: Minor  (was: Major)

>  Handling the addition of geo column to hive at the time of table creation
> --
>
> Key: CARBONDATA-3943
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3943
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>
>  Handling the addition of geo column to hive at the time of table creation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3879: [WIP] Handling the addition of geo column to hive at the time of table creation.

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-669984068


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1899/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-669968186


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1898/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [WIP] Support Presto with IndexSserver

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-669965675


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3637/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-669964959


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1897/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-669960797


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3636/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


ajantha-bhat commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669948010


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669945651


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3635/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] dependabot[bot] commented on pull request #3456: Bump solr.version from 6.3.0 to 8.3.0 in /datamap/lucene

2020-08-06 Thread GitBox


dependabot[bot] commented on pull request #3456:
URL: https://github.com/apache/carbondata/pull/3456#issuecomment-669932089


   Dependabot tried to update this pull request, but something went wrong. 
We're looking into it, but in the meantime you can retry the update by 
commenting `@dependabot rebase`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] dependabot[bot] commented on pull request #3447: Bump dep.jackson.version from 2.6.5 to 2.10.1 in /store/sdk

2020-08-06 Thread GitBox


dependabot[bot] commented on pull request #3447:
URL: https://github.com/apache/carbondata/pull/3447#issuecomment-669932215


   Dependabot tried to update this pull request, but something went wrong. 
We're looking into it, but in the meantime you can retry the update by 
commenting `@dependabot rebase`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r466417493



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/readers/ArrayStreamReader.java
##
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.readers;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import io.prestosql.spi.type.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.datatype.StructField;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.block.BlockBuilder;
+
+import org.apache.carbondata.presto.CarbonVectorBatch;
+
+/**
+ * Class to read the Array Stream
+ */
+
+public class ArrayStreamReader extends CarbonColumnVectorImpl implements 
PrestoVectorBlockBuilder {
+
+  protected int batchSize;
+
+  protected Type type;
+  protected BlockBuilder builder;
+  Block childBlock = null;
+  private int index = 0;
+
+  public ArrayStreamReader(int batchSize, DataType dataType, StructField 
field) {
+super(batchSize, dataType);
+this.batchSize = batchSize;
+this.type = getArrayOfType(field, dataType);
+ArrayList childrenList= new ArrayList<>();
+
childrenList.add(CarbonVectorBatch.createDirectStreamReader(this.batchSize, 
field.getDataType(), field));
+setChildrenVector(childrenList);
+this.builder = type.createBlockBuilder(null, batchSize);
+  }
+
+  public int getIndex() {
+return index;
+  }
+
+  public void setIndex(int index) {
+this.index = index;
+  }
+
+  public String getDataTypeName() {
+return "ARRAY";
+  }
+
+  Type getArrayOfType(StructField field, DataType dataType) {
+if (dataType == DataTypes.STRING) {
+  return new ArrayType(VarcharType.VARCHAR);
+} else if (dataType == DataTypes.BYTE) {
+  return new ArrayType(TinyintType.TINYINT);
+} else if (dataType == DataTypes.SHORT) {
+  return new ArrayType(SmallintType.SMALLINT);
+} else if (dataType == DataTypes.INT) {
+  return new ArrayType(IntegerType.INTEGER);
+} else if (dataType == DataTypes.LONG) {
+  return new ArrayType(BigintType.BIGINT);
+} else if (dataType == DataTypes.DOUBLE) {
+  return new ArrayType(DoubleType.DOUBLE);
+} else if (dataType == DataTypes.FLOAT) {
+  return new ArrayType(RealType.REAL);
+} else if (dataType == DataTypes.BOOLEAN) {
+  return new ArrayType(BooleanType.BOOLEAN);
+} else if (dataType == DataTypes.TIMESTAMP) {
+  return new ArrayType(TimestampType.TIMESTAMP);
+} else if (DataTypes.isArrayType(dataType)) {
+  StructField childField = field.getChildren().get(0);
+  return new ArrayType(getArrayOfType(childField, 
childField.getDataType()));
+} else {
+  throw new UnsupportedOperationException("Unsupported type: " + dataType);
+}
+  }
+
+  @Override
+  public Block buildBlock() {
+return builder.build();
+  }
+
+  public boolean isComplex() {
+return true;
+  }
+
+  @Override
+  public void setBatchSize(int batchSize) {
+this.batchSize = batchSize;
+  }
+
+  @Override
+  public void putObject(int rowId, Object value) {
+if (value == null) {
+  putNull(rowId);
+} else {
+  getChildrenVector().get(0).putObject(rowId, value);
+}
+  }
+
+  public void putArrayObject() {
+if (DataTypes.isArrayType(this.getType())) {
+  childBlock = ((ArrayStreamReader) 
getChildrenVector().get(0)).buildBlock();
+} else if (this.getType() == DataTypes.STRING) {
+  childBlock = ((SliceStreamReader) 
getChildrenVector().get(0)).buildBlock();
+} else if (this.getType() == DataTypes.INT) {
+  childBlock = ((IntegerStreamReader) 
getChildrenVector().get(0)).buildBlock();
+} else if (this.getType() == DataTypes.LONG) {
+  childBlock = ((LongStreamReader) 
getChildrenVector().get(0)).buildBlock();
+} else if (this.getType() == DataTypes.DOUBLE) {
+  childBlock = ((DoubleStreamReader) 

[GitHub] [carbondata] asfgit closed pull request #3872: [CARBONDATA-3889] Enable java check style for all java modules

2020-08-06 Thread GitBox


asfgit closed pull request #3872:
URL: https://github.com/apache/carbondata/pull/3872


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669915582


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1896/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3881: [HOTFIX] NPE While Data Loading

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3881:
URL: https://github.com/apache/carbondata/pull/3881#issuecomment-66990


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3634/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3881: [HOTFIX] NPE While Data Loading

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3881:
URL: https://github.com/apache/carbondata/pull/3881#issuecomment-669907190


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1895/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3883: [CARBONDATA-3940] CommitTask fails due to Rename IOException during L…

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3883:
URL: https://github.com/apache/carbondata/pull/3883#issuecomment-669894078


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3631/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 opened a new pull request #3885: [WIP] Support Presto with IndexSserver

2020-08-06 Thread GitBox


Indhumathi27 opened a new pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885


### Why is this PR needed?


### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3884: [CARBONDATA-3942] Fix type cast when loading data into partitioned table

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3884:
URL: https://github.com/apache/carbondata/pull/3884#issuecomment-669887552


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3633/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3884: [CARBONDATA-3942] Fix type cast when loading data into partitioned table

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3884:
URL: https://github.com/apache/carbondata/pull/3884#issuecomment-669886202


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1894/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3872: [CARBONDATA-3889] Enable java check style for all java modules

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3872:
URL: https://github.com/apache/carbondata/pull/3872#issuecomment-669885501


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3632/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3872: [CARBONDATA-3889] Enable java check style for all java modules

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3872:
URL: https://github.com/apache/carbondata/pull/3872#issuecomment-669880833


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1893/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-669860500


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3630/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3883: [CARBONDATA-3940] CommitTask fails due to Rename IOException during L…

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3883:
URL: https://github.com/apache/carbondata/pull/3883#issuecomment-669858937


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1892/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-669856755


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1891/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat commented on pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#issuecomment-669844507


   @akkio-97 : update limitations and TODO clearly
   a. with local dictionary arrays cannot be read now
   b. arrays with other complex type is not supported yet
   
   I also feel arrayStreamRader and some interface need to cleaned up [I will 
do it with struct support]



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r466313217



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/GenerateFiles.scala
##
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.{ByteArrayInputStream, ByteArrayOutputStream, DataInputStream, 
File, InputStream}
+import java.util
+
+import scala.collection.JavaConverters._
+
+import org.apache.avro
+import org.apache.avro.file.DataFileWriter
+import org.apache.avro.generic.{GenericDatumReader, GenericDatumWriter, 
GenericRecord}
+import org.apache.avro.io.{DecoderFactory, Encoder}
+import org.junit.Assert
+
+import org.apache.carbondata.core.cache.dictionary.DictionaryByteArrayWrapper
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.block.TableBlockInfo
+import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk
+import 
org.apache.carbondata.core.datastore.chunk.reader.CarbonDataReaderFactory
+import 
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.DimensionChunkReaderV3
+import org.apache.carbondata.core.datastore.compression.CompressorFactory
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import 
org.apache.carbondata.core.datastore.page.encoding.DefaultEncodingFactory
+import org.apache.carbondata.core.metadata.ColumnarFormatVersion
+import org.apache.carbondata.core.util.{CarbonMetadataUtil, 
DataFileFooterConverterV3}
+import org.apache.carbondata.sdk.file.CarbonWriter
+
+class GenerateFiles {
+
+  def singleLevelArrayFile() = {
+val json1: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true,"arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.111,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json2: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[1.1,2.2,4.45,3.3],
+|"arrayBooleanCol": [true, true, true]} """.stripMargin
+val json3: String =
+  """ {"stringCol": "Rio","intCol": 16,"doubleCol": 12.5,"realCol": 14.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", "Street2","Street3"],
+|"arrayStringCol2": ["China", "Brazil", "Paris", 
"France"],"arrayIntCol": [1,2,3,4,5],
+
|"arrayBigIntCol":[7,6,8000,91],"arrayRealCol":[1.1,2.2,3.3,4.45],
+|"arrayDoubleCol":[1.1,2.2,4.45,5.5,3.3], "arrayBooleanCol": [true, 
false, true]} """
+.stripMargin
+val json4: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true, "arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.1,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json5: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[4,1,21.222,15.231],
+|"arrayBooleanCol": [false, false, false]} """.stripMargin
+
+
+val mySchema =
+  """ {
+|  "name": "address",
+|  "type": "record",
+|  "fields": [
+|  {
+|  "name": "stringCol",
+|  "type": 

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r466313146



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/GenerateFiles.scala
##
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.{ByteArrayInputStream, ByteArrayOutputStream, DataInputStream, 
File, InputStream}
+import java.util
+
+import scala.collection.JavaConverters._
+
+import org.apache.avro
+import org.apache.avro.file.DataFileWriter
+import org.apache.avro.generic.{GenericDatumReader, GenericDatumWriter, 
GenericRecord}
+import org.apache.avro.io.{DecoderFactory, Encoder}
+import org.junit.Assert
+
+import org.apache.carbondata.core.cache.dictionary.DictionaryByteArrayWrapper
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.block.TableBlockInfo
+import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk
+import 
org.apache.carbondata.core.datastore.chunk.reader.CarbonDataReaderFactory
+import 
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.DimensionChunkReaderV3
+import org.apache.carbondata.core.datastore.compression.CompressorFactory
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import 
org.apache.carbondata.core.datastore.page.encoding.DefaultEncodingFactory
+import org.apache.carbondata.core.metadata.ColumnarFormatVersion
+import org.apache.carbondata.core.util.{CarbonMetadataUtil, 
DataFileFooterConverterV3}
+import org.apache.carbondata.sdk.file.CarbonWriter
+
+class GenerateFiles {
+
+  def singleLevelArrayFile() = {
+val json1: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true,"arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.111,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json2: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[1.1,2.2,4.45,3.3],
+|"arrayBooleanCol": [true, true, true]} """.stripMargin
+val json3: String =
+  """ {"stringCol": "Rio","intCol": 16,"doubleCol": 12.5,"realCol": 14.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", "Street2","Street3"],
+|"arrayStringCol2": ["China", "Brazil", "Paris", 
"France"],"arrayIntCol": [1,2,3,4,5],
+
|"arrayBigIntCol":[7,6,8000,91],"arrayRealCol":[1.1,2.2,3.3,4.45],
+|"arrayDoubleCol":[1.1,2.2,4.45,5.5,3.3], "arrayBooleanCol": [true, 
false, true]} """
+.stripMargin
+val json4: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true, "arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.1,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json5: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[4,1,21.222,15.231],
+|"arrayBooleanCol": [false, false, false]} """.stripMargin
+
+
+val mySchema =
+  """ {
+|  "name": "address",
+|  "type": "record",
+|  "fields": [
+|  {
+|  "name": "stringCol",
+|  "type": 

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r466312957



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/GenerateFiles.scala
##
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.{ByteArrayInputStream, ByteArrayOutputStream, DataInputStream, 
File, InputStream}
+import java.util
+
+import scala.collection.JavaConverters._
+
+import org.apache.avro
+import org.apache.avro.file.DataFileWriter
+import org.apache.avro.generic.{GenericDatumReader, GenericDatumWriter, 
GenericRecord}
+import org.apache.avro.io.{DecoderFactory, Encoder}
+import org.junit.Assert
+
+import org.apache.carbondata.core.cache.dictionary.DictionaryByteArrayWrapper
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.block.TableBlockInfo
+import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk
+import 
org.apache.carbondata.core.datastore.chunk.reader.CarbonDataReaderFactory
+import 
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.DimensionChunkReaderV3
+import org.apache.carbondata.core.datastore.compression.CompressorFactory
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import 
org.apache.carbondata.core.datastore.page.encoding.DefaultEncodingFactory
+import org.apache.carbondata.core.metadata.ColumnarFormatVersion
+import org.apache.carbondata.core.util.{CarbonMetadataUtil, 
DataFileFooterConverterV3}
+import org.apache.carbondata.sdk.file.CarbonWriter
+
+class GenerateFiles {
+
+  def singleLevelArrayFile() = {
+val json1: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true,"arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.111,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json2: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[1.1,2.2,4.45,3.3],
+|"arrayBooleanCol": [true, true, true]} """.stripMargin
+val json3: String =
+  """ {"stringCol": "Rio","intCol": 16,"doubleCol": 12.5,"realCol": 14.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", "Street2","Street3"],
+|"arrayStringCol2": ["China", "Brazil", "Paris", 
"France"],"arrayIntCol": [1,2,3,4,5],
+
|"arrayBigIntCol":[7,6,8000,91],"arrayRealCol":[1.1,2.2,3.3,4.45],
+|"arrayDoubleCol":[1.1,2.2,4.45,5.5,3.3], "arrayBooleanCol": [true, 
false, true]} """
+.stripMargin
+val json4: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true, "arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.1,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json5: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[4,1,21.222,15.231],
+|"arrayBooleanCol": [false, false, false]} """.stripMargin
+
+
+val mySchema =
+  """ {
+|  "name": "address",
+|  "type": "record",
+|  "fields": [
+|  {
+|  "name": "stringCol",
+|  "type": 

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r466312027



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/GenerateFiles.scala
##
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.{ByteArrayInputStream, ByteArrayOutputStream, DataInputStream, 
File, InputStream}
+import java.util
+
+import scala.collection.JavaConverters._
+
+import org.apache.avro
+import org.apache.avro.file.DataFileWriter
+import org.apache.avro.generic.{GenericDatumReader, GenericDatumWriter, 
GenericRecord}
+import org.apache.avro.io.{DecoderFactory, Encoder}
+import org.junit.Assert
+
+import org.apache.carbondata.core.cache.dictionary.DictionaryByteArrayWrapper
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.block.TableBlockInfo
+import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk
+import 
org.apache.carbondata.core.datastore.chunk.reader.CarbonDataReaderFactory
+import 
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.DimensionChunkReaderV3
+import org.apache.carbondata.core.datastore.compression.CompressorFactory
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import 
org.apache.carbondata.core.datastore.page.encoding.DefaultEncodingFactory
+import org.apache.carbondata.core.metadata.ColumnarFormatVersion
+import org.apache.carbondata.core.util.{CarbonMetadataUtil, 
DataFileFooterConverterV3}
+import org.apache.carbondata.sdk.file.CarbonWriter
+
+class GenerateFiles {
+
+  def singleLevelArrayFile() = {
+val json1: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true,"arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.111,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json2: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[1.1,2.2,4.45,3.3],
+|"arrayBooleanCol": [true, true, true]} """.stripMargin
+val json3: String =
+  """ {"stringCol": "Rio","intCol": 16,"doubleCol": 12.5,"realCol": 14.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", "Street2","Street3"],
+|"arrayStringCol2": ["China", "Brazil", "Paris", 
"France"],"arrayIntCol": [1,2,3,4,5],
+
|"arrayBigIntCol":[7,6,8000,91],"arrayRealCol":[1.1,2.2,3.3,4.45],
+|"arrayDoubleCol":[1.1,2.2,4.45,5.5,3.3], "arrayBooleanCol": [true, 
false, true]} """
+.stripMargin
+val json4: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true, "arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.1,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json5: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[4,1,21.222,15.231],
+|"arrayBooleanCol": [false, false, false]} """.stripMargin
+
+
+val mySchema =
+  """ {
+|  "name": "address",
+|  "type": "record",
+|  "fields": [
+|  {
+|  "name": "stringCol",
+|  "type": 

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r466311639



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/GenerateFiles.scala
##
@@ -0,0 +1,667 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.{ByteArrayInputStream, ByteArrayOutputStream, DataInputStream, 
File, InputStream}
+import java.util
+
+import scala.collection.JavaConverters._
+
+import org.apache.avro
+import org.apache.avro.file.DataFileWriter
+import org.apache.avro.generic.{GenericDatumReader, GenericDatumWriter, 
GenericRecord}
+import org.apache.avro.io.{DecoderFactory, Encoder}
+import org.junit.Assert
+
+import org.apache.carbondata.core.cache.dictionary.DictionaryByteArrayWrapper
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.block.TableBlockInfo
+import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk
+import 
org.apache.carbondata.core.datastore.chunk.reader.CarbonDataReaderFactory
+import 
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.DimensionChunkReaderV3
+import org.apache.carbondata.core.datastore.compression.CompressorFactory
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import 
org.apache.carbondata.core.datastore.page.encoding.DefaultEncodingFactory
+import org.apache.carbondata.core.metadata.ColumnarFormatVersion
+import org.apache.carbondata.core.util.{CarbonMetadataUtil, 
DataFileFooterConverterV3}
+import org.apache.carbondata.sdk.file.CarbonWriter
+
+class GenerateFiles {
+
+  def singleLevelArrayFile() = {
+val json1: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true,"arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.111,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json2: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[1.1,2.2,4.45,3.3],
+|"arrayBooleanCol": [true, true, true]} """.stripMargin
+val json3: String =
+  """ {"stringCol": "Rio","intCol": 16,"doubleCol": 12.5,"realCol": 14.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", "Street2","Street3"],
+|"arrayStringCol2": ["China", "Brazil", "Paris", 
"France"],"arrayIntCol": [1,2,3,4,5],
+
|"arrayBigIntCol":[7,6,8000,91],"arrayRealCol":[1.1,2.2,3.3,4.45],
+|"arrayDoubleCol":[1.1,2.2,4.45,5.5,3.3], "arrayBooleanCol": [true, 
false, true]} """
+.stripMargin
+val json4: String =
+  """ {"stringCol": "bob","intCol": 14,"doubleCol": 10.5,"realCol": 12.7,
+|"boolCol": true, "arrayStringCol1":["Street1"],"arrayStringCol2": 
["India", "Egypt"],
+|"arrayIntCol": 
[1,2,3],"arrayBigIntCol":[7,6],"arrayRealCol":[1.1,2.2],
+|"arrayDoubleCol":[1.1,2.2,3.3], "arrayBooleanCol": [true, false, 
true]} """.stripMargin
+val json5: String =
+  """ {"stringCol": "Alex","intCol": 15,"doubleCol": 11.5,"realCol": 13.7,
+|"boolCol": true, "arrayStringCol1": ["Street1", 
"Street2"],"arrayStringCol2": ["Japan",
+|"China", "India"],"arrayIntCol": 
[1,2,3,4],"arrayBigIntCol":[7,6,8000],
+|"arrayRealCol":[1.1,2.2,3.3],"arrayDoubleCol":[4,1,21.222,15.231],
+|"arrayBooleanCol": [false, false, false]} """.stripMargin
+
+
+val mySchema =
+  """ {
+|  "name": "address",
+|  "type": "record",
+|  "fields": [
+|  {
+|  "name": "stringCol",
+|  "type": 

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r466310201



##
File path: 
integration/presto/src/main/java/org/apache/carbondata/presto/CarbonVectorBatch.java
##
@@ -102,6 +89,12 @@ public static CarbonColumnVectorImpl 
createDirectStreamReader(int batchSize, Dat
   } else {
 return null;
   }
+} else if (DataTypes.isArrayType(field.getDataType())) {
+  if (field.getChildren().size() > 1) {

Review comment:
   remove this assert, array can never have more than one child





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r466309537



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/CarbonColumnVectorImpl.java
##
@@ -102,6 +126,58 @@ public CarbonColumnVectorImpl(int batchSize, DataType 
dataType) {
 
   }
 
+  @Override
+  public CarbonColumnVector getColumnVector() {
+return null;
+  }
+
+  @Override
+  public List getChildrenVector() {
+return childrenVector;
+  }
+
+  @Override
+  public void putArrayObject() {
+return;
+  }
+
+  public void setChildrenVector(ArrayList 
childrenVector) {
+this.childrenVector = childrenVector;
+  }
+
+  public ArrayList getChildrenElements() {
+return childrenElements;
+  }
+
+  public void setChildrenElements(ArrayList childrenElements) {
+this.childrenElements = childrenElements;
+  }
+
+  public ArrayList getChildrenOffset() {
+return childrenOffset;
+  }
+
+  public void setChildrenOffset(ArrayList childrenOffset) {
+this.childrenOffset = childrenOffset;
+  }
+
+  public void setChildrenElementsAndOffset(byte[] childPageData) {
+ByteBuffer childInfoBuffer = ByteBuffer.wrap(childPageData);
+ArrayList childElements = new ArrayList<>();
+ArrayList childOffset = new ArrayList<>();

Review comment:
   offset is not required, even for struct type. so, please remove it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] marchpure commented on a change in pull request #3881: [HOTFIX] NPE While Data Loading

2020-08-06 Thread GitBox


marchpure commented on a change in pull request #3881:
URL: https://github.com/apache/carbondata/pull/3881#discussion_r466303536



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonRelation.scala
##
@@ -207,7 +207,7 @@ case class CarbonRelation(
   null != validSeg.getLoadMetadataDetails.getIndexSize) {
 size = size + 
validSeg.getLoadMetadataDetails.getDataSize.toLong +
validSeg.getLoadMetadataDetails.getIndexSize.toLong
-  } else {
+  } else if (!carbonTable.isHivePartitionTable) {

Review comment:
   Here, it aims collect the datasize of segment path. but the format of 
segment path generated is "Fart/Part0/Segment_0".
   For partition table. will throw out FileNotFound exception.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669832416


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1888/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3881: [HOTFIX] NPE While Data Loading

2020-08-06 Thread GitBox


QiangCai commented on a change in pull request #3881:
URL: https://github.com/apache/carbondata/pull/3881#discussion_r466295972



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonRelation.scala
##
@@ -207,7 +207,7 @@ case class CarbonRelation(
   null != validSeg.getLoadMetadataDetails.getIndexSize) {
 size = size + 
validSeg.getLoadMetadataDetails.getDataSize.toLong +
validSeg.getLoadMetadataDetails.getIndexSize.toLong
-  } else {
+  } else if (!carbonTable.isHivePartitionTable) {

Review comment:
   why add this check?

##
File path: 
core/src/main/java/org/apache/carbondata/core/readcommitter/TableStatusReadCommittedScope.java
##
@@ -87,7 +87,9 @@ public TableStatusReadCommittedScope(AbsoluteTableIdentifier 
identifier,
   SegmentFileStore fileStore =
   new SegmentFileStore(identifier.getTablePath(), 
segment.getSegmentFileName());
   indexFiles = fileStore.getIndexOrMergeFiles();
-  
segment.setSegmentMetaDataInfo(fileStore.getSegmentFile().getSegmentMetaDataInfo());
+  if (fileStore != null && fileStore.getSegmentFile() != null) {

Review comment:
   no need to check "fileStore != null"





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 commented on pull request #3876: TestingCI

2020-08-06 Thread GitBox


Karan980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-669829422


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#issuecomment-669828122


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3627/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3881: [HOTFIX] NPE While Data Loading

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3881:
URL: https://github.com/apache/carbondata/pull/3881#issuecomment-66982







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] asfgit closed pull request #3880: [CARBONDATA-3879] Filtering Segmets Optimazation

2020-08-06 Thread GitBox


asfgit closed pull request #3880:
URL: https://github.com/apache/carbondata/pull/3880


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-3879) Filtering Segmets Optimazation

2020-08-06 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3879.
--
Fix Version/s: (was: 2.0.2)
   2.1.0
   Resolution: Fixed

> Filtering Segmets Optimazation
> --
>
> Key: CARBONDATA-3879
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3879
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-query
>Affects Versions: 2.0.0
>Reporter: Xingjun Hao
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> During filter segments flow, there are a lot of LIST.CONTAINS, which has 
> heavy time overhead when there are tens of thousands segments.
> For example, if there are 5 segments. it will trigger LIST.CONTAINS  for 
> each segment, the LIST also has about 5 elements. so the time complexity 
> will be O(5 * 5 )



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-669820094


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3625/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3880: [CARBONDATA-3879] Filtering Segmets Optimazation

2020-08-06 Thread GitBox


ajantha-bhat commented on pull request #3880:
URL: https://github.com/apache/carbondata/pull/3880#issuecomment-669819275


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3880: [CARBONDATA-3879] Filtering Segmets Optimazation

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3880:
URL: https://github.com/apache/carbondata/pull/3880#discussion_r466277616



##
File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
##
@@ -389,14 +389,16 @@ protected FileSplit makeSplit(String segmentId, String 
filePath, long start, lon
 
   public void updateLoadMetaDataDetailsToSegments(List validSegments,
   List prunedSplits) {
+Map validSegmentsMap = validSegments.stream()

Review comment:
   oh, I got what you mean, we still need to get the element from valid 
segment to read its LoadMetadataDetails. So, SET can help only for contains, 
but not for getting the valid segment.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] marchpure commented on a change in pull request #3880: [CARBONDATA-3879] Filtering Segmets Optimazation

2020-08-06 Thread GitBox


marchpure commented on a change in pull request #3880:
URL: https://github.com/apache/carbondata/pull/3880#discussion_r466275337



##
File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
##
@@ -389,14 +389,16 @@ protected FileSplit makeSplit(String segmentId, String 
filePath, long start, lon
 
   public void updateLoadMetaDataDetailsToSegments(List validSegments,
   List prunedSplits) {
+Map validSegmentsMap = validSegments.stream()

Review comment:
   Yeah. But if use SET. it's hard to read a element with specified 
segmentno. 
   In this code. we need to read a segment with specified segmentno from 
validSegments.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3880: [CARBONDATA-3879] Filtering Segmets Optimazation

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3880:
URL: https://github.com/apache/carbondata/pull/3880#discussion_r466270144



##
File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
##
@@ -389,14 +389,16 @@ protected FileSplit makeSplit(String segmentId, String 
filePath, long start, lon
 
   public void updateLoadMetaDataDetailsToSegments(List validSegments,
   List prunedSplits) {
+Map validSegmentsMap = validSegments.stream()

Review comment:
   if you see the `equals` implementation of `Segment.java`, it is based on 
only segment number comparison. so, I think you can still use `SET` 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] marchpure commented on a change in pull request #3880: [CARBONDATA-3879] Filtering Segmets Optimazation

2020-08-06 Thread GitBox


marchpure commented on a change in pull request #3880:
URL: https://github.com/apache/carbondata/pull/3880#discussion_r466265360



##
File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
##
@@ -389,14 +389,16 @@ protected FileSplit makeSplit(String segmentId, String 
filePath, long start, lon
 
   public void updateLoadMetaDataDetailsToSegments(List validSegments,
   List prunedSplits) {
+Map validSegmentsMap = validSegments.stream()

Review comment:
   Appreciate for that good suggestion. But here we can only use Map 
instead of Set. 
   
   Reason: The The pseudo-code for this function is shown as below.
   
   // **0. segments.hashcode is segmentno, to when we compare 2 
segments, only segmentno will be compared**
   if (validSegments.contains(segmentInSplit)) {
  **1. fetch the segment from validSegments** 
  Segment segmentInValidSegment <- fetch the segment from 
validSegments
  2. 
 
segmentInSplit.setLoadMetadataDetails(segmentInValidSegment.getLoadMetadataDetails);
   }
   
   For SET. it's hard to fetch the segment from validSegments.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-669812497


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1886/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466254229



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends 
FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = 
{
 val fields: Array[Field] = new Array[Field](5)
 // same column name, but name as boolean type
-fields(0) = new Field("name", DataTypes.BOOLEAN)
+fields(0) = new Field("name", DataTypes.VARCHAR)
 fields(1) = new Field("age", DataTypes.INT)
-fields(2) = new Field("id", DataTypes.BYTE)
+fields(2) = new Field("id", DataTypes.BINARY)
 fields(3) = new Field("height", DataTypes.DOUBLE)
 fields(4) = new Field("salary", DataTypes.FLOAT)
 
+val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"
 try {
   val builder = CarbonWriter.builder()
   val writer =
 builder.outputPath(writerPath)
   
.uniqueIdentifier(System.currentTimeMillis()).withBlockSize(2).sortBy(sortColumns)
   .withCsvInput(new 
Schema(fields)).writtenBy("TestNonTransactionalCarbonTable").build()
   var i = 0
+  val bis = new BufferedInputStream(new FileInputStream(imagePath))
+  var hexValue: Array[Char] = null
+  val originBinary = new Array[Byte](bis.available)
+  while (bis.read(originBinary) != -1) {
+hexValue = Hex.encodeHex(originBinary)
+  }
+  bis.close()
+  val binaryValue = String.valueOf(hexValue)
+

Review comment:
   ok

##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends 
FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = 
{
 val fields: Array[Field] = new Array[Field](5)
 // same column name, but name as boolean type
-fields(0) = new Field("name", DataTypes.BOOLEAN)
+fields(0) = new Field("name", DataTypes.VARCHAR)
 fields(1) = new Field("age", DataTypes.INT)
-fields(2) = new Field("id", DataTypes.BYTE)
+fields(2) = new Field("id", DataTypes.BINARY)
 fields(3) = new Field("height", DataTypes.DOUBLE)
 fields(4) = new Field("salary", DataTypes.FLOAT)
 
+val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
   ok, used root path





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466252790



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends 
FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = 
{
 val fields: Array[Field] = new Array[Field](5)
 // same column name, but name as boolean type
-fields(0) = new Field("name", DataTypes.BOOLEAN)
+fields(0) = new Field("name", DataTypes.VARCHAR)
 fields(1) = new Field("age", DataTypes.INT)
-fields(2) = new Field("id", DataTypes.BYTE)
+fields(2) = new Field("id", DataTypes.BINARY)
 fields(3) = new Field("height", DataTypes.DOUBLE)
 fields(4) = new Field("salary", DataTypes.FLOAT)
 
+val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
   > another question: why presto module need scala?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466252790



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends 
FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = 
{
 val fields: Array[Field] = new Array[Field](5)
 // same column name, but name as boolean type
-fields(0) = new Field("name", DataTypes.BOOLEAN)
+fields(0) = new Field("name", DataTypes.VARCHAR)
 fields(1) = new Field("age", DataTypes.INT)
-fields(2) = new Field("id", DataTypes.BYTE)
+fields(2) = new Field("id", DataTypes.BINARY)
 fields(3) = new Field("height", DataTypes.DOUBLE)
 fields(4) = new Field("salary", DataTypes.FLOAT)
 
+val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
   > another question: why presto module need scala?
   we can add spark as test dependency and create store from spark and query 
from presto

##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends 
FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = 
{
 val fields: Array[Field] = new Array[Field](5)
 // same column name, but name as boolean type
-fields(0) = new Field("name", DataTypes.BOOLEAN)
+fields(0) = new Field("name", DataTypes.VARCHAR)
 fields(1) = new Field("age", DataTypes.INT)
-fields(2) = new Field("id", DataTypes.BYTE)
+fields(2) = new Field("id", DataTypes.BINARY)
 fields(3) = new Field("height", DataTypes.DOUBLE)
 fields(4) = new Field("salary", DataTypes.FLOAT)
 
+val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
   > another question: why presto module need scala?
   
   we can add spark as test dependency and create store from spark and query 
from presto





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3877: [CARBONDATA-3889] Cleanup duplicated code in carbondata-spark module

2020-08-06 Thread GitBox


QiangCai commented on pull request #3877:
URL: https://github.com/apache/carbondata/pull/3877#issuecomment-669804784


   @kevinjmh @ajantha-bhat 
   please help to review this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


QiangCai commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466249308



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends 
FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = 
{
 val fields: Array[Field] = new Array[Field](5)
 // same column name, but name as boolean type
-fields(0) = new Field("name", DataTypes.BOOLEAN)
+fields(0) = new Field("name", DataTypes.VARCHAR)
 fields(1) = new Field("age", DataTypes.INT)
-fields(2) = new Field("id", DataTypes.BYTE)
+fields(2) = new Field("id", DataTypes.BINARY)
 fields(3) = new Field("height", DataTypes.DOUBLE)
 fields(4) = new Field("salary", DataTypes.FLOAT)
 
+val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"
 try {
   val builder = CarbonWriter.builder()
   val writer =
 builder.outputPath(writerPath)
   
.uniqueIdentifier(System.currentTimeMillis()).withBlockSize(2).sortBy(sortColumns)
   .withCsvInput(new 
Schema(fields)).writtenBy("TestNonTransactionalCarbonTable").build()
   var i = 0
+  val bis = new BufferedInputStream(new FileInputStream(imagePath))
+  var hexValue: Array[Char] = null
+  val originBinary = new Array[Byte](bis.available)
+  while (bis.read(originBinary) != -1) {
+hexValue = Hex.encodeHex(originBinary)
+  }
+  bis.close()
+  val binaryValue = String.valueOf(hexValue)
+

Review comment:
   keep only one blank line.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3882: [CARBONDATA-3941] Support binary data type reading from presto

2020-08-06 Thread GitBox


QiangCai commented on a change in pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882#discussion_r466249621



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##
@@ -176,24 +188,35 @@ class PrestoTestNonTransactionalTableFiles extends 
FunSuiteLike with BeforeAndAf
   def buildTestDataOtherDataType(rows: Int, sortColumns: Array[String]): Any = 
{
 val fields: Array[Field] = new Array[Field](5)
 // same column name, but name as boolean type
-fields(0) = new Field("name", DataTypes.BOOLEAN)
+fields(0) = new Field("name", DataTypes.VARCHAR)
 fields(1) = new Field("age", DataTypes.INT)
-fields(2) = new Field("id", DataTypes.BYTE)
+fields(2) = new Field("id", DataTypes.BINARY)
 fields(3) = new Field("height", DataTypes.DOUBLE)
 fields(4) = new Field("salary", DataTypes.FLOAT)
 
+val imagePath = "../../sdk/sdk/src/test/resources/image/carbondatalogo.jpg"

Review comment:
   better to base on rootPath
   val imagePath = 
"$rootPath/sdk/sdk/src/test/resources/image/carbondatalogo.jpg"
   
   another question: why presto module need scala?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3872: [CARBONDATA-3889] Enable java check style for all java modules

2020-08-06 Thread GitBox


ajantha-bhat commented on pull request #3872:
URL: https://github.com/apache/carbondata/pull/3872#issuecomment-669801705


   LGTM. can merge once build is passed



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] IceMimosa commented on pull request #3884: [CARBONDATA-3942] Fix type cast when loading data into partitioned table

2020-08-06 Thread GitBox


IceMimosa commented on pull request #3884:
URL: https://github.com/apache/carbondata/pull/3884#issuecomment-669798694


   reset this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3880: [CARBONDATA-3879] Filtering Segmets Optimazation

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3880:
URL: https://github.com/apache/carbondata/pull/3880#discussion_r466243016



##
File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
##
@@ -389,14 +389,16 @@ protected FileSplit makeSplit(String segmentId, String 
filePath, long start, lon
 
   public void updateLoadMetaDataDetailsToSegments(List validSegments,
   List prunedSplits) {
+Map validSegmentsMap = validSegments.stream()

Review comment:
   creating a map everytime for query is also overhead when the valid 
segments is in thousands, I suggest we can use `SET` for `validSegments` when 
it is formed originally instead of 'LIST'





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3942) Fix type cast when loading data into partitioned table

2020-08-06 Thread ChenKai (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChenKai updated CARBONDATA-3942:

Summary: Fix type cast when loading data into partitioned table  (was: Fix 
type cast when doing data load into partitioned table)

> Fix type cast when loading data into partitioned table
> --
>
> Key: CARBONDATA-3942
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3942
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.1.0
>Reporter: ChenKai
>Priority: Major
>
> Loading Int type data to carbondata double type, the value will be broken 
> like this:
> +---++++
> |cnt |name|time|
> +---++++
> |4.9E-323|a |2020|
> |1.0E-322|b |2020|
> +---++++
> original cnt is: 10, 20
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] IceMimosa opened a new pull request #3884: [CARBONDATA-3942] Fix type cast when loading data into partitioned table

2020-08-06 Thread GitBox


IceMimosa opened a new pull request #3884:
URL: https://github.com/apache/carbondata/pull/3884


### Why is this PR needed?
Loading Int type data to carbondata double type, the value will be broken 
like this:
   
   +---++
   |cnt |time|
   +---++
   |4.9E-323|2020|
   |1.0E-322|2020|
   +---++
   
   original cnt value is: 10, 20
   
### What changes were proposed in this PR?
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- Yes
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3872: [CARBONDATA-3889] Enable java check style for all java modules

2020-08-06 Thread GitBox


QiangCai commented on a change in pull request #3872:
URL: https://github.com/apache/carbondata/pull/3872#discussion_r466236037



##
File path: 
integration/spark/src/main/java/org/apache/spark/sql/CarbonVectorProxy.java
##
@@ -0,0 +1,554 @@
+/*

Review comment:
   @ajantha-bhat 
   I try the following two ways in one commit, but it gets the same result.
   1. git mv   and modify new file 
   2. modify old file and git mv  
   
   Finally, I raise two commits, you can review one by one in this PR>_<





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] marchpure opened a new pull request #3883: [CARBONDATA-3940] CommitTask fails due to Rename IOException during L…

2020-08-06 Thread GitBox


marchpure opened a new pull request #3883:
URL: https://github.com/apache/carbondata/pull/3883


   …oading
   
### Why is this PR needed?
During the load process, commitTask fails with high probability. The 
exceptionstack shows that it was throwed by HadoopMapReduceCommitProtocol, not 
CarbonSQLHadoopMapMapReduceCommitProtocol, implying that there is class init 
error during the initializing of "Committer". which should have been 
initialized as CarbonSQLHadoopMapMapReduceCommitProtocol, but was incorrectly 
initialized to HadoopMapReduceCommitProtocol.

### What changes were proposed in this PR?
   Init the committer to be CarbonSQLHadoopMapMapReduceCommitProtocol directly
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- No
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3942) Fix type cast when doing data load into partitioned table

2020-08-06 Thread ChenKai (Jira)
ChenKai created CARBONDATA-3942:
---

 Summary: Fix type cast when doing data load into partitioned table
 Key: CARBONDATA-3942
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3942
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 2.1.0
Reporter: ChenKai


Loading Int type data to carbondata double type, the value will be broken like 
this:

+---++++
|cnt |name|time|
+---++++
|4.9E-323|a |2020|
|1.0E-322|b |2020|
+---++++

original cnt is: 10, 20

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] QiangCai commented on a change in pull request #3872: [CARBONDATA-3889] Enable java check style for all java modules

2020-08-06 Thread GitBox


QiangCai commented on a change in pull request #3872:
URL: https://github.com/apache/carbondata/pull/3872#discussion_r466218295



##
File path: 
integration/spark/src/main/java/org/apache/spark/sql/CarbonVectorProxy.java
##
@@ -0,0 +1,554 @@
+/*

Review comment:
   CarbonVectorProxy.java have an indent issue, it led to many changes 
(about 400 lines).
   I will use "git mv" to try again.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-669767240


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3628/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.

2020-08-06 Thread GitBox


CarbonDataQA1 commented on pull request #3834:
URL: https://github.com/apache/carbondata/pull/3834#issuecomment-669766599


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1889/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3941) Presto cannot query binary datatype store

2020-08-06 Thread Ajantha Bhat (Jira)
Ajantha Bhat created CARBONDATA-3941:


 Summary: Presto cannot query binary datatype store
 Key: CARBONDATA-3941
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3941
 Project: CarbonData
  Issue Type: Bug
Reporter: Ajantha Bhat
Assignee: Ajantha Bhat


Presto cannot query binary datatype store



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3930) MVExample is throwing DataLoadingException

2020-08-06 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3930.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> MVExample is throwing DataLoadingException
> --
>
> Key: CARBONDATA-3930
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3930
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: David Cai
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [Reproduce]
> Run 
> examples/spark/src/main/scala/org/apache/carbondata/examples/MVExample.scala 
> in IDEA
> [LOG]
> Exception in thread "main" 
> org.apache.carbondata.processing.exception.DataLoadingException: The input 
> file does not exist: 
> /***/carbondata/integration/spark-common-test/src/test/resources/sample.csvException
>  in thread "main" 
> org.apache.carbondata.processing.exception.DataLoadingException: The input 
> file does not exist: 
> /home/david/Documents/code/carbondata/integration/spark-common-test/src/test/resources/sample.csv
>  at 
> org.apache.spark.util.FileUtils$$anonfun$getPaths$1.apply$mcVI$sp(FileUtils.scala:81)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) at 
> org.apache.spark.util.FileUtils$.getPaths(FileUtils.scala:77) at 
> org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:97)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145)
>  at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
> org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
> org.apache.spark.sql.Dataset$$anonfun$51.apply(Dataset.scala:3265) at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3264) at 
> org.apache.spark.sql.Dataset.(Dataset.scala:190) at 
> org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at 
> org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at 
> org.apache.carbondata.examples.MVExample$.exampleBody(MVExample.scala:67) at 
> org.apache.carbondata.examples.MVExample$.main(MVExample.scala:37) at 
> org.apache.carbondata.examples.MVExample.main(MVExample.scala)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3870: [CARBONDATA-3930] Fix DataLoadingException in MVExample

2020-08-06 Thread GitBox


asfgit closed pull request #3870:
URL: https://github.com/apache/carbondata/pull/3870


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3872: [CARBONDATA-3889] Enable java check style for all java modules

2020-08-06 Thread GitBox


ajantha-bhat commented on a change in pull request #3872:
URL: https://github.com/apache/carbondata/pull/3872#discussion_r466201420



##
File path: 
integration/spark/src/main/java/org/apache/spark/sql/CarbonVectorProxy.java
##
@@ -0,0 +1,554 @@
+/*

Review comment:
   I can see other files are moved, but why this file shows as added 
instead of moved ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3870: [CARBONDATA-3930] Fix DataLoadingException in MVExample

2020-08-06 Thread GitBox


ajantha-bhat commented on pull request #3870:
URL: https://github.com/apache/carbondata/pull/3870#issuecomment-669755870


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat opened a new pull request #3882: [WIP] Support binary data type reading from presto

2020-08-06 Thread GitBox


ajantha-bhat opened a new pull request #3882:
URL: https://github.com/apache/carbondata/pull/3882


### Why is this PR needed?
when binary store is queried from presto, presto currently give 0 rows.

### What changes were proposed in this PR?
   Presto can support binary (varBinary) data type reading by using the 
SliceStreamReader
   and it can put binary byte[] using putByteArray() method of SliceStreamReader
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org