[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-11 Thread GitBox


ajantha-bhat commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r469026786



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/readers/ArrayStreamReader.java
##
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.readers;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import io.prestosql.spi.type.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.datatype.StructField;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.block.BlockBuilder;
+
+import org.apache.carbondata.presto.CarbonVectorBatch;
+
+/**
+ * Class to read the Array Stream
+ */
+
+public class ArrayStreamReader extends CarbonColumnVectorImpl implements 
PrestoVectorBlockBuilder {
+
+  protected int batchSize;
+
+  protected Type type;
+  protected BlockBuilder builder;
+  Block childBlock = null;
+  private int index = 0;
+
+  public ArrayStreamReader(int batchSize, DataType dataType, StructField 
field) {
+super(batchSize, dataType);
+this.batchSize = batchSize;
+this.type = getArrayOfType(field, dataType);
+ArrayList childrenList= new ArrayList<>();
+
childrenList.add(CarbonVectorBatch.createDirectStreamReader(this.batchSize, 
field.getDataType(), field));
+setChildrenVector(childrenList);
+this.builder = type.createBlockBuilder(null, batchSize);
+  }
+
+  public int getIndex() {
+return index;
+  }
+
+  public void setIndex(int index) {
+this.index = index;
+  }
+
+  public String getDataTypeName() {
+return "ARRAY";
+  }
+
+  Type getArrayOfType(StructField field, DataType dataType) {
+if (dataType == DataTypes.STRING) {
+  return new ArrayType(VarcharType.VARCHAR);
+} else if (dataType == DataTypes.BYTE) {
+  return new ArrayType(TinyintType.TINYINT);
+} else if (dataType == DataTypes.SHORT) {
+  return new ArrayType(SmallintType.SMALLINT);
+} else if (dataType == DataTypes.INT) {

Review comment:
   decimal datatype handling is also missing 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (CARBONDATA-3944) Delete stage files was interrupted when IOException happen

2020-08-11 Thread Bo Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bo Xu closed CARBONDATA-3944.
-
Resolution: Fixed

> Delete stage files was interrupted when IOException happen
> --
>
> Key: CARBONDATA-3944
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3944
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Xingjun Hao
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> In the insertstage flow, the stage files will be deleted with retry 
> mechanism. but then IOException happen due to network abnormal etc, the 
> delete stage flow will be interrupted, which is unexpected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3886: [CARBONDATA-3944] Insertstage interrupted when IOException happen

2020-08-11 Thread GitBox


asfgit closed pull request #3886:
URL: https://github.com/apache/carbondata/pull/3886


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] xubo245 commented on pull request #3886: [CARBONDATA-3944] Insertstage interrupted when IOException happen

2020-08-11 Thread GitBox


xubo245 commented on pull request #3886:
URL: https://github.com/apache/carbondata/pull/3886#issuecomment-672545887


   Thanks for your contribution!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3879: [WIP] Handling the addition of geo column to hive at the time of table creation.

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-672245218


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1961/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3879: [WIP] Handling the addition of geo column to hive at the time of table creation.

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-672244044


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3700/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-672030736


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1960/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-672027937


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3699/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-671967929


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1959/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-671966844


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3698/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [CARBONDATA-3864] Store Size Optimization

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-671947660


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3697/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [CARBONDATA-3864] Store Size Optimization

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-671947164


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1958/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 commented on pull request #3876: TestingCI

2020-08-11 Thread GitBox


Karan980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-671943473


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-11 Thread GitBox


Indhumathi27 commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468518151



##
File path: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonLocalInputSplit.java
##
@@ -127,7 +128,8 @@ public CarbonLocalInputSplit(@JsonProperty("segmentId") 
String segmentId,
   @JsonProperty("deleteDeltaFiles") String[] deleteDeltaFiles,
   @JsonProperty("blockletId") String blockletId,
   @JsonProperty("detailInfo") String detailInfo,
-  @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal) {
+  @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal,
+  boolean isDistributedPruningEnabled) {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3938) In Hive read table, we are unable to read a projection column or read a full scan - select * query. But the aggregate queries are working fine.

2020-08-11 Thread Prasanna Ravichandran (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-3938:
--
Description: 
In Hive read table, we are unable to read a projection column or full scan 
query. But the aggregate queries are working fine.

 

Test query:

 

--spark beeline;

drop table if exists uniqdata;

drop table if exists uniqdata1;

CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) stored as carbondata ;

LOAD DATA INPATH 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into table 
uniqdata OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

CREATE TABLE IF NOT EXISTS uniqdata1 (CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) ROW FORMAT SERDE 'org.apache.carbondata.hive.CarbonHiveSerDe' WITH 
SERDEPROPERTIES 
('mapreduce.input.carboninputformat.databaseName'='default','mapreduce.input.carboninputformat.tableName'='uniqdata')
 STORED AS INPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonInputFormat' 
OUTPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonOutputFormat' LOCATION 
'hdfs://hacluster/user/hive/warehouse/uniqdata';

select  count(*)  from uniqdata1;

 

 

--Hive Beeline;

select count(*) from uniqdata1; --Returns 2000;

select count(*) from uniqdata; --Returns 2000 - working fine;

select * from uniqdata1; --Return no rows;–Issue 1 on Hive read format table;

select * from uniqdata;–Returns no rows;–Issue 2 while reading a normal carbon 
table created in spark;

select cust_id from uniqdata1 limit 5;--Return no rows;

 Attached the logs for your reference. With the Hive write table this issue is 
not seen. Issue is only seen in Hive read format table.

This issue also exists when a normal carbon table is created in Spark and read 
through Hive beeline.

  was:
In Hive read table, we are unable to read a projection column or full scan 
query. But the aggregate queries are working fine.

 

Test query:

 

--spark beeline;

drop table if exists uniqdata;

drop table if exists uniqdata1;

CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) stored as carbondata ;

LOAD DATA INPATH 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into table 
uniqdata OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

CREATE TABLE IF NOT EXISTS uniqdata1 (CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) ROW FORMAT SERDE 'org.apache.carbondata.hive.CarbonHiveSerDe' WITH 
SERDEPROPERTIES 
('mapreduce.input.carboninputformat.databaseName'='default','mapreduce.input.carboninputformat.tableName'='uniqdata')
 STORED AS INPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonInputFormat' 
OUTPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonOutputFormat' LOCATION 
'hdfs://hacluster/user/hive/warehouse/uniqdata';

select  count(*)  from uniqdata1;

 

 

--Hive Beeline;

select count(*) from uniqdata1; --Returns 2000;

select * from uniqdata1; --Return no rows;

select cust_id from uniqdata1 limit 5;--Return no rows;

 Attached the logs for your reference. With the Hive write table this issue is 
not seen. Issue is only seen in Hive read format table.


> In Hive read table, we are unable to read a projection column or read a full 
> scan - select * query. But the aggregate queries are working fine.
> ---
>
> Key: CARBONDATA-3938
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3938
> Project: CarbonData
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 2.0.0
>Reporter: Prasanna Ravichandran

[GitHub] [carbondata] xubo245 commented on pull request #3886: [CARBONDATA-3944] Insertstage interrupted when IOException happen

2020-08-11 Thread GitBox


xubo245 commented on pull request #3886:
URL: https://github.com/apache/carbondata/pull/3886#issuecomment-671892488


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-67134


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1957/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-671888447


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3696/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-11 Thread GitBox


ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468503482



##
File path: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonLocalInputSplit.java
##
@@ -127,7 +128,8 @@ public CarbonLocalInputSplit(@JsonProperty("segmentId") 
String segmentId,
   @JsonProperty("deleteDeltaFiles") String[] deleteDeltaFiles,
   @JsonProperty("blockletId") String blockletId,
   @JsonProperty("detailInfo") String detailInfo,
-  @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal) {
+  @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal,
+  boolean isDistributedPruningEnabled) {

Review comment:
   please also keep  @JsonProperty for this





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-11 Thread GitBox


ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468499772



##
File path: docs/prestodb-guide.md
##
@@ -301,3 +303,21 @@ Presto carbon only supports reading the carbon table which 
is written by spark c
 During reading, it supports the non-distributed indexes like block index and 
bloom index.
 It doesn't support Materialized View as it needs query plan to be changed and 
presto does not allow it.
 Also, Presto carbon supports streaming segment read from streaming table 
created by spark.
+
+## Presto Setup with CarbonData Distributed IndexServer
+
+### Dependency jars
+After copying all the jars from 
../integration/presto/target/carbondata-presto-X.Y.Z-SNAPSHOT 
+to `plugin/carbondata` directory on all nodes, ensure copying the following 
jars as well.
+1. Copy ../integration/spark/target/carbondata-spark_X.Y.Z-SNAPSHOT.jar
+2. Copy corresponding Spark dependency jars to the location.
+
+### Configure properties
+Configure IndexServer configurations in carbon.properties file. Refer 
+[Configuring 
IndexServer](https://github.com/apache/carbondata/blob/master/docs/index-server.md#Configurations)
 for more info.
+Add  `-Dcarbon.properties.filepath=/carbon.properties` in jvm.config 
file. 
+
+### Presto with IndexServer
+Start distributed index server. Launch presto CLI and fire SELECT query and 
check if the corresponding job
+is triggered in the index server application.

Review comment:
   Also mention that can use spark to see the cache loaded by using show 
metacache command





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-11 Thread GitBox


ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468499341



##
File path: docs/prestodb-guide.md
##
@@ -301,3 +303,21 @@ Presto carbon only supports reading the carbon table which 
is written by spark c
 During reading, it supports the non-distributed indexes like block index and 
bloom index.
 It doesn't support Materialized View as it needs query plan to be changed and 
presto does not allow it.
 Also, Presto carbon supports streaming segment read from streaming table 
created by spark.
+
+## Presto Setup with CarbonData Distributed IndexServer

Review comment:
   As `prestosql` is default profile, add this doc in `prestosql-guide.md` 
and in prestodb doc give a link to this section of prestosql as it is common to 
both the presto version 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] MarvinLitt commented on a change in pull request #3855: [CARBONDATA-3863], after using index service clean the temp data

2020-08-11 Thread GitBox


MarvinLitt commented on a change in pull request #3855:
URL: https://github.com/apache/carbondata/pull/3855#discussion_r468484427



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala
##
@@ -316,4 +324,17 @@ object IndexServer extends ServerInterface {
   Array(new Service("security.indexserver.protocol.acl", 
classOf[ServerInterface]))
 }
   }
+
+  def startAgingFolders(): Unit = {
+val runnable = new Runnable() {
+  def run() {
+val age = System.currentTimeMillis() - agePeriod.toLong
+CarbonUtil.agingTempFolderForIndexServer(age)
+LOGGER.info(s"Complete age temp folder 
${CarbonUtil.getIndexServerTempPath}")
+  }
+}
+val ags: ScheduledExecutorService = 
Executors.newSingleThreadScheduledExecutor
+ags.scheduleAtFixedRate(runnable, 1000, 360, TimeUnit.MICROSECONDS)

Review comment:
   oh sorry will change to milliseconds, how about keep it as 3hours.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3874: [CARBONDATA-3931]Fix Secondary index with index column as DateType giving wrong results

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3874:
URL: https://github.com/apache/carbondata/pull/3874#issuecomment-671852762


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1956/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3874: [CARBONDATA-3931]Fix Secondary index with index column as DateType giving wrong results

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3874:
URL: https://github.com/apache/carbondata/pull/3874#issuecomment-671851196


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3695/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3949) Select filter query fails from presto-cli on MV table

2020-08-11 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3949:

Affects Version/s: (was: 2.0.0)
   2.0.1

> Select filter query fails from presto-cli on MV table
> -
>
> Key: CARBONDATA-3949
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3949
> Project: CarbonData
>  Issue Type: Bug
>  Components: presto-integration
>Affects Versions: 2.0.1
> Environment: Spark 2.4.5. PrestoSQL 316
>Reporter: Chetan Bhat
>Priority: Major
>
> From sparksql create table , load data and create MV
> spark-sql> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED as carbondata 
> TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
> Time taken: 0.753 seconds
> spark-sql> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> OK
> OK
> Time taken: 1.992 seconds
> spark-sql> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, 
> count(cust_id) from uniqdata group by cust_id, cust_name;
> OK
> Time taken: 4.336 seconds
>  
> From presto cli select filter query on table with MV fails.
> presto:chetan> select * from uniqdata where CUST_ID IS NULL or BIGINT_COLUMN1 
> =1233720368578 or DECIMAL_COLUMN1 = 12345678901.123458 or Double_COLUMN1 
> = 1.12345674897976E10 or INTEGER_COLUMN1 IS NULL ;
> Query 20200804_092703_00253_ed34h failed: Unable to get file status:
> *Log-*
> 2020-08-04T18:09:55.975+0800 INFO Query-20200804_100955_00300_ed34h-2642 
> stdout 2020-08-04 18:09:55 WARN AbstractDFSCarbonFile:458 - Exception 
> occurred: File 
> hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
> does not exist.
> java.io.FileNotFoundException: File 
> hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
> does not exist.
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125)
>  at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
>  at 
> org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:456)
>  at 
> org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:559)
>  at 
> org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:189)
>  at 
> org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:168)
>  at 
> org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaTables(CarbonTableReader.java:147)
>  at 
> org.apache.carbondata.presto.impl.CarbonTableReader.getCarbonCache(CarbonTableReader.java:128)
>  at 
> org.apache.carbondata.presto.CarbondataSplitManager.getSplits(CarbondataSplitManager.java:145)
>  at 
> io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:50)
>  at io.prestosql.split.SplitManager.getSplits(SplitManager.java:85)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitScanAndFilter(DistributedExecutionPlanner.java:189)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:257)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:149)
>  at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:119)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:124)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner.plan(DistributedExecutionPlanner.java:96)
>  at 
> io.prest

[GitHub] [carbondata] marchpure commented on pull request #3886: [CARBONDATA-3944] Insertstage interrupted when IOException happen

2020-08-11 Thread GitBox


marchpure commented on pull request #3886:
URL: https://github.com/apache/carbondata/pull/3886#issuecomment-671849088


   > Please optimize the PR title and description(insertstage).
   
   the PR title is modified to "Insertstage interrupted when IOException happen"



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3932) need to change discovery.uri and add hive.metastore.uri,hive.config.resources in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-mu

2020-08-11 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3932:

Affects Version/s: (was: 2.0.0)
   2.0.1

> need to change discovery.uri and add  
> hive.metastore.uri,hive.config.resources  in 
> https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata
> -
>
> Key: CARBONDATA-3932
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3932
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs, presto-integration
>Affects Versions: 2.0.1
> Environment: Documentation
>Reporter: Chetan Bhat
>Priority: Minor
>
> Need to change discovery.uri=:8086 to 
> discovery.uri=http://:8086 in 
> [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata]
> Need to add these configurations as well in carbondata.properties and to be 
> updated in carbondata-presto opensource doc .
> 1.hive.metastore.uri
> 2.hive.config.resources
> Ex : -
> connector.name=carbondata
> hive.metastore.uri=thrift://10.21.18.106:9083
> hive.config.resources=/opt/HA/C10/install/hadoop/datanode/etc/hadoop/core-site.xml,/opt/HA/C10/install/hadoop/datanode/etc/hadoop/hdfs-site.xml
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3950) Alter table drop column for non partition column throws error

2020-08-11 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3950:
---

 Summary: Alter table drop column for non partition column throws 
error
 Key: CARBONDATA-3950
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3950
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.0.1
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


>From spark-sql the queries are executed as mentioned below-

drop table if exists uniqdata_int;
CREATE TABLE uniqdata_int (CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB 
timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 
int) Partitioned by (cust_id int) stored as carbondata TBLPROPERTIES 
("TABLE_BLOCKSIZE"= "256 MB");

LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_int partition(cust_id='1') OPTIONS ('FILEHEADER'='CUST_ID,CUST_NAME 
,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
show partitions uniqdata_int;
select * from uniqdata_int order by cust_id;

alter table uniqdata_int add columns(id int);
 desc uniqdata_int;
 *alter table uniqdata_int drop columns(CUST_NAME);*
 desc uniqdata_int;

Issue : Alter table drop column for non partition column throws error even 
though the operation is success.

org.apache.carbondata.spark.exception.ProcessMetaDataException: operation 
failed for priyesh.uniqdata_int: Alterion failed: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. The 
following columns have he existing columns in their respective positions :
col;
 at 
org.apache.spark.sql.execution.command.MetadataProcessOperation$class.throwMetadataException(package.
 at 
org.apache.spark.sql.execution.command.MetadataCommand.throwMetadataException(package.scala:120)
 at 
org.apache.spark.sql.execution.command.schema.CarbonAlterTableDropColumnCommand.processMetadata(Carboand.scala:201)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
 at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379)
 at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:95
 at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:86)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378)
 at org.apache.spark.sql.Dataset.(Dataset.scala:196)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651)
 at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:67)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:387)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:279)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
 at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:87
 at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:164)
 at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:187)
 at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:89)
 at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmi

[GitHub] [carbondata] marchpure commented on a change in pull request #3886: [CARBONDATA-3944] Delete stage files was interrupted when IOException…

2020-08-11 Thread GitBox


marchpure commented on a change in pull request #3886:
URL: https://github.com/apache/carbondata/pull/3886#discussion_r468461356



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertFromStageCommand.scala
##
@@ -499,25 +499,31 @@ case class CarbonInsertFromStageCommand(
* return the loading files failed to create
*/
   private def createStageLoadingFiles(
+  stagePath: String,
   executorService: ExecutorService,
   stageFiles: Array[(CarbonFile, CarbonFile)]): Array[(CarbonFile, 
CarbonFile)] = {
 stageFiles.map { files =>
   executorService.submit(new Callable[(CarbonFile, CarbonFile, Boolean)] {
 override def call(): (CarbonFile, CarbonFile, Boolean) = {
-  // Get the loading files path
-  val stageLoadingFile =
-FileFactory.getCarbonFile(files._1.getAbsolutePath +
-  CarbonTablePath.LOADING_FILE_SUFFIX);
-  // Try to create loading files
-  // make isFailed to be true if createNewFile return false.
-  // the reason can be file exists or exceptions.
-  var isFailed = !stageLoadingFile.createNewFile()
-  // if file exists, modify the lastModifiedTime of the file.
-  if (isFailed) {
-// make isFailed to be true if setLastModifiedTime return false.
-isFailed = 
!stageLoadingFile.setLastModifiedTime(System.currentTimeMillis());
+  try {
+// Get the loading files path
+val stageLoadingFile =
+  FileFactory.getCarbonFile(stagePath +
+CarbonCommonConstants.FILE_SEPARATOR +
+files._1.getName + CarbonTablePath.LOADING_FILE_SUFFIX);
+// Try to create loading files
+// make isFailed to be true if createNewFile return false.
+// the reason can be file exists or exceptions.
+var isFailed = !stageLoadingFile.createNewFile()
+// if file exists, modify the lastmodifiedtime of the file.
+if (isFailed) {
+  // make isFailed to be true if setLastModifiedTime return false.
+  isFailed = 
!stageLoadingFile.setLastModifiedTime(System.currentTimeMillis());
+}
+(files._1, files._2, isFailed)
+  } catch {
+case _ : Exception => (files._1, files._2, true)

Review comment:
   The third paramter is 'isFailed'. when isFailed is equal to be 'true', 
we can retry to delete files. 

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertFromStageCommand.scala
##
@@ -557,25 +564,30 @@ case class CarbonInsertFromStageCommand(
* Return the files failed to delete
*/
   private def deleteStageFiles(
+  stagePath: String,
   executorService: ExecutorService,
   stageFiles: Array[(CarbonFile, CarbonFile)]): Array[(CarbonFile, 
CarbonFile)] = {
 stageFiles.map { files =>
   executorService.submit(new Callable[(CarbonFile, CarbonFile, Boolean)] {
 override def call(): (CarbonFile, CarbonFile, Boolean) = {
   // Delete three types of file: stage|.success|.loading
-  val stageLoadingFile =
-FileFactory.getCarbonFile(files._1.getAbsolutePath
-  + CarbonTablePath.LOADING_FILE_SUFFIX);
-  var isFailed = false
-  // If delete() return false, maybe the reason is FileNotFount or 
FileFailedClean.
-  // Considering FileNotFound means file clean successfully.
-  // We need double check the file exists or not when delete() return 
false.
-  if (!(files._1.delete() && files._2.delete() && 
stageLoadingFile.delete())) {
-// If the file still exists,  make isFailed to be true
-// So we can retry to delete this file.
-isFailed = files._1.exists() || files._1.exists() || 
stageLoadingFile.exists()
+  try {
+val stageLoadingFile = FileFactory.getCarbonFile(stagePath +
+  CarbonCommonConstants.FILE_SEPARATOR +
+  files._1.getName + CarbonTablePath.LOADING_FILE_SUFFIX);
+var isFailed = false
+// If delete() return false, maybe the reason is FileNotFount or 
FileFailedClean.
+// Considering FileNotFound means FileCleanSucessfully.
+// We need double check the file exists or not when delete() 
return false.
+if (!files._1.delete() || !files._2.delete() || 
!stageLoadingFile.delete()) {
+  // If the file still exists,  make isFailed to be true
+  // So we can retry to delete this file.
+  isFailed = files._1.exists() || files._1.exists() || 
stageLoadingFile.exists()
+}
+(files._1, files._2, isFailed)
+  } catch {
+case _: Exception => (files._1, files._2, true)

Review comment:
   The third paramter is 'isFailed'. when isFa

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3886: [CARBONDATA-3944] Delete stage files was interrupted when IOException…

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3886:
URL: https://github.com/apache/carbondata/pull/3886#issuecomment-671845557


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3694/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3886: [CARBONDATA-3944] Delete stage files was interrupted when IOException…

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3886:
URL: https://github.com/apache/carbondata/pull/3886#issuecomment-671844968


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1955/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] marchpure commented on a change in pull request #3886: [CARBONDATA-3944] Delete stage files was interrupted when IOException…

2020-08-11 Thread GitBox


marchpure commented on a change in pull request #3886:
URL: https://github.com/apache/carbondata/pull/3886#discussion_r468453764



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertFromStageCommand.scala
##
@@ -499,25 +499,31 @@ case class CarbonInsertFromStageCommand(
* return the loading files failed to create
*/
   private def createStageLoadingFiles(
+  stagePath: String,
   executorService: ExecutorService,
   stageFiles: Array[(CarbonFile, CarbonFile)]): Array[(CarbonFile, 
CarbonFile)] = {
 stageFiles.map { files =>
   executorService.submit(new Callable[(CarbonFile, CarbonFile, Boolean)] {
 override def call(): (CarbonFile, CarbonFile, Boolean) = {
-  // Get the loading files path
-  val stageLoadingFile =
-FileFactory.getCarbonFile(files._1.getAbsolutePath +
-  CarbonTablePath.LOADING_FILE_SUFFIX);
-  // Try to create loading files
-  // make isFailed to be true if createNewFile return false.
-  // the reason can be file exists or exceptions.
-  var isFailed = !stageLoadingFile.createNewFile()
-  // if file exists, modify the lastModifiedTime of the file.
-  if (isFailed) {
-// make isFailed to be true if setLastModifiedTime return false.
-isFailed = 
!stageLoadingFile.setLastModifiedTime(System.currentTimeMillis());
+  try {
+// Get the loading files path
+val stageLoadingFile =
+  FileFactory.getCarbonFile(stagePath +
+CarbonCommonConstants.FILE_SEPARATOR +

Review comment:
   modified





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] marchpure commented on a change in pull request #3886: [CARBONDATA-3944] Delete stage files was interrupted when IOException…

2020-08-11 Thread GitBox


marchpure commented on a change in pull request #3886:
URL: https://github.com/apache/carbondata/pull/3886#discussion_r468453680



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertFromStageCommand.scala
##
@@ -557,25 +564,30 @@ case class CarbonInsertFromStageCommand(
* Return the files failed to delete
*/
   private def deleteStageFiles(
+  stagePath: String,
   executorService: ExecutorService,
   stageFiles: Array[(CarbonFile, CarbonFile)]): Array[(CarbonFile, 
CarbonFile)] = {
 stageFiles.map { files =>
   executorService.submit(new Callable[(CarbonFile, CarbonFile, Boolean)] {
 override def call(): (CarbonFile, CarbonFile, Boolean) = {
   // Delete three types of file: stage|.success|.loading
-  val stageLoadingFile =
-FileFactory.getCarbonFile(files._1.getAbsolutePath
-  + CarbonTablePath.LOADING_FILE_SUFFIX);
-  var isFailed = false
-  // If delete() return false, maybe the reason is FileNotFount or 
FileFailedClean.
-  // Considering FileNotFound means file clean successfully.
-  // We need double check the file exists or not when delete() return 
false.
-  if (!(files._1.delete() && files._2.delete() && 
stageLoadingFile.delete())) {
-// If the file still exists,  make isFailed to be true
-// So we can retry to delete this file.
-isFailed = files._1.exists() || files._1.exists() || 
stageLoadingFile.exists()
+  try {
+val stageLoadingFile = FileFactory.getCarbonFile(stagePath +
+  CarbonCommonConstants.FILE_SEPARATOR +

Review comment:
   modified





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3878: [CARBONDATA-3947]Fixed Hive read/write operation for Insert into Select operation.

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3878:
URL: https://github.com/apache/carbondata/pull/3878#issuecomment-671828688


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1954/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3878: [CARBONDATA-3947]Fixed Hive read/write operation for Insert into Select operation.

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3878:
URL: https://github.com/apache/carbondata/pull/3878#issuecomment-671826659


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3693/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 commented on pull request #3876: TestingCI

2020-08-11 Thread GitBox


Karan980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-671820358


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3889: [CARBONDATA-3889]Fix a spelling mistake

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3889:
URL: https://github.com/apache/carbondata/pull/3889#issuecomment-671809826


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Kejian-Li opened a new pull request #3889: Fix a spelling mistake

2020-08-11 Thread GitBox


Kejian-Li opened a new pull request #3889:
URL: https://github.com/apache/carbondata/pull/3889


### Why is this PR needed?


### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#issuecomment-671797534


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3691/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-671786799


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1953/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-11 Thread GitBox


CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-671786550


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3692/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org