date:20200915

[GitHub] [carbondata] Karan980 commented on pull request #3923: [CARBONDATA-3984] 1. Fix compaction issue. 2. Fix longStringColumn validation issue

2020-09-15 Thread GitBox



Karan980 commented on pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#issuecomment-693189676


   > @Karan980 the PR title shouldn't be so long, give a small but descriptive 
title and details you can explain in PR description.
   
   Done



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Karan980 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Fix compaction issue. 2. Fix longStringColumn validation issue

2020-09-15 Thread GitBox



Karan980 commented on a change in pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#discussion_r489180454



##
File path: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
##
@@ -3344,6 +3344,9 @@ public int 
compare(org.apache.carbondata.format.ColumnSchema o1,
 // for setting long string columns
 if (!longStringColumnsString.isEmpty()) {
   String[] inputColumns = longStringColumnsString.split(",");
+  for (int i = 0; i < inputColumns.length; i++) {

Review comment:
   Done

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
##
@@ -110,6 +114,51 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
 assert(exceptionCaught.getMessage.contains("its data type is not string"))
   }
 
+  test("cannot alter sort_columns dataType to long_string_columns") {
+val exceptionCaught = intercept[RuntimeException] {
+  sql(
+s"""
+   | CREATE TABLE if not exists $longStringTable(
+   | id INT, NAME STRING, description STRING, address STRING, note 
STRING
+   | ) STORED AS carbondata
+   | TBLPROPERTIES('SORT_COLUMNS'='name, address')
+   |""".
+  stripMargin)
+  sql("ALTER TABLE long_string_table SET 
TBLPROPERTIES('long_String_columns'='NAME')")
+}
+assert(exceptionCaught.getMessage.contains("LONG_STRING_COLUMNS cannot be 
present in sort columns: name"))
+  }
+
+  test("check compaction after altering range column dataType to 
longStringColumn") {
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, 
"2,2")
+  sql(
+s"""
+   | CREATE TABLE if not exists $longStringTable(
+   | id INT, NAME STRING, description STRING
+   | ) STORED AS carbondata
+   | TBLPROPERTIES('RANGE_COLUMN'='Name')
+   |""".
+  stripMargin)
+sql("ALTER TABLE long_string_table SET 
TBLPROPERTIES('long_String_columns'='NAME')")
+sql("insert into long_string_table select 1, 'ab', 'cool1'")
+sql("insert into long_string_table select 2, 'abc', 'cool2'")
+sql("ALTER TABLE long_string_table compact 'minor'")
+
+val carbonTable = CarbonMetadata.getInstance().getCarbonTable(
+  CarbonCommonConstants.DATABASE_DEFAULT_NAME,
+  "long_string_table"
+)
+val absoluteTableIdentifier = carbonTable
+  .getAbsoluteTableIdentifier

Review comment:
   Done

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
##
@@ -110,6 +114,51 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
 assert(exceptionCaught.getMessage.contains("its data type is not string"))
   }
 
+  test("cannot alter sort_columns dataType to long_string_columns") {
+val exceptionCaught = intercept[RuntimeException] {
+  sql(
+s"""
+   | CREATE TABLE if not exists $longStringTable(
+   | id INT, NAME STRING, description STRING, address STRING, note 
STRING
+   | ) STORED AS carbondata
+   | TBLPROPERTIES('SORT_COLUMNS'='name, address')
+   |""".
+  stripMargin)
+  sql("ALTER TABLE long_string_table SET 
TBLPROPERTIES('long_String_columns'='NAME')")
+}
+assert(exceptionCaught.getMessage.contains("LONG_STRING_COLUMNS cannot be 
present in sort columns: name"))
+  }
+
+  test("check compaction after altering range column dataType to 
longStringColumn") {
+CarbonProperties.getInstance()

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3930:
URL: https://github.com/apache/carbondata/pull/3930#issuecomment-693180722


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4091/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3930:
URL: https://github.com/apache/carbondata/pull/3930#issuecomment-693178308


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2350/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



akashrn5 commented on a change in pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#discussion_r488807941



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
##
@@ -29,7 +31,9 @@ import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandExcepti
 import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.metadata.CarbonMetadata
 import org.apache.carbondata.core.metadata.datatype.DataTypes
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
 import org.apache.carbondata.core.util.CarbonProperties
+

Review comment:
   avoid unnecessary changes





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes

2020-09-15 Thread GitBox



ajantha-bhat commented on a change in pull request #3920:
URL: https://github.com/apache/carbondata/pull/3920#discussion_r489158514



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/PrestoFilterUtil.java
##
@@ -78,16 +78,22 @@ private static DataType 
spi2CarbondataTypeMapper(HiveColumnHandle columnHandle)
 HiveType colType = columnHandle.getHiveType();
 if (colType.equals(HiveType.HIVE_BOOLEAN)) {
   return DataTypes.BOOLEAN;
+} else if (colType.equals(HiveType.HIVE_BINARY)) {
+  return DataTypes.BINARY;
 } else if (colType.equals(HiveType.HIVE_SHORT)) {
   return DataTypes.SHORT;
 } else if (colType.equals(HiveType.HIVE_INT)) {
   return DataTypes.INT;
+} else if (colType.equals(HiveType.HIVE_FLOAT)) {

Review comment:
   also manually run new testcases once in prestodb profile





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes

2020-09-15 Thread GitBox



ajantha-bhat commented on a change in pull request #3920:
URL: https://github.com/apache/carbondata/pull/3920#discussion_r489158120



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/PrestoFilterUtil.java
##
@@ -78,16 +78,22 @@ private static DataType 
spi2CarbondataTypeMapper(HiveColumnHandle columnHandle)
 HiveType colType = columnHandle.getHiveType();
 if (colType.equals(HiveType.HIVE_BOOLEAN)) {
   return DataTypes.BOOLEAN;
+} else if (colType.equals(HiveType.HIVE_BINARY)) {
+  return DataTypes.BINARY;
 } else if (colType.equals(HiveType.HIVE_SHORT)) {
   return DataTypes.SHORT;
 } else if (colType.equals(HiveType.HIVE_INT)) {
   return DataTypes.INT;
+} else if (colType.equals(HiveType.HIVE_FLOAT)) {

Review comment:
   please handle the same file changes in prestodb also and compile 
prestodb profile once with latest changes





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3929: [CARBONDATA-3991] Fix the set modified time function on S3 and Alluxio file system

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3929:
URL: https://github.com/apache/carbondata/pull/3929#issuecomment-693160345


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4090/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3925: [CARBONDATA-3985] Optimize the segment-timestamp file clean up

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3925:
URL: https://github.com/apache/carbondata/pull/3925#issuecomment-693154382


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2349/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3925: [CARBONDATA-3985] Optimize the segment-timestamp file clean up

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3925:
URL: https://github.com/apache/carbondata/pull/3925#issuecomment-693152789


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4089/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-09-15 Thread GitBox



QiangCai commented on pull request #3930:
URL: https://github.com/apache/carbondata/pull/3930#issuecomment-693148903


   add to whitelist



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3930:
URL: https://github.com/apache/carbondata/pull/3930#issuecomment-693134156


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Klaus-xjp opened a new pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-09-15 Thread GitBox



Klaus-xjp opened a new pull request #3930:
URL: https://github.com/apache/carbondata/pull/3930


   … file system
   
###Why is this PR needed?  
   If there are some update or create mv in S3 and Alluxio files ,those 
operations do not take effect. And the other tenant can't find the changes in 
this operation and cause a data consistency problem.
   
###What changes were proposed in this PR?  
   In this PR, we set two function to fix this problem.
   1.Create a new file for S3 and Alluxio to set a tag to update.
   2.Set a switch case like function to select different scenario in update or 
create.  
   
###Does this PR introduce any user interface change?  
   No  
   
###Is any new testcase added?  
   Yes  
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Klaus-xjp closed pull request #3929: [CARBONDATA-3991] Fix the set modified time function on S3 and Alluxio file system

2020-09-15 Thread GitBox



Klaus-xjp closed pull request #3929:
URL: https://github.com/apache/carbondata/pull/3929


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3929: [CARBONDATA-3991] Fix the set modified time function on S3 and Alluxio file system

2020-09-15 Thread GitBox



QiangCai commented on pull request #3929:
URL: https://github.com/apache/carbondata/pull/3929#issuecomment-693131340


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] QiangCai commented on pull request #3929: [CARBONDATA-3991] Fix the set modified time function on S3 and Alluxio file system

2020-09-15 Thread GitBox



QiangCai commented on pull request #3929:
URL: https://github.com/apache/carbondata/pull/3929#issuecomment-693131254


   if the old file exists, how does it work?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Klaus-xjp opened a new pull request #3929: [CARBONDATA-3991] Fix the modified time failed problem

2020-09-15 Thread GitBox



Klaus-xjp opened a new pull request #3929:
URL: https://github.com/apache/carbondata/pull/3929


   Why is this PR needed?
   If there are some update or create mv in S3 and Alluxio files ,those 
operations do not take effect. And the other tenant can't find the changes in 
this operation and cause a data consistency problem.
   
   What changes were proposed in this PR?
   In this PR, we set two function to fix this problem.
   1. Create a new file for S3 and Alluxio to set a tag to update.
   2. Set a switch case like function to select different scenario in update or 
create.
   
   Does this PR introduce any user interface change?
   No
   
   Is any new testcase added?
   Yes
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3929: [CARBONDATA-3991] Fix the modified time failed problem

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3929:
URL: https://github.com/apache/carbondata/pull/3929#issuecomment-693124094


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (CARBONDATA-3991) File system could not set modified time because don't override the settime function

2020-09-15 Thread jingpan xiong (Jira)

jingpan xiong created CARBONDATA-3991:
-

 Summary: File system could not set modified time because don't 
override the settime function
 Key: CARBONDATA-3991
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3991
 Project: CarbonData
  Issue Type: Bug
  Components: core
Affects Versions: 2.0.1
Reporter: jingpan xiong
 Fix For: 2.0.1


The file system like S3 and Alluxio, don't override the settime function, cause 
the updata and create mv got some problem. This bug can't raise a exception on 
set modified time, and may set a null value in modified time. This bug may 
cause multi tenant problem and data consistency problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3920:
URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692853189


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2348/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3920:
URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692849005


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4088/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



akashrn5 commented on pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692834883


   @Karan980 the PR title shouldn't be so long, give a small but descriptive 
title and details you can explain in PR description.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



akashrn5 commented on a change in pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#discussion_r488807808



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
##
@@ -110,6 +114,51 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
 assert(exceptionCaught.getMessage.contains("its data type is not string"))
   }
 
+  test("cannot alter sort_columns dataType to long_string_columns") {
+val exceptionCaught = intercept[RuntimeException] {
+  sql(
+s"""
+   | CREATE TABLE if not exists $longStringTable(
+   | id INT, NAME STRING, description STRING, address STRING, note 
STRING
+   | ) STORED AS carbondata
+   | TBLPROPERTIES('SORT_COLUMNS'='name, address')
+   |""".
+  stripMargin)
+  sql("ALTER TABLE long_string_table SET 
TBLPROPERTIES('long_String_columns'='NAME')")
+}
+assert(exceptionCaught.getMessage.contains("LONG_STRING_COLUMNS cannot be 
present in sort columns: name"))
+  }
+
+  test("check compaction after altering range column dataType to 
longStringColumn") {
+CarbonProperties.getInstance()

Review comment:
   this property value should be reset after this test case else it will 
impact later test cases. Take the exiting value, set new value, reset to old 
value

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
##
@@ -29,7 +31,9 @@ import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandExcepti
 import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.metadata.CarbonMetadata
 import org.apache.carbondata.core.metadata.datatype.DataTypes
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
 import org.apache.carbondata.core.util.CarbonProperties
+

Review comment:
   avoid unnecessary changes

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
##
@@ -110,6 +114,51 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
 assert(exceptionCaught.getMessage.contains("its data type is not string"))
   }
 
+  test("cannot alter sort_columns dataType to long_string_columns") {
+val exceptionCaught = intercept[RuntimeException] {
+  sql(
+s"""
+   | CREATE TABLE if not exists $longStringTable(
+   | id INT, NAME STRING, description STRING, address STRING, note 
STRING
+   | ) STORED AS carbondata
+   | TBLPROPERTIES('SORT_COLUMNS'='name, address')
+   |""".
+  stripMargin)
+  sql("ALTER TABLE long_string_table SET 
TBLPROPERTIES('long_String_columns'='NAME')")
+}
+assert(exceptionCaught.getMessage.contains("LONG_STRING_COLUMNS cannot be 
present in sort columns: name"))
+  }
+
+  test("check compaction after altering range column dataType to 
longStringColumn") {
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, 
"2,2")
+  sql(
+s"""
+   | CREATE TABLE if not exists $longStringTable(
+   | id INT, NAME STRING, description STRING
+   | ) STORED AS carbondata
+   | TBLPROPERTIES('RANGE_COLUMN'='Name')
+   |""".
+  stripMargin)
+sql("ALTER TABLE long_string_table SET 
TBLPROPERTIES('long_String_columns'='NAME')")
+sql("insert into long_string_table select 1, 'ab', 'cool1'")
+sql("insert into long_string_table select 2, 'abc', 'cool2'")
+sql("ALTER TABLE long_string_table compact 'minor'")
+
+val carbonTable = CarbonMetadata.getInstance().getCarbonTable(
+  CarbonCommonConstants.DATABASE_DEFAULT_NAME,
+  "long_string_table"
+)
+val absoluteTableIdentifier = carbonTable
+  .getAbsoluteTableIdentifier

Review comment:
   move this line above





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692832532


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4086/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (CARBONDATA-3986) multiple issues during compaction and concurrent scenarios

2020-09-15 Thread Akash R Nilugal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3986.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> multiple issues during compaction and concurrent scenarios
> --
>
> Key: CARBONDATA-3986
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3986
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> multiple issues during compaction and concurrent scenarios
> a) Auto compaction/multiple times minor compaction is called, it was 
> considering compacted segments and coming compaction again ad overwriting the 
> files and segments
> b) Minor/ auto compaction should skip >=2 level segments, now only skipping 
> =2 level segments
> c) when compaction failed, no need to call merge index
> d) At executor, When segment file or table status file failed to write during 
> merge index event, need to remove the stale files.
> e) during partial load cleanup segment folders are removed but segment 
> metadata files were not removed
> f) Some table status retry issues



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



asfgit closed pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692828938


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2346/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



akashrn5 commented on pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692826931


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692818581


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2345/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692817569


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4085/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3920:
URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692774538


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2347/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3920: [CARBONDATA-3981] Presto filter check on binary datatype

2020-09-15 Thread GitBox



akkio-97 commented on a change in pull request #3920:
URL: https://github.com/apache/carbondata/pull/3920#discussion_r488734849



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala
##
@@ -230,6 +230,37 @@ class PrestoTestNonTransactionalTableFiles extends 
FunSuiteLike with BeforeAndAf
 }
   }
 
+  def buildOnlyBinary(rows: Int, sortColumns: Array[String], path : String): 
Any = {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3920: [CARBONDATA-3981] Presto filter check on binary datatype

2020-09-15 Thread GitBox



akkio-97 commented on a change in pull request #3920:
URL: https://github.com/apache/carbondata/pull/3920#discussion_r488734274



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/PrestoFilterUtil.java
##
@@ -78,6 +78,8 @@ private static DataType 
spi2CarbondataTypeMapper(HiveColumnHandle columnHandle)
 HiveType colType = columnHandle.getHiveType();
 if (colType.equals(HiveType.HIVE_BOOLEAN)) {
   return DataTypes.BOOLEAN;
+} else if (colType.equals(HiveType.HIVE_BINARY)) {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



akashrn5 commented on a change in pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#discussion_r488723181



##
File path: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
##
@@ -3344,6 +3344,9 @@ public int 
compare(org.apache.carbondata.format.ColumnSchema o1,
 // for setting long string columns
 if (!longStringColumnsString.isEmpty()) {
   String[] inputColumns = longStringColumnsString.split(",");
+  for (int i = 0; i < inputColumns.length; i++) {

Review comment:
   instead of forloop, you can use 
`Arrays.stream(inputColumns).map(longColumn => 
longColumn.trim().toLowerCase()).toArray(String[]::new)` in a more functional 
manner





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Karan980 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



Karan980 commented on a change in pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#discussion_r488712088



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
##
@@ -110,6 +110,21 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
 assert(exceptionCaught.getMessage.contains("its data type is not string"))
   }
 
+  test("cannot alter sort_columns dataType to long_string_columns") {

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



ajantha-bhat commented on pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692740141


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692739041


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4084/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (CARBONDATA-3983) SI compatability issue

2020-09-15 Thread Akash R Nilugal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3983.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> SI compatability issue
> --
>
> Key: CARBONDATA-3983
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3983
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Read from maintable having SI returns empty resultset when SI is stored with 
> old tuple id storage format. 
> Bug id: BUG2020090205414
> PR link: https://github.com/apache/carbondata/pull/3922



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



asfgit closed pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



ShreelekhyaG commented on a change in pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922#discussion_r488687445



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java
##
@@ -59,13 +61,23 @@ public ImplicitExpression(Map> 
blockIdToBlockletIdMapping)
 
   private void addBlockEntry(String blockletPath) {
 String blockId =
-blockletPath.substring(0, 
blockletPath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR));
+blockletPath.substring(0, blockletPath.lastIndexOf(File.separator));
+// Check if blockletPath contains old tuple id format, and convert it to 
compatible format.

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



akashrn5 commented on pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692730001


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692728976


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2343/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692728352


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4083/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (CARBONDATA-3990) Fix DropCache log error when indexmap is null

2020-09-15 Thread Indhumathi Muthumurugesh (Jira)

Indhumathi Muthumurugesh created CARBONDATA-3990:


 Summary: Fix DropCache log error  when indexmap is null
 Key: CARBONDATA-3990
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3990
 Project: CarbonData
  Issue Type: Bug
Reporter: Indhumathi Muthumurugesh






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CARBONDATA-3980) Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread Ajantha Bhat (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3980.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> Load fails with aborted exception when Bad records action is unspecified
> 
>
> Key: CARBONDATA-3980
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3980
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When the partition column is loaded with a bad record value, load fails with 
> 'Job aborted' message in cluster. However in complete stack trace we can see 
> the actual error message. ('Data load failed due to bad record: The value 
> with column name projectjoindate and column data type TIMESTAMP is not a 
> valid TIMESTAMP type') 
> Bug id: BUG2020082802430
> PR link: https://github.com/apache/carbondata/pull/3919



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



asfgit closed pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] asfgit closed pull request #3921: [CARBONDATA-3928] Removed records from exception message.

2020-09-15 Thread GitBox



asfgit closed pull request #3921:
URL: https://github.com/apache/carbondata/pull/3921


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



ajantha-bhat commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692670191


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



ajantha-bhat commented on a change in pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#discussion_r488608473



##
File path: 
core/src/main/java/org/apache/carbondata/core/writer/CarbonIndexFileMergeWriter.java
##
@@ -233,15 +234,24 @@ public String 
writeMergeIndexFileBasedOnSegmentFile(String segmentId,
 }
 if 
(FileFactory.getCarbonFile(entry.getKey()).equals(FileFactory.getCarbonFile(location)))
 {
   segment.getValue().setMergeFileName(mergeIndexFile);
-  segment.getValue().setFiles(new HashSet());
+  mergeIndexFiles
+  .add(entry.getKey() + CarbonCommonConstants.FILE_SEPARATOR + 
mergeIndexFile);
+  segment.getValue().setFiles(new HashSet<>());
   break;
 }
   }
   if (table.isHivePartitionTable()) {
 for (PartitionSpec partitionSpec : partitionSpecs) {
   if (partitionSpec.getLocation().toString().equals(partitionPath)) {
-SegmentFileStore.writeSegmentFile(table.getTablePath(), 
mergeIndexFile, partitionPath,
-segmentId + "_" + uuid + "", partitionSpec.getPartitions(), 
true);
+try {
+  SegmentFileStore.writeSegmentFile(table.getTablePath(), 
mergeIndexFile, partitionPath,
+  segmentId + "_" + uuid + "", partitionSpec.getPartitions(), 
true);
+} catch (Exception ex) {
+  // delete merge index file if created,
+  // keep only index files as segment file writing is failed

Review comment:
   ok

##
File path: 
core/src/main/java/org/apache/carbondata/core/writer/CarbonIndexFileMergeWriter.java
##
@@ -251,9 +261,29 @@ public String writeMergeIndexFileBasedOnSegmentFile(String 
segmentId,
 String path = CarbonTablePath.getSegmentFilesLocation(table.getTablePath())
 + CarbonCommonConstants.FILE_SEPARATOR + newSegmentFileName;
 if (!table.isHivePartitionTable()) {
-  SegmentFileStore.writeSegmentFile(segmentFileStore.getSegmentFile(), 
path);
-  SegmentFileStore.updateTableStatusFile(table, segmentId, 
newSegmentFileName,
+  String content = SegmentStatusManager.readFileAsString(path);
+  try {
+SegmentFileStore.writeSegmentFile(segmentFileStore.getSegmentFile(), 
path);
+  } catch (Exception ex) {
+// delete merge index file if created,

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3928: [WIP]Fix DropCache log error when indexmap is null

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3928:
URL: https://github.com/apache/carbondata/pull/3928#issuecomment-692663663


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2341/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3928: [WIP]Fix DropCache log error when indexmap is null

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3928:
URL: https://github.com/apache/carbondata/pull/3928#issuecomment-692661529


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4081/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



ShreelekhyaG commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488595453



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala
##
@@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends 
QueryTest with BeforeAndAfter
 }
   }
 
+  test("test load with partition column having bad record value") {
+sql("drop table if exists dataloadOptionTests")
+sql("CREATE TABLE dataloadOptionTests (empno int, empname String, 
designation String, " +
+  "workgroupcategory int, workgroupcategoryname String, deptno int, 
projectjoindate " +
+  "Timestamp, projectenddate Date,attendance int,utilization int,salary 
int) PARTITIONED BY " +
+  "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ")
+val csvFilePath = s"$resourcesPath/data.csv"
+val ex = intercept[Exception] {
+  sql("LOAD DATA local inpath '" + csvFilePath +
+  "' INTO TABLE dataloadOptionTests OPTIONS 
('bad_records_action'='FAIL', 'DELIMITER'= '," +
+  "', 'QUOTECHAR'= '\"', 
'dateformat'='DD-MM-','timestampformat'='DD-MM-')");
+}
+assert(ex.getMessage.contains(
+  "DataLoad failure: Data load failed due to bad record: The value with 
column name " +
+  "projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP 
type.Please " +
+  "enable bad record logger to know the detail reason."))
+  }

Review comment:
   Ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692654382


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2340/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692645252


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4080/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692642655


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4079/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



akashrn5 commented on a change in pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#discussion_r488574809



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
##
@@ -110,6 +110,21 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
 assert(exceptionCaught.getMessage.contains("its data type is not string"))
   }
 
+  test("cannot alter sort_columns dataType to long_string_columns") {

Review comment:
   please add test case for compaction failure case





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692638683


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2339/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary datatype

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3920:
URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692631719


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4078/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary datatype

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3920:
URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692631316


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2338/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692629619


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4077/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692594093


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2336/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3923:
URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692593886


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4076/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692593687


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2337/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-692584299


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2335/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



ajantha-bhat commented on a change in pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#discussion_r488511466



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java
##
@@ -64,45 +68,85 @@ public static void deletePartialLoadDataIfExist(CarbonTable 
carbonTable,
   if (allSegments == null || allSegments.length == 0) {
 return;
   }
-  LoadMetadataDetails[] details = 
SegmentStatusManager.readLoadMetadata(metaDataLocation);
-  // there is no segment or failed to read tablestatus file.
-  // so it should stop immediately.
-  if (details == null || details.length == 0) {
-return;
-  }
-  Set metadataSet = new HashSet<>(details.length);
-  for (LoadMetadataDetails detail : details) {
-metadataSet.add(detail.getLoadName());
-  }
-  List staleSegments = new ArrayList<>(allSegments.length);
-  for (CarbonFile segment : allSegments) {
-String segmentName = segment.getName();
-// check segment folder pattern
-if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) {
-  String[] parts = segmentName.split(CarbonCommonConstants.UNDERSCORE);
-  if (parts.length == 2) {
-boolean isOriginal = !parts[1].contains(".");
-if (isCompactionFlow) {
-  // in compaction flow, it should be big segment and segment 
metadata is not exists
-  if (!isOriginal && !metadataSet.contains(parts[1])) {
-staleSegments.add(segment);
-  }
-} else {
-  // in loading flow, it should be original segment and segment 
metadata is not exists
-  if (isOriginal && !metadataSet.contains(parts[1])) {
-staleSegments.add(segment);
+  int retryCount = CarbonLockUtil
+  
.getLockProperty(CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK,
+  
CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK_DEFAULT);
+  int maxTimeout = CarbonLockUtil
+  
.getLockProperty(CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK,
+  CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK_DEFAULT);
+  ICarbonLock carbonTableStatusLock = CarbonLockFactory
+  .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier(), 
LockUsage.TABLE_STATUS_LOCK);
+  try {
+if (carbonTableStatusLock.lockWithRetries(retryCount, maxTimeout)) {
+  LoadMetadataDetails[] details = 
SegmentStatusManager.readLoadMetadata(metaDataLocation);
+  // there is no segment or failed to read tablestatus file.
+  // so it should stop immediately.
+  if (details == null || details.length == 0) {
+return;
+  }
+  Set metadataSet = new HashSet<>(details.length);
+  for (LoadMetadataDetails detail : details) {
+metadataSet.add(detail.getLoadName());
+  }
+  List staleSegments = new ArrayList<>(allSegments.length);
+  Set staleSegmentsId = new HashSet<>(allSegments.length);
+  for (CarbonFile segment : allSegments) {
+String segmentName = segment.getName();
+// check segment folder pattern
+if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) {
+  String[] parts = 
segmentName.split(CarbonCommonConstants.UNDERSCORE);
+  if (parts.length == 2) {
+boolean isOriginal = !parts[1].contains(".");
+if (isCompactionFlow) {
+  // in compaction flow,
+  // it should be merged segment and segment metadata doesn't 
exists
+  if (!isOriginal && !metadataSet.contains(parts[1])) {
+staleSegments.add(segment);
+staleSegmentsId.add(parts[1]);
+  }
+} else {
+  // in loading flow,
+  // it should be original segment and segment metadata 
doesn't exists
+  if (isOriginal && !metadataSet.contains(parts[1])) {
+staleSegments.add(segment);
+staleSegmentsId.add(parts[1]);
+  }
+}
   }
 }
   }
+  // delete segment folders one by one
+  for (CarbonFile staleSegment : staleSegments) {
+try {
+  CarbonUtil.deleteFoldersAndFiles(staleSegment);
+} catch (IOException | InterruptedException e) {
+  LOGGER.error("Unable to delete the given path :: " + 
e.getMessage(), e);
+}
+  }
+  if (staleSegments.size() > 0) {
+// get the segment metadata path
+String segmentFilesLocation =
+
CarbonTablePath.getSegmentFilesLocation(carbonTable.getTablePath());
+// delete the segment metadata files also
+

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-692581967


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4075/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



ShreelekhyaG commented on a change in pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922#discussion_r488507475



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java
##
@@ -59,13 +60,22 @@ public ImplicitExpression(Map> 
blockIdToBlockletIdMapping)
 
   private void addBlockEntry(String blockletPath) {
 String blockId =
-blockletPath.substring(0, 
blockletPath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR));
+blockletPath.substring(0, blockletPath.lastIndexOf(File.separator));
+// Check if blockletPath contains old tuple id format, and convert it to 
compatible format.
+if (blockId.contains("batchno")) {

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



akashrn5 commented on a change in pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922#discussion_r488507700



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java
##
@@ -59,13 +61,23 @@ public ImplicitExpression(Map> 
blockIdToBlockletIdMapping)
 
   private void addBlockEntry(String blockletPath) {
 String blockId =
-blockletPath.substring(0, 
blockletPath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR));
+blockletPath.substring(0, blockletPath.lastIndexOf(File.separator));
+// Check if blockletPath contains old tuple id format, and convert it to 
compatible format.

Review comment:
   @ShreelekhyaG can you please mention the old and new tupleID format here 
in comment, so that it will be easier for developer and reviewer for 
understanding.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



akashrn5 commented on a change in pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#discussion_r488503684



##
File path: 
core/src/main/java/org/apache/carbondata/core/writer/CarbonIndexFileMergeWriter.java
##
@@ -233,15 +234,24 @@ public String 
writeMergeIndexFileBasedOnSegmentFile(String segmentId,
 }
 if 
(FileFactory.getCarbonFile(entry.getKey()).equals(FileFactory.getCarbonFile(location)))
 {
   segment.getValue().setMergeFileName(mergeIndexFile);
-  segment.getValue().setFiles(new HashSet());
+  mergeIndexFiles
+  .add(entry.getKey() + CarbonCommonConstants.FILE_SEPARATOR + 
mergeIndexFile);
+  segment.getValue().setFiles(new HashSet<>());
   break;
 }
   }
   if (table.isHivePartitionTable()) {
 for (PartitionSpec partitionSpec : partitionSpecs) {
   if (partitionSpec.getLocation().toString().equals(partitionPath)) {
-SegmentFileStore.writeSegmentFile(table.getTablePath(), 
mergeIndexFile, partitionPath,
-segmentId + "_" + uuid + "", partitionSpec.getPartitions(), 
true);
+try {
+  SegmentFileStore.writeSegmentFile(table.getTablePath(), 
mergeIndexFile, partitionPath,
+  segmentId + "_" + uuid + "", partitionSpec.getPartitions(), 
true);
+} catch (Exception ex) {
+  // delete merge index file if created,
+  // keep only index files as segment file writing is failed

Review comment:
   what i meant to say is, here we can error as writing segment file 
failed, and then throw exception, because we just get the IO exception here and 
not any custom message exception, so if we add here, it will be easy for any 
future analysis.

##
File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java
##
@@ -64,45 +68,85 @@ public static void deletePartialLoadDataIfExist(CarbonTable 
carbonTable,
   if (allSegments == null || allSegments.length == 0) {
 return;
   }
-  LoadMetadataDetails[] details = 
SegmentStatusManager.readLoadMetadata(metaDataLocation);
-  // there is no segment or failed to read tablestatus file.
-  // so it should stop immediately.
-  if (details == null || details.length == 0) {
-return;
-  }
-  Set metadataSet = new HashSet<>(details.length);
-  for (LoadMetadataDetails detail : details) {
-metadataSet.add(detail.getLoadName());
-  }
-  List staleSegments = new ArrayList<>(allSegments.length);
-  for (CarbonFile segment : allSegments) {
-String segmentName = segment.getName();
-// check segment folder pattern
-if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) {
-  String[] parts = segmentName.split(CarbonCommonConstants.UNDERSCORE);
-  if (parts.length == 2) {
-boolean isOriginal = !parts[1].contains(".");
-if (isCompactionFlow) {
-  // in compaction flow, it should be big segment and segment 
metadata is not exists
-  if (!isOriginal && !metadataSet.contains(parts[1])) {
-staleSegments.add(segment);
-  }
-} else {
-  // in loading flow, it should be original segment and segment 
metadata is not exists
-  if (isOriginal && !metadataSet.contains(parts[1])) {
-staleSegments.add(segment);
+  int retryCount = CarbonLockUtil
+  
.getLockProperty(CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK,
+  
CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK_DEFAULT);
+  int maxTimeout = CarbonLockUtil
+  
.getLockProperty(CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK,
+  CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK_DEFAULT);
+  ICarbonLock carbonTableStatusLock = CarbonLockFactory
+  .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier(), 
LockUsage.TABLE_STATUS_LOCK);
+  try {
+if (carbonTableStatusLock.lockWithRetries(retryCount, maxTimeout)) {
+  LoadMetadataDetails[] details = 
SegmentStatusManager.readLoadMetadata(metaDataLocation);
+  // there is no segment or failed to read tablestatus file.
+  // so it should stop immediately.
+  if (details == null || details.length == 0) {
+return;
+  }
+  Set metadataSet = new HashSet<>(details.length);
+  for (LoadMetadataDetails detail : details) {
+metadataSet.add(detail.getLoadName());
+  }
+  List staleSegments = new ArrayList<>(allSegments.length);
+  Set staleSegmentsId = new HashSet<>(allSegments.length);
+  for (CarbonFile segment : allSegments) {
+String segmentName = segment.getName();
+// check segment folder pattern
+if

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692563259







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3928: [WIP]Fix DropCache log error when indexmap is null

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3928:
URL: https://github.com/apache/carbondata/pull/3928#issuecomment-692562871


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2333/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3928: [WIP]Fix DropCache log error when indexmap is null

2020-09-15 Thread GitBox



CarbonDataQA1 commented on pull request #3928:
URL: https://github.com/apache/carbondata/pull/3928#issuecomment-692562664


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4072/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] marchpure commented on a change in pull request #3917: [CARBONDATA-3978] Clean files refactor and added support for a trash folder where all the carbondata files will be copied t

2020-09-15 Thread GitBox



marchpure commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r488464250



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.util.concurrent.{Executors, ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.hadoop.fs.permission.{FsAction, FsPermission}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockUtil, ICarbonLock, 
LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{SegmentStatus, 
SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.core.util.path.CarbonTablePath
+
+object CleanFilesUtil {
+  private val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * The method deletes all data if forceTableCLean  and lean garbage 
segment
+   * (MARKED_FOR_DELETE state) if forceTableCLean 
+   *
+   * @param dbName : Database name
+   * @param tableName  : Table name
+   * @param tablePath  : Table path
+   * @param carbonTable: CarbonTable Object  in case of 
force clean
+   * @param forceTableClean:  for force clean it will delete all 
data
+   *it will clean garbage segment 
(MARKED_FOR_DELETE state)
+   * @param currentTablePartitions : Hive Partitions  details
+   */
+  def cleanFiles(
+  dbName: String,
+  tableName: String,
+  tablePath: String,
+  carbonTable: CarbonTable,
+  forceTableClean: Boolean,
+  currentTablePartitions: Option[Seq[PartitionSpec]] = None,
+  truncateTable: Boolean = false): Unit = {
+var carbonCleanFilesLock: ICarbonLock = null
+val absoluteTableIdentifier = if (forceTableClean) {
+  AbsoluteTableIdentifier.from(tablePath, dbName, tableName, tableName)
+} else {
+  carbonTable.getAbsoluteTableIdentifier
+}
+try {
+  val errorMsg = "Clean files request is failed for " +
+s"$dbName.$tableName" +
+". Not able to acquire the clean files lock due to another clean files 
" +
+"operation is running in the background."
+  // in case of force clean the lock is not required
+  if (forceTableClean) {

Review comment:
   forceTableClean is too violence， please delete it.

##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.util.concurrent.{Executors,

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3921: [CARBONDATA-3928] Removed records from exception message.

2020-09-15 Thread GitBox



ajantha-bhat commented on a change in pull request #3921:
URL: https://github.com/apache/carbondata/pull/3921#discussion_r488463308



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java
##
@@ -119,10 +118,6 @@ public CarbonRow convert(CarbonRow row) throws 
CarbonDataLoadingException {
 .getTableProperties();
 String spatialProperty = 
properties.get(CarbonCommonConstants.SPATIAL_INDEX);
 boolean isSpatialColumn = false;
-Object[] rawData = row.getRawData();
-if (rawData == null) {
-  rawData = row.getData() == null ? null : row.getData().clone();

Review comment:
   Ok then.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on pull request #3921: [CARBONDATA-3928] Removed records from exception message.

2020-09-15 Thread GitBox



ajantha-bhat commented on pull request #3921:
URL: https://github.com/apache/carbondata/pull/3921#issuecomment-692538365


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3921: [CARBONDATA-3928] Removed records from exception message.

2020-09-15 Thread GitBox



nihal0107 commented on a change in pull request #3921:
URL: https://github.com/apache/carbondata/pull/3921#discussion_r488461592



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java
##
@@ -119,10 +118,6 @@ public CarbonRow convert(CarbonRow row) throws 
CarbonDataLoadingException {
 .getTableProperties();
 String spatialProperty = 
properties.get(CarbonCommonConstants.SPATIAL_INDEX);
 boolean isSpatialColumn = false;
-Object[] rawData = row.getRawData();
-if (rawData == null) {
-  rawData = row.getData() == null ? null : row.getData().clone();

Review comment:
   I had only added this scenario last time because at that time we wanted 
rawdata for every bad record action. But now we need this only in case of bad 
record logger is enable or action is redirect. And in this case, rawdata will 
be always available.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3921: [CARBONDATA-3928] Removed records from exception message.

2020-09-15 Thread GitBox



ajantha-bhat commented on a change in pull request #3921:
URL: https://github.com/apache/carbondata/pull/3921#discussion_r488454458



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java
##
@@ -119,10 +118,6 @@ public CarbonRow convert(CarbonRow row) throws 
CarbonDataLoadingException {
 .getTableProperties();
 String spatialProperty = 
properties.get(CarbonCommonConstants.SPATIAL_INDEX);
 boolean isSpatialColumn = false;
-Object[] rawData = row.getRawData();
-if (rawData == null) {
-  rawData = row.getData() == null ? null : row.getData().clone();

Review comment:
   Please revert back this change, somecases rawData will not be set, hence 
they are setting here before converting the row. 
   Now your current code will use null instead of row.getData().clone() in this 
scenario.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



ajantha-bhat commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488449347



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala
##
@@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends 
QueryTest with BeforeAndAfter
 }
   }
 
+  test("test load with partition column having bad record value") {
+sql("drop table if exists dataloadOptionTests")
+sql("CREATE TABLE dataloadOptionTests (empno int, empname String, 
designation String, " +
+  "workgroupcategory int, workgroupcategoryname String, deptno int, 
projectjoindate " +
+  "Timestamp, projectenddate Date,attendance int,utilization int,salary 
int) PARTITIONED BY " +
+  "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ")
+val csvFilePath = s"$resourcesPath/data.csv"
+val ex = intercept[Exception] {

Review comment:
   please intercept RuntimeException only





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



ajantha-bhat commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488449175



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala
##
@@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends 
QueryTest with BeforeAndAfter
 }
   }
 
+  test("test load with partition column having bad record value") {
+sql("drop table if exists dataloadOptionTests")
+sql("CREATE TABLE dataloadOptionTests (empno int, empname String, 
designation String, " +
+  "workgroupcategory int, workgroupcategoryname String, deptno int, 
projectjoindate " +
+  "Timestamp, projectenddate Date,attendance int,utilization int,salary 
int) PARTITIONED BY " +
+  "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ")
+val csvFilePath = s"$resourcesPath/data.csv"
+val ex = intercept[Exception] {
+  sql("LOAD DATA local inpath '" + csvFilePath +
+  "' INTO TABLE dataloadOptionTests OPTIONS 
('bad_records_action'='FAIL', 'DELIMITER'= '," +
+  "', 'QUOTECHAR'= '\"', 
'dateformat'='DD-MM-','timestampformat'='DD-MM-')");
+}
+assert(ex.getMessage.contains(
+  "DataLoad failure: Data load failed due to bad record: The value with 
column name " +
+  "projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP 
type.Please " +
+  "enable bad record logger to know the detail reason."))
+  }

Review comment:
   please drop the table here





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] jxxfxkp commented on pull request #3662: [TEMP] support prestosql 330 in carbon

2020-09-15 Thread GitBox



jxxfxkp commented on pull request #3662:
URL: https://github.com/apache/carbondata/pull/3662#issuecomment-692525522


   I 'm sure to cherry-pick this PR. Some Classes cannot find symbol 
(OrcFileWriterConfig->OrcWriterConfig,io.prestosql.plugin.hive->io.prestosql.plugin.hive.orc
 ...) ,Although I can change them, I still get some prestoserver--start errors  
(presto330)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios

2020-09-15 Thread GitBox



ajantha-bhat commented on a change in pull request #3871:
URL: https://github.com/apache/carbondata/pull/3871#discussion_r488444308



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java
##
@@ -64,45 +68,89 @@ public static void deletePartialLoadDataIfExist(CarbonTable 
carbonTable,
   if (allSegments == null || allSegments.length == 0) {
 return;
   }
-  LoadMetadataDetails[] details = 
SegmentStatusManager.readLoadMetadata(metaDataLocation);
-  // there is no segment or failed to read tablestatus file.
-  // so it should stop immediately.
-  if (details == null || details.length == 0) {
-return;
-  }
-  Set metadataSet = new HashSet<>(details.length);
-  for (LoadMetadataDetails detail : details) {
-metadataSet.add(detail.getLoadName());
-  }
-  List staleSegments = new ArrayList<>(allSegments.length);
-  for (CarbonFile segment : allSegments) {
-String segmentName = segment.getName();
-// check segment folder pattern
-if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) {
-  String[] parts = segmentName.split(CarbonCommonConstants.UNDERSCORE);
-  if (parts.length == 2) {
-boolean isOriginal = !parts[1].contains(".");
-if (isCompactionFlow) {
-  // in compaction flow, it should be big segment and segment 
metadata is not exists
-  if (!isOriginal && !metadataSet.contains(parts[1])) {
-staleSegments.add(segment);
+  int retryCount = CarbonLockUtil
+  
.getLockProperty(CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK,
+  
CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK_DEFAULT);
+  int maxTimeout = CarbonLockUtil
+  
.getLockProperty(CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK,
+  CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK_DEFAULT);
+  ICarbonLock carbonTableStatusLock = CarbonLockFactory
+  .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier(), 
LockUsage.TABLE_STATUS_LOCK);
+  try {
+if (carbonTableStatusLock.lockWithRetries(retryCount, maxTimeout)) {
+  LoadMetadataDetails[] details = 
SegmentStatusManager.readLoadMetadata(metaDataLocation);
+  // there is no segment or failed to read tablestatus file.
+  // so it should stop immediately.
+  if (details == null || details.length == 0) {
+return;
+  }
+  Set metadataSet = new HashSet<>(details.length);
+  for (LoadMetadataDetails detail : details) {
+metadataSet.add(detail.getLoadName());
+  }
+  List staleSegments = new ArrayList<>(allSegments.length);
+  Set staleSegmentsId = new HashSet<>(allSegments.length);
+  for (CarbonFile segment : allSegments) {
+String segmentName = segment.getName();
+// check segment folder pattern
+if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) {
+  String[] parts = 
segmentName.split(CarbonCommonConstants.UNDERSCORE);
+  if (parts.length == 2) {
+boolean isOriginal = !parts[1].contains(".");
+if (isCompactionFlow) {
+  // in compaction flow, it should be big segment and segment 
metadata is not exists
+  if (!isOriginal && !metadataSet.contains(parts[1])) {
+staleSegments.add(segment);
+staleSegmentsId.add(parts[1]);
+  }
+} else {
+  // in loading flow,
+  // it should be original segment and segment metadata is not 
exists
+  if (isOriginal && !metadataSet.contains(parts[1])) {
+staleSegments.add(segment);
+staleSegmentsId.add(parts[1]);
+  }
+}
   }
-} else {
-  // in loading flow, it should be original segment and segment 
metadata is not exists
-  if (isOriginal && !metadataSet.contains(parts[1])) {
-staleSegments.add(segment);
+}
+  }
+  // delete segment folders one by one
+  for (CarbonFile staleSegment : staleSegments) {
+try {
+  CarbonUtil.deleteFoldersAndFiles(staleSegment);
+} catch (IOException | InterruptedException e) {
+  LOGGER.error("Unable to delete the given path :: " + 
e.getMessage(), e);
+}
+  }
+  if (staleSegments.size() > 0) {
+// collect the segment metadata path
+String segmentFilesLocation =
+
CarbonTablePath.getSegmentFilesLocation(carbonTable.getTablePath());
+CarbonFile[] allSegmentMetadataFiles =
+

[GitHub] [carbondata] Karan980 commented on pull request #3876: TestingCI

2020-09-15 Thread GitBox



Karan980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-692516342


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



ShreelekhyaG commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488423451



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
##
@@ -191,7 +191,12 @@ case class CarbonLoadDataCommand(databaseNameOp: 
Option[String],
 if (isUpdateTableStatusRequired) {
   CarbonLoaderUtil.updateTableStatusForFailure(carbonLoadModel, uuid)
 }
-throw ex
+val errorMessage = operationContext.getProperty("Error message")
+if (errorMessage != null) {
+  throw new Exception(errorMessage.toString, ex.getCause)

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified

2020-09-15 Thread GitBox



ShreelekhyaG commented on a change in pull request #3919:
URL: https://github.com/apache/carbondata/pull/3919#discussion_r488422763



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala
##
@@ -1064,6 +1064,7 @@ object CommonLoadUtils {
 if (loadParams.updateModel.isDefined) {
   CarbonScalaUtil.updateErrorInUpdateModel(loadParams.updateModel.get, 
executorMessage)
 }

Review comment:
   Added testcase 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Indhumathi27 opened a new pull request #3928: [WIP]Fix DropCache log error when indexmap is null

2020-09-15 Thread GitBox



Indhumathi27 opened a new pull request #3928:
URL: https://github.com/apache/carbondata/pull/3928


### Why is this PR needed?


### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3922: [CARBONDATA-3983] SI compatability issue

2020-09-15 Thread GitBox



nihal0107 commented on a change in pull request #3922:
URL: https://github.com/apache/carbondata/pull/3922#discussion_r488414094



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java
##
@@ -59,13 +60,22 @@ public ImplicitExpression(Map> 
blockIdToBlockletIdMapping)
 
   private void addBlockEntry(String blockletPath) {
 String blockId =
-blockletPath.substring(0, 
blockletPath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR));
+blockletPath.substring(0, blockletPath.lastIndexOf(File.separator));
+// Check if blockletPath contains old tuple id format, and convert it to 
compatible format.
+if (blockId.contains("batchno")) {

Review comment:
   Please put the strings in the constant file and then use it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (CARBONDATA-3952) After reset query not hitting MV

2020-09-15 Thread SHREELEKHYA GAMPA (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA closed CARBONDATA-3952.
-
Resolution: Fixed

> After reset query not hitting MV
> 
>
> Key: CARBONDATA-3952
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3952
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> After reset query not hitting MV.
> With the reset, spark.sql.warehouse.dir and carbonStorePath don't match and 
> the databaseLocation will change to old table path format. So, new tables 
> that are created after reset, take a different path incase of default.
> Closing this , as it is identified as spark bug. More details can be found at 
> https://issues.apache.org/jira/browse/SPARK-31234



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] ShreelekhyaG closed pull request #3890: [CARBONDATA-3952] After reset query not hitting MV

2020-09-15 Thread GitBox



ShreelekhyaG closed pull request #3890:
URL: https://github.com/apache/carbondata/pull/3890


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

88 matches

Mail list logo