[GitHub] [carbondata] Karan980 commented on pull request #3923: [CARBONDATA-3984] 1. Fix compaction issue. 2. Fix longStringColumn validation issue
Karan980 commented on pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#issuecomment-693189676 > @Karan980 the PR title shouldn't be so long, give a small but descriptive title and details you can explain in PR description. Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Fix compaction issue. 2. Fix longStringColumn validation issue
Karan980 commented on a change in pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#discussion_r489180454 ## File path: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java ## @@ -3344,6 +3344,9 @@ public int compare(org.apache.carbondata.format.ColumnSchema o1, // for setting long string columns if (!longStringColumnsString.isEmpty()) { String[] inputColumns = longStringColumnsString.split(","); + for (int i = 0; i < inputColumns.length; i++) { Review comment: Done ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ## @@ -110,6 +114,51 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi assert(exceptionCaught.getMessage.contains("its data type is not string")) } + test("cannot alter sort_columns dataType to long_string_columns") { +val exceptionCaught = intercept[RuntimeException] { + sql( +s""" + | CREATE TABLE if not exists $longStringTable( + | id INT, NAME STRING, description STRING, address STRING, note STRING + | ) STORED AS carbondata + | TBLPROPERTIES('SORT_COLUMNS'='name, address') + |""". + stripMargin) + sql("ALTER TABLE long_string_table SET TBLPROPERTIES('long_String_columns'='NAME')") +} +assert(exceptionCaught.getMessage.contains("LONG_STRING_COLUMNS cannot be present in sort columns: name")) + } + + test("check compaction after altering range column dataType to longStringColumn") { +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,2") + sql( +s""" + | CREATE TABLE if not exists $longStringTable( + | id INT, NAME STRING, description STRING + | ) STORED AS carbondata + | TBLPROPERTIES('RANGE_COLUMN'='Name') + |""". + stripMargin) +sql("ALTER TABLE long_string_table SET TBLPROPERTIES('long_String_columns'='NAME')") +sql("insert into long_string_table select 1, 'ab', 'cool1'") +sql("insert into long_string_table select 2, 'abc', 'cool2'") +sql("ALTER TABLE long_string_table compact 'minor'") + +val carbonTable = CarbonMetadata.getInstance().getCarbonTable( + CarbonCommonConstants.DATABASE_DEFAULT_NAME, + "long_string_table" +) +val absoluteTableIdentifier = carbonTable + .getAbsoluteTableIdentifier Review comment: Done ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ## @@ -110,6 +114,51 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi assert(exceptionCaught.getMessage.contains("its data type is not string")) } + test("cannot alter sort_columns dataType to long_string_columns") { +val exceptionCaught = intercept[RuntimeException] { + sql( +s""" + | CREATE TABLE if not exists $longStringTable( + | id INT, NAME STRING, description STRING, address STRING, note STRING + | ) STORED AS carbondata + | TBLPROPERTIES('SORT_COLUMNS'='name, address') + |""". + stripMargin) + sql("ALTER TABLE long_string_table SET TBLPROPERTIES('long_String_columns'='NAME')") +} +assert(exceptionCaught.getMessage.contains("LONG_STRING_COLUMNS cannot be present in sort columns: name")) + } + + test("check compaction after altering range column dataType to longStringColumn") { +CarbonProperties.getInstance() Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…
CarbonDataQA1 commented on pull request #3930: URL: https://github.com/apache/carbondata/pull/3930#issuecomment-693180722 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4091/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…
CarbonDataQA1 commented on pull request #3930: URL: https://github.com/apache/carbondata/pull/3930#issuecomment-693178308 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2350/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
akashrn5 commented on a change in pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#discussion_r488807941 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ## @@ -29,7 +31,9 @@ import org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandExcepti import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.metadata.CarbonMetadata import org.apache.carbondata.core.metadata.datatype.DataTypes +import org.apache.carbondata.core.statusmanager.SegmentStatusManager import org.apache.carbondata.core.util.CarbonProperties + Review comment: avoid unnecessary changes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes
ajantha-bhat commented on a change in pull request #3920: URL: https://github.com/apache/carbondata/pull/3920#discussion_r489158514 ## File path: integration/presto/src/main/prestosql/org/apache/carbondata/presto/PrestoFilterUtil.java ## @@ -78,16 +78,22 @@ private static DataType spi2CarbondataTypeMapper(HiveColumnHandle columnHandle) HiveType colType = columnHandle.getHiveType(); if (colType.equals(HiveType.HIVE_BOOLEAN)) { return DataTypes.BOOLEAN; +} else if (colType.equals(HiveType.HIVE_BINARY)) { + return DataTypes.BINARY; } else if (colType.equals(HiveType.HIVE_SHORT)) { return DataTypes.SHORT; } else if (colType.equals(HiveType.HIVE_INT)) { return DataTypes.INT; +} else if (colType.equals(HiveType.HIVE_FLOAT)) { Review comment: also manually run new testcases once in prestodb profile This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes
ajantha-bhat commented on a change in pull request #3920: URL: https://github.com/apache/carbondata/pull/3920#discussion_r489158120 ## File path: integration/presto/src/main/prestosql/org/apache/carbondata/presto/PrestoFilterUtil.java ## @@ -78,16 +78,22 @@ private static DataType spi2CarbondataTypeMapper(HiveColumnHandle columnHandle) HiveType colType = columnHandle.getHiveType(); if (colType.equals(HiveType.HIVE_BOOLEAN)) { return DataTypes.BOOLEAN; +} else if (colType.equals(HiveType.HIVE_BINARY)) { + return DataTypes.BINARY; } else if (colType.equals(HiveType.HIVE_SHORT)) { return DataTypes.SHORT; } else if (colType.equals(HiveType.HIVE_INT)) { return DataTypes.INT; +} else if (colType.equals(HiveType.HIVE_FLOAT)) { Review comment: please handle the same file changes in prestodb also and compile prestodb profile once with latest changes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3929: [CARBONDATA-3991] Fix the set modified time function on S3 and Alluxio file system
CarbonDataQA1 commented on pull request #3929: URL: https://github.com/apache/carbondata/pull/3929#issuecomment-693160345 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4090/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3925: [CARBONDATA-3985] Optimize the segment-timestamp file clean up
CarbonDataQA1 commented on pull request #3925: URL: https://github.com/apache/carbondata/pull/3925#issuecomment-693154382 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2349/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3925: [CARBONDATA-3985] Optimize the segment-timestamp file clean up
CarbonDataQA1 commented on pull request #3925: URL: https://github.com/apache/carbondata/pull/3925#issuecomment-693152789 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4089/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] QiangCai commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…
QiangCai commented on pull request #3930: URL: https://github.com/apache/carbondata/pull/3930#issuecomment-693148903 add to whitelist This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…
CarbonDataQA1 commented on pull request #3930: URL: https://github.com/apache/carbondata/pull/3930#issuecomment-693134156 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Klaus-xjp opened a new pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…
Klaus-xjp opened a new pull request #3930: URL: https://github.com/apache/carbondata/pull/3930 … file system ###Why is this PR needed? If there are some update or create mv in S3 and Alluxio files ,those operations do not take effect. And the other tenant can't find the changes in this operation and cause a data consistency problem. ###What changes were proposed in this PR? In this PR, we set two function to fix this problem. 1.Create a new file for S3 and Alluxio to set a tag to update. 2.Set a switch case like function to select different scenario in update or create. ###Does this PR introduce any user interface change? No ###Is any new testcase added? Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Klaus-xjp closed pull request #3929: [CARBONDATA-3991] Fix the set modified time function on S3 and Alluxio file system
Klaus-xjp closed pull request #3929: URL: https://github.com/apache/carbondata/pull/3929 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] QiangCai commented on pull request #3929: [CARBONDATA-3991] Fix the set modified time function on S3 and Alluxio file system
QiangCai commented on pull request #3929: URL: https://github.com/apache/carbondata/pull/3929#issuecomment-693131340 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] QiangCai commented on pull request #3929: [CARBONDATA-3991] Fix the set modified time function on S3 and Alluxio file system
QiangCai commented on pull request #3929: URL: https://github.com/apache/carbondata/pull/3929#issuecomment-693131254 if the old file exists, how does it work? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Klaus-xjp opened a new pull request #3929: [CARBONDATA-3991] Fix the modified time failed problem
Klaus-xjp opened a new pull request #3929: URL: https://github.com/apache/carbondata/pull/3929 Why is this PR needed? If there are some update or create mv in S3 and Alluxio files ,those operations do not take effect. And the other tenant can't find the changes in this operation and cause a data consistency problem. What changes were proposed in this PR? In this PR, we set two function to fix this problem. 1. Create a new file for S3 and Alluxio to set a tag to update. 2. Set a switch case like function to select different scenario in update or create. Does this PR introduce any user interface change? No Is any new testcase added? Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3929: [CARBONDATA-3991] Fix the modified time failed problem
CarbonDataQA1 commented on pull request #3929: URL: https://github.com/apache/carbondata/pull/3929#issuecomment-693124094 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-3991) File system could not set modified time because don't override the settime function
jingpan xiong created CARBONDATA-3991: - Summary: File system could not set modified time because don't override the settime function Key: CARBONDATA-3991 URL: https://issues.apache.org/jira/browse/CARBONDATA-3991 Project: CarbonData Issue Type: Bug Components: core Affects Versions: 2.0.1 Reporter: jingpan xiong Fix For: 2.0.1 The file system like S3 and Alluxio, don't override the settime function, cause the updata and create mv got some problem. This bug can't raise a exception on set modified time, and may set a null value in modified time. This bug may cause multi tenant problem and data consistency problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes
CarbonDataQA1 commented on pull request #3920: URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692853189 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2348/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes
CarbonDataQA1 commented on pull request #3920: URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692849005 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4088/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
akashrn5 commented on pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692834883 @Karan980 the PR title shouldn't be so long, give a small but descriptive title and details you can explain in PR description. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
akashrn5 commented on a change in pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#discussion_r488807808 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ## @@ -110,6 +114,51 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi assert(exceptionCaught.getMessage.contains("its data type is not string")) } + test("cannot alter sort_columns dataType to long_string_columns") { +val exceptionCaught = intercept[RuntimeException] { + sql( +s""" + | CREATE TABLE if not exists $longStringTable( + | id INT, NAME STRING, description STRING, address STRING, note STRING + | ) STORED AS carbondata + | TBLPROPERTIES('SORT_COLUMNS'='name, address') + |""". + stripMargin) + sql("ALTER TABLE long_string_table SET TBLPROPERTIES('long_String_columns'='NAME')") +} +assert(exceptionCaught.getMessage.contains("LONG_STRING_COLUMNS cannot be present in sort columns: name")) + } + + test("check compaction after altering range column dataType to longStringColumn") { +CarbonProperties.getInstance() Review comment: this property value should be reset after this test case else it will impact later test cases. Take the exiting value, set new value, reset to old value ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ## @@ -29,7 +31,9 @@ import org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandExcepti import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.metadata.CarbonMetadata import org.apache.carbondata.core.metadata.datatype.DataTypes +import org.apache.carbondata.core.statusmanager.SegmentStatusManager import org.apache.carbondata.core.util.CarbonProperties + Review comment: avoid unnecessary changes ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ## @@ -110,6 +114,51 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi assert(exceptionCaught.getMessage.contains("its data type is not string")) } + test("cannot alter sort_columns dataType to long_string_columns") { +val exceptionCaught = intercept[RuntimeException] { + sql( +s""" + | CREATE TABLE if not exists $longStringTable( + | id INT, NAME STRING, description STRING, address STRING, note STRING + | ) STORED AS carbondata + | TBLPROPERTIES('SORT_COLUMNS'='name, address') + |""". + stripMargin) + sql("ALTER TABLE long_string_table SET TBLPROPERTIES('long_String_columns'='NAME')") +} +assert(exceptionCaught.getMessage.contains("LONG_STRING_COLUMNS cannot be present in sort columns: name")) + } + + test("check compaction after altering range column dataType to longStringColumn") { +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.COMPACTION_SEGMENT_LEVEL_THRESHOLD, "2,2") + sql( +s""" + | CREATE TABLE if not exists $longStringTable( + | id INT, NAME STRING, description STRING + | ) STORED AS carbondata + | TBLPROPERTIES('RANGE_COLUMN'='Name') + |""". + stripMargin) +sql("ALTER TABLE long_string_table SET TBLPROPERTIES('long_String_columns'='NAME')") +sql("insert into long_string_table select 1, 'ab', 'cool1'") +sql("insert into long_string_table select 2, 'abc', 'cool2'") +sql("ALTER TABLE long_string_table compact 'minor'") + +val carbonTable = CarbonMetadata.getInstance().getCarbonTable( + CarbonCommonConstants.DATABASE_DEFAULT_NAME, + "long_string_table" +) +val absoluteTableIdentifier = carbonTable + .getAbsoluteTableIdentifier Review comment: move this line above This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
CarbonDataQA1 commented on pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692832532 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4086/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (CARBONDATA-3986) multiple issues during compaction and concurrent scenarios
[ https://issues.apache.org/jira/browse/CARBONDATA-3986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3986. - Fix Version/s: 2.1.0 Resolution: Fixed > multiple issues during compaction and concurrent scenarios > -- > > Key: CARBONDATA-3986 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3986 > Project: CarbonData > Issue Type: Bug >Reporter: Ajantha Bhat >Assignee: Ajantha Bhat >Priority: Major > Fix For: 2.1.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > multiple issues during compaction and concurrent scenarios > a) Auto compaction/multiple times minor compaction is called, it was > considering compacted segments and coming compaction again ad overwriting the > files and segments > b) Minor/ auto compaction should skip >=2 level segments, now only skipping > =2 level segments > c) when compaction failed, no need to call merge index > d) At executor, When segment file or table status file failed to write during > merge index event, need to remove the stale files. > e) during partial load cleanup segment folders are removed but segment > metadata files were not removed > f) Some table status retry issues -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] asfgit closed pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
asfgit closed pull request #3871: URL: https://github.com/apache/carbondata/pull/3871 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
CarbonDataQA1 commented on pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692828938 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2346/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
akashrn5 commented on pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692826931 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
CarbonDataQA1 commented on pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692818581 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2345/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
CarbonDataQA1 commented on pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692817569 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4085/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary, byte and float datatypes
CarbonDataQA1 commented on pull request #3920: URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692774538 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2347/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akkio-97 commented on a change in pull request #3920: [CARBONDATA-3981] Presto filter check on binary datatype
akkio-97 commented on a change in pull request #3920: URL: https://github.com/apache/carbondata/pull/3920#discussion_r488734849 ## File path: integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoTestNonTransactionalTableFiles.scala ## @@ -230,6 +230,37 @@ class PrestoTestNonTransactionalTableFiles extends FunSuiteLike with BeforeAndAf } } + def buildOnlyBinary(rows: Int, sortColumns: Array[String], path : String): Any = { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akkio-97 commented on a change in pull request #3920: [CARBONDATA-3981] Presto filter check on binary datatype
akkio-97 commented on a change in pull request #3920: URL: https://github.com/apache/carbondata/pull/3920#discussion_r488734274 ## File path: integration/presto/src/main/prestosql/org/apache/carbondata/presto/PrestoFilterUtil.java ## @@ -78,6 +78,8 @@ private static DataType spi2CarbondataTypeMapper(HiveColumnHandle columnHandle) HiveType colType = columnHandle.getHiveType(); if (colType.equals(HiveType.HIVE_BOOLEAN)) { return DataTypes.BOOLEAN; +} else if (colType.equals(HiveType.HIVE_BINARY)) { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
akashrn5 commented on a change in pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#discussion_r488723181 ## File path: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java ## @@ -3344,6 +3344,9 @@ public int compare(org.apache.carbondata.format.ColumnSchema o1, // for setting long string columns if (!longStringColumnsString.isEmpty()) { String[] inputColumns = longStringColumnsString.split(","); + for (int i = 0; i < inputColumns.length; i++) { Review comment: instead of forloop, you can use `Arrays.stream(inputColumns).map(longColumn => longColumn.trim().toLowerCase()).toArray(String[]::new)` in a more functional manner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
Karan980 commented on a change in pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#discussion_r488712088 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ## @@ -110,6 +110,21 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi assert(exceptionCaught.getMessage.contains("its data type is not string")) } + test("cannot alter sort_columns dataType to long_string_columns") { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
ajantha-bhat commented on pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692740141 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
CarbonDataQA1 commented on pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692739041 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4084/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (CARBONDATA-3983) SI compatability issue
[ https://issues.apache.org/jira/browse/CARBONDATA-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3983. - Fix Version/s: 2.1.0 Resolution: Fixed > SI compatability issue > -- > > Key: CARBONDATA-3983 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3983 > Project: CarbonData > Issue Type: Bug >Reporter: SHREELEKHYA GAMPA >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Read from maintable having SI returns empty resultset when SI is stored with > old tuple id storage format. > Bug id: BUG2020090205414 > PR link: https://github.com/apache/carbondata/pull/3922 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] asfgit closed pull request #3922: [CARBONDATA-3983] SI compatability issue
asfgit closed pull request #3922: URL: https://github.com/apache/carbondata/pull/3922 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3922: [CARBONDATA-3983] SI compatability issue
ShreelekhyaG commented on a change in pull request #3922: URL: https://github.com/apache/carbondata/pull/3922#discussion_r488687445 ## File path: core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java ## @@ -59,13 +61,23 @@ public ImplicitExpression(Map> blockIdToBlockletIdMapping) private void addBlockEntry(String blockletPath) { String blockId = -blockletPath.substring(0, blockletPath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR)); +blockletPath.substring(0, blockletPath.lastIndexOf(File.separator)); +// Check if blockletPath contains old tuple id format, and convert it to compatible format. Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue
akashrn5 commented on pull request #3922: URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692730001 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue
CarbonDataQA1 commented on pull request #3922: URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692728976 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2343/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue
CarbonDataQA1 commented on pull request #3922: URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692728352 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4083/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-3990) Fix DropCache log error when indexmap is null
Indhumathi Muthumurugesh created CARBONDATA-3990: Summary: Fix DropCache log error when indexmap is null Key: CARBONDATA-3990 URL: https://issues.apache.org/jira/browse/CARBONDATA-3990 Project: CarbonData Issue Type: Bug Reporter: Indhumathi Muthumurugesh -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3980) Load fails with aborted exception when Bad records action is unspecified
[ https://issues.apache.org/jira/browse/CARBONDATA-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajantha Bhat resolved CARBONDATA-3980. -- Fix Version/s: 2.1.0 Resolution: Fixed > Load fails with aborted exception when Bad records action is unspecified > > > Key: CARBONDATA-3980 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3980 > Project: CarbonData > Issue Type: Bug >Reporter: SHREELEKHYA GAMPA >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > When the partition column is loaded with a bad record value, load fails with > 'Job aborted' message in cluster. However in complete stack trace we can see > the actual error message. ('Data load failed due to bad record: The value > with column name projectjoindate and column data type TIMESTAMP is not a > valid TIMESTAMP type') > Bug id: BUG2020082802430 > PR link: https://github.com/apache/carbondata/pull/3919 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] asfgit closed pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
asfgit closed pull request #3919: URL: https://github.com/apache/carbondata/pull/3919 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3921: [CARBONDATA-3928] Removed records from exception message.
asfgit closed pull request #3921: URL: https://github.com/apache/carbondata/pull/3921 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
ajantha-bhat commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692670191 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
ajantha-bhat commented on a change in pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#discussion_r488608473 ## File path: core/src/main/java/org/apache/carbondata/core/writer/CarbonIndexFileMergeWriter.java ## @@ -233,15 +234,24 @@ public String writeMergeIndexFileBasedOnSegmentFile(String segmentId, } if (FileFactory.getCarbonFile(entry.getKey()).equals(FileFactory.getCarbonFile(location))) { segment.getValue().setMergeFileName(mergeIndexFile); - segment.getValue().setFiles(new HashSet()); + mergeIndexFiles + .add(entry.getKey() + CarbonCommonConstants.FILE_SEPARATOR + mergeIndexFile); + segment.getValue().setFiles(new HashSet<>()); break; } } if (table.isHivePartitionTable()) { for (PartitionSpec partitionSpec : partitionSpecs) { if (partitionSpec.getLocation().toString().equals(partitionPath)) { -SegmentFileStore.writeSegmentFile(table.getTablePath(), mergeIndexFile, partitionPath, -segmentId + "_" + uuid + "", partitionSpec.getPartitions(), true); +try { + SegmentFileStore.writeSegmentFile(table.getTablePath(), mergeIndexFile, partitionPath, + segmentId + "_" + uuid + "", partitionSpec.getPartitions(), true); +} catch (Exception ex) { + // delete merge index file if created, + // keep only index files as segment file writing is failed Review comment: ok ## File path: core/src/main/java/org/apache/carbondata/core/writer/CarbonIndexFileMergeWriter.java ## @@ -251,9 +261,29 @@ public String writeMergeIndexFileBasedOnSegmentFile(String segmentId, String path = CarbonTablePath.getSegmentFilesLocation(table.getTablePath()) + CarbonCommonConstants.FILE_SEPARATOR + newSegmentFileName; if (!table.isHivePartitionTable()) { - SegmentFileStore.writeSegmentFile(segmentFileStore.getSegmentFile(), path); - SegmentFileStore.updateTableStatusFile(table, segmentId, newSegmentFileName, + String content = SegmentStatusManager.readFileAsString(path); + try { +SegmentFileStore.writeSegmentFile(segmentFileStore.getSegmentFile(), path); + } catch (Exception ex) { +// delete merge index file if created, Review comment: ok This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3928: [WIP]Fix DropCache log error when indexmap is null
CarbonDataQA1 commented on pull request #3928: URL: https://github.com/apache/carbondata/pull/3928#issuecomment-692663663 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2341/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3928: [WIP]Fix DropCache log error when indexmap is null
CarbonDataQA1 commented on pull request #3928: URL: https://github.com/apache/carbondata/pull/3928#issuecomment-692661529 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4081/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
ShreelekhyaG commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488595453 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala ## @@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends QueryTest with BeforeAndAfter } } + test("test load with partition column having bad record value") { +sql("drop table if exists dataloadOptionTests") +sql("CREATE TABLE dataloadOptionTests (empno int, empname String, designation String, " + + "workgroupcategory int, workgroupcategoryname String, deptno int, projectjoindate " + + "Timestamp, projectenddate Date,attendance int,utilization int,salary int) PARTITIONED BY " + + "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ") +val csvFilePath = s"$resourcesPath/data.csv" +val ex = intercept[Exception] { + sql("LOAD DATA local inpath '" + csvFilePath + + "' INTO TABLE dataloadOptionTests OPTIONS ('bad_records_action'='FAIL', 'DELIMITER'= '," + + "', 'QUOTECHAR'= '\"', 'dateformat'='DD-MM-','timestampformat'='DD-MM-')"); +} +assert(ex.getMessage.contains( + "DataLoad failure: Data load failed due to bad record: The value with column name " + + "projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP type.Please " + + "enable bad record logger to know the detail reason.")) + } Review comment: Ok This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692654382 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2340/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692645252 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4080/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue
CarbonDataQA1 commented on pull request #3922: URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692642655 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4079/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
akashrn5 commented on a change in pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#discussion_r488574809 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala ## @@ -110,6 +110,21 @@ class VarcharDataTypesBasicTestCase extends QueryTest with BeforeAndAfterEach wi assert(exceptionCaught.getMessage.contains("its data type is not string")) } + test("cannot alter sort_columns dataType to long_string_columns") { Review comment: please add test case for compaction failure case This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3922: [CARBONDATA-3983] SI compatability issue
CarbonDataQA1 commented on pull request #3922: URL: https://github.com/apache/carbondata/pull/3922#issuecomment-692638683 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2339/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary datatype
CarbonDataQA1 commented on pull request #3920: URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692631719 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4078/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3920: [CARBONDATA-3981] Presto filter check on binary datatype
CarbonDataQA1 commented on pull request #3920: URL: https://github.com/apache/carbondata/pull/3920#issuecomment-692631316 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2338/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
CarbonDataQA1 commented on pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692629619 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4077/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
CarbonDataQA1 commented on pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692594093 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2336/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3923: [CARBONDATA-3984] 1. Compaction on table having range column after altering datatype from string to long string fails.
CarbonDataQA1 commented on pull request #3923: URL: https://github.com/apache/carbondata/pull/3923#issuecomment-692593886 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4076/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
CarbonDataQA1 commented on pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#issuecomment-692593687 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2337/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI
CarbonDataQA1 commented on pull request #3876: URL: https://github.com/apache/carbondata/pull/3876#issuecomment-692584299 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2335/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
ajantha-bhat commented on a change in pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#discussion_r488511466 ## File path: processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java ## @@ -64,45 +68,85 @@ public static void deletePartialLoadDataIfExist(CarbonTable carbonTable, if (allSegments == null || allSegments.length == 0) { return; } - LoadMetadataDetails[] details = SegmentStatusManager.readLoadMetadata(metaDataLocation); - // there is no segment or failed to read tablestatus file. - // so it should stop immediately. - if (details == null || details.length == 0) { -return; - } - Set metadataSet = new HashSet<>(details.length); - for (LoadMetadataDetails detail : details) { -metadataSet.add(detail.getLoadName()); - } - List staleSegments = new ArrayList<>(allSegments.length); - for (CarbonFile segment : allSegments) { -String segmentName = segment.getName(); -// check segment folder pattern -if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) { - String[] parts = segmentName.split(CarbonCommonConstants.UNDERSCORE); - if (parts.length == 2) { -boolean isOriginal = !parts[1].contains("."); -if (isCompactionFlow) { - // in compaction flow, it should be big segment and segment metadata is not exists - if (!isOriginal && !metadataSet.contains(parts[1])) { -staleSegments.add(segment); - } -} else { - // in loading flow, it should be original segment and segment metadata is not exists - if (isOriginal && !metadataSet.contains(parts[1])) { -staleSegments.add(segment); + int retryCount = CarbonLockUtil + .getLockProperty(CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK, + CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK_DEFAULT); + int maxTimeout = CarbonLockUtil + .getLockProperty(CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK, + CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK_DEFAULT); + ICarbonLock carbonTableStatusLock = CarbonLockFactory + .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier(), LockUsage.TABLE_STATUS_LOCK); + try { +if (carbonTableStatusLock.lockWithRetries(retryCount, maxTimeout)) { + LoadMetadataDetails[] details = SegmentStatusManager.readLoadMetadata(metaDataLocation); + // there is no segment or failed to read tablestatus file. + // so it should stop immediately. + if (details == null || details.length == 0) { +return; + } + Set metadataSet = new HashSet<>(details.length); + for (LoadMetadataDetails detail : details) { +metadataSet.add(detail.getLoadName()); + } + List staleSegments = new ArrayList<>(allSegments.length); + Set staleSegmentsId = new HashSet<>(allSegments.length); + for (CarbonFile segment : allSegments) { +String segmentName = segment.getName(); +// check segment folder pattern +if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) { + String[] parts = segmentName.split(CarbonCommonConstants.UNDERSCORE); + if (parts.length == 2) { +boolean isOriginal = !parts[1].contains("."); +if (isCompactionFlow) { + // in compaction flow, + // it should be merged segment and segment metadata doesn't exists + if (!isOriginal && !metadataSet.contains(parts[1])) { +staleSegments.add(segment); +staleSegmentsId.add(parts[1]); + } +} else { + // in loading flow, + // it should be original segment and segment metadata doesn't exists + if (isOriginal && !metadataSet.contains(parts[1])) { +staleSegments.add(segment); +staleSegmentsId.add(parts[1]); + } +} } } } + // delete segment folders one by one + for (CarbonFile staleSegment : staleSegments) { +try { + CarbonUtil.deleteFoldersAndFiles(staleSegment); +} catch (IOException | InterruptedException e) { + LOGGER.error("Unable to delete the given path :: " + e.getMessage(), e); +} + } + if (staleSegments.size() > 0) { +// get the segment metadata path +String segmentFilesLocation = + CarbonTablePath.getSegmentFilesLocation(carbonTable.getTablePath()); +// delete the segment metadata files also +
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI
CarbonDataQA1 commented on pull request #3876: URL: https://github.com/apache/carbondata/pull/3876#issuecomment-692581967 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4075/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3922: [CARBONDATA-3983] SI compatability issue
ShreelekhyaG commented on a change in pull request #3922: URL: https://github.com/apache/carbondata/pull/3922#discussion_r488507475 ## File path: core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java ## @@ -59,13 +60,22 @@ public ImplicitExpression(Map> blockIdToBlockletIdMapping) private void addBlockEntry(String blockletPath) { String blockId = -blockletPath.substring(0, blockletPath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR)); +blockletPath.substring(0, blockletPath.lastIndexOf(File.separator)); +// Check if blockletPath contains old tuple id format, and convert it to compatible format. +if (blockId.contains("batchno")) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3922: [CARBONDATA-3983] SI compatability issue
akashrn5 commented on a change in pull request #3922: URL: https://github.com/apache/carbondata/pull/3922#discussion_r488507700 ## File path: core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java ## @@ -59,13 +61,23 @@ public ImplicitExpression(Map> blockIdToBlockletIdMapping) private void addBlockEntry(String blockletPath) { String blockId = -blockletPath.substring(0, blockletPath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR)); +blockletPath.substring(0, blockletPath.lastIndexOf(File.separator)); +// Check if blockletPath contains old tuple id format, and convert it to compatible format. Review comment: @ShreelekhyaG can you please mention the old and new tupleID format here in comment, so that it will be easier for developer and reviewer for understanding. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
akashrn5 commented on a change in pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#discussion_r488503684 ## File path: core/src/main/java/org/apache/carbondata/core/writer/CarbonIndexFileMergeWriter.java ## @@ -233,15 +234,24 @@ public String writeMergeIndexFileBasedOnSegmentFile(String segmentId, } if (FileFactory.getCarbonFile(entry.getKey()).equals(FileFactory.getCarbonFile(location))) { segment.getValue().setMergeFileName(mergeIndexFile); - segment.getValue().setFiles(new HashSet()); + mergeIndexFiles + .add(entry.getKey() + CarbonCommonConstants.FILE_SEPARATOR + mergeIndexFile); + segment.getValue().setFiles(new HashSet<>()); break; } } if (table.isHivePartitionTable()) { for (PartitionSpec partitionSpec : partitionSpecs) { if (partitionSpec.getLocation().toString().equals(partitionPath)) { -SegmentFileStore.writeSegmentFile(table.getTablePath(), mergeIndexFile, partitionPath, -segmentId + "_" + uuid + "", partitionSpec.getPartitions(), true); +try { + SegmentFileStore.writeSegmentFile(table.getTablePath(), mergeIndexFile, partitionPath, + segmentId + "_" + uuid + "", partitionSpec.getPartitions(), true); +} catch (Exception ex) { + // delete merge index file if created, + // keep only index files as segment file writing is failed Review comment: what i meant to say is, here we can error as writing segment file failed, and then throw exception, because we just get the IO exception here and not any custom message exception, so if we add here, it will be easy for any future analysis. ## File path: processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java ## @@ -64,45 +68,85 @@ public static void deletePartialLoadDataIfExist(CarbonTable carbonTable, if (allSegments == null || allSegments.length == 0) { return; } - LoadMetadataDetails[] details = SegmentStatusManager.readLoadMetadata(metaDataLocation); - // there is no segment or failed to read tablestatus file. - // so it should stop immediately. - if (details == null || details.length == 0) { -return; - } - Set metadataSet = new HashSet<>(details.length); - for (LoadMetadataDetails detail : details) { -metadataSet.add(detail.getLoadName()); - } - List staleSegments = new ArrayList<>(allSegments.length); - for (CarbonFile segment : allSegments) { -String segmentName = segment.getName(); -// check segment folder pattern -if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) { - String[] parts = segmentName.split(CarbonCommonConstants.UNDERSCORE); - if (parts.length == 2) { -boolean isOriginal = !parts[1].contains("."); -if (isCompactionFlow) { - // in compaction flow, it should be big segment and segment metadata is not exists - if (!isOriginal && !metadataSet.contains(parts[1])) { -staleSegments.add(segment); - } -} else { - // in loading flow, it should be original segment and segment metadata is not exists - if (isOriginal && !metadataSet.contains(parts[1])) { -staleSegments.add(segment); + int retryCount = CarbonLockUtil + .getLockProperty(CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK, + CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK_DEFAULT); + int maxTimeout = CarbonLockUtil + .getLockProperty(CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK, + CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK_DEFAULT); + ICarbonLock carbonTableStatusLock = CarbonLockFactory + .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier(), LockUsage.TABLE_STATUS_LOCK); + try { +if (carbonTableStatusLock.lockWithRetries(retryCount, maxTimeout)) { + LoadMetadataDetails[] details = SegmentStatusManager.readLoadMetadata(metaDataLocation); + // there is no segment or failed to read tablestatus file. + // so it should stop immediately. + if (details == null || details.length == 0) { +return; + } + Set metadataSet = new HashSet<>(details.length); + for (LoadMetadataDetails detail : details) { +metadataSet.add(detail.getLoadName()); + } + List staleSegments = new ArrayList<>(allSegments.length); + Set staleSegmentsId = new HashSet<>(allSegments.length); + for (CarbonFile segment : allSegments) { +String segmentName = segment.getName(); +// check segment folder pattern +if
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
CarbonDataQA1 commented on pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#issuecomment-692563259 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3928: [WIP]Fix DropCache log error when indexmap is null
CarbonDataQA1 commented on pull request #3928: URL: https://github.com/apache/carbondata/pull/3928#issuecomment-692562871 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2333/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3928: [WIP]Fix DropCache log error when indexmap is null
CarbonDataQA1 commented on pull request #3928: URL: https://github.com/apache/carbondata/pull/3928#issuecomment-692562664 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4072/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on a change in pull request #3917: [CARBONDATA-3978] Clean files refactor and added support for a trash folder where all the carbondata files will be copied t
marchpure commented on a change in pull request #3917: URL: https://github.com/apache/carbondata/pull/3917#discussion_r488464250 ## File path: integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala ## @@ -0,0 +1,259 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.cleanfiles + +import java.util.concurrent.{Executors, ScheduledExecutorService, TimeUnit} + +import scala.collection.JavaConverters._ + +import org.apache.hadoop.fs.permission.{FsAction, FsPermission} + +import org.apache.carbondata.common.logging.LogServiceFactory +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.datastore.filesystem.CarbonFile +import org.apache.carbondata.core.datastore.impl.FileFactory +import org.apache.carbondata.core.indexstore.PartitionSpec +import org.apache.carbondata.core.locks.{CarbonLockUtil, ICarbonLock, LockUsage} +import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, CarbonMetadata, SegmentFileStore} +import org.apache.carbondata.core.metadata.schema.table.CarbonTable +import org.apache.carbondata.core.mutate.CarbonUpdateUtil +import org.apache.carbondata.core.statusmanager.{SegmentStatus, SegmentStatusManager} +import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil} +import org.apache.carbondata.core.util.path.CarbonTablePath + +object CleanFilesUtil { + private val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName) + + /** + * The method deletes all data if forceTableCLean and lean garbage segment + * (MARKED_FOR_DELETE state) if forceTableCLean + * + * @param dbName : Database name + * @param tableName : Table name + * @param tablePath : Table path + * @param carbonTable: CarbonTable Object in case of force clean + * @param forceTableClean: for force clean it will delete all data + *it will clean garbage segment (MARKED_FOR_DELETE state) + * @param currentTablePartitions : Hive Partitions details + */ + def cleanFiles( + dbName: String, + tableName: String, + tablePath: String, + carbonTable: CarbonTable, + forceTableClean: Boolean, + currentTablePartitions: Option[Seq[PartitionSpec]] = None, + truncateTable: Boolean = false): Unit = { +var carbonCleanFilesLock: ICarbonLock = null +val absoluteTableIdentifier = if (forceTableClean) { + AbsoluteTableIdentifier.from(tablePath, dbName, tableName, tableName) +} else { + carbonTable.getAbsoluteTableIdentifier +} +try { + val errorMsg = "Clean files request is failed for " + +s"$dbName.$tableName" + +". Not able to acquire the clean files lock due to another clean files " + +"operation is running in the background." + // in case of force clean the lock is not required + if (forceTableClean) { Review comment: forceTableClean is too violence, please delete it. ## File path: integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala ## @@ -0,0 +1,259 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.cleanfiles + +import java.util.concurrent.{Executors,
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3921: [CARBONDATA-3928] Removed records from exception message.
ajantha-bhat commented on a change in pull request #3921: URL: https://github.com/apache/carbondata/pull/3921#discussion_r488463308 ## File path: processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java ## @@ -119,10 +118,6 @@ public CarbonRow convert(CarbonRow row) throws CarbonDataLoadingException { .getTableProperties(); String spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX); boolean isSpatialColumn = false; -Object[] rawData = row.getRawData(); -if (rawData == null) { - rawData = row.getData() == null ? null : row.getData().clone(); Review comment: Ok then. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3921: [CARBONDATA-3928] Removed records from exception message.
ajantha-bhat commented on pull request #3921: URL: https://github.com/apache/carbondata/pull/3921#issuecomment-692538365 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3921: [CARBONDATA-3928] Removed records from exception message.
nihal0107 commented on a change in pull request #3921: URL: https://github.com/apache/carbondata/pull/3921#discussion_r488461592 ## File path: processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java ## @@ -119,10 +118,6 @@ public CarbonRow convert(CarbonRow row) throws CarbonDataLoadingException { .getTableProperties(); String spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX); boolean isSpatialColumn = false; -Object[] rawData = row.getRawData(); -if (rawData == null) { - rawData = row.getData() == null ? null : row.getData().clone(); Review comment: I had only added this scenario last time because at that time we wanted rawdata for every bad record action. But now we need this only in case of bad record logger is enable or action is redirect. And in this case, rawdata will be always available. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3921: [CARBONDATA-3928] Removed records from exception message.
ajantha-bhat commented on a change in pull request #3921: URL: https://github.com/apache/carbondata/pull/3921#discussion_r488454458 ## File path: processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/RowConverterImpl.java ## @@ -119,10 +118,6 @@ public CarbonRow convert(CarbonRow row) throws CarbonDataLoadingException { .getTableProperties(); String spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX); boolean isSpatialColumn = false; -Object[] rawData = row.getRawData(); -if (rawData == null) { - rawData = row.getData() == null ? null : row.getData().clone(); Review comment: Please revert back this change, somecases rawData will not be set, hence they are setting here before converting the row. Now your current code will use null instead of row.getData().clone() in this scenario. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
ajantha-bhat commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488449347 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala ## @@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends QueryTest with BeforeAndAfter } } + test("test load with partition column having bad record value") { +sql("drop table if exists dataloadOptionTests") +sql("CREATE TABLE dataloadOptionTests (empno int, empname String, designation String, " + + "workgroupcategory int, workgroupcategoryname String, deptno int, projectjoindate " + + "Timestamp, projectenddate Date,attendance int,utilization int,salary int) PARTITIONED BY " + + "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ") +val csvFilePath = s"$resourcesPath/data.csv" +val ex = intercept[Exception] { Review comment: please intercept RuntimeException only This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
ajantha-bhat commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488449175 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionBadRecordLoggerTest.scala ## @@ -219,6 +219,24 @@ class StandardPartitionBadRecordLoggerTest extends QueryTest with BeforeAndAfter } } + test("test load with partition column having bad record value") { +sql("drop table if exists dataloadOptionTests") +sql("CREATE TABLE dataloadOptionTests (empno int, empname String, designation String, " + + "workgroupcategory int, workgroupcategoryname String, deptno int, projectjoindate " + + "Timestamp, projectenddate Date,attendance int,utilization int,salary int) PARTITIONED BY " + + "(deptname String,doj Timestamp,projectcode int) STORED AS carbondata ") +val csvFilePath = s"$resourcesPath/data.csv" +val ex = intercept[Exception] { + sql("LOAD DATA local inpath '" + csvFilePath + + "' INTO TABLE dataloadOptionTests OPTIONS ('bad_records_action'='FAIL', 'DELIMITER'= '," + + "', 'QUOTECHAR'= '\"', 'dateformat'='DD-MM-','timestampformat'='DD-MM-')"); +} +assert(ex.getMessage.contains( + "DataLoad failure: Data load failed due to bad record: The value with column name " + + "projectjoindate and column data type TIMESTAMP is not a valid TIMESTAMP type.Please " + + "enable bad record logger to know the detail reason.")) + } Review comment: please drop the table here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] jxxfxkp commented on pull request #3662: [TEMP] support prestosql 330 in carbon
jxxfxkp commented on pull request #3662: URL: https://github.com/apache/carbondata/pull/3662#issuecomment-692525522 I 'm sure to cherry-pick this PR. Some Classes cannot find symbol (OrcFileWriterConfig->OrcWriterConfig,io.prestosql.plugin.hive->io.prestosql.plugin.hive.orc ...) ,Although I can change them, I still get some prestoserver--start errors (presto330) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3871: [CARBONDATA-3986] Fix multiple issues during compaction and concurrent scenarios
ajantha-bhat commented on a change in pull request #3871: URL: https://github.com/apache/carbondata/pull/3871#discussion_r488444308 ## File path: processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java ## @@ -64,45 +68,89 @@ public static void deletePartialLoadDataIfExist(CarbonTable carbonTable, if (allSegments == null || allSegments.length == 0) { return; } - LoadMetadataDetails[] details = SegmentStatusManager.readLoadMetadata(metaDataLocation); - // there is no segment or failed to read tablestatus file. - // so it should stop immediately. - if (details == null || details.length == 0) { -return; - } - Set metadataSet = new HashSet<>(details.length); - for (LoadMetadataDetails detail : details) { -metadataSet.add(detail.getLoadName()); - } - List staleSegments = new ArrayList<>(allSegments.length); - for (CarbonFile segment : allSegments) { -String segmentName = segment.getName(); -// check segment folder pattern -if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) { - String[] parts = segmentName.split(CarbonCommonConstants.UNDERSCORE); - if (parts.length == 2) { -boolean isOriginal = !parts[1].contains("."); -if (isCompactionFlow) { - // in compaction flow, it should be big segment and segment metadata is not exists - if (!isOriginal && !metadataSet.contains(parts[1])) { -staleSegments.add(segment); + int retryCount = CarbonLockUtil + .getLockProperty(CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK, + CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK_DEFAULT); + int maxTimeout = CarbonLockUtil + .getLockProperty(CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK, + CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK_DEFAULT); + ICarbonLock carbonTableStatusLock = CarbonLockFactory + .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier(), LockUsage.TABLE_STATUS_LOCK); + try { +if (carbonTableStatusLock.lockWithRetries(retryCount, maxTimeout)) { + LoadMetadataDetails[] details = SegmentStatusManager.readLoadMetadata(metaDataLocation); + // there is no segment or failed to read tablestatus file. + // so it should stop immediately. + if (details == null || details.length == 0) { +return; + } + Set metadataSet = new HashSet<>(details.length); + for (LoadMetadataDetails detail : details) { +metadataSet.add(detail.getLoadName()); + } + List staleSegments = new ArrayList<>(allSegments.length); + Set staleSegmentsId = new HashSet<>(allSegments.length); + for (CarbonFile segment : allSegments) { +String segmentName = segment.getName(); +// check segment folder pattern +if (segmentName.startsWith(CarbonTablePath.SEGMENT_PREFIX)) { + String[] parts = segmentName.split(CarbonCommonConstants.UNDERSCORE); + if (parts.length == 2) { +boolean isOriginal = !parts[1].contains("."); +if (isCompactionFlow) { + // in compaction flow, it should be big segment and segment metadata is not exists + if (!isOriginal && !metadataSet.contains(parts[1])) { +staleSegments.add(segment); +staleSegmentsId.add(parts[1]); + } +} else { + // in loading flow, + // it should be original segment and segment metadata is not exists + if (isOriginal && !metadataSet.contains(parts[1])) { +staleSegments.add(segment); +staleSegmentsId.add(parts[1]); + } +} } -} else { - // in loading flow, it should be original segment and segment metadata is not exists - if (isOriginal && !metadataSet.contains(parts[1])) { -staleSegments.add(segment); +} + } + // delete segment folders one by one + for (CarbonFile staleSegment : staleSegments) { +try { + CarbonUtil.deleteFoldersAndFiles(staleSegment); +} catch (IOException | InterruptedException e) { + LOGGER.error("Unable to delete the given path :: " + e.getMessage(), e); +} + } + if (staleSegments.size() > 0) { +// collect the segment metadata path +String segmentFilesLocation = + CarbonTablePath.getSegmentFilesLocation(carbonTable.getTablePath()); +CarbonFile[] allSegmentMetadataFiles = +
[GitHub] [carbondata] Karan980 commented on pull request #3876: TestingCI
Karan980 commented on pull request #3876: URL: https://github.com/apache/carbondata/pull/3876#issuecomment-692516342 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
ShreelekhyaG commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488423451 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala ## @@ -191,7 +191,12 @@ case class CarbonLoadDataCommand(databaseNameOp: Option[String], if (isUpdateTableStatusRequired) { CarbonLoaderUtil.updateTableStatusForFailure(carbonLoadModel, uuid) } -throw ex +val errorMessage = operationContext.getProperty("Error message") +if (errorMessage != null) { + throw new Exception(errorMessage.toString, ex.getCause) Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3919: [CARBONDATA-3980] Load fails with aborted exception when Bad records action is unspecified
ShreelekhyaG commented on a change in pull request #3919: URL: https://github.com/apache/carbondata/pull/3919#discussion_r488422763 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CommonLoadUtils.scala ## @@ -1064,6 +1064,7 @@ object CommonLoadUtils { if (loadParams.updateModel.isDefined) { CarbonScalaUtil.updateErrorInUpdateModel(loadParams.updateModel.get, executorMessage) } Review comment: Added testcase This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 opened a new pull request #3928: [WIP]Fix DropCache log error when indexmap is null
Indhumathi27 opened a new pull request #3928: URL: https://github.com/apache/carbondata/pull/3928 ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3922: [CARBONDATA-3983] SI compatability issue
nihal0107 commented on a change in pull request #3922: URL: https://github.com/apache/carbondata/pull/3922#discussion_r488414094 ## File path: core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java ## @@ -59,13 +60,22 @@ public ImplicitExpression(Map> blockIdToBlockletIdMapping) private void addBlockEntry(String blockletPath) { String blockId = -blockletPath.substring(0, blockletPath.lastIndexOf(CarbonCommonConstants.FILE_SEPARATOR)); +blockletPath.substring(0, blockletPath.lastIndexOf(File.separator)); +// Check if blockletPath contains old tuple id format, and convert it to compatible format. +if (blockId.contains("batchno")) { Review comment: Please put the strings in the constant file and then use it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (CARBONDATA-3952) After reset query not hitting MV
[ https://issues.apache.org/jira/browse/CARBONDATA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] SHREELEKHYA GAMPA closed CARBONDATA-3952. - Resolution: Fixed > After reset query not hitting MV > > > Key: CARBONDATA-3952 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3952 > Project: CarbonData > Issue Type: Bug >Reporter: SHREELEKHYA GAMPA >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > > After reset query not hitting MV. > With the reset, spark.sql.warehouse.dir and carbonStorePath don't match and > the databaseLocation will change to old table path format. So, new tables > that are created after reset, take a different path incase of default. > Closing this , as it is identified as spark bug. More details can be found at > https://issues.apache.org/jira/browse/SPARK-31234 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] ShreelekhyaG closed pull request #3890: [CARBONDATA-3952] After reset query not hitting MV
ShreelekhyaG closed pull request #3890: URL: https://github.com/apache/carbondata/pull/3890 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org