[GitHub] [carbondata] akkio-97 commented on a change in pull request #3906: [CARBONDATA-3968]Added test cases for hive read complex types and handled other issues
akkio-97 commented on a change in pull request #3906: URL: https://github.com/apache/carbondata/pull/3906#discussion_r484698276 ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/DataTypeUtil.java ## @@ -21,25 +21,31 @@ import java.util.ArrayList; import java.util.List; +import org.apache.carbondata.core.constants.CarbonCommonConstants; import org.apache.carbondata.core.metadata.datatype.DataType; import org.apache.carbondata.core.metadata.datatype.DataTypes; import org.apache.carbondata.core.metadata.datatype.StructField; +import org.apache.commons.lang.ArrayUtils; + public class DataTypeUtil { public static DataType convertHiveTypeToCarbon(String type) throws SQLException { if ("string".equalsIgnoreCase(type) || type.startsWith("char")) { return DataTypes.STRING; -} else if ("varchar".equalsIgnoreCase(type)) { +} else if (!type.startsWith("map<") && !type.startsWith("array<") && !type.startsWith("struct<") Review comment: made required changes, they are not required This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.
ShreelekhyaG commented on a change in pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#discussion_r484709973 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala ## @@ -318,48 +318,47 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA test("test load, update data with setlenient carbon property for daylight " + "saving time from different timezone") { CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true") -TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai")) sql("DROP TABLE IF EXISTS test_time") sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " + "TBLPROPERTIES('dateformat'='-MM-dd', 'timestampformat'='-MM-dd HH:mm:ss') ") sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time") -sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ") -sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'") -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" +sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ") Review comment: ok done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.
ShreelekhyaG commented on a change in pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#discussion_r484710316 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala ## @@ -318,48 +318,53 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA test("test load, update data with setlenient carbon property for daylight " + "saving time from different timezone") { CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true") -TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai")) sql("DROP TABLE IF EXISTS test_time") sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " + "TBLPROPERTIES('dateformat'='-MM-dd', 'timestampformat'='-MM-dd HH:mm:ss') ") sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time") -sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ") -sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'") -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" +sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ") +sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'") +// Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests) +// Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and +// clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST. +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" Review comment: added This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on a change in pull request #3906: [CARBONDATA-3968]Added test cases for hive read complex types and handled other issues
kunal642 commented on a change in pull request #3906: URL: https://github.com/apache/carbondata/pull/3906#discussion_r484718004 ## File path: processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/MapParserImpl.java ## @@ -73,9 +73,12 @@ public ArrayObject parse(Object data) { @Override public ArrayObject parseRaw(Object data) { -Object keyArray = ((Object[]) data)[0]; -Object valueArray = ((Object[]) data)[1]; -return new ArrayObject(new Object[]{child.parseRaw(keyArray), child.parseRaw(valueArray)}); +Object[] keyValuePairs = ((Object[]) data); Review comment: okay This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
CarbonDataQA1 commented on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-688694994 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2257/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
CarbonDataQA1 commented on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-688698823 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3997/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-3976) CarbonData Update operation enhancement
TangLin created CARBONDATA-3976: --- Summary: CarbonData Update operation enhancement Key: CARBONDATA-3976 URL: https://issues.apache.org/jira/browse/CARBONDATA-3976 Project: CarbonData Issue Type: Improvement Components: data-load Reporter: TangLin *Background* Update operation will clean up delta files before update( see cleanUpDeltaFiles(carbonTable, false)), It's loop traversal metadata path and segment path many times. When there are too many files, the overhead will increase and update time will be longer. *Motivation & Goal* During the update process, reduce loop traversal or remove cleanUpDelteFiles to another method. *Modification* There are some solutions as following. Solution 1: In cleanUpDeltaFiles have some same points in get files method, like updateStatusManager.getUpdateDeltaFilesList(segment, false,CarbonCommonConstants.UPDATE_DELTA_FILE_EXT, true, allSegmentFiles,true) and updateStatusManager.getUpdateDeltaFilesList(segment, false,CarbonCommonConstants.UPDATE_INDEX_FILE_EXT, true, allSegmentFiles,true), They are just different file types,but loop traversal segment path twice. we can merge it. Solution 2: Base solution 1,Use Spark or MapReduce to hand over tasks to other nodes. Solution 3: Submit cleanUpDelaFiles to another task, process them in the early morning or when the cluster is not busy. Solution 4: Establish a garbage collection bin, which provides some interfaces for our program to determine when files enter the garbage collection bin and how to deal with them. Please vote for all solutions. Best Regards, LinWood -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3906: [CARBONDATA-3968]Added test cases for hive read complex types and handled other issues
CarbonDataQA1 commented on pull request #3906: URL: https://github.com/apache/carbondata/pull/3906#issuecomment-688701556 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3998/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3906: [CARBONDATA-3968]Added test cases for hive read complex types and handled other issues
CarbonDataQA1 commented on pull request #3906: URL: https://github.com/apache/carbondata/pull/3906#issuecomment-688703629 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2258/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3909: [CARBONDATA-3972] Date/timestamp compatability between hive and carbon
akashrn5 commented on pull request #3909: URL: https://github.com/apache/carbondata/pull/3909#issuecomment-688713873 @ShreelekhyaG please add test cases. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.
akashrn5 commented on a change in pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#discussion_r484752262 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala ## @@ -318,48 +318,65 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA test("test load, update data with setlenient carbon property for daylight " + "saving time from different timezone") { CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true") -TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai")) sql("DROP TABLE IF EXISTS test_time") +sql("DROP TABLE IF EXISTS testhivetable") +// Create test_time and hive table sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " + "TBLPROPERTIES('dateformat'='-MM-dd', 'timestampformat'='-MM-dd HH:mm:ss') ") -sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time") -sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ") -sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'") -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" +sql("CREATE TABLE testhivetable (ID Int, date Date, time TIMESTAMP) row format delimited fields terminated by ',' ") +// load data into test_time and hive table and validate query result +sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time options('fileheader'='ID,date,time')") +sql(s"LOAD DATA local inpath '$resourcesPath/differentZoneTimeStamp.csv' overwrite INTO table testhivetable") +checkAnswer(sql("select * from test_time"), sql("select * from testhivetable")) +sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ") +sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'") +// Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests) +// Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and +// clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST. +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" sql("DROP TABLE test_time") CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE) } test("test load, update data with setlenient session level property for daylight " + "saving time from different timezone") { sql("set carbon.load.dateformat.setlenient.enable = true") -TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai")) sql("DROP TABLE IF EXISTS test_time") -sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " + +sql("DROP TABLE IF EXISTS testhivetable") +// Create test_time and hive table +sql("CREATE TABLE test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " + "TBLPROPERTIES('dateformat'='-MM-dd', 'timestampformat'='-MM-dd HH:mm:ss') ") -sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time") -sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ") -sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'") -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" +sql("CREATE TABLE testhivetable (ID Int, date Date, time TIMESTAMP) row format delimited fields terminated by ',' ") +// load data into test_time and hive table and validate query result +sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time options('fileheader'='ID,date,time')") +sql(s"LOAD DATA local inpath '$resourcesPath/differentZoneTimeStamp.csv' overwrite INTO table testhivetable") +checkAnswer(sql("select * from
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.
CarbonDataQA1 commented on pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688747092 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2259/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.
CarbonDataQA1 commented on pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688748633 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3999/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3902: [CARBONDATA-3961] reorder filter expression based on storage ordinal
CarbonDataQA1 commented on pull request #3902: URL: https://github.com/apache/carbondata/pull/3902#issuecomment-688760004 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4000/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3902: [CARBONDATA-3961] reorder filter expression based on storage ordinal
CarbonDataQA1 commented on pull request #3902: URL: https://github.com/apache/carbondata/pull/3902#issuecomment-688761061 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2260/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.
ShreelekhyaG commented on a change in pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#discussion_r484802588 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala ## @@ -318,48 +318,53 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA test("test load, update data with setlenient carbon property for daylight " + "saving time from different timezone") { CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true") -TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai")) sql("DROP TABLE IF EXISTS test_time") sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " + "TBLPROPERTIES('dateformat'='-MM-dd', 'timestampformat'='-MM-dd HH:mm:ss') ") sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time") -sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ") -sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'") -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" +sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ") +sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'") +// Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests) +// Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and +// clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST. +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" sql("DROP TABLE test_time") CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE) } test("test load, update data with setlenient session level property for daylight " + "saving time from different timezone") { sql("set carbon.load.dateformat.setlenient.enable = true") -TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai")) sql("DROP TABLE IF EXISTS test_time") sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " + "TBLPROPERTIES('dateformat'='-MM-dd', 'timestampformat'='-MM-dd HH:mm:ss') ") sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time") -sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ") -sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'") -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" -checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00" +sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ") +sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'") +// Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests) +// Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and +// clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST. +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" +checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00" sql("DROP TABLE test_time") defaultConfig() } def generateCSVFile(): Unit = { val rows = new ListBuffer[Array[String]] rows += Array("ID", "date", "time") -rows += Array("1", "1941-3-15", "1941-3-15 00:00:00") +rows += Array("1", "1941-3-15", "2019-3-10 02:00:00") rows += Array("2", "2016-7-24", "2016-7-24 01:02:30") BadRecordUtil.createCSV(rows, csvPath) } override def afterAll {
[GitHub] [carbondata] kunal642 commented on a change in pull request #3902: [CARBONDATA-3961] reorder filter expression based on storage ordinal
kunal642 commented on a change in pull request #3902: URL: https://github.com/apache/carbondata/pull/3902#discussion_r484804408 ## File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ## @@ -2512,4 +2512,10 @@ private CarbonCommonConstants() { * property which defines the presto query default value */ public static final String IS_QUERY_FROM_PRESTO_DEFAULT = "false"; + + @CarbonProperty(dynamicConfigurable = true) Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.
akashrn5 commented on pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688769415 @nihal0107 this PR contains some test case fix too, please add the changes in PR description and title, you can brief the title, no need to keep so long. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3906: [CARBONDATA-3968]Added test cases for hive read complex types and handled other issues
kunal642 commented on pull request #3906: URL: https://github.com/apache/carbondata/pull/3906#issuecomment-688795296 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3906: [CARBONDATA-3968]Added test cases for hive read complex types and handled other issues
asfgit closed pull request #3906: URL: https://github.com/apache/carbondata/pull/3906 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (CARBONDATA-3968) Hive read complex types issues
[ https://issues.apache.org/jira/browse/CARBONDATA-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor resolved CARBONDATA-3968. -- Fix Version/s: 2.1.0 Resolution: Fixed > Hive read complex types issues > -- > > Key: CARBONDATA-3968 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3968 > Project: CarbonData > Issue Type: Bug > Components: hive-integration >Reporter: Akshay >Priority: Major > Fix For: 2.1.0 > > Time Spent: 3h > Remaining Estimate: 0h > > # Issues in reading array/map/struct of byte, varchar and decimal types. > # Map of primitive type with only one row inserted has issues. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] kunal642 commented on a change in pull request #3908: [CARBONDATA-3967] cache partition on select to enable faster pruning
kunal642 commented on a change in pull request #3908: URL: https://github.com/apache/carbondata/pull/3908#discussion_r484838229 ## File path: integration/spark/src/main/scala/org/apache/spark/util/PartitionCacheManger.scala ## @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.util + +import java.net.URI +import java.util + +import scala.collection.JavaConverters._ + +import org.apache.log4j.Logger +import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, CatalogTablePartition} + +import org.apache.carbondata.common.logging.LogServiceFactory +import org.apache.carbondata.core.cache.{Cache, Cacheable, CarbonLRUCache} +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.datastore.impl.FileFactory +import org.apache.carbondata.core.metadata.SegmentFileStore +import org.apache.carbondata.core.statusmanager.SegmentStatusManager +import org.apache.carbondata.core.util.path.CarbonTablePath + +object PartitionCacheManager extends Cache[PartitionCacheKey, CacheablePartitionSpec] { + + private val CACHE = new CarbonLRUCache( +CarbonCommonConstants.CARBON_PARTITION_MAX_DRIVER_LRU_CACHE_SIZE, +CarbonCommonConstants.CARBON_MAX_LRU_CACHE_SIZE_DEFAULT) + + val LOGGER: Logger = LogServiceFactory.getLogService(this.getClass.getName) + + def get(identifier: PartitionCacheKey): CacheablePartitionSpec = { +val cacheablePartitionSpec = + CACHE.get(identifier.tableId).asInstanceOf[CacheablePartitionSpec] +val tableStatusModifiedTime = FileFactory + .getCarbonFile(CarbonTablePath.getTableStatusFilePath(identifier.tablePath)) + .getLastModifiedTime +if (cacheablePartitionSpec != null) { + if (tableStatusModifiedTime > cacheablePartitionSpec.timestamp) { +readPartitions(identifier, tableStatusModifiedTime) + } else { +cacheablePartitionSpec + } +} else { + readPartitions(identifier, tableStatusModifiedTime) +} + } + + override def getAll(keys: util.List[PartitionCacheKey]): + util.List[CacheablePartitionSpec] = { +keys.asScala.map(get).toList.asJava + } + + override def getIfPresent(key: PartitionCacheKey): CacheablePartitionSpec = { +CACHE.get(key.tableId).asInstanceOf[CacheablePartitionSpec] + } + + override def invalidate(partitionCacheKey: PartitionCacheKey): Unit = { +CACHE.remove(partitionCacheKey.tableId) + } + + private def readPartitions(identifier: PartitionCacheKey, tableStatusModifiedTime: Long) = { Review comment: added per segment modification check...now only the updated/new segments would be loaded This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.
nihal0107 commented on pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688802174 > @nihal0107 this PR contains some test case fix too, please add the changes in PR description and title, you can brief the title, no need to keep so long. Updated the PR description and title. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
marchpure commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688802921 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
Indhumathi27 commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484845416 ## File path: integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java ## @@ -117,6 +122,16 @@ public ConnectorSplitSource getSplits(ConnectorTransactionHandle transactionHand // file metastore case tablePath can be null, so get from location location = table.getStorage().getLocation(); } +List filteredPartitions = new ArrayList<>(); Review comment: Can you add a testcase with partition filter? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
ajantha-bhat commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484845973 ## File path: integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java ## @@ -117,6 +122,16 @@ public ConnectorSplitSource getSplits(ConnectorTransactionHandle transactionHand // file metastore case tablePath can be null, so get from location location = table.getStorage().getLocation(); } +List filteredPartitions = new ArrayList<>(); Review comment: please read the description, I have mentioned why UT cannot be added now This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
ajantha-bhat commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484845973 ## File path: integration/presto/src/main/prestodb/org/apache/carbondata/presto/CarbondataSplitManager.java ## @@ -117,6 +122,16 @@ public ConnectorSplitSource getSplits(ConnectorTransactionHandle transactionHand // file metastore case tablePath can be null, so get from location location = table.getStorage().getLocation(); } +List filteredPartitions = new ArrayList<>(); Review comment: please read the description, I have already mentioned why UT cannot be added now This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
Indhumathi27 commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484849645 ## File path: integration/presto/src/main/prestosql/org/apache/carbondata/presto/impl/CarbonTableReader.java ## @@ -245,16 +242,14 @@ private CarbonTableCacheModel getValidCacheBySchemaTableName(SchemaTableName sch * * @param tableCacheModel cached table * @param filters carbonData filters - * @param constraints presto filters + * @param filteredPartitions matched partitionSpec for the filter * @param config hadoop conf * @return list of multiblock split * @throws IOException */ - public List getInputSplits( - CarbonTableCacheModel tableCacheModel, - Expression filters, - TupleDomain constraints, - Configuration config) throws IOException { + public List getInputSplits(CarbonTableCacheModel tableCacheModel, + Expression filters, List filteredPartitions, Configuration config) Review comment: Can revert to old style This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
Indhumathi27 commented on a change in pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#discussion_r484851607 ## File path: integration/presto/src/test/prestodb/org/apache/carbondata/presto/server/PrestoTestUtil.scala ## @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.presto.server + +import com.facebook.presto.jdbc.PrestoArray + Review comment: Remove extra lines This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #3910: [CARBONDATA-3969] Fix Deserialization issue with DataType class
Indhumathi27 commented on pull request #3910: URL: https://github.com/apache/carbondata/pull/3910#issuecomment-688813601 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.
CarbonDataQA1 commented on pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688819393 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4001/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.
CarbonDataQA1 commented on pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688820071 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2261/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3908: [CARBONDATA-3967] cache partition on select to enable faster pruning
CarbonDataQA1 commented on pull request #3908: URL: https://github.com/apache/carbondata/pull/3908#issuecomment-688848438 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4002/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3908: [CARBONDATA-3967] cache partition on select to enable faster pruning
CarbonDataQA1 commented on pull request #3908: URL: https://github.com/apache/carbondata/pull/3908#issuecomment-688852427 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2262/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688858320 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4003/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
CarbonDataQA1 commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-688859734 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2263/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.
akashrn5 commented on pull request #3905: URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688866461 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.
asfgit closed pull request #3905: URL: https://github.com/apache/carbondata/pull/3905 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (CARBONDATA-3964) Select * from table or select count(*) without filter is throwing null pointer exception.
[ https://issues.apache.org/jira/browse/CARBONDATA-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3964. - Fix Version/s: 2.1.0 Resolution: Fixed > Select * from table or select count(*) without filter is throwing null > pointer exception. > - > > Key: CARBONDATA-3964 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3964 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 5h 20m > Remaining Estimate: 0h > > Steps to reproduce. > 1. Create a table. > 2. Load around 500 segments and more than 1 million records. > 3. Running query select(*) or select count(*) without filter is throwing null > pointer exception. > File: TableIndex.java > Method: pruneWithMultiThread > line: 447 > Reason: filter.getresolver() is null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3910: [CARBONDATA-3969] Fix Deserialization issue with DataType class
CarbonDataQA1 commented on pull request #3910: URL: https://github.com/apache/carbondata/pull/3910#issuecomment-65143 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2265/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on a change in pull request #3908: [CARBONDATA-3967] cache partition on select to enable faster pruning
kunal642 commented on a change in pull request #3908: URL: https://github.com/apache/carbondata/pull/3908#discussion_r484941895 ## File path: integration/spark/src/main/scala/org/apache/spark/util/PartitionCacheManger.scala ## @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.util + +import java.net.URI +import java.util + +import scala.collection.JavaConverters._ + +import org.apache.log4j.Logger +import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, CatalogTablePartition} + +import org.apache.carbondata.common.logging.LogServiceFactory +import org.apache.carbondata.core.cache.{Cache, Cacheable, CarbonLRUCache} +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.datastore.impl.FileFactory +import org.apache.carbondata.core.metadata.SegmentFileStore +import org.apache.carbondata.core.statusmanager.SegmentStatusManager +import org.apache.carbondata.core.util.path.CarbonTablePath + +object PartitionCacheManager extends Cache[PartitionCacheKey, CacheablePartitionSpec] { + + private val CACHE = new CarbonLRUCache( +CarbonCommonConstants.CARBON_PARTITION_MAX_DRIVER_LRU_CACHE_SIZE, +CarbonCommonConstants.CARBON_MAX_LRU_CACHE_SIZE_DEFAULT) + + val LOGGER: Logger = LogServiceFactory.getLogService(this.getClass.getName) + + def get(identifier: PartitionCacheKey): CacheablePartitionSpec = { +val cacheablePartitionSpec = + CACHE.get(identifier.tableId).asInstanceOf[CacheablePartitionSpec] +val tableStatusModifiedTime = FileFactory + .getCarbonFile(CarbonTablePath.getTableStatusFilePath(identifier.tablePath)) + .getLastModifiedTime +if (cacheablePartitionSpec != null) { + if (tableStatusModifiedTime > cacheablePartitionSpec.timestamp) { +readPartitions(identifier, tableStatusModifiedTime) + } else { +cacheablePartitionSpec + } +} else { + readPartitions(identifier, tableStatusModifiedTime) +} + } + + override def getAll(keys: util.List[PartitionCacheKey]): + util.List[CacheablePartitionSpec] = { +keys.asScala.map(get).toList.asJava + } + + override def getIfPresent(key: PartitionCacheKey): CacheablePartitionSpec = { +CACHE.get(key.tableId).asInstanceOf[CacheablePartitionSpec] + } + + override def invalidate(partitionCacheKey: PartitionCacheKey): Unit = { +CACHE.remove(partitionCacheKey.tableId) + } + + private def readPartitions(identifier: PartitionCacheKey, tableStatusModifiedTime: Long) = { Review comment: @QiangCai each load or query would be loading the already success segments, so it will now solve the problem you mentioned This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3910: [CARBONDATA-3969] Fix Deserialization issue with DataType class
CarbonDataQA1 commented on pull request #3910: URL: https://github.com/apache/carbondata/pull/3910#issuecomment-688898175 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4005/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3912: [WIP] Global sort partitions should be determined dynamically
CarbonDataQA1 commented on pull request #3912: URL: https://github.com/apache/carbondata/pull/3912#issuecomment-688912501 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4004/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #3910: [CARBONDATA-3969] Fix Deserialization issue with DataType class
Indhumathi27 commented on pull request #3910: URL: https://github.com/apache/carbondata/pull/3910#issuecomment-688914579 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3912: [WIP] Global sort partitions should be determined dynamically
CarbonDataQA1 commented on pull request #3912: URL: https://github.com/apache/carbondata/pull/3912#issuecomment-688914915 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2264/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 opened a new pull request #3916: [CARBONDATA-3935]Support partition table transactional write in presto
akashrn5 opened a new pull request #3916: URL: https://github.com/apache/carbondata/pull/3916 ### Why is this PR needed? Currently, we support only reading the tables created in spark in presto. Its a bottleneck and writing the trasactional is required in presto for easy write and read via presto. ### What changes were proposed in this PR? This PR iis on top of #3875 This PR supports writing the partition transactional data in presto, it supports multiple partition columns too. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3910: [CARBONDATA-3969] Fix Deserialization issue with DataType class
CarbonDataQA1 commented on pull request #3910: URL: https://github.com/apache/carbondata/pull/3910#issuecomment-688991758 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4006/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3910: [CARBONDATA-3969] Fix Deserialization issue with DataType class
CarbonDataQA1 commented on pull request #3910: URL: https://github.com/apache/carbondata/pull/3910#issuecomment-688993910 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2266/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure commented on pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
marchpure commented on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-689003409 I just tested. With this PR. Query nonpartition table will has EMPTY RESULT. Query parititon table works well This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [CARBONDATA-3864] Store Size Optimization
CarbonDataQA1 commented on pull request #3789: URL: https://github.com/apache/carbondata/pull/3789#issuecomment-689021438 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4007/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [CARBONDATA-3864] Store Size Optimization
CarbonDataQA1 commented on pull request #3789: URL: https://github.com/apache/carbondata/pull/3789#issuecomment-689024935 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2267/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan-c980 commented on a change in pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
Karan-c980 commented on a change in pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#discussion_r485093341 ## File path: README.md ## @@ -100,3 +100,4 @@ To get involved in CarbonData: ## About Apache CarbonData is an open source project of The Apache Software Foundation (ASF). +## PR Review comment: Removed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.
CarbonDataQA1 commented on pull request #3875: URL: https://github.com/apache/carbondata/pull/3875#issuecomment-689036860 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2268/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.
CarbonDataQA1 commented on pull request #3875: URL: https://github.com/apache/carbondata/pull/3875#issuecomment-689040369 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4008/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3787: [WIP][CARBONDATA-3923] support global sort for SI
CarbonDataQA1 commented on pull request #3787: URL: https://github.com/apache/carbondata/pull/3787#issuecomment-689041516 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4009/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-3977) Global sort partitions should be determined dynamically
Mahesh Raju Somalaraju created CARBONDATA-3977: -- Summary: Global sort partitions should be determined dynamically Key: CARBONDATA-3977 URL: https://issues.apache.org/jira/browse/CARBONDATA-3977 Project: CarbonData Issue Type: New Feature Reporter: Mahesh Raju Somalaraju global sort : if user does not give any number of partitions in table properties and not configured property "carbon.load.global.sort.partitions" then need to calculate dynamically based on dataframe size. number of partition = dataframesizeInMB/partition size -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3916: [CARBONDATA-3935]Support partition table transactional write in presto
CarbonDataQA1 commented on pull request #3916: URL: https://github.com/apache/carbondata/pull/3916#issuecomment-689046280 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2270/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3787: [WIP][CARBONDATA-3923] support global sort for SI
CarbonDataQA1 commented on pull request #3787: URL: https://github.com/apache/carbondata/pull/3787#issuecomment-689047351 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2269/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3916: [CARBONDATA-3935]Support partition table transactional write in presto
CarbonDataQA1 commented on pull request #3916: URL: https://github.com/apache/carbondata/pull/3916#issuecomment-689055659 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4010/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
CarbonDataQA1 commented on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689092982 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4011/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
CarbonDataQA1 commented on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689096325 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2272/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3912: [CARBONDATA-3977] Global sort partitions should be determined dynamically
CarbonDataQA1 commented on pull request #3912: URL: https://github.com/apache/carbondata/pull/3912#issuecomment-689101595 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4012/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3909: [CARBONDATA-3972] Date/timestamp compatability between hive and carbon
CarbonDataQA1 commented on pull request #3909: URL: https://github.com/apache/carbondata/pull/3909#issuecomment-689102959 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4013/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3909: [CARBONDATA-3972] Date/timestamp compatability between hive and carbon
CarbonDataQA1 commented on pull request #3909: URL: https://github.com/apache/carbondata/pull/3909#issuecomment-689103877 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2273/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3912: [CARBONDATA-3977] Global sort partitions should be determined dynamically
CarbonDataQA1 commented on pull request #3912: URL: https://github.com/apache/carbondata/pull/3912#issuecomment-689105224 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2274/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
kunal642 commented on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689111283 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
kunal642 commented on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689111838 @Karan-c980 Please rebase, dont use merge to pull the new code This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 edited a comment on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
kunal642 edited a comment on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689111838 @Karan-c980 Please rebase, dont use merge to pull the new code. Merge commit should not be there This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] marchpure removed a comment on pull request #3913: [CARBONDATA-3974] Improve partition purning performance in presto carbon integration
marchpure removed a comment on pull request #3913: URL: https://github.com/apache/carbondata/pull/3913#issuecomment-689003409 I just tested. With this PR. Query nonpartition table will has EMPTY RESULT. Query parititon table works well This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan-c980 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
Karan-c980 commented on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689330508 done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan-c980 commented on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
Karan-c980 commented on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689330741 > @Karan-c980 Please rebase, dont use merge to pull the new code. > Merge commit should not be there Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan-c980 removed a comment on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
Karan-c980 removed a comment on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689330508 done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 removed a comment on pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
kunal642 removed a comment on pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#issuecomment-689111283 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on a change in pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
kunal642 commented on a change in pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#discussion_r485374393 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java ## @@ -0,0 +1,376 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.sdk.file; + +import java.io.File; +import java.io.FilenameFilter; +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.datatype.Field; +import org.apache.carbondata.core.scan.expression.ColumnExpression; +import org.apache.carbondata.core.scan.expression.Expression; +import org.apache.carbondata.core.scan.expression.LiteralExpression; +import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression; +import org.apache.carbondata.core.scan.expression.logical.AndExpression; +import org.apache.carbondata.core.scan.expression.logical.OrExpression; +import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat; +import org.apache.carbondata.hadoop.internal.ObjectArrayWritable; + +import org.apache.hadoop.io.NullWritable; +import org.apache.hadoop.mapreduce.RecordWriter; + +public class CarbonIUD { + + private final Map>> filterColumnToValueMappingForDelete; + private final Map>> filterColumnToValueMappingForUpdate; + private final Map> updateColumnToValueMapping; + + private CarbonIUD() { +filterColumnToValueMappingForDelete = new HashMap<>(); +filterColumnToValueMappingForUpdate = new HashMap<>(); +updateColumnToValueMapping = new HashMap<>(); + } + + /** + * @return CarbonIUD object + */ + public static CarbonIUD getInstance() { +return new CarbonIUD(); + } + + /** + * @param path is the table path on which delete is performed + * @param column is the columnName on which records have to be deleted + * @param value of column on which the records have to be deleted + * @return CarbonIUD object + */ + public CarbonIUD delete(String path, String column, String value) { +prepareDelete(path, column, value, filterColumnToValueMappingForDelete); +return this; + } + + /** + * This method deletes the rows at given path by applying the filterExpression + * + * @param path is the table path on which delete is performed + * @param filterExpression is the expression to delete the records + * @throws IOException + * @throws InterruptedException + */ + public void delete(String path, Expression filterExpression) + throws IOException, InterruptedException { +CarbonReader reader = CarbonReader.builder(path) +.projection(new String[] { CarbonCommonConstants.CARBON_IMPLICIT_COLUMN_TUPLEID }) +.filter(filterExpression).build(); + +RecordWriter deleteDeltaWriter = +CarbonTableOutputFormat.getDeleteDeltaRecordWriter(path); +ObjectArrayWritable writable = new ObjectArrayWritable(); +while (reader.hasNext()) { + Object[] row = (Object[]) reader.readNextRow(); + writable.set(row); + deleteDeltaWriter.write(NullWritable.get(), writable); +} +deleteDeltaWriter.close(null); +reader.close(); + } + + /** + * Calling this method will start the execution of delete process + * + * @throws IOException + * @throws InterruptedException + */ + public void closeDelete() throws IOException, InterruptedException { +for (Map.Entry>> path : this.filterColumnToValueMappingForDelete +.entrySet()) { + deleteExecution(path.getKey()); +} + } + + /** + * @param path is the table path on which update is performed + * @param columnis the columnName on which records have to be updated + * @param value of column on which the records have to be updated + * @param updColumn is the name of updatedColumn + * @param updValue is the value of updatedCo
[GitHub] [carbondata] kunal642 commented on a change in pull request #3834: [CARBONDATA-3865] Implementation of delete/update feature in carbondata SDK.
kunal642 commented on a change in pull request #3834: URL: https://github.com/apache/carbondata/pull/3834#discussion_r48538 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonIUD.java ## @@ -0,0 +1,376 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.sdk.file; + +import java.io.File; +import java.io.FilenameFilter; +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.datatype.Field; +import org.apache.carbondata.core.scan.expression.ColumnExpression; +import org.apache.carbondata.core.scan.expression.Expression; +import org.apache.carbondata.core.scan.expression.LiteralExpression; +import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression; +import org.apache.carbondata.core.scan.expression.logical.AndExpression; +import org.apache.carbondata.core.scan.expression.logical.OrExpression; +import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat; +import org.apache.carbondata.hadoop.internal.ObjectArrayWritable; + +import org.apache.hadoop.io.NullWritable; +import org.apache.hadoop.mapreduce.RecordWriter; + +public class CarbonIUD { + + private final Map>> filterColumnToValueMappingForDelete; + private final Map>> filterColumnToValueMappingForUpdate; + private final Map> updateColumnToValueMapping; + + private CarbonIUD() { +filterColumnToValueMappingForDelete = new HashMap<>(); +filterColumnToValueMappingForUpdate = new HashMap<>(); +updateColumnToValueMapping = new HashMap<>(); + } + + /** + * @return CarbonIUD object + */ + public static CarbonIUD getInstance() { Review comment: take hadoop configuration object here to enable IUD on S3 also. You should pass the configuration to Write and Reader API internally This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org