[GitHub] [carbondata] vikramahuja1001 opened a new pull request #3894: [WIP] Added property to enable disable SIforFailed segments and added prope…
vikramahuja1001 opened a new pull request #3894: URL: https://github.com/apache/carbondata/pull/3894 …rty to limit number of segments ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3855: [CARBONDATA-3863], after using index service clean the temp data
CarbonDataQA1 commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-675280659 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2011/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3855: [CARBONDATA-3863], after using index service clean the temp data
CarbonDataQA1 commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-675279974 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3752/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3837: [CARBONDATA-3927]Remove unwanted fields from tupleID to make it short and to improve store size and performance.
CarbonDataQA1 commented on pull request #3837: URL: https://github.com/apache/carbondata/pull/3837#issuecomment-675279405 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3751/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3837: [CARBONDATA-3927]Remove unwanted fields from tupleID to make it short and to improve store size and performance.
CarbonDataQA1 commented on pull request #3837: URL: https://github.com/apache/carbondata/pull/3837#issuecomment-675278720 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2010/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance
ajantha-bhat commented on pull request #3858: URL: https://github.com/apache/carbondata/pull/3858#issuecomment-675277686 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.
CarbonDataQA1 commented on pull request #3879: URL: https://github.com/apache/carbondata/pull/3879#issuecomment-675276272 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3755/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance
ajantha-bhat commented on pull request #3858: URL: https://github.com/apache/carbondata/pull/3858#issuecomment-675272382 @akashrn5 : 2.4.5 build has a random failure, observed in other PR's also. you can merge this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance
CarbonDataQA1 commented on pull request #3858: URL: https://github.com/apache/carbondata/pull/3858#issuecomment-675270450 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3750/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance
CarbonDataQA1 commented on pull request #3858: URL: https://github.com/apache/carbondata/pull/3858#issuecomment-675268681 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2009/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.
akashrn5 commented on a change in pull request #3865: URL: https://github.com/apache/carbondata/pull/3865#discussion_r471918176 ## File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala ## @@ -145,47 +150,162 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach { sql("drop table if exists carbon_table") } - test("test insert / update with data more than 32000 characters") { + test("test load / insert / update with data more than 32000 characters and bad record action as Redirect") { +val testdata =s"$resourcesPath/MoreThan32KChar.csv" +FileFactory.deleteAllFilesOfDir(new File(CarbonProperties.getInstance() + .getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC))) +sql("CREATE TABLE longerthan32kchar(dim1 String, dim2 String, mes1 int) STORED AS carbondata") +sql(s"LOAD DATA LOCAL INPATH '$testdata' into table longerThan32kChar OPTIONS('FILEHEADER'='dim1,dim2,mes1', " + + s"'BAD_RECORDS_ACTION'='REDIRECT','BAD_RECORDS_LOGGER_ENABLE'='TRUE')") +var redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", "0", "0") +assert(checkRedirectedCsvContentAvailableInSource(testdata, redirectCsvPath)) +val longChar: String = RandomStringUtils.randomAlphabetic(33000) + CarbonProperties.getInstance() .addProperty(CarbonCommonConstants.CARBON_ENABLE_BAD_RECORD_HANDLING_FOR_INSERT, "true") -val testdata =s"$resourcesPath/32000char.csv" -sql("drop table if exists load32000chardata") -sql("drop table if exists load32000chardata_dup") -sql("CREATE TABLE load32000chardata(dim1 String, dim2 String, mes1 int) STORED AS carbondata") -sql("CREATE TABLE load32000chardata_dup(dim1 String, dim2 String, mes1 int) STORED AS carbondata") -sql(s"LOAD DATA LOCAL INPATH '$testdata' into table load32000chardata OPTIONS('FILEHEADER'='dim1,dim2,mes1')") +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "REDIRECT"); +sql(s"insert into longerthan32kchar values('33000', '$longChar', 4)") +checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("ok", "hi", 1), Row("itsok", "hello", 2))) +redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", "1", "0") +var redirectedFileLineList = FileUtils.readLines(redirectCsvPath) +var iterator = redirectedFileLineList.iterator() +while (iterator.hasNext) { + assert(iterator.next().equals("33000,"+longChar+",4")) +} + +// Update strings of length greater than 32000 +sql(s"update longerthan32kchar set(longerthan32kchar.dim2)=('$longChar') " + + "where longerthan32kchar.mes1=1").show() +checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("itsok", "hello", 2))) +redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", "0", "1") +redirectedFileLineList = FileUtils.readLines(redirectCsvPath) +iterator = redirectedFileLineList.iterator() +while (iterator.hasNext) { + assert(iterator.next().equals("ok,"+longChar+",1")) +} +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_ENABLE_BAD_RECORD_HANDLING_FOR_INSERT, "false") + +// Insert longer string without converter step will throw exception intercept[Exception] { - sql("insert into load32000chardata_dup select dim1,concat(load32000chardata.dim2,''),mes1 from load32000chardata").show() + sql(s"insert into longerthan32kchar values('32000', '$longChar', 3)") } -sql(s"LOAD DATA LOCAL INPATH '$testdata' into table load32000chardata_dup OPTIONS('FILEHEADER'='dim1,dim2,mes1')") + +FileFactory.deleteAllFilesOfDir(new File(CarbonProperties.getInstance() + .getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC))) + } + + test("test load / insert / update with data more than 32000 characters and bad record action as Force") { +val testdata =s"$resourcesPath/MoreThan32KChar.csv" +sql("CREATE TABLE longerthan32kchar(dim1 String, dim2 String, mes1 int) STORED AS carbondata") +sql(s"LOAD DATA LOCAL INPATH '$testdata' into table longerThan32kChar OPTIONS('FILEHEADER'='dim1,dim2,mes1', " + + s"'BAD_RECORDS_ACTION'='FORCE','BAD_RECORDS_LOGGER_ENABLE'='TRUE')") +checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("ok", "hi", 1), Row("itsok", "hello", 2), Row("32123", null, 3))) Review comment: move `testdata`, create and load command to a method and pass the bad record action as parameter, as its a common code between the test cases, code will be clean. ## File path: integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala ## @@ -145,47 +150,162 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterEach { sql("drop table if exists carbon_table") } - test("test insert /
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine
Indhumathi27 commented on a change in pull request #3885: URL: https://github.com/apache/carbondata/pull/3885#discussion_r471912427 ## File path: integration/presto/src/main/prestodb/org/apache/carbondata/presto/impl/CarbonTableReader.java ## @@ -281,7 +287,11 @@ private CarbonTableCacheModel getValidCacheBySchemaTableName(SchemaTableName sch createInputFormat(jobConf, carbonTable.getAbsoluteTableIdentifier(), new IndexFilter(carbonTable, filters, true), filteredPartitions); Job job = Job.getInstance(jobConf); + CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.IS_QUERY_FROM_PRESTO, "true"); Review comment: i think we can add only in connector classes. Moved to carbonDataModule. Please check if it is ok @ajantha-bhat @kunal642 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.
akashrn5 commented on a change in pull request #3879: URL: https://github.com/apache/carbondata/pull/3879#discussion_r471903933 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonSource.scala ## @@ -281,10 +281,22 @@ object CarbonSource { isExternal) val updatedFormat = CarbonToSparkAdapter .getUpdatedStorageFormat(storageFormat, updatedTableProperties, tableInfo.getTablePath) Review comment: if its handled, may be in `GeoTableExampleWithCarbonSession.scala` please add some validations just to check if the geo hash column is added in schema for the hive table. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3855: [CARBONDATA-3863], after using index service clean the temp data
kunal642 commented on pull request #3855: URL: https://github.com/apache/carbondata/pull/3855#issuecomment-675237567 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.
akashrn5 commented on a change in pull request #3879: URL: https://github.com/apache/carbondata/pull/3879#discussion_r471902010 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonSource.scala ## @@ -281,10 +281,22 @@ object CarbonSource { isExternal) val updatedFormat = CarbonToSparkAdapter .getUpdatedStorageFormat(storageFormat, updatedTableProperties, tableInfo.getTablePath) Review comment: here i can see the handking for `createCatalogTableForCarbonExtension` only, is it handled for `createCatalogTableForCarbonSession`? ## File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonSource.scala ## @@ -281,10 +281,22 @@ object CarbonSource { isExternal) val updatedFormat = CarbonToSparkAdapter .getUpdatedStorageFormat(storageFormat, updatedTableProperties, tableInfo.getTablePath) Review comment: here i can see the handling for `createCatalogTableForCarbonExtension` only, is it handled for `createCatalogTableForCarbonSession`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine
kunal642 commented on a change in pull request #3885: URL: https://github.com/apache/carbondata/pull/3885#discussion_r471901716 ## File path: integration/presto/src/main/prestodb/org/apache/carbondata/presto/impl/CarbonTableReader.java ## @@ -281,7 +287,11 @@ private CarbonTableCacheModel getValidCacheBySchemaTableName(SchemaTableName sch createInputFormat(jobConf, carbonTable.getAbsoluteTableIdentifier(), new IndexFilter(carbonTable, filters, true), filteredPartitions); Job job = Job.getInstance(jobConf); + CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.IS_QUERY_FROM_PRESTO, "true"); Review comment: @ajantha-bhat If we have a common place, better to move there This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3837: [CARBONDATA-3927]Remove unwanted fields from tupleID to make it short and to improve store size and performance.
kunal642 commented on pull request #3837: URL: https://github.com/apache/carbondata/pull/3837#issuecomment-675236391 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance
akashrn5 commented on pull request #3858: URL: https://github.com/apache/carbondata/pull/3858#issuecomment-675231491 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine
CarbonDataQA1 commented on pull request #3885: URL: https://github.com/apache/carbondata/pull/3885#issuecomment-675020057 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2008/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine
CarbonDataQA1 commented on pull request #3885: URL: https://github.com/apache/carbondata/pull/3885#issuecomment-67500 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3749/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #3773: [CARBONDATA-3830]Presto array columns read support
ajantha-bhat edited a comment on pull request #3773: URL: https://github.com/apache/carbondata/pull/3773#issuecomment-674940989 @akkio-97 : Thanks for working on this. Based on the above comments (array design is not vectorized, fill vector need to be dissolved and keep it as original, interface need to have a default implementation, all the data types not handled, delta flows were not handled, null values not handled), **it will be very hard to stabilize this based on the current design**. **I have analyzed and reworked on the new design in #3887,** **I will add you as co-Author to the same.** **you can close this PR** and later raise PR for local dict support, multi level array & struct support and map support. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3893: Added new property to set the value of executor LRU cache size to 70% of the total executor memory in IndexServer, if executor LRU
CarbonDataQA1 commented on pull request #3893: URL: https://github.com/apache/carbondata/pull/3893#issuecomment-674955113 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2007/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3893: Added new property to set the value of executor LRU cache size to 70% of the total executor memory in IndexServer, if executor LRU
CarbonDataQA1 commented on pull request #3893: URL: https://github.com/apache/carbondata/pull/3893#issuecomment-674954792 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3748/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine
Indhumathi27 commented on pull request #3885: URL: https://github.com/apache/carbondata/pull/3885#issuecomment-674941930 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3773: [CARBONDATA-3830]Presto array columns read support
ajantha-bhat commented on pull request #3773: URL: https://github.com/apache/carbondata/pull/3773#issuecomment-674940989 @akkio-97 : Thanks for working on this. Based on the above comments (array design is not vectorized, fill vector need to be dissolved and keep it as original, interface need to have a default implementation, all the data types not handled, delta flows were not handled) **it will be very hard to stabilize this based on the current design**. **I have analyzed and reworked on the new design in #3887,** **I will add you as co-Author to the same.** **you can close this PR** and later raise PR for local dict support, multi level array & struct support and map support. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type
CarbonDataQA1 commented on pull request #3887: URL: https://github.com/apache/carbondata/pull/3887#issuecomment-674907332 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2006/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type
CarbonDataQA1 commented on pull request #3887: URL: https://github.com/apache/carbondata/pull/3887#issuecomment-674898812 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3747/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Karan980 opened a new pull request #3893: Added new property to set the value of executor LRU cache size to 70% of the total executor memory in IndexServer, if executor LRU cache
Karan980 opened a new pull request #3893: URL: https://github.com/apache/carbondata/pull/3893 ### Why is this PR needed? This PR will set executor LRU cache memory to 70% of executor memory size, if it is not configured. ### What changes were proposed in this PR? Added new property to set executor LRU cache size to 70% ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3778: [CARBONDATA-3916] Support array complex type with SI
Indhumathi27 commented on a change in pull request #3778: URL: https://github.com/apache/carbondata/pull/3778#discussion_r471459506 ## File path: core/src/main/java/org/apache/carbondata/core/scan/complextypes/ArrayQueryType.java ## @@ -97,21 +97,31 @@ public void fillRequiredBlockData(RawBlockletColumnChunks blockChunkHolder) @Override public Object getDataBasedOnDataType(ByteBuffer dataBuffer) { -Object[] data = fillData(dataBuffer); +return getDataBasedOnDataType(dataBuffer, false); + } + + @Override + public Object getDataBasedOnDataType(ByteBuffer dataBuffer, boolean getBytesData) { Review comment: handled This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3778: [CARBONDATA-3916] Support array complex type with SI
CarbonDataQA1 commented on pull request #3778: URL: https://github.com/apache/carbondata/pull/3778#issuecomment-674861462 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3746/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3778: [CARBONDATA-3916] Support array complex type with SI
CarbonDataQA1 commented on pull request #3778: URL: https://github.com/apache/carbondata/pull/3778#issuecomment-674854838 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2005/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3778: [CARBONDATA-3916] Support array complex type with SI
CarbonDataQA1 commented on pull request #3778: URL: https://github.com/apache/carbondata/pull/3778#issuecomment-674787907 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2004/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3778: [CARBONDATA-3916] Support array complex type with SI
Indhumathi27 commented on a change in pull request #3778: URL: https://github.com/apache/carbondata/pull/3778#discussion_r471366000 ## File path: core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java ## @@ -41,39 +44,62 @@ * map that contains the mapping of block id to the valid blocklets in that block which contain * the data as per the applied filter */ - private Map> blockIdToBlockletIdMapping; + private final Map> blockIdToBlockletIdMapping; + + /** + * checks if implicit filter exceeds complex filter threshold + */ + private boolean isThresholdReached; public ImplicitExpression(List implicitFilterList) { +final Logger LOGGER = LogServiceFactory.getLogService(getClass().getName()); // initialize map with half the size of filter list as one block id can contain // multiple blocklets blockIdToBlockletIdMapping = new HashMap<>(implicitFilterList.size() / 2); for (Expression value : implicitFilterList) { String blockletPath = ((LiteralExpression) value).getLiteralExpValue().toString(); addBlockEntry(blockletPath); } +int complexFilterThreshold = CarbonProperties.getInstance().getComplexFilterThresholdForSI(); +isThresholdReached = implicitFilterList.size() > complexFilterThreshold; +if (isThresholdReached) { + LOGGER.info("Implicit Filter Size: " + implicitFilterList.size() + ", Threshold is: " + + complexFilterThreshold); +} } - public ImplicitExpression(Map> blockIdToBlockletIdMapping) { + public ImplicitExpression(Map> blockIdToBlockletIdMapping) { this.blockIdToBlockletIdMapping = blockIdToBlockletIdMapping; } private void addBlockEntry(String blockletPath) { Review comment: handled This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3778: [CARBONDATA-3916] Support array complex type with SI
Indhumathi27 commented on a change in pull request #3778: URL: https://github.com/apache/carbondata/pull/3778#discussion_r471365947 ## File path: core/src/main/java/org/apache/carbondata/core/scan/expression/conditional/ImplicitExpression.java ## @@ -41,39 +44,62 @@ * map that contains the mapping of block id to the valid blocklets in that block which contain * the data as per the applied filter */ - private Map> blockIdToBlockletIdMapping; + private final Map> blockIdToBlockletIdMapping; + + /** + * checks if implicit filter exceeds complex filter threshold + */ + private boolean isThresholdReached; public ImplicitExpression(List implicitFilterList) { +final Logger LOGGER = LogServiceFactory.getLogService(getClass().getName()); Review comment: moved This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3778: [CARBONDATA-3916] Support array complex type with SI
Indhumathi27 commented on a change in pull request #3778: URL: https://github.com/apache/carbondata/pull/3778#discussion_r471365673 ## File path: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataRefNode.java ## @@ -221,4 +221,9 @@ public int numberOfNodes() { public List getBlockInfos() { Review comment: removed getBlockInfos method This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3778: [CARBONDATA-3916] Support array complex type with SI
Indhumathi27 commented on a change in pull request #3778: URL: https://github.com/apache/carbondata/pull/3778#discussion_r471365424 ## File path: core/src/main/java/org/apache/carbondata/core/scan/complextypes/ArrayQueryType.java ## @@ -39,7 +39,7 @@ public ArrayQueryType(String name, String parentName, int columnIndex) { @Override public void addChildren(GenericQueryType children) { -if (this.getName().equals(children.getParentName())) { +if (null == this.getName() || this.getName().equals(children.getParentName())) { Review comment: removed this check ## File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ## @@ -2456,4 +2456,15 @@ private CarbonCommonConstants() { * property which defines the insert stage flow */ public static final String IS_INSERT_STAGE = "is_insert_stage"; + + /** + * Until the threshold for complex filter is reached, row id will be set to the bitset in + * implicit filter during secondary index pruning + */ + public static final String SI_COMPLEX_FILTER_THRESHOLD = "carbon.si.complex.filter.threshold"; Review comment: handled This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type
CarbonDataQA1 commented on pull request #3887: URL: https://github.com/apache/carbondata/pull/3887#issuecomment-674764749 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3743/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type
CarbonDataQA1 commented on pull request #3887: URL: https://github.com/apache/carbondata/pull/3887#issuecomment-674756309 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2003/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3892: flink write carbon file to hdfs when file size is less than 1M,can't write
ajantha-bhat commented on a change in pull request #3892: URL: https://github.com/apache/carbondata/pull/3892#discussion_r471286842 ## File path: integration/flink/src/main/java/org/apache/carbon/core/metadata/StageManager.java ## @@ -81,7 +81,7 @@ public static void writeStageInput(final String stageInputPath, final StageInput private static void writeSuccessFile(final String successFilePath) throws IOException { final DataOutputStream segmentStatusSuccessOutputStream = FileFactory.getDataOutputStream(successFilePath, -CarbonCommonConstants.BYTEBUFFER_SIZE, 1024); +CarbonCommonConstants.BYTEBUFFER_SIZE, 1024 * 1024 * 2); Review comment: what if the file size is greater than 2 MB ? why 2MB selected for this ? may be need to pass the actual file size ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.
CarbonDataQA1 commented on pull request #3865: URL: https://github.com/apache/carbondata/pull/3865#issuecomment-674707292 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3742/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.
CarbonDataQA1 commented on pull request #3865: URL: https://github.com/apache/carbondata/pull/3865#issuecomment-674707040 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2002/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3892: flink write carbon file to hdfs when file size is less than 1M,can't write
ajantha-bhat commented on pull request #3892: URL: https://github.com/apache/carbondata/pull/3892#issuecomment-674706635 @yutaoChina : Thanks for working on this. a) please handle the compilation error b) please create a jira issue and add it in the issue header This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3892: flink write carbon file to hdfs when file size is less than 1M,can't write
CarbonDataQA1 commented on pull request #3892: URL: https://github.com/apache/carbondata/pull/3892#issuecomment-674700072 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2001/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org