[jira] [Resolved] (CARBONDATA-3311) Support latest presto [0.217] in carbon

2020-07-22 Thread Manhua Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manhua Jiang resolved CARBONDATA-3311.
--
Fix Version/s: 1.6.0
   Resolution: Fixed

> Support latest presto [0.217] in carbon
> ---
>
> Key: CARBONDATA-3311
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3311
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ajantha Bhat
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> supporting the latest version of the presto. please refer the release doc of 
> presto for more details,
> there is a change in presto-hive interfaces and hive analyser is added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3335) Fixed load and compaction failure after alter done in older version

2020-07-22 Thread Manhua Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manhua Jiang resolved CARBONDATA-3335.
--
Fix Version/s: 1.6.0
   Resolution: Fixed

> Fixed load and compaction failure after alter done in older version
> ---
>
> Key: CARBONDATA-3335
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3335
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kumar Vishal
>Assignee: Kumar Vishal
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *No Sort Load/Compaction is failing in latest version with alter in older 
> version*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3342) It throws IllegalArgumentException when using filter

2020-07-22 Thread Manhua Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manhua Jiang resolved CARBONDATA-3342.
--
Fix Version/s: 2.0.0
   Resolution: Fixed

> It throws IllegalArgumentException when using filter
> 
>
> Key: CARBONDATA-3342
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3342
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Bo Xu
>Assignee: Bo Xu
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> {code:java}
>   public void testReadWithFilterOfNonTransactional2() throws IOException, 
> InterruptedException {
> String path = "./testWriteFiles";
> FileUtils.deleteDirectory(new File(path));
> DataMapStoreManager.getInstance()
> .clearDataMaps(AbsoluteTableIdentifier.from(path));
> Field[] fields = new Field[2];
> fields[0] = new Field("name", DataTypes.STRING);
> fields[1] = new Field("age", DataTypes.INT);
> TestUtil.writeFilesAndVerify(200, new Schema(fields), path);
> ColumnExpression columnExpression = new ColumnExpression("age", 
> DataTypes.INT);
> EqualToExpression equalToExpression = new 
> EqualToExpression(columnExpression,
> new LiteralExpression("-11", DataTypes.INT));
> CarbonReader reader = CarbonReader
> .builder(path, "_temp")
> .projection(new String[]{"name", "age"})
> .filter(equalToExpression)
> .build();
> int i = 0;
> while (reader.hasNext()) {
>   Object[] row = (Object[]) reader.readNextRow();
>   // Default sort column is applied for dimensions. So, need  to validate 
> accordingly
>   assert (((String) row[0]).contains("robot"));
>   assert (1 == (int) (row[1]));
>   i++;
> }
> Assert.assertEquals(i, 1);
> reader.close();
> FileUtils.deleteDirectory(new File(path));
>   }
> {code}
> Exception:
> {code:java}
> 2019-04-04 18:15:23 INFO  CarbonLRUCache:163 - Removed entry from InMemory 
> lru cache :: 
> /Users/xubo/Desktop/xubo/git/carbondata2/store/sdk/testWriteFiles/63862773138004_batchno0-0-null-63862150454623.carbonindex
> java.lang.IllegalArgumentException: no reader
>   at 
> org.apache.carbondata.sdk.file.CarbonReader.(CarbonReader.java:60)
>   at 
> org.apache.carbondata.sdk.file.CarbonReaderBuilder.build(CarbonReaderBuilder.java:222)
>   at 
> org.apache.carbondata.sdk.file.CarbonReaderTest.testReadWithFilterOfNonTransactional2(CarbonReaderTest.java:221)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at junit.framework.TestCase.runBare(TestCase.java:141)
>   at junit.framework.TestResult$1.protect(TestResult.java:122)
>   at junit.framework.TestResult.runProtected(TestResult.java:142)
>   at junit.framework.TestResult.run(TestResult.java:125)
>   at junit.framework.TestCase.run(TestCase.java:129)
>   at junit.framework.TestSuite.runTest(TestSuite.java:255)
>   at junit.framework.TestSuite.run(TestSuite.java:250)
>   at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] MarvinLitt commented on pull request #3855: [CARBONDATA-3863], after using index service clean the temp data

2020-07-22 Thread GitBox


MarvinLitt commented on pull request #3855:
URL: https://github.com/apache/carbondata/pull/3855#issuecomment-662774014


   > @MarvinLitt , please add proper description as to why this PR is needed 
and what changes are proposed
   
   done



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3843: [CARBONDATA-3911]Fix NullPointerException in case of multiple updates and clean files

2020-07-22 Thread GitBox


kunal642 commented on pull request #3843:
URL: https://github.com/apache/carbondata/pull/3843#issuecomment-662696746


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 edited a comment on pull request #3788: [CARBONDATA-3844]Fix scan the relevant database instead of scanning all

2020-07-22 Thread GitBox


VenuReddy2103 edited a comment on pull request #3788:
URL: https://github.com/apache/carbondata/pull/3788#issuecomment-662573078


   If MV is in different database than that of its source tables, then we will 
not be able to rewrite the plan. Consider below inner join example.
   `spark.sql("create database db1")`
   `spark.sql("create database db2")`
   `spark.sql("create database db3")`
   `spark.sql("use db1")`
   `spark.sql("create table db1_table(a int, b int) stored as carbondata")`
   `spark.sql("insert into db1_table select 1, 2")`
   `spark.sql("use db2")`
   `spark.sql("create table db2_table(i int, j int) stored as carbondata")`
   `spark.sql("insert into db2_table select 1, 4")`
   `spark.sql("use db3")`
   `spark.sql("create materialized view db3_mv as select t1.a,t2.i from 
db1.db1_table t1,db2.db2_table t2 where t1.a=t2.i")`
   `spark.sql("explain select t1.a, t2.i from db1.db1_table t1, 
db2.db2_table t2 where t1.a=t2.i").show(100, false)`
   
   If we get `getValidSchemas` only from `db1` and `db2` in `hasSuitableMV()`. 
We don't find one and do not rewrite the plan.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-662577815


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1731/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 edited a comment on pull request #3788: [CARBONDATA-3844]Fix scan the relevant database instead of scanning all

2020-07-22 Thread GitBox


VenuReddy2103 edited a comment on pull request #3788:
URL: https://github.com/apache/carbondata/pull/3788#issuecomment-662573078


   If MV is in different database than that of its source tables, then we will 
not be able to rewrite the plan. Consider below inner join example.
   `spark.sql("create database db1")`
   `spark.sql("create database db2")`
   `spark.sql("create database db3")`
   `spark.sql("use db1")`
   `spark.sql("create table db1_table(a int, b int) stored as carbondata")`
   `spark.sql("insert into db1_table select 1, 2")`
   `spark.sql("use db2")`
   `spark.sql("create table db2_table(i int, j int) stored as carbondata")`
   `spark.sql("insert into db2_table select 1, 4")`
   `spark.sql("use db3")`
   `spark.sql("create materialized view db3_mv as select t1.a,t2.i from 
db1.db1_table t1,db2.db2_table t2 where t1.a=t2.i")`
   `spark.sql("explain select t1.a, t2.i from db1.db1_table t1, 
db2.db2_table t2 where t1.a=t2.i").show(100, false)`
   
   **if we get `getValidSchemas` only from `db1` and `db2` in 
`hasSuitableMV()`. We don't find one and do not rewrite the plan.**



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-662577478


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3473/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 edited a comment on pull request #3788: [CARBONDATA-3844]Fix scan the relevant database instead of scanning all

2020-07-22 Thread GitBox


VenuReddy2103 edited a comment on pull request #3788:
URL: https://github.com/apache/carbondata/pull/3788#issuecomment-662573078


   If MV is in different database than that of its source tables, then we will 
not be able to rewrite the plan. Consider below inner join example.
   `spark.sql("create database db1")`
   `spark.sql("create database db2")`
   `spark.sql("create database db3")`
   `spark.sql("use db1")`
   `spark.sql("create table db1_table(a int, b int) stored as carbondata")`
   `spark.sql("insert into db1_table select 1, 2")`
   `spark.sql("use db2")`
   `spark.sql("create table db2_table(i int, j int) stored as carbondata")`
   `spark.sql("insert into db2_table select 1, 4")`
   `spark.sql("use` db3")`
   `spark.sql("create materialized view db3_mv as select t1.a,t2.i from 
db1.db1_table t1,db2.db2_table t2 where t1.a=t2.i")`
   `spark.sql("explain select t1.a, t2.i from db1.db1_table t1, 
db2.db2_table t2 where t1.a=t2.i").show(100, false)`
   
   if we get `getValidSchemas` only from `db1` and `db2` in `hasSuitableMV()`. 
We don't find one and do not rewrite the plan.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 edited a comment on pull request #3788: [CARBONDATA-3844]Fix scan the relevant database instead of scanning all

2020-07-22 Thread GitBox


VenuReddy2103 edited a comment on pull request #3788:
URL: https://github.com/apache/carbondata/pull/3788#issuecomment-662573078


   If MV is in different database than that of its source tables, then we will 
not be able to rewrite the plan. Consider below inner join example.
   `spark.sql("create database db1")`
   `spark.sql("create database db2")`
   spark.sql("create database db3")
   spark.sql("use db1")
   spark.sql("create table db1_table(a int, b int) stored as carbondata")
   spark.sql("insert into db1_table select 1, 2")
   spark.sql("use db2")
   spark.sql("create table db2_table(i int, j int) stored as carbondata")
   spark.sql("insert into db2_table select 1, 4")
   spark.sql("use db3")
   spark.sql("create materialized view db3_mv as select t1.a,t2.i from 
db1.db1_table t1,db2.db2_table t2 where t1.a=t2.i")
   spark.sql("explain select t1.a, t2.i from db1.db1_table t1, db2.db2_table t2 
where t1.a=t2.i").show(100, false)`
   
   if we get `getValidSchemas` only from `db1` and `db2` in `hasSuitableMV()`. 
We don't find one and do not rewrite the plan.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 edited a comment on pull request #3788: [CARBONDATA-3844]Fix scan the relevant database instead of scanning all

2020-07-22 Thread GitBox


VenuReddy2103 edited a comment on pull request #3788:
URL: https://github.com/apache/carbondata/pull/3788#issuecomment-662573078


   If MV is in different database than that of its source tables, then we will 
not be able to rewrite the plan. Consider below inner join example.
   `spark.sql("create database db1")`
   `spark.sql("create database db2")`
   spark.sql("create database db3")
   spark.sql("use db1")
   spark.sql("create table db1_table(a int, b int) stored as carbondata")
   spark.sql("insert into db1_table select 1, 2")
   spark.sql("use db2")
   spark.sql("create table db2_table(i int, j int) stored as carbondata")
   spark.sql("insert into db2_table select 1, 4")
   spark.sql("use db3")
   spark.sql("create materialized view db3_mv as select t1.a,t2.i from 
db1.db1_table t1,db2.db2_table t2 where t1.a=t2.i")
   spark.sql("explain select t1.a, t2.i from db1.db1_table t1, db2.db2_table t2 
where t1.a=t2.i").show(100, false)`
   
   if we get `getValidSchemas` only from `db1` and `db2` in `hasSuitableMV()`. 
We don't find one and do not rewrite the plan.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 edited a comment on pull request #3788: [CARBONDATA-3844]Fix scan the relevant database instead of scanning all

2020-07-22 Thread GitBox


VenuReddy2103 edited a comment on pull request #3788:
URL: https://github.com/apache/carbondata/pull/3788#issuecomment-662573078


   If MV is in different database than that of its source tables, then we will 
not be able to rewrite the plan. Consider below inner join example.
   `spark.sql("create database db1")`
   spark.sql("create database db2")`
   spark.sql("create database db3")
   spark.sql("use db1")
   spark.sql("create table db1_table(a int, b int) stored as carbondata")
   spark.sql("insert into db1_table select 1, 2")
   spark.sql("use db2")
   spark.sql("create table db2_table(i int, j int) stored as carbondata")
   spark.sql("insert into db2_table select 1, 4")
   spark.sql("use db3")
   spark.sql("create materialized view db3_mv as select t1.a,t2.i from 
db1.db1_table t1,db2.db2_table t2 where t1.a=t2.i")
   spark.sql("explain select t1.a, t2.i from db1.db1_table t1, db2.db2_table t2 
where t1.a=t2.i").show(100, false)`
   
   if we get `getValidSchemas` only from `db1` and `db2` in `hasSuitableMV()`. 
We don't find one and do not rewrite the plan.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 edited a comment on pull request #3788: [CARBONDATA-3844]Fix scan the relevant database instead of scanning all

2020-07-22 Thread GitBox


VenuReddy2103 edited a comment on pull request #3788:
URL: https://github.com/apache/carbondata/pull/3788#issuecomment-662573078


   If MV is in different database than that of its source tables, then we will 
not be able to rewrite the plan. Consider below inner join example.
   `spark.sql("create database db1")`
   `spark.sql("create database db2")`
   spark.sql("create database db3")
   spark.sql("use db1")
   spark.sql("create table db1_table(a int, b int) stored as carbondata")
   spark.sql("insert into db1_table select 1, 2")
   spark.sql("use db2")
   spark.sql("create table db2_table(i int, j int) stored as carbondata")
   spark.sql("insert into db2_table select 1, 4")
   spark.sql("use db3")
   spark.sql("create materialized view db3_mv as select t1.a,t2.i from 
db1.db1_table t1,db2.db2_table t2 where t1.a=t2.i")
   spark.sql("explain select t1.a, t2.i from db1.db1_table t1, db2.db2_table t2 
where t1.a=t2.i").show(100, false)`
   
   if we get `getValidSchemas` only from `db1` and `db2` in `hasSuitableMV()`. 
We don't find one and do not rewrite the plan.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 commented on pull request #3788: [CARBONDATA-3844]Fix scan the relevant database instead of scanning all

2020-07-22 Thread GitBox


VenuReddy2103 commented on pull request #3788:
URL: https://github.com/apache/carbondata/pull/3788#issuecomment-662573078


   If MV is in different database than that of its source tables, then we will 
not be able to rewrite the plan. Consider below inner join example.
   `spark.sql("create database db1")
   spark.sql("create database db2")
   spark.sql("create database db3")
   spark.sql("use db1")
   spark.sql("create table db1_table(a int, b int) stored as carbondata")
   spark.sql("insert into db1_table select 1, 2")
   spark.sql("use db2")
   spark.sql("create table db2_table(i int, j int) stored as carbondata")
   spark.sql("insert into db2_table select 1, 4")
   spark.sql("use db3")
   spark.sql("create materialized view db3_mv as select t1.a,t2.i from 
db1.db1_table t1,db2.db2_table t2 where t1.a=t2.i")
   spark.sql("explain select t1.a, t2.i from db1.db1_table t1, db2.db2_table t2 
where t1.a=t2.i").show(100, false)`
   
   if we get `getValidSchemas` only from `db1` and `db2` in `hasSuitableMV()`. 
We don't find one and do not rewrite the plan.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3854: [CARBONDATA-3920]Fix compaction failure issue for SI table and metadata mismatch in concurrency

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3854:
URL: https://github.com/apache/carbondata/pull/3854#issuecomment-662520344


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3471/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

2020-07-22 Thread GitBox


ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458871503



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends 
MVManager {
 
 object MVManagerInSpark {
 
-  private val MANAGER_MAP_BY_SESSION =
-new util.HashMap[SparkSession, MVManagerInSpark]()
+  private var viewManager: MVManagerInSpark = null
 
+  // returns single MVManager instance for all the current sessions.
   def get(session: SparkSession): MVManagerInSpark = {
-var viewManager = MANAGER_MAP_BY_SESSION.get(session)
 if (viewManager == null) {
-  MANAGER_MAP_BY_SESSION.synchronized {
-viewManager = MANAGER_MAP_BY_SESSION.get(session)
+  this.synchronized {
 if (viewManager == null) {
   viewManager = new MVManagerInSpark(session)
-  MANAGER_MAP_BY_SESSION.put(session, viewManager)
   session.sparkContext.addSparkListener(new SparkListener {

Review comment:
   This is not needed for MV, removing it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3854: [CARBONDATA-3920]Fix compaction failure issue for SI table and metadata mismatch in concurrency

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3854:
URL: https://github.com/apache/carbondata/pull/3854#issuecomment-662510411


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1729/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3784:
URL: https://github.com/apache/carbondata/pull/3784#issuecomment-662504822


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3470/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

2020-07-22 Thread GitBox


ShreelekhyaG commented on a change in pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#discussion_r458854322



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/view/MVManagerInSpark.scala
##
@@ -48,17 +48,14 @@ class MVManagerInSpark(session: SparkSession) extends 
MVManager {
 
 object MVManagerInSpark {
 
-  private val MANAGER_MAP_BY_SESSION =
-new util.HashMap[SparkSession, MVManagerInSpark]()
+  private var viewManager: MVManagerInSpark = null
 
+  // returns single MVManager instance for all the current sessions.

Review comment:
   There should be only one MVManagerInSpark instance sharing the same 
sparkcontext. If not, in different spark sessions, the query will not hit MV.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3858:
URL: https://github.com/apache/carbondata/pull/3858#issuecomment-662495412


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3469/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3784:
URL: https://github.com/apache/carbondata/pull/3784#issuecomment-662484694


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1728/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3858:
URL: https://github.com/apache/carbondata/pull/3858#issuecomment-662485004


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1727/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3852: [CARBONDATA-3912] Fixed concurrent locking issue for clean files after load

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3852:
URL: https://github.com/apache/carbondata/pull/3852#issuecomment-662482569


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1726/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3852: [CARBONDATA-3912] Fixed concurrent locking issue for clean files after load

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3852:
URL: https://github.com/apache/carbondata/pull/3852#issuecomment-662457454


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3468/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on pull request #3841: [CARBONDATA-3899] Drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

2020-07-22 Thread GitBox


ShreelekhyaG commented on pull request #3841:
URL: https://github.com/apache/carbondata/pull/3841#issuecomment-662451331


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3857: [CARBONDATA-3914] Fixed issue on reading data from carbon table through hive beeline when no data is present in table.

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3857:
URL: https://github.com/apache/carbondata/pull/3857#issuecomment-662443929


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1725/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3857: [CARBONDATA-3914] Fixed issue on reading data from carbon table through hive beeline when no data is present in table.

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3857:
URL: https://github.com/apache/carbondata/pull/3857#issuecomment-662442472


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3467/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3843: [CARBONDATA-3911]Fix NullPointerException in case of multiple updates and clean files

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3843:
URL: https://github.com/apache/carbondata/pull/3843#issuecomment-662434419


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1724/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3843: [CARBONDATA-3911]Fix NullPointerException in case of multiple updates and clean files

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3843:
URL: https://github.com/apache/carbondata/pull/3843#issuecomment-662432614


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3466/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3920) compaction failure issue for SI table and metadata mismatch in concurrency

2020-07-22 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3920:
---

 Summary: compaction failure issue for SI table and metadata 
mismatch in concurrency
 Key: CARBONDATA-3920
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3920
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


when load and compaction are run concurrently, sometimes the data files or 
segment folders are missing , and due to concurrency sometimes, the SI metadata 
is overwritten which leads to metadata inconsistency



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3784:
URL: https://github.com/apache/carbondata/pull/3784#issuecomment-662423572


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3465/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3847: [CARBONDATA-3906] Optimize sort performance in writting file

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3847:
URL: https://github.com/apache/carbondata/pull/3847#issuecomment-662414323


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1722/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3847: [CARBONDATA-3906] Optimize sort performance in writting file

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3847:
URL: https://github.com/apache/carbondata/pull/3847#issuecomment-662414760


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3464/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3858: [CARBONDATA-3919] Improve concurrent query perfromance

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3858:
URL: https://github.com/apache/carbondata/pull/3858#discussion_r458722832



##
File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
##
@@ -472,9 +471,6 @@ public IndexFilter getFilterPredicates(Configuration 
configuration) {
 QueryStatisticsRecorder recorder = 
CarbonTimeStatisticsFactory.createDriverRecorder();
 QueryStatistic statistic = new QueryStatistic();
 
-// get tokens for all the required FileSystem for table path

Review comment:
   @ Reviewers: let me know if any impact by removing this call. Removed 
and tested in cluster, didn't find any problem. 
   so, is it required ? is this API call is responsible for renewing tokens ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3858: [CARBONDATA-3919] Improve concurrent query perfromance

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3858:
URL: https://github.com/apache/carbondata/pull/3858#discussion_r458722832



##
File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
##
@@ -472,9 +471,6 @@ public IndexFilter getFilterPredicates(Configuration 
configuration) {
 QueryStatisticsRecorder recorder = 
CarbonTimeStatisticsFactory.createDriverRecorder();
 QueryStatistic statistic = new QueryStatistic();
 
-// get tokens for all the required FileSystem for table path

Review comment:
   @ Reviewers: let me know if any impact by remove this call. Removed and 
tested in closer, didn't find any problem. 
   so, is it required ? is this API call is responsible for renewing tokens ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat opened a new pull request #3858: [CARBONDATA-3919] Improve concurrent query perfromance

2020-07-22 Thread GitBox


ajantha-bhat opened a new pull request #3858:
URL: https://github.com/apache/carbondata/pull/3858


### Why is this PR needed?
problem1: when 500 queries executed concurrently. 
checkIfRefreshIsNeeded method was synchronized. so only one thread was 
working at a time.
But actually synchronization is required only when schema modified to drop 
tables. Not for whole function.
   
   problem2:  
   TokenCache.obtainTokensForNamenodes was causing a performance bottleneck for 
concurrent queries. 
so, removed it

### What changes were proposed in this PR?
   for problem1: synchronize only remove table part. Observed 500 query total 
performance improved from 10s to 3 seconds in cluster.
   
   for problem2: 
avoid calling the API.
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- No
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on pull request #3852: [CARBONDATA-3912] Fixed concurrent locking issue for clean files after load

2020-07-22 Thread GitBox


nihal0107 commented on pull request #3852:
URL: https://github.com/apache/carbondata/pull/3852#issuecomment-662396337


   > @nihal0107 please resolve conflicts
   
   Resolved.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3919) Improve concurrent query performance

2020-07-22 Thread Ajantha Bhat (Jira)
Ajantha Bhat created CARBONDATA-3919:


 Summary: Improve concurrent query performance
 Key: CARBONDATA-3919
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3919
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ajantha Bhat
Assignee: Ajantha Bhat


problem1: when 500 queries executed concurrently. 

checkIfRefreshIsNeeded method was synchronized. so only one thread was working 
at time.

But actually synchronization is required only when schema modified to drop 
tables. Not for whole function

 

solution: synchronize only remove table part. Observed 500 query total 
performance improved from 10s to 3 seconds in cluster.

 

problem2:  

TokenCache.obtainTokensForNamenodes was causing a performance bottleneck for 
concurrent queries. so, removed it

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] akashrn5 commented on pull request #3852: [CARBONDATA-3912] Fixed concurrent locking issue for clean files after load

2020-07-22 Thread GitBox


akashrn5 commented on pull request #3852:
URL: https://github.com/apache/carbondata/pull/3852#issuecomment-662386500


   @nihal0107 please resolve conflicts



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on pull request #3857: [CARBONDATA-3914] Fixed issue on reading data from carbon table through hive beeline when no data is present in table.

2020-07-22 Thread GitBox


akashrn5 commented on pull request #3857:
URL: https://github.com/apache/carbondata/pull/3857#issuecomment-662384120


   add to whitelist



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-3853) Dataload fails for date column configured as BUCKET_COLUMNS

2020-07-22 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3853.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> Dataload fails for date column configured as BUCKET_COLUMNS
> ---
>
> Key: CARBONDATA-3853
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3853
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 2.0.0
>Reporter: Chetan Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Steps and Issue
> 0: jdbc:hive2://10.20.255.171:23040/> create table if not exists 
> all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
> int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
> float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
> varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
> smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
> float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
> char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
> ('BUCKET_NUMBER'='2', *'BUCKET_COLUMNS'='day'*);
>  +--+-+
> |Result|
> +--+-+
>  +--+-+
>  No rows selected (0.494 seconds)
>  0: jdbc:hive2://10.20.255.171:23040/> LOAD DATA INPATH 
> 'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 
> OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 
> ,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal 
> ,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber 
> ,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal 
> ,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords 
> ,emptyvarwords');
>  *Error: java.lang.Exception: DataLoad failure (state=,code=0)*
>  
> *Log-*
> java.lang.Exception: DataLoad failure
>  at 
> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:560)
>  at 
> org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207)
>  at 
> org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145)
>  at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
>  at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:190)
>  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
>  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
>  at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> 

[GitHub] [carbondata] asfgit closed pull request #3830: [CARBONDATA-3853] Data load failure when loading with bucket column as DATE data type

2020-07-22 Thread GitBox


asfgit closed pull request #3830:
URL: https://github.com/apache/carbondata/pull/3830


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3880) How to start JDBC service in distributed index

2020-07-22 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor updated CARBONDATA-3880:
-
Fix Version/s: (was: 2.0.1)
   2.1.0

>  How to start JDBC service in distributed index
> ---
>
> Key: CARBONDATA-3880
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3880
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.0
>Reporter: li
>Priority: Major
> Fix For: 2.1.0
>
>
> How to start JDBC service in distributed index



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3910) load fails when csv file present in local and loading to cluster

2020-07-22 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3910.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> load fails when csv file present in local and loading to cluster
> 
>
> Key: CARBONDATA-3910
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3910
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> load fails when csv file present in local and loading to cluster



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458697336



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/command/SICreationCommand.scala
##
@@ -209,6 +209,11 @@ private[sql] case class CarbonCreateSecondaryIndexCommand(
 .get
 }")
   }
+  val isSpatialColPresent = dims.find(x => 
x.getColumnSchema.isSpatialColumn)

Review comment:
   Please check , I think we need to remove isSpatialColumn from 
columnSchema. 
   we used that mainly for making  it invisible. Now as it is visible. It is 
just another plan column. Instead we can check if column name is in the table 
property or not.
   
   @VenuReddy2103 @ShreelekhyaG 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458696449



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##
@@ -82,6 +81,33 @@ class GeoTest extends QueryTest with BeforeAndAfterAll with 
BeforeAndAfterEach {
   case None => assert(false)
 }
   }
+  test("test geo table drop spatial index column") {
+createTable()
+loadData()
+val exception = intercept[MalformedCarbonCommandException](sql(s"alter 
table $table1 drop columns(mygeohash)"))
+assert(exception.getMessage.contains(

Review comment:
   some more test case can be added with mixed case column names for other 
table property validation added, example range column, bucket column etc





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] asfgit closed pull request #3838: [CARBONDATA-3910]Fix load failure in cluster when csv present in local file system in case of global sort

2020-07-22 Thread GitBox


asfgit closed pull request #3838:
URL: https://github.com/apache/carbondata/pull/3838


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458695151



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
##
@@ -228,8 +230,15 @@ class CarbonFileMetastore extends CarbonMetaStore {
 
c.getClass.getName.equals("org.apache.spark.sql.catalyst.catalog.HiveTableRelation")
 ||
 c.getClass.getName.equals(
   
"org.apache.spark.sql.catalyst.catalog.UnresolvedCatalogRelation")) =>
-val catalogTable =
+var catalogTable =
   CarbonReflectionUtils.getFieldOfCatalogTable("tableMeta", 
c).asInstanceOf[CatalogTable]
+// remove spatial column from schema

Review comment:
   same as above, we can register this column as it is visible column





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458694598



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala
##
@@ -266,16 +266,24 @@ case class CarbonPreInsertionCasts(sparkSession: 
SparkSession) extends Rule[Logi
   relation: LogicalRelation,
   child: LogicalPlan): LogicalPlan = {
 val carbonDSRelation = 
relation.relation.asInstanceOf[CarbonDatasourceHadoopRelation]
-if (carbonDSRelation.carbonRelation.output.size > CarbonCommonConstants
+val carbonTable = carbonDSRelation.carbonRelation.carbonTable
+val properties = 
carbonTable.getTableInfo.getFactTable.getTableProperties.asScala
+val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+var expectedOutput = carbonDSRelation.carbonRelation.output
+// have to remove geo column to support insert with original schema

Review comment:
   I think no need to remove this column from original schema as it is a 
visible column , same reason as I mentioned for `insertIntoCommand`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3843: [CARBONDATA-3911]Fix NullPointerException in case of multiple updates and clean files

2020-07-22 Thread GitBox


kunal642 commented on pull request #3843:
URL: https://github.com/apache/carbondata/pull/3843#issuecomment-662375543


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3838: [CARBONDATA-3910]Fix load failure in cluster when csv present in local file system in case of global sort

2020-07-22 Thread GitBox


kunal642 commented on pull request #3838:
URL: https://github.com/apache/carbondata/pull/3838#issuecomment-662375498


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458690774



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala
##
@@ -70,12 +71,20 @@ private[sql] case class CarbonProjectForUpdateCommand(
 val carbonTable = CarbonEnv.getCarbonTable(databaseNameOp, 
tableName)(sparkSession)
 setAuditTable(carbonTable)
 setAuditInfo(Map("plan" -> plan.simpleString))
+val properties = 
carbonTable.getTableInfo.getFactTable.getTableProperties.asScala
+val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
 columns.foreach { col =>
   val dataType = 
carbonTable.getColumnByName(col).getColumnSchema.getDataType
   if (dataType.isComplexType) {
 throw new UnsupportedOperationException("Unsupported operation on 
Complex data type")
   }
-
+  if (spatialProperty.isDefined) {
+if (col.contains(spatialProperty.get.trim)) {

Review comment:
   why `contains` here ? it's a column name. suppose to be equalsIgnoreCase 
?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458689351



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp: 
Option[String],
   convertedStaticPartition)
 scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
 if (logicalPartitionRelation != null) {
-  if (selectedColumnSchema.length != 
logicalPartitionRelation.output.length) {
+  val properties = 
table.getTableInfo.getFactTable.getTableProperties.asScala
+  val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+  var expectedOutput = logicalPartitionRelation.output
+  if (spatialProperty.isDefined && selectedColumnSchema.size + 1 == 
expectedOutput.length) {

Review comment:
   why the changes in this function ?
   
   As user wanted to created geoSpatial column, we created an extra column, 
select * from table  should include all the columns.
   
   If the target table don't want geo column, user can specify projections.
   
   **we should not skip spatial column while doing insert into**
   
   @VenuReddy2103 , @ShreelekhyaG 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3774: [CARBONDATA-3833] Make geoID visible

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r458686578



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##
@@ -677,6 +677,8 @@ object CarbonParserUtil {
 val errorMsg = "range_column not support multiple columns"
 throw new MalformedCarbonCommandException(errorMsg)
   }
+  CommonUtil.validateForSpatialTypeColumn(tableProperties, rangeColumn,

Review comment:
   Instead of checking property by property, once all the properties are 
filled, better to validate at one place  ?
   
   Also I see that for sortcolumns, column_metacache,no_inverted_index it is 
not handled





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3838: [CARBONDATA-3910]Fix load failure in cluster when csv present in local file system in case of global sort

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3838:
URL: https://github.com/apache/carbondata/pull/3838#issuecomment-662365766


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1721/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3838: [CARBONDATA-3910]Fix load failure in cluster when csv present in local file system in case of global sort

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3838:
URL: https://github.com/apache/carbondata/pull/3838#issuecomment-662366174


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3463/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3843: [CARBONDATA-3911]Fix NullPointerException in case of multiple updates and clean files

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3843:
URL: https://github.com/apache/carbondata/pull/3843#issuecomment-662366027


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1720/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3843: [CARBONDATA-3911]Fix NullPointerException in case of multiple updates and clean files

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3843:
URL: https://github.com/apache/carbondata/pull/3843#issuecomment-662363363


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3462/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3830: [CARBONDATA-3853] Data load failure when loading with bucket column as DATE data type

2020-07-22 Thread GitBox


ajantha-bhat commented on pull request #3830:
URL: https://github.com/apache/carbondata/pull/3830#issuecomment-662362251


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] shunlean commented on pull request #3847: [CARBONDATA-3906] Optimize sort performance in writting file

2020-07-22 Thread GitBox


shunlean commented on pull request #3847:
URL: https://github.com/apache/carbondata/pull/3847#issuecomment-662356431


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3849: [CARBONDATA-3913] Table level dateformat, timestampformat support

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3849:
URL: https://github.com/apache/carbondata/pull/3849#issuecomment-662337358


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1719/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3849: [CARBONDATA-3913] Table level dateformat, timestampformat support

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3849:
URL: https://github.com/apache/carbondata/pull/3849#issuecomment-662336254


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3461/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3847: [CARBONDATA-3906] Optimize sort performance in writting file

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3847:
URL: https://github.com/apache/carbondata/pull/3847#issuecomment-662325092


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3460/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3847: [CARBONDATA-3906] Optimize sort performance in writting file

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3847:
URL: https://github.com/apache/carbondata/pull/3847#issuecomment-662324632


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1718/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3838: [CARBONDATA-3910]Fix load failure in cluster when csv present in local file system in case of global sort

2020-07-22 Thread GitBox


kunal642 commented on pull request #3838:
URL: https://github.com/apache/carbondata/pull/3838#issuecomment-662298190


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3843: [CARBONDATA-3911]Fix NullPointerException in case of multiple updates and clean files

2020-07-22 Thread GitBox


kunal642 commented on pull request #3843:
URL: https://github.com/apache/carbondata/pull/3843#issuecomment-662297938


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3918) Select count(*) gives extra data after multiple updates with index server running

2020-07-22 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3918:
---

 Summary: Select count(*) gives extra data after multiple updates 
with index server running
 Key: CARBONDATA-3918
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3918
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal
 Fix For: 2.1.0


Select count(*) gives extra data after multiple updates with index server 
running

start index server.

create table and load data and then perform two updates and then run count (*) 
which gives extra data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-662284841


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3459/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3774: [CARBONDATA-3833] Make geoID visible

2020-07-22 Thread GitBox


CarbonDataQA1 commented on pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#issuecomment-662284603


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1717/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on pull request #3849: [CARBONDATA-3913] Table level dateformat, timestampformat support

2020-07-22 Thread GitBox


ShreelekhyaG commented on pull request #3849:
URL: https://github.com/apache/carbondata/pull/3849#issuecomment-662279167


   checked and updated.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3778: [CARBONDATA-3916] Support array with SI

2020-07-22 Thread GitBox


QiangCai commented on pull request #3778:
URL: https://github.com/apache/carbondata/pull/3778#issuecomment-662267686


   Agree with @ajantha-bhat 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3778: [CARBONDATA-3916] Support array with SI

2020-07-22 Thread GitBox


ajantha-bhat commented on a change in pull request #3778:
URL: https://github.com/apache/carbondata/pull/3778#discussion_r458561948



##
File path: 
index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithComplexArrayType.scala
##
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.spark.testsuite.secondaryindex
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterEach
+
+import 
org.apache.carbondata.spark.testsuite.secondaryindex.TestSecondaryIndexUtils.isFilterPushedDownToSI
+
+class TestSIWithComplexArrayType extends QueryTest with BeforeAndAfterEach {
+
+  override def beforeEach(): Unit = {
+sql("drop table if exists complextable")
+  }
+
+  override def afterEach(): Unit = {
+sql("drop index if exists index_1 on complextable")
+sql("drop table if exists complextable")
+  }
+
+  test("test array on secondary index") {

Review comment:
   also add a test case for c)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3778: [CARBONDATA-3916] Support array with SI

2020-07-22 Thread GitBox


QiangCai commented on a change in pull request #3778:
URL: https://github.com/apache/carbondata/pull/3778#discussion_r458509321



##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/complextypes/ArrayQueryType.java
##
@@ -97,21 +97,35 @@ public void fillRequiredBlockData(RawBlockletColumnChunks 
blockChunkHolder)
 
   @Override
   public Object getDataBasedOnDataType(ByteBuffer dataBuffer) {
-Object[] data = fillData(dataBuffer);
+return getDataBasedOnDataType(dataBuffer, false);
+  }
+
+  @Override
+  public Object getDataBasedOnDataType(ByteBuffer dataBuffer, boolean 
getBytesData) {
+Object[] data = fillData(dataBuffer, false);
 if (data == null) {
   return null;
 }
 return DataTypeUtil.getDataTypeConverter().wrapWithGenericArrayData(data);
   }
 
-  protected Object[] fillData(ByteBuffer dataBuffer) {
+  @Override
+  public Object[] getObjectArrayDataBasedOnDataType(ByteBuffer dataBuffer) {
+Object[] data = fillData(dataBuffer, true);
+if (data == null) {
+  return null;
+}
+return data;

Review comment:
   return fillData(dataBuffer, true);

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/query/SecondaryIndexQueryResultProcessor.java
##
@@ -281,12 +310,45 @@ private void processResult(List> 
detailQueryResultItera
   }
 }
 
+if (!complexDimensionInfoMap.isEmpty() && 
complexColumnParentBlockIndexes.length > 0) {
+  // In case of complex array type, flatten the data and add for sorting
+  // TODO: Handle for nested array and other complex types
+  for (int k = 0; k < wrapper.getComplexTypesKeys().length; k++) {
+byte[] complexKeyByIndex = wrapper.getComplexKeyByIndex(k);
+ByteBuffer byteArrayInput = ByteBuffer.wrap(complexKeyByIndex);
+GenericQueryType genericQueryType =
+complexDimensionInfoMap.get(complexColumnParentBlockIndexes[k]);
+short length = byteArrayInput.getShort(2);
+// get flattened array data
+Object[] data = 
genericQueryType.getObjectArrayDataBasedOnDataType(byteArrayInput);
+if (length != 1) {
+  for (int j = 1; j < length; j++) {
+preparedRow[i] = getData(data, j);
+preparedRow[i + 1] = implicitColumnByteArray;
+addRowForSorting(preparedRow.clone());
+  }
+  // add first row
+  preparedRow[i] = getData(data, 0);
+} else {
+  preparedRow[i] = getData(data, 0);
+}

Review comment:
   }
   preparedRow[i] = getData(data, 0);

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/query/SecondaryIndexQueryResultProcessor.java
##
@@ -226,8 +239,16 @@ private void processResult(List> 
detailQueryResultItera
 for (CarbonIterator detailQueryIterator : 
detailQueryResultIteratorList) {
   while (detailQueryIterator.hasNext()) {
 RowBatch batchResult = detailQueryIterator.next();
+DetailQueryResultIterator queryIterator = (DetailQueryResultIterator) 
detailQueryIterator;
+BlockExecutionInfo blockExecutionInfo = 
queryIterator.getBlockExecutionInfo();
+// get complex dimension info map from block execution info
+Map complexDimensionInfoMap =
+blockExecutionInfo.getComplexDimensionInfoMap();
+int[] complexColumnParentBlockIndexes =
+blockExecutionInfo.getComplexColumnParentBlockIndexes();

Review comment:
   move to outside of while loop





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org