[jira] [Created] (CARBONDATA-3309) MV datamap adapt to spark 2.1 version
Chenjian Qiu created CARBONDATA-3309: Summary: MV datamap adapt to spark 2.1 version Key: CARBONDATA-3309 URL: https://issues.apache.org/jira/browse/CARBONDATA-3309 Project: CarbonData Issue Type: Improvement Reporter: Chenjian Qiu -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3304) Distinguish the thread names created by thread pool of CarbonThreadFactory
Chenjian Qiu created CARBONDATA-3304: Summary: Distinguish the thread names created by thread pool of CarbonThreadFactory Key: CARBONDATA-3304 URL: https://issues.apache.org/jira/browse/CARBONDATA-3304 Project: CarbonData Issue Type: Improvement Reporter: Chenjian Qiu In order to solve the problem more conveniently through the log, we should distinguish the thread names created by thread pool of CarbonThreadFactory -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3303) MV datamap return wrong results when using coalesce and less groupby columns
Chenjian Qiu created CARBONDATA-3303: Summary: MV datamap return wrong results when using coalesce and less groupby columns Key: CARBONDATA-3303 URL: https://issues.apache.org/jira/browse/CARBONDATA-3303 Project: CarbonData Issue Type: Bug Reporter: Chenjian Qiu *SQL:* create table coalesce_test_main(id int,name string,height int,weight int using carbondata insert into coalesce_test_main select 1,'tom',170,130 insert into coalesce_test_main select 2,'tom',170,120 insert into coalesce_test_main select 3,'lily',160,100 create datamap coalesce_test_main_mv using 'mv' as select coalesce(sum(id),0) as sum_id,name as myname,weight from coalesce_test_main group by name,weight select coalesce(sum(id),0) as sumid,name from coalesce_test_main group by name *Result:* 1 tom 2 tom 3 lily -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3297) Throw IndexOutOfBoundsException when creating table and drop table at the same time
Chenjian Qiu created CARBONDATA-3297: Summary: Throw IndexOutOfBoundsException when creating table and drop table at the same time Key: CARBONDATA-3297 URL: https://issues.apache.org/jira/browse/CARBONDATA-3297 Project: CarbonData Issue Type: Bug Reporter: Chenjian Qiu java.lang.IndexOutOfBoundsException: 179 at scala.collection.mutable.ResizableArray$class.apply(ResizableArray.scala:43) at scala.collection.mutable.ArrayBuffer.apply(ArrayBuffer.scala:48) at scala.collection.IndexedSeqOptimized$class.segmentLength(IndexedSeqOptimized.scala:195) at scala.collection.mutable.ArrayBuffer.segmentLength(ArrayBuffer.scala:48) at scala.collection.GenSeqLike$class.prefixLength(GenSeqLike.scala:93) at scala.collection.AbstractSeq.prefixLength(Seq.scala:41) at scala.collection.IndexedSeqOptimized$class.find(IndexedSeqOptimized.scala:50) at scala.collection.mutable.ArrayBuffer.find(ArrayBuffer.scala:48) at org.apache.spark.sql.hive.CarbonFileMetastore.getTableFromMetadataCache(CarbonFileMetastore.scala:203) at org.apache.spark.sql.CarbonEnv$.getCarbonTable(CarbonEnv.scala:203) at org.apache.spark.sql.CarbonEnv$.getTablePath(CarbonEnv.scala:288) at org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand$$anonfun$1.apply(CarbonCreateTableCommand.scala:74) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3295) MV datamap throw exception because its rewrite algorithm when multiply subquery
Chenjian Qiu created CARBONDATA-3295: Summary: MV datamap throw exception because its rewrite algorithm when multiply subquery Key: CARBONDATA-3295 URL: https://issues.apache.org/jira/browse/CARBONDATA-3295 Project: CarbonData Issue Type: Bug Reporter: Chenjian Qiu error: java.lang.UnsupportedOperationException was thrown. java.lang.UnsupportedOperationException at org.apache.carbondata.mv.plans.util.SQLBuildDSL$Fragment.productArity(SQLBuildDSL.scala:36) at scala.runtime.ScalaRunTime$$anon$1.(ScalaRunTime.scala:174) at scala.runtime.ScalaRunTime$.typedProductIterator(ScalaRunTime.scala:172) mv sql: sql(s"""create datamap data_table_mv using 'mv' as | SELECT STARTTIME,LAYER4ID, | COALESCE (SUM(seq),0) AS seq_c, | COALESCE (SUM(succ),0) AS succ_c | FROM data_table | GROUP BY STARTTIME,LAYER4ID""".stripMargin) Query sql: sql(s"""SELECT MT.`3600` AS `3600`, | MT.`2250410101` AS `2250410101`, | (CASE WHEN (SUM(COALESCE(seq_c, 0))) = 0 THEN NULL | ELSE | (CASE WHEN (CAST((SUM(COALESCE(seq_c, 0))) AS int)) = 0 THEN 0 | ELSE ((CAST((SUM(COALESCE(succ_c, 0))) AS double)) | / (CAST((SUM(COALESCE(seq_c, 0))) AS double))) | END) * 100 | END) AS rate | FROM ( | SELECT sum_result.*, H_REGION.`2250410101` FROM | (SELECT cast(floor((starttime + 28800) / 3600) * 3600 - 28800 as int) AS `3600`, | LAYER4ID, | COALESCE(SUM(seq), 0) AS seq_c, | COALESCE(SUM(succ), 0) AS succ_c | FROM data_table | WHERE STARTTIME >= 1549866600 AND STARTTIME < 1549899900 | GROUP BY cast(floor((STARTTIME + 28800) / 3600) * 3600 - 28800 as int),LAYER4ID | )sum_result | LEFT JOIN | (SELECT l4id AS `225040101`, | l4name AS `2250410101`, | l4name AS NAME_2250410101 | FROM region | GROUP BY l4id, l4name) H_REGION | ON sum_result.LAYER4ID = H_REGION.`225040101` | WHERE H_REGION.NAME_2250410101 IS NOT NULL | ) MT | GROUP BY MT.`3600`, MT.`2250410101` | ORDER BY `3600` ASC LIMIT 5000""".stripMargin) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3294) MV datamap throw error when using count(1) and case when expression
Chenjian Qiu created CARBONDATA-3294: Summary: MV datamap throw error when using count(1) and case when expression Key: CARBONDATA-3294 URL: https://issues.apache.org/jira/browse/CARBONDATA-3294 Project: CarbonData Issue Type: Bug Reporter: Chenjian Qiu Query SQL ``` sql(s"""SELECT MT.`3600` AS `3600`, MT.`2250410101` AS `2250410101`, count(1) over() as countNum, (CASE WHEN (SUM(COALESCE(seq_c, 0))) = 0 THEN NULL ELSE (CASE WHEN (CAST((SUM(COALESCE(seq_c, 0))) AS int)) = 0 THEN 0 ELSE ((CAST((SUM(COALESCE(succ_c, 0))) AS double)) / (CAST((SUM(COALESCE(seq_c, 0))) AS double))) END) * 100 END) AS rate FROM ( SELECT sum_result.*, H_REGION.`2250410101` FROM (SELECT cast(floor((starttime + 28800) / 3600) * 3600 - 28800 as int) AS `3600`, LAYER4ID, COALESCE(SUM(seq), 0) AS seq_c, COALESCE(SUM(succ), 0) AS succ_c FROM data_table WHERE STARTTIME >= 1549866600 AND STARTTIME < 1549899900 GROUP BY cast(floor((STARTTIME + 28800) / 3600) * 3600 - 28800 as int),LAYER4ID )sum_result LEFT JOIN (SELECT l4id AS `225040101`, l4name AS `2250410101`, l4name AS NAME_2250410101 FROM region GROUP BY l4id, l4name) H_REGION ON sum_result.LAYER4ID = H_REGION.`225040101` WHERE H_REGION.NAME_2250410101 IS NOT NULL ) MT GROUP BY MT.`3600`, MT.`2250410101` ORDER BY `3600` ASC LIMIT 5000""".stripMargin) ``` ERROR: mismatched input 'FROM' expecting \{, 'WHERE', 'GROUP', 'ORDER', 'HAVING', 'LIMIT', 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', 'SORT', 'CLUSTER', 'DISTRIBUTE'}(line 2, pos 0) == SQL == SELECT MT.`3600`, MT.`2250410101`, `countNum`, `rate` FROM ^^^ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3291) MV datamap doesn't take affect when the same table join
Chenjian Qiu created CARBONDATA-3291: Summary: MV datamap doesn't take affect when the same table join Key: CARBONDATA-3291 URL: https://issues.apache.org/jira/browse/CARBONDATA-3291 Project: CarbonData Issue Type: Bug Reporter: Chenjian Qiu -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3289) MV datamap doesn't take effect when having clause use alias
Chenjian Qiu created CARBONDATA-3289: Summary: MV datamap doesn't take effect when having clause use alias Key: CARBONDATA-3289 URL: https://issues.apache.org/jira/browse/CARBONDATA-3289 Project: CarbonData Issue Type: Bug Reporter: Chenjian Qiu create table test_main(id int,name string,height int) using carbondata create datamap test_main_mv using 'mv' as select cast(id + 1 as bigint) as cast_id ,count(name) from test_main group by cast_id select cast(id + 1 as bigint) as cast_id,count(name) from test_main group by cast_id having cast_id < 3 select cast(id + 1 as bigint) as cast_id,count(name) from test_main where cast(id + 1 as bigint) < 3 group by cast_id -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3286) MV datamap doesn't take effect when query SQL has coalesce with all projections and no filter condition
Chenjian Qiu created CARBONDATA-3286: Summary: MV datamap doesn't take effect when query SQL has coalesce with all projections and no filter condition Key: CARBONDATA-3286 URL: https://issues.apache.org/jira/browse/CARBONDATA-3286 Project: CarbonData Issue Type: Bug Components: core Affects Versions: 1.5.1 Reporter: Chenjian Qiu MV datamap doesn't take effect when query SQL has coalesce with all projections and no filter condition test case : create table test_main(id int,name string,height int,weight int) using carbondata create datamap test_main_mv using 'mv' as select sum(id) as sum_id, sum(height) as sum_height,name as myname from test_main group by name select coalesce(sum(id),12) as sumid, sum(height) as sum_height from test_main group by name -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3276) Compacting table that do not exist should throw NoSuchTableException instead of MalformedCarbonCommandException
Chenjian Qiu created CARBONDATA-3276: Summary: Compacting table that do not exist should throw NoSuchTableException instead of MalformedCarbonCommandException Key: CARBONDATA-3276 URL: https://issues.apache.org/jira/browse/CARBONDATA-3276 Project: CarbonData Issue Type: Improvement Reporter: Chenjian Qiu Compacting table that do not exist should throw NoSuchTableException instead of MalformedCarbonCommandException("Operation not allowed : ALTER TABLE table_name COMPACT 'MAJOR'") it's confused -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3270) MV support that group by columns doesn't need be existed in the projection
Chenjian Qiu created CARBONDATA-3270: Summary: MV support that group by columns doesn't need be existed in the projection Key: CARBONDATA-3270 URL: https://issues.apache.org/jira/browse/CARBONDATA-3270 Project: CarbonData Issue Type: Bug Reporter: Chenjian Qiu MV support that group columns doesn't need be existed in the projection when create MV [JIRA-CARBONDATA-2533|https://issues.apache.org/jira/browse/CARBONDATA-2533] this jira's modification is not suitable this sql throws UnsupportedOperationException("Group by columns must be present in project columns") {code:java} create table mv_groupby_main(name string,height int,age int) stored by 'carbondata' create datamap mv_groupby_main_mv using 'mv' as select sum(height) from mv_groupby_main group by age {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3258) Add more test case for mv datamap
Chenjian Qiu created CARBONDATA-3258: Summary: Add more test case for mv datamap Key: CARBONDATA-3258 URL: https://issues.apache.org/jira/browse/CARBONDATA-3258 Project: CarbonData Issue Type: Test Components: data-query Reporter: Chenjian Qiu Add more test case for mv datamap -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3256) MV datamap doesn't affect using avg expression and count expression
[ https://issues.apache.org/jira/browse/CARBONDATA-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chenjian Qiu updated CARBONDATA-3256: - Description: test case: create table test_table(name string, age int, height int,weight int) stored by 'carbondata' create datamap test_table_mv using 'mv' as select sum(height),count(age),avg(age),name from test_table group by name explain select avg(age),name from test_table group by name was: test case: create table test_table(name string, age int, height int,weight int) stored by 'carbondata' create datamap test_table_mv using 'mv' as select sum(height),count(age),avg(age),name from test_table group by name explain select avg(age),name from test_table group by name error: It is not allowed to use an aggregate function in the argument of another aggregate function. Please use the inner aggregate function in a sub-query.;; Aggregate [name#267], [(sum((avg(age)#266 * cast(sum(count(age)#265L) as double))) / cast(sum(count(age)#265L) as double)) AS avg(age)#268, name#267] +- SubqueryAlias gen_subsumer_0 +- Project [sum_height#208L AS sum(height)#264L, count_age#209L AS count(age)#265L, avg_age#210 AS avg(age)#266, test_table_name#211 AS name#267] +- SubqueryAlias test_table_mv_table +- Project [sum_height#208L, count_age#209L, avg_age#210, test_table_name#211] +- SubqueryAlias test_table_mv_table +- Relation[sum_height#208L,count_age#209L,avg_age#210,test_table_name#211] CarbonDatasourceHadoopRelation [ Database name :default, Table name :test_table_mv_table, Schema :Some(StructType(StructField(sum_height,LongType,true), StructField(count_age,LongType,true), StructField(avg_age,DoubleType,true), StructField(test_table_name,StringType,true))) ] > MV datamap doesn't affect using avg expression and count expression > --- > > Key: CARBONDATA-3256 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3256 > Project: CarbonData > Issue Type: Bug > Components: sql >Reporter: Chenjian Qiu >Priority: Blocker > > test case: > create table test_table(name string, age int, height int,weight int) stored > by 'carbondata' > create datamap test_table_mv using 'mv' as select > sum(height),count(age),avg(age),name from test_table group by name > explain select avg(age),name from test_table group by name -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3256) MV datamap doesn't affect using avg expression and count expression
[ https://issues.apache.org/jira/browse/CARBONDATA-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chenjian Qiu updated CARBONDATA-3256: - Description: test case: create table test_table(name string, age int, height int,weight int) stored by 'carbondata' create datamap test_table_mv using 'mv' as select sum(height),count(age),avg(age),name from test_table group by name explain select avg(age),name from test_table group by name [== Physical Plan == *HashAggregate(keys=[name#33], functions=[count(age#34), avg(cast(age#34 as bigint))]) +- Exchange hashpartitioning(name#33, 200) +- *HashAggregate(keys=[name#33], functions=[partial_count(age#34), partial_avg(cast(age#34 as bigint))]) +- *FileScan carbondata default.test_table[name#33,age#34]] was: test case: create table test_table(name string, age int, height int,weight int) stored by 'carbondata' create datamap test_table_mv using 'mv' as select sum(height),count(age),avg(age),name from test_table group by name explain select avg(age),name from test_table group by name > MV datamap doesn't affect using avg expression and count expression > --- > > Key: CARBONDATA-3256 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3256 > Project: CarbonData > Issue Type: Bug > Components: sql >Reporter: Chenjian Qiu >Priority: Blocker > > test case: > create table test_table(name string, age int, height int,weight int) stored > by 'carbondata' > create datamap test_table_mv using 'mv' as select > sum(height),count(age),avg(age),name from test_table group by name > explain select avg(age),name from test_table group by name > [== Physical Plan == > *HashAggregate(keys=[name#33], functions=[count(age#34), avg(cast(age#34 as > bigint))]) > +- Exchange hashpartitioning(name#33, 200) >+- *HashAggregate(keys=[name#33], functions=[partial_count(age#34), > partial_avg(cast(age#34 as bigint))]) > +- *FileScan carbondata default.test_table[name#33,age#34]] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3256) MV datamap doesn't affect using avg expression and count expression
[ https://issues.apache.org/jira/browse/CARBONDATA-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chenjian Qiu updated CARBONDATA-3256: - Summary: MV datamap doesn't affect using avg expression and count expression (was: MV datamap throw error using avg expression and count expression) > MV datamap doesn't affect using avg expression and count expression > --- > > Key: CARBONDATA-3256 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3256 > Project: CarbonData > Issue Type: Bug > Components: sql >Reporter: Chenjian Qiu >Priority: Blocker > > test case: > create table test_table(name string, age int, height int,weight int) stored > by 'carbondata' > create datamap test_table_mv using 'mv' as select > sum(height),count(age),avg(age),name from test_table group by name > explain select avg(age),name from test_table group by name > error: > It is not allowed to use an aggregate function in the argument of another > aggregate function. Please use the inner aggregate function in a sub-query.;; > Aggregate [name#267], [(sum((avg(age)#266 * cast(sum(count(age)#265L) as > double))) / cast(sum(count(age)#265L) as double)) AS avg(age)#268, name#267] > +- SubqueryAlias gen_subsumer_0 >+- Project [sum_height#208L AS sum(height)#264L, count_age#209L AS > count(age)#265L, avg_age#210 AS avg(age)#266, test_table_name#211 AS name#267] > +- SubqueryAlias test_table_mv_table > +- Project [sum_height#208L, count_age#209L, avg_age#210, > test_table_name#211] > +- SubqueryAlias test_table_mv_table >+- > Relation[sum_height#208L,count_age#209L,avg_age#210,test_table_name#211] > CarbonDatasourceHadoopRelation [ Database name :default, Table name > :test_table_mv_table, Schema > :Some(StructType(StructField(sum_height,LongType,true), > StructField(count_age,LongType,true), StructField(avg_age,DoubleType,true), > StructField(test_table_name,StringType,true))) ] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3256) MV datamap throw error using avg expression and count expression
Chenjian Qiu created CARBONDATA-3256: Summary: MV datamap throw error using avg expression and count expression Key: CARBONDATA-3256 URL: https://issues.apache.org/jira/browse/CARBONDATA-3256 Project: CarbonData Issue Type: Bug Components: sql Reporter: Chenjian Qiu test case: create table test_table(name string, age int, height int,weight int) stored by 'carbondata' create datamap test_table_mv using 'mv' as select sum(height),count(age),avg(age),name from test_table group by name explain select avg(age),name from test_table group by name error: It is not allowed to use an aggregate function in the argument of another aggregate function. Please use the inner aggregate function in a sub-query.;; Aggregate [name#267], [(sum((avg(age)#266 * cast(sum(count(age)#265L) as double))) / cast(sum(count(age)#265L) as double)) AS avg(age)#268, name#267] +- SubqueryAlias gen_subsumer_0 +- Project [sum_height#208L AS sum(height)#264L, count_age#209L AS count(age)#265L, avg_age#210 AS avg(age)#266, test_table_name#211 AS name#267] +- SubqueryAlias test_table_mv_table +- Project [sum_height#208L, count_age#209L, avg_age#210, test_table_name#211] +- SubqueryAlias test_table_mv_table +- Relation[sum_height#208L,count_age#209L,avg_age#210,test_table_name#211] CarbonDatasourceHadoopRelation [ Database name :default, Table name :test_table_mv_table, Schema :Some(StructType(StructField(sum_height,LongType,true), StructField(count_age,LongType,true), StructField(avg_age,DoubleType,true), StructField(test_table_name,StringType,true))) ] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3247) Support to select all columns when creating MV datamap
Chenjian Qiu created CARBONDATA-3247: Summary: Support to select all columns when creating MV datamap Key: CARBONDATA-3247 URL: https://issues.apache.org/jira/browse/CARBONDATA-3247 Project: CarbonData Issue Type: Bug Components: sql Affects Versions: 1.5.1 Reporter: Chenjian Qiu create table all_table(name string, age int, height int) stored by 'carbondata' create datamap all_table_mv on table all_table using 'mv' as select avg(age),avg(height),name from all_table group by name throw UnsupportedOperationException("MV is not supported for this query") -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3238) Throw StackOverflowError exception using MV datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chenjian Qiu updated CARBONDATA-3238: - Description: Exception: java.lang.StackOverflowError at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at scala.Option.map(Option.scala:146) at org.apache.spark.sql.catalyst.expressions.AttributeMap.get(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap.contains(AttributeMap.scala:36) TestCase: sql("drop datamap if exists all_table_mv") sql("drop table if exists all_table") sql("create table all_table(x1 bigint,x2 bigint,x3 string,x4 bigint,x5 bigint,x6 int,x7 string,x8 int, x9 int,x10 bigint," + "x11 bigint, x12 bigint,x13 bigint,x14 bigint,x15 bigint,x16 bigint,x17 bigint,x18 bigint,x19 bigint) stored by 'carbondata'") sql("insert into all_table select 1,1,null,1,1,1,null,1,1,1,1,1,1,1,1,1,1,1,1") sql("create datamap all_table_mv on table all_table using 'mv' " + "as select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2") sql("rebuild datamap all_table_mv") sql("explain select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2") was: Exception: java.lang.StackOverflowError at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at scala.Option.map(Option.scala:146) at org.apache.spark.sql.catalyst.expressions.AttributeMap.get(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap.contains(AttributeMap.scala:36) TestCase: test("select mv stack exception") { sql("drop datamap if exists all_table_mv") sql("drop table if exists all_table") sql("create table all_table(x1 bigint,x2 bigint,x3 string,x4 bigint,x5 bigint,x6 int,x7 string,x8 int, x9 int,x10 bigint," + "x11 bigint, x12 bigint,x13 bigint,x14 bigint,x15 bigint,x16 bigint,x17 bigint,x18 bigint,x19 bigint) stored by 'carbondata'") sql("insert into all_table select 1,1,null,1,1,1,null,1,1,1,1,1,1,1,1,1,1,1,1") sql("create datamap all_table_mv on table all_table using 'mv' " + "as select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2") sql("rebuild datamap all_table_mv") sql("explain select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2").collect().foreach(println) } > Throw StackOverflowError exception using MV datamap > --- > > Key: CARBONDATA-3238 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3238 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.5.1 >Reporter: Chenjian Qiu >Priority: Blocker > > Exception: > java.lang.StackOverflowError > at > org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) > at > org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.catalyst.expressions.AttributeMap.get(AttributeMap.scala:34) > at > org.apache.spark.sql.catalyst.expressions.AttributeMap.contains(AttributeMap.scala:36) > TestCase: > sql("drop datamap if exists all_table_mv") > sql("drop table if exists all_table") > sql("create table all_table(x1 bigint,x2 bigint,x3 string,x4 bigint,x5 > bigint,x6 int,x7 string,x8 int, x9 int,x10 bigint," + > "x11 bigint, x12 bigint,x13 bigint,x14 bigint,x15 bigint,x16 bigint,x17 > bigint,x18 bigint,x19 bigint) stored by 'carbondata'") > sql("insert into all_table select > 1,1,null,1,1,1,null,1,1,1,1,1,1,1,1,1,1,1,1") > sql("create datamap all_table_mv on table all_table using 'mv' " + > "as select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as > y4,X8,x9,x2 from all_table group by X8,x9,x2") > sql("rebuild datamap all_table_mv") > sql("explain select sum(x12) as y1, sum(x13) as y2, sum(x14) as > y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2") -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3238) Throw StackOverflowError exception using MV datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chenjian Qiu updated CARBONDATA-3238: - Description: Exception: java.lang.StackOverflowError at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at scala.Option.map(Option.scala:146) at org.apache.spark.sql.catalyst.expressions.AttributeMap.get(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap.contains(AttributeMap.scala:36) TestCase: test("select mv stack exception") { sql("drop datamap if exists all_table_mv") sql("drop table if exists all_table") sql("create table all_table(x1 bigint,x2 bigint,x3 string,x4 bigint,x5 bigint,x6 int,x7 string,x8 int, x9 int,x10 bigint," + "x11 bigint, x12 bigint,x13 bigint,x14 bigint,x15 bigint,x16 bigint,x17 bigint,x18 bigint,x19 bigint) stored by 'carbondata'") sql("insert into all_table select 1,1,null,1,1,1,null,1,1,1,1,1,1,1,1,1,1,1,1") sql("create datamap all_table_mv on table all_table using 'mv' " + "as select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2") sql("rebuild datamap all_table_mv") sql("explain select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2").collect().foreach(println) } was: Exception: java.lang.StackOverflowError at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at scala.Option.map(Option.scala:146) at org.apache.spark.sql.catalyst.expressions.AttributeMap.get(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap.contains(AttributeMap.scala:36) TestCase: test("select mv stack exception") { sql("drop datamap if exists all_table_mv") sql("drop table if exists all_table") sql("create table all_table(x1 bigint,x2 bigint,x3 string,x4 bigint,x5 bigint,x6 int,x7 string,x8 int, x9 int,x10 bigint," + "x11 bigint, x12 bigint,x13 bigint,x14 bigint,x15 bigint,x16 bigint,x17 bigint,x18 bigint,x19 bigint) stored by 'carbondata'") sql("insert into all_table select 1,1,null,1,1,1,null,1,1,1,1,1,1,1,1,1,1,1,1") sql("create datamap all_table_mv on table all_table using 'mv' " + "as select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2") sql("rebuild datamap all_table_mv") sql("explain select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2").collect().foreach(println) } > Throw StackOverflowError exception using MV datamap > --- > > Key: CARBONDATA-3238 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3238 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.5.1 >Reporter: Chenjian Qiu >Priority: Blocker > > Exception: > java.lang.StackOverflowError > at > org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) > at > org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.sql.catalyst.expressions.AttributeMap.get(AttributeMap.scala:34) > at > org.apache.spark.sql.catalyst.expressions.AttributeMap.contains(AttributeMap.scala:36) > TestCase: > test("select mv stack exception") { > sql("drop datamap if exists all_table_mv") > sql("drop table if exists all_table") > sql("create table all_table(x1 bigint,x2 bigint,x3 string,x4 bigint,x5 > bigint,x6 int,x7 string,x8 int, x9 int,x10 bigint," + > "x11 bigint, x12 bigint,x13 bigint,x14 bigint,x15 bigint,x16 bigint,x17 > bigint,x18 bigint,x19 bigint) stored by 'carbondata'") > sql("insert into all_table select > 1,1,null,1,1,1,null,1,1,1,1,1,1,1,1,1,1,1,1") > sql("create datamap all_table_mv on table all_table using 'mv' " + > "as select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as > y4,X8,x9,x2 from all_table group by X8,x9,x2") > sql("rebuild datamap all_table_mv") > sql("explain select sum(x12) as y1, sum(x13) as y2, sum(x14) as > y3,sum(x15) as y4,X8,x9,x2 from all_table group by > X8,x9,x2").collect().foreach(println) > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3238) Throw StackOverflowError exception using MV datamap
Chenjian Qiu created CARBONDATA-3238: Summary: Throw StackOverflowError exception using MV datamap Key: CARBONDATA-3238 URL: https://issues.apache.org/jira/browse/CARBONDATA-3238 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 1.5.1 Reporter: Chenjian Qiu Exception: java.lang.StackOverflowError at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap$$anonfun$get$1.apply(AttributeMap.scala:34) at scala.Option.map(Option.scala:146) at org.apache.spark.sql.catalyst.expressions.AttributeMap.get(AttributeMap.scala:34) at org.apache.spark.sql.catalyst.expressions.AttributeMap.contains(AttributeMap.scala:36) TestCase: test("select mv stack exception") { sql("drop datamap if exists all_table_mv") sql("drop table if exists all_table") sql("create table all_table(x1 bigint,x2 bigint,x3 string,x4 bigint,x5 bigint,x6 int,x7 string,x8 int, x9 int,x10 bigint," + "x11 bigint, x12 bigint,x13 bigint,x14 bigint,x15 bigint,x16 bigint,x17 bigint,x18 bigint,x19 bigint) stored by 'carbondata'") sql("insert into all_table select 1,1,null,1,1,1,null,1,1,1,1,1,1,1,1,1,1,1,1") sql("create datamap all_table_mv on table all_table using 'mv' " + "as select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2") sql("rebuild datamap all_table_mv") sql("explain select sum(x12) as y1, sum(x13) as y2, sum(x14) as y3,sum(x15) as y4,X8,x9,x2 from all_table group by X8,x9,x2").collect().foreach(println) } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3199) "show datamap" command doesn't return preaggregate datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chenjian Qiu updated CARBONDATA-3199: - Description: "show datamap" command doesn't return preaggregate datamap, only return bloom datamap; our system only has bloom datamap and preaggregate datamap and the 1.5.1 version document describe this in CarbonData DataMap Management: "Show DataMap There is a SHOW DATAMAPS command, when this is issued, system will read all datamap from system folder and print all information on screen. The current information includes: DataMapName DataMapProviderName like mv, preaggreagte, timeseries, etc Associated Table" was:"show datamap" command doesn't return preaggregate datamap, only return bloom datamap;our system only has bloom datamap and preaggregate datamap > "show datamap" command doesn't return preaggregate datamap > -- > > Key: CARBONDATA-3199 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3199 > Project: CarbonData > Issue Type: Bug > Components: other >Affects Versions: 1.5.1 >Reporter: Chenjian Qiu >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > "show datamap" command doesn't return preaggregate datamap, only return bloom > datamap; > our system only has bloom datamap and preaggregate datamap > and the 1.5.1 version document describe this in CarbonData DataMap Management: > "Show DataMap > There is a SHOW DATAMAPS command, when this is issued, system will read all > datamap from system folder and print all information on screen. The current > information includes: > DataMapName > DataMapProviderName like mv, preaggreagte, timeseries, etc > Associated Table" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3199) "show datamap" command doesn't return preaggregate datamap
Chenjian Qiu created CARBONDATA-3199: Summary: "show datamap" command doesn't return preaggregate datamap Key: CARBONDATA-3199 URL: https://issues.apache.org/jira/browse/CARBONDATA-3199 Project: CarbonData Issue Type: Bug Components: other Affects Versions: 1.5.1 Reporter: Chenjian Qiu "show datamap" command doesn't return preaggregate datamap, only return bloom datamap;our system only has bloom datamap and preaggregate datamap -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3165) Query of BloomFilter java.lang.NullPointerException
Chenjian Qiu created CARBONDATA-3165: Summary: Query of BloomFilter java.lang.NullPointerException Key: CARBONDATA-3165 URL: https://issues.apache.org/jira/browse/CARBONDATA-3165 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 1.5.1 Reporter: Chenjian Qiu carbon.enable.distributed.datamap is true,run long time, use bloomfilter to query, the exception is: 24274.0 (TID 664711) | org.apache.spark.internal.Logging$class.logError(Logging.scala:91) java.lang.NullPointerException at java.util.ArrayList.(ArrayList.java:177) at org.apache.carbondata.datamap.bloom.BloomCoarseGrainDataMap.prune(BloomCoarseGrainDataMap.java:230) at org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:379) at org.apache.carbondata.core.datamap.DistributableDataMapFormat$1.initialize(DistributableDataMapFormat.java:108) at org.apache.carbondata.spark.rdd.DataMapPruneRDD.internalCompute(SparkDataMapJob.scala:77) at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:82) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3142) The names of threads created by CarbonThreadFactory are all the same
Chenjian Qiu created CARBONDATA-3142: Summary: The names of threads created by CarbonThreadFactory are all the same Key: CARBONDATA-3142 URL: https://issues.apache.org/jira/browse/CARBONDATA-3142 Project: CarbonData Issue Type: Improvement Components: data-load Affects Versions: 1.5.0 Reporter: Chenjian Qiu The names of threads created by CarbonThreadFactory are all the same, such as "ProducerPool_", this situation is confused to look -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2908) the option of sort_scope don't effects while creating table by data frame
Chenjian Qiu created CARBONDATA-2908: Summary: the option of sort_scope don't effects while creating table by data frame Key: CARBONDATA-2908 URL: https://issues.apache.org/jira/browse/CARBONDATA-2908 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 1.4.0 Reporter: Chenjian Qiu -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (CARBONDATA-2109) config of dataframe load with tempCSV is invalid ,such as QUOTECHAR
[ https://issues.apache.org/jira/browse/CARBONDATA-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chenjian Qiu closed CARBONDATA-2109. Resolution: Fixed Fix Version/s: 1.3.0 discard tempCSV option > config of dataframe load with tempCSV is invalid ,such as QUOTECHAR > --- > > Key: CARBONDATA-2109 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2109 > Project: CarbonData > Issue Type: Improvement > Components: data-load >Affects Versions: 1.2.0 >Reporter: Chenjian Qiu >Priority: Minor > Fix For: 1.3.0 > > Original Estimate: 2h > Time Spent: 2h > Remaining Estimate: 0h > > use datafame to load data with tempCSV > true(dataset.writer..format("carbondata").save), carbondata will generate sql > to load data > the sql's configs only have SINGLE_PASS, other configs is invalid,such as > QUOTECHAR. > the code is in CarbonDataFrameWriter: > private def makeLoadString(csvFolder: String, options: CarbonOption): > String = { > val dbName = > CarbonEnv.getDatabaseName(options.dbName)(sqlContext.sparkSession) > s""" >| LOAD DATA INPATH '$csvFolder' >| INTO TABLE $dbName.${options.tableName} >| OPTIONS ('FILEHEADER' = '${dataFrame.columns.mkString(",")}', >| 'SINGLE_PASS' = '${options.singlePass}') > """.stripMargin > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2109) config of dataframe load with tempCSV is invalid ,such as QUOTECHAR
Chenjian Qiu created CARBONDATA-2109: Summary: config of dataframe load with tempCSV is invalid ,such as QUOTECHAR Key: CARBONDATA-2109 URL: https://issues.apache.org/jira/browse/CARBONDATA-2109 Project: CarbonData Issue Type: Improvement Components: data-load Affects Versions: 1.2.0 Reporter: Chenjian Qiu use datafame to load data with tempCSV true(dataset.writer..format("carbondata").save), carbondata will generate sql to load data the sql's configs only have SINGLE_PASS, other configs is invalid,such as QUOTECHAR. the code is in CarbonDataFrameWriter: private def makeLoadString(csvFolder: String, options: CarbonOption): String = { val dbName = CarbonEnv.getDatabaseName(options.dbName)(sqlContext.sparkSession) s""" | LOAD DATA INPATH '$csvFolder' | INTO TABLE $dbName.${options.tableName} | OPTIONS ('FILEHEADER' = '${dataFrame.columns.mkString(",")}', | 'SINGLE_PASS' = '${options.singlePass}') """.stripMargin } -- This message was sent by Atlassian JIRA (v7.6.3#76005)