[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399971835 ## File path: integration/spark/src/test/scala/org/apache/carbondata/view/MaterializedViewTest.scala ## @@ -0,0 +1,194 @@ +package org.apache.carbondata.view + +import java.io.File + +import org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException +import org.apache.spark.sql.catalyst.catalog.HiveTableRelation +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources.LogicalRelation +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.spark.exception.ProcessMetaDataException + +class MaterializedViewTest extends QueryTest with BeforeAndAfterAll { + + override def beforeAll { +drop() +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") +val projectPath = new File(this.getClass.getResource("/").getPath + "../../../../") + .getCanonicalPath.replaceAll("", "/") +val integrationPath = s"$projectPath/integration" +val resourcesPath = s"$integrationPath/spark/src/test/resources" +sql( + """ +| CREATE TABLE fact_table (empname String, designation String, doj Timestamp, +| workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, +| projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int, +| utilization int,salary int) +| STORED AS carbondata + """.stripMargin) +sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""") +sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""") + } + + test("test create mv on hive table") { +sql("drop materialized view if exists mv1") +sql("drop table if exists source") +sql("create table source as select * from fact_table") +sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname") Review comment: Now that hive mv and our mv syntax is same. what is the impact ? On hive table even if we want to create hive mv. Does it create carbon mv table ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399972261 ## File path: integration/spark/src/test/scala/org/apache/carbondata/view/MaterializedViewTest.scala ## @@ -0,0 +1,194 @@ +package org.apache.carbondata.view + +import java.io.File + +import org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException +import org.apache.spark.sql.catalyst.catalog.HiveTableRelation +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources.LogicalRelation +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.spark.exception.ProcessMetaDataException + +class MaterializedViewTest extends QueryTest with BeforeAndAfterAll { + + override def beforeAll { +drop() +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") +val projectPath = new File(this.getClass.getResource("/").getPath + "../../../../") + .getCanonicalPath.replaceAll("", "/") +val integrationPath = s"$projectPath/integration" +val resourcesPath = s"$integrationPath/spark/src/test/resources" +sql( + """ +| CREATE TABLE fact_table (empname String, designation String, doj Timestamp, +| workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, +| projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int, +| utilization int,salary int) +| STORED AS carbondata + """.stripMargin) +sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""") +sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""") + } + + test("test create mv on hive table") { +sql("drop materialized view if exists mv1") +sql("drop table if exists source") +sql("create table source as select * from fact_table") +sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname") Review comment: do we support all the feature supported in hive mv ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399971835 ## File path: integration/spark/src/test/scala/org/apache/carbondata/view/MaterializedViewTest.scala ## @@ -0,0 +1,194 @@ +package org.apache.carbondata.view + +import java.io.File + +import org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException +import org.apache.spark.sql.catalyst.catalog.HiveTableRelation +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources.LogicalRelation +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.spark.exception.ProcessMetaDataException + +class MaterializedViewTest extends QueryTest with BeforeAndAfterAll { + + override def beforeAll { +drop() +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") +val projectPath = new File(this.getClass.getResource("/").getPath + "../../../../") + .getCanonicalPath.replaceAll("", "/") +val integrationPath = s"$projectPath/integration" +val resourcesPath = s"$integrationPath/spark/src/test/resources" +sql( + """ +| CREATE TABLE fact_table (empname String, designation String, doj Timestamp, +| workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, +| projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int, +| utilization int,salary int) +| STORED AS carbondata + """.stripMargin) +sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""") +sql(s"""LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE fact_table OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"')""") + } + + test("test create mv on hive table") { +sql("drop materialized view if exists mv1") +sql("drop table if exists source") +sql("create table source as select * from fact_table") +sql("create materialized view mv1 as select empname, deptname, avg(salary) from source group by empname, deptname") Review comment: Now that hive mv and our mv syntax is same. what is the impcat ? On hive table even if we want to create hive mv. Does it create carbon mv table ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399970121 ## File path: docs/index-developer-guide.md ## @@ -24,11 +24,11 @@ Currently, there are two types of DataMap supported: 2. MVDataMap: DataMap that leverages Materialized View to accelerate olap style query, like SPJG query (select, predicate, join, groupby). Preaggregate, timeseries and mv DataMap belong to this type of DataMaps. Review comment: Also documents [*.md] files need to be updated for all the changes of this PR Example, index-developer-guide.md has many datamap words in it still This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399968880 ## File path: dev/findbugs-exclude.xml ## @@ -59,7 +59,7 @@ - + Review comment: Also documents [*.md] files need to be updated for all the changes of this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399969838 ## File path: dev/findbugs-exclude.xml ## @@ -59,7 +59,7 @@ - + Review comment: Example, index-developer-guide.md has many datamap words in it still This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399969838 ## File path: dev/findbugs-exclude.xml ## @@ -59,7 +59,7 @@ - + Review comment: Example, index-developer-guide.md has many datamap words in it still This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399968880 ## File path: dev/findbugs-exclude.xml ## @@ -59,7 +59,7 @@ - + Review comment: Also documents [*.md] files need to be updated for all the changes of this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399968880 ## File path: dev/findbugs-exclude.xml ## @@ -59,7 +59,7 @@ - + Review comment: Also documents [*.md] files need to be updated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399965545 ## File path: core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapSyncStatus.java ## @@ -36,7 +36,7 @@ /** * Interface to check whether datamap can be enabled */ -@InterfaceAudience.Developer("DataMap") +@InterfaceAudience.Developer("Index") Review comment: please delete old mv also in this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399963875 ## File path: core/src/main/java/org/apache/carbondata/core/datamap/IndexChooser.java ## @@ -59,76 +59,76 @@ * the datamap which has fewer columns that is the first datamap. */ @InterfaceAudience.Internal -public class DataMapChooser { +public class IndexChooser { private CarbonTable carbonTable; - private List cgDataMaps; - private List fgDataMaps; + private List cgIndexes; + private List fgIndexes; - public DataMapChooser(CarbonTable carbonTable) throws IOException { + public IndexChooser(CarbonTable carbonTable) throws IOException { this.carbonTable = carbonTable; -// read all datamaps for this table and populate CG and FG datamap list -List visibleDataMaps = -DataMapStoreManager.getInstance().getAllVisibleDataMap(carbonTable); +// read all indexes for this table and populate CG and FG index list +List visibleIndexes = +DataMapStoreManager.getInstance().getAllVisibleIndexes(carbonTable); Review comment: DataMapStoreManager need to be changed to IndexStoreManager ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399963004 ## File path: core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java ## @@ -163,6 +165,17 @@ private DataMapStoreManager() { return provider.retrieveAllSchemas(); } + /** + * Return first match of the specified index name in table + * + */ + public Optional getIndexInTable(CarbonTable carbonTable, String indexName) Review comment: DataMapSchema need to change to IndexSchema ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399962377 ## File path: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java ## @@ -2265,6 +2265,11 @@ private CarbonCommonConstants() { */ public static final String PARENT_TABLES = "parent_tables"; + /** + * This property will be used to store table name's associated with mv + */ Review comment: why not clean in this PR itself as it is meant for MV refactoring ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399961269 ## File path: common/src/main/java/org/apache/carbondata/common/exceptions/sql/MalformedIndexCommandException.java ## @@ -21,21 +21,21 @@ import org.apache.carbondata.common.annotations.InterfaceStability; /** - * This exception will be thrown when Datamap related SQL statement is invalid + * This exception will be thrown when index related SQL statement is invalid Review comment: @niuge01 and @jackylk : As per my understanding. After this PR. There won't be datamap word in the code. It is either index or materialized view. But **many package names still has datamap** [exmaple CgDatamap, FgDatamap] and **files also exist with datamap name**. [Example DataMapStatus, DataMapStatusManager] Do we need to completly remove if weare doing this change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r393458926 ## File path: core/src/main/java/org/apache/carbondata/core/datamap/status/DataMapStatus.java ## @@ -18,7 +18,7 @@ package org.apache.carbondata.core.datamap.status; /** - * DataMap status + * Index status */ public enum DataMapStatus { Review comment: better to rename the file itself This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view
ajantha-bhat commented on a change in pull request #3661: [WIP] Support materialized view URL: https://github.com/apache/carbondata/pull/3661#discussion_r399961269 ## File path: common/src/main/java/org/apache/carbondata/common/exceptions/sql/MalformedIndexCommandException.java ## @@ -21,21 +21,21 @@ import org.apache.carbondata.common.annotations.InterfaceStability; /** - * This exception will be thrown when Datamap related SQL statement is invalid + * This exception will be thrown when index related SQL statement is invalid Review comment: @niuge01 and @jackylk : As per my understanding. After this PR. There won't be datamap word in the code. It is either index or materialized view. But **many package names still has datamap** [exmaple CgDatamap, FgDatamap] and files also exist with datamap name. [Example DataMapStatus, DataMapStatusManager] Do we need to completly remove if weare doing this change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services