[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106192438 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -313,6 +307,100 @@ private[sql] case class AlterTableAddColumns( } } +private[sql] case class AlterTableRenameTable(alterTableRenameModel: AlterTableRenameModel) --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106189693 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -747,14 +835,14 @@ case class LoadTable( true } else { LOGGER.error("Can't use single_pass, because SINGLE_PASS and ALL_DICTIONARY_PATH" + - "can not be used together, and USE_KETTLE must be set as false") + "can not be used together, and USE_KETTLE must be set as false") --- End diff -- wrong indentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106174923 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -313,6 +307,100 @@ private[sql] case class AlterTableAddColumns( } } +private[sql] case class AlterTableRenameTable(alterTableRenameModel: AlterTableRenameModel) --- End diff -- Move these commands and case class AlterTableRenameTable to AlterTableCommands.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106131227 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala --- @@ -129,4 +134,80 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { case databaseName ~ tableName ~ limit => ShowLoadsCommand(convertDbNameToLowerCase(databaseName), tableName.toLowerCase(), limit) } + + protected lazy val alterTableModifyDataType: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ CHANGE ~ ident ~ ident ~ +ident ~ opt("(" ~> rep1sep(valueOptions, ",") <~ ")") <~ opt(";") ^^ { + case dbName ~ table ~ change ~ columnName ~ columnNameCopy ~ dataType ~ values => +// both the column names should be same +CommonUtil.validateColumnNames(columnName, columnNameCopy) +val alterTableChangeDataTypeModel = + AlterTableDataTypeChangeModel(parseDataType(dataType.toLowerCase, values), +convertDbNameToLowerCase(dbName), +table.toLowerCase, +columnName.toLowerCase, +columnNameCopy.toLowerCase) +AlterTableDataTypeChange(alterTableChangeDataTypeModel) +} + + protected lazy val alterTableAddColumns: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ +(ADD ~> COLUMNS ~> "(" ~> repsep(anyFieldDef, ",") <~ ")") ~ +(TBLPROPERTIES ~> "(" ~> repsep(loadOptions, ",") <~ ")").? <~ opt(";") ^^ { + case dbName ~ table ~ fields ~ tblProp => +fields.foreach{ f => + if (isComplexDimDictionaryExclude(f.dataType.get)) { +throw new MalformedCarbonCommandException( + s"Add column is unsupported for complex datatype column: ${f.column}") + } +} +val tableProps = if (tblProp.isDefined) { + // default value should not be converted to lower case + val tblProps = tblProp.get.map(f => if (f._1.toLowerCase.startsWith("default.value.")) { +f._1 -> f._2 + } else { +f._1 -> f._2.toLowerCase + }) + scala.collection.mutable.Map(tblProps: _*) +} else { + scala.collection.mutable.Map.empty[String, String] +} + +val tableModel = prepareTableModel (false, + convertDbNameToLowerCase(dbName), + table.toLowerCase, + fields.map(convertFieldNamesToLowercase), + Seq.empty, + tableProps, + None, + true) + +val alterTableAddColumnsModel = AlterTableAddColumnsModel(convertDbNameToLowerCase(dbName), + table, + tableProps, + tableModel.dimCols, + tableModel.msrCols, + tableModel.highcardinalitydims.getOrElse(Seq.empty)) +AlterTableAddColumns(alterTableAddColumnsModel) +} + + private def convertFieldNamesToLowercase(field: Field): Field = { +val name = field.column.toLowerCase +field.copy(column = name, name = Some(name)) + } + protected lazy val alterTableDropColumn: Parser[LogicalPlan] = --- End diff -- Ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106129635 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala --- @@ -129,4 +134,80 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { case databaseName ~ tableName ~ limit => ShowLoadsCommand(convertDbNameToLowerCase(databaseName), tableName.toLowerCase(), limit) } + + protected lazy val alterTableModifyDataType: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ CHANGE ~ ident ~ ident ~ +ident ~ opt("(" ~> rep1sep(valueOptions, ",") <~ ")") <~ opt(";") ^^ { + case dbName ~ table ~ change ~ columnName ~ columnNameCopy ~ dataType ~ values => +// both the column names should be same +CommonUtil.validateColumnNames(columnName, columnNameCopy) --- End diff -- Ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106125641 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonMetastore.scala --- @@ -304,38 +341,76 @@ class CarbonMetastore(conf: RuntimeConfig, val storePath: String) { if (tableExists(tableName, Some(dbName))(sparkSession)) { sys.error(s"Table [$tableName] already exists under Database [$dbName]") } +val schemaEvolutionEntry = new SchemaEvolutionEntry(tableInfo.getLastUpdatedTime) val schemaConverter = new ThriftWrapperSchemaConverterImpl val thriftTableInfo = schemaConverter .fromWrapperToExternalTableInfo(tableInfo, dbName, tableName) -val schemaEvolutionEntry = new SchemaEvolutionEntry(tableInfo.getLastUpdatedTime) thriftTableInfo.getFact_table.getSchema_evolution.getSchema_evolution_history .add(schemaEvolutionEntry) +val carbonTablePath = createSchemaThriftFile(tableInfo, + thriftTableInfo, + dbName, + tableName)(sparkSession) +updateSchemasUpdatedTime(touchSchemaFileSystemTime(dbName, tableName)) +LOGGER.info(s"Table $tableName for Database $dbName created successfully.") +carbonTablePath + } + /** + * This method will write the schema thrift file in carbon store and load table metadata + * + * @param tableInfo + * @param thriftTableInfo + * @param dbName + * @param tableName + * @param sparkSession + * @return + */ + private def createSchemaThriftFile( + tableInfo: org.apache.carbondata.core.metadata.schema.table.TableInfo, + thriftTableInfo: org.apache.carbondata.format.TableInfo, + dbName: String, tableName: String) +(sparkSession: SparkSession): String = { val carbonTableIdentifier = new CarbonTableIdentifier(dbName, tableName, tableInfo.getFactTable.getTableId) val carbonTablePath = CarbonStorePath.getCarbonTablePath(storePath, carbonTableIdentifier) val schemaFilePath = carbonTablePath.getSchemaFilePath val schemaMetadataPath = CarbonTablePath.getFolderContainingFile(schemaFilePath) tableInfo.setMetaDataFilepath(schemaMetadataPath) tableInfo.setStorePath(storePath) -CarbonMetadata.getInstance().loadTableMetadata(tableInfo) -val tableMeta = new TableMeta(carbonTableIdentifier, storePath, - CarbonMetadata.getInstance().getCarbonTable(dbName + "_" + tableName)) - val fileType = FileFactory.getFileType(schemaMetadataPath) if (!FileFactory.isFileExist(schemaMetadataPath, fileType)) { FileFactory.mkdirs(schemaMetadataPath, fileType) } val thriftWriter = new ThriftWriter(schemaFilePath, false) -thriftWriter.open() +thriftWriter.open(FileWriteOperation.OVERWRITE) thriftWriter.write(thriftTableInfo) thriftWriter.close() +removeTableFromMetadata(dbName, tableName) --- End diff -- As per old code, when new table is created, table info is added into CarbonMetastore and modified.mdt file is timestamp is updated to refresh in other sessions. Those changes are extracted to a new method and used in CreateTable and AlterTable flow, updating mdt file code is missing, which i added --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106112796 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala --- @@ -194,4 +196,102 @@ object CarbonScalaUtil { } } } + + /** + * This method will validate a column for its data type and check whether the column data type + * can be modified and update if conditions are met + * + * @param dataTypeInfo + * @param carbonColumn + */ + def validateColumnDataType(dataTypeInfo: DataTypeInfo, carbonColumn: CarbonColumn): Unit = { +carbonColumn.getDataType.getName match { + case "INT" => +if (!dataTypeInfo.dataType.equals("bigint")) { + sys +.error(s"Given column ${ carbonColumn.getColName } with data type ${ + carbonColumn +.getDataType.getName +} cannot be modified. Int can only be changed to bigInt") +} + case "DECIMAL" => +if (!dataTypeInfo.dataType.equals("decimal")) { + sys +.error(s"Given column ${ carbonColumn.getColName } with data type ${ + carbonColumn.getDataType.getName +} cannot be modified. Decimal can be only be changed to Decimal of higher precision") +} +if (dataTypeInfo.precision <= carbonColumn.getColumnSchema.getPrecision) { + sys +.error(s"Given column ${ + carbonColumn +.getColName +} cannot be modified. Specified precision value ${ + dataTypeInfo +.precision +} should be greater or equal to current precision value ${ + carbonColumn.getColumnSchema +.getPrecision +}") +} else if (dataTypeInfo.scale <= carbonColumn.getColumnSchema.getScale) { + sys +.error(s"Given column ${ + carbonColumn +.getColName +} cannot be modified. Specified scale value ${ + dataTypeInfo +.scale +} should be greater or equal to current scale value ${ + carbonColumn.getColumnSchema +.getScale +}") +} else { + // difference of precision and scale specified by user should not be less than the + // difference of already existing precision and scale else it will result in data loss + val carbonColumnPrecisionScaleDiff = carbonColumn.getColumnSchema.getPrecision - + carbonColumn.getColumnSchema.getScale + val dataInfoPrecisionScaleDiff = dataTypeInfo.precision - dataTypeInfo.scale + if (dataInfoPrecisionScaleDiff < carbonColumnPrecisionScaleDiff) { +sys + .error(s"Given column ${ +carbonColumn + .getColName + } cannot be modified. Specified precision and scale values will lead to data loss") + } +} + case _ => +sys + .error(s"Given column ${ carbonColumn.getColName } with data type ${ +carbonColumn + .getDataType.getName + } cannot be modified. Only Int and Decimal data types are allowed for modification") +} + } + + /** + * This method will create a copy of the same object + * + * @param thriftColumnSchema object to be cloned + * @return + */ + def createColumnSchemaCopyObject(thriftColumnSchema: org.apache.carbondata.format.ColumnSchema) --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106112761 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/DataTypeConverterUtil.scala --- @@ -77,4 +77,40 @@ object DataTypeConverterUtil { case DataType.STRUCT => "struct" } } + + /** + * convert from wrapper to external data type + * + * @param dataType + * @return + */ + def convertToThriftDataType(dataType: String): org.apache.carbondata.format.DataType = { --- End diff -- There is no direct string to thrift datatype conversion. When creating table, we first convert string to Wrapper datatype and from wrapper datatype, we convert into thrift datatype. This method is required for alter table, as we r directly converting string to thrift type in alter table change datatype --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106095253 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -136,6 +140,298 @@ case class AlterTableCompaction(alterTableModel: AlterTableModel) extends Runnab } } +private[sql] case class AlterTableDataTypeChange( +alterTableDataTypeChangeModel: AlterTableDataTypeChangeModel) extends RunnableCommand { + + val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName) + + def run(sparkSession: SparkSession): Seq[Row] = { +val tableName = alterTableDataTypeChangeModel.tableName +val dbName = alterTableDataTypeChangeModel.databaseName + .getOrElse(sparkSession.catalog.currentDatabase) +LOGGER.audit(s"Alter table change data type request has been received for $dbName.$tableName") +val relation = + CarbonEnv.get.carbonMetastore +.lookupRelation(Option(dbName), tableName)(sparkSession) +.asInstanceOf[CarbonRelation] +if (relation == null) { + LOGGER.audit(s"Alter table change data type request has failed. " + + s"Table $dbName.$tableName does not exist") + sys.error(s"Table $dbName.$tableName does not exist") +} +// acquire the lock first +val table = relation.tableMeta.carbonTable +val carbonLock = CarbonLockFactory + .getCarbonLockObj(table.getAbsoluteTableIdentifier.getCarbonTableIdentifier, +LockUsage.METADATA_LOCK) +try { + // get the latest carbon table and check for column existence + val carbonTable = CarbonMetadata.getInstance.getCarbonTable(dbName + "_" + tableName) + val columnName = alterTableDataTypeChangeModel.columnName + var carbonColumnToBeModified: CarbonColumn = null + val carbonColumns = carbonTable.getCreateOrderColumn(tableName).asScala + // read the latest schema file + val carbonTablePath = CarbonStorePath.getCarbonTablePath(carbonTable.getStorePath, +carbonTable.getCarbonTableIdentifier) + val tableMetadataFile = carbonTablePath.getSchemaFilePath + val tableInfo: org.apache.carbondata.format.TableInfo = CarbonEnv.get.carbonMetastore +.readSchemaFile(tableMetadataFile) + // maintain the added column for schema evolution history + var addColumnSchema: org.apache.carbondata.format.ColumnSchema = null + var deletedColumnSchema: org.apache.carbondata.format.ColumnSchema = null + val columnSchemaList = tableInfo.fact_table.table_columns.asScala + columnSchemaList.foreach { columnSchema => +if (columnSchema.column_name.equalsIgnoreCase(columnName)) { + deletedColumnSchema = CarbonScalaUtil.createColumnSchemaCopyObject(columnSchema) + columnSchema.setData_type(DataTypeConverterUtil + .convertToThriftDataType(alterTableDataTypeChangeModel.dataTypeInfo.dataType)) + columnSchema.setPrecision(alterTableDataTypeChangeModel.dataTypeInfo.precision) + columnSchema.setScale(alterTableDataTypeChangeModel.dataTypeInfo.scale) + addColumnSchema = columnSchema +} + } + val schemaEvolutionEntry = new SchemaEvolutionEntry(System.currentTimeMillis) + schemaEvolutionEntry.setAdded(List(addColumnSchema).asJava) + schemaEvolutionEntry.setRemoved(List(deletedColumnSchema).asJava) + tableInfo.getFact_table.getSchema_evolution.getSchema_evolution_history.get(0) +.setTime_stamp(System.currentTimeMillis) + CarbonEnv.get.carbonMetastore +.updateTableSchema(carbonTable.getCarbonTableIdentifier, + tableInfo, + schemaEvolutionEntry, + carbonTable.getStorePath)(sparkSession) + + val tableIdentifier = TableIdentifier(tableName, Some(dbName)) + val schema = CarbonEnv.get.carbonMetastore +.lookupRelation(tableIdentifier)(sparkSession).schema.json + sparkSession.sharedState.externalCatalog.asInstanceOf[HiveExternalCatalog].client.runSqlHive( +s"ALTER TABLE $dbName.$tableName SET TBLPROPERTIES('spark.sql.sources.schema'='$schema')") + sparkSession.catalog.refreshTable(tableIdentifier.quotedString) + LOGGER.info(s"Alter table for data type change is successful for table $dbName.$tableName") + LOGGER.audit(s"Alter table for data type change is successful for table $dbName.$tableName") +} catch { + case e: Exception => +LOGGER.error("Alter table change datatype failed : " + e.getMessage) +throw e +} finally { + // release lock after command execution
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106093450 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonMetastore.scala --- @@ -304,38 +341,76 @@ class CarbonMetastore(conf: RuntimeConfig, val storePath: String) { if (tableExists(tableName, Some(dbName))(sparkSession)) { sys.error(s"Table [$tableName] already exists under Database [$dbName]") } +val schemaEvolutionEntry = new SchemaEvolutionEntry(tableInfo.getLastUpdatedTime) val schemaConverter = new ThriftWrapperSchemaConverterImpl val thriftTableInfo = schemaConverter .fromWrapperToExternalTableInfo(tableInfo, dbName, tableName) -val schemaEvolutionEntry = new SchemaEvolutionEntry(tableInfo.getLastUpdatedTime) thriftTableInfo.getFact_table.getSchema_evolution.getSchema_evolution_history .add(schemaEvolutionEntry) +val carbonTablePath = createSchemaThriftFile(tableInfo, + thriftTableInfo, + dbName, + tableName)(sparkSession) +updateSchemasUpdatedTime(touchSchemaFileSystemTime(dbName, tableName)) +LOGGER.info(s"Table $tableName for Database $dbName created successfully.") +carbonTablePath + } + /** + * This method will write the schema thrift file in carbon store and load table metadata + * + * @param tableInfo + * @param thriftTableInfo + * @param dbName + * @param tableName + * @param sparkSession + * @return + */ + private def createSchemaThriftFile( + tableInfo: org.apache.carbondata.core.metadata.schema.table.TableInfo, + thriftTableInfo: org.apache.carbondata.format.TableInfo, + dbName: String, tableName: String) +(sparkSession: SparkSession): String = { val carbonTableIdentifier = new CarbonTableIdentifier(dbName, tableName, tableInfo.getFactTable.getTableId) val carbonTablePath = CarbonStorePath.getCarbonTablePath(storePath, carbonTableIdentifier) val schemaFilePath = carbonTablePath.getSchemaFilePath val schemaMetadataPath = CarbonTablePath.getFolderContainingFile(schemaFilePath) tableInfo.setMetaDataFilepath(schemaMetadataPath) tableInfo.setStorePath(storePath) -CarbonMetadata.getInstance().loadTableMetadata(tableInfo) -val tableMeta = new TableMeta(carbonTableIdentifier, storePath, - CarbonMetadata.getInstance().getCarbonTable(dbName + "_" + tableName)) - val fileType = FileFactory.getFileType(schemaMetadataPath) if (!FileFactory.isFileExist(schemaMetadataPath, fileType)) { FileFactory.mkdirs(schemaMetadataPath, fileType) } val thriftWriter = new ThriftWriter(schemaFilePath, false) -thriftWriter.open() +thriftWriter.open(FileWriteOperation.OVERWRITE) thriftWriter.write(thriftTableInfo) thriftWriter.close() +removeTableFromMetadata(dbName, tableName) --- End diff -- directly call refresh flow after updating alter schema --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106095089 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -136,6 +140,298 @@ case class AlterTableCompaction(alterTableModel: AlterTableModel) extends Runnab } } +private[sql] case class AlterTableDataTypeChange( +alterTableDataTypeChangeModel: AlterTableDataTypeChangeModel) extends RunnableCommand { + + val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName) + + def run(sparkSession: SparkSession): Seq[Row] = { +val tableName = alterTableDataTypeChangeModel.tableName +val dbName = alterTableDataTypeChangeModel.databaseName + .getOrElse(sparkSession.catalog.currentDatabase) +LOGGER.audit(s"Alter table change data type request has been received for $dbName.$tableName") +val relation = + CarbonEnv.get.carbonMetastore +.lookupRelation(Option(dbName), tableName)(sparkSession) +.asInstanceOf[CarbonRelation] +if (relation == null) { + LOGGER.audit(s"Alter table change data type request has failed. " + + s"Table $dbName.$tableName does not exist") + sys.error(s"Table $dbName.$tableName does not exist") +} +// acquire the lock first +val table = relation.tableMeta.carbonTable +val carbonLock = CarbonLockFactory + .getCarbonLockObj(table.getAbsoluteTableIdentifier.getCarbonTableIdentifier, +LockUsage.METADATA_LOCK) +try { + // get the latest carbon table and check for column existence + val carbonTable = CarbonMetadata.getInstance.getCarbonTable(dbName + "_" + tableName) + val columnName = alterTableDataTypeChangeModel.columnName + var carbonColumnToBeModified: CarbonColumn = null + val carbonColumns = carbonTable.getCreateOrderColumn(tableName).asScala + // read the latest schema file + val carbonTablePath = CarbonStorePath.getCarbonTablePath(carbonTable.getStorePath, +carbonTable.getCarbonTableIdentifier) + val tableMetadataFile = carbonTablePath.getSchemaFilePath + val tableInfo: org.apache.carbondata.format.TableInfo = CarbonEnv.get.carbonMetastore +.readSchemaFile(tableMetadataFile) + // maintain the added column for schema evolution history + var addColumnSchema: org.apache.carbondata.format.ColumnSchema = null + var deletedColumnSchema: org.apache.carbondata.format.ColumnSchema = null + val columnSchemaList = tableInfo.fact_table.table_columns.asScala + columnSchemaList.foreach { columnSchema => +if (columnSchema.column_name.equalsIgnoreCase(columnName)) { + deletedColumnSchema = CarbonScalaUtil.createColumnSchemaCopyObject(columnSchema) + columnSchema.setData_type(DataTypeConverterUtil + .convertToThriftDataType(alterTableDataTypeChangeModel.dataTypeInfo.dataType)) + columnSchema.setPrecision(alterTableDataTypeChangeModel.dataTypeInfo.precision) + columnSchema.setScale(alterTableDataTypeChangeModel.dataTypeInfo.scale) + addColumnSchema = columnSchema +} + } + val schemaEvolutionEntry = new SchemaEvolutionEntry(System.currentTimeMillis) + schemaEvolutionEntry.setAdded(List(addColumnSchema).asJava) + schemaEvolutionEntry.setRemoved(List(deletedColumnSchema).asJava) + tableInfo.getFact_table.getSchema_evolution.getSchema_evolution_history.get(0) +.setTime_stamp(System.currentTimeMillis) + CarbonEnv.get.carbonMetastore +.updateTableSchema(carbonTable.getCarbonTableIdentifier, + tableInfo, + schemaEvolutionEntry, + carbonTable.getStorePath)(sparkSession) + + val tableIdentifier = TableIdentifier(tableName, Some(dbName)) + val schema = CarbonEnv.get.carbonMetastore +.lookupRelation(tableIdentifier)(sparkSession).schema.json + sparkSession.sharedState.externalCatalog.asInstanceOf[HiveExternalCatalog].client.runSqlHive( +s"ALTER TABLE $dbName.$tableName SET TBLPROPERTIES('spark.sql.sources.schema'='$schema')") + sparkSession.catalog.refreshTable(tableIdentifier.quotedString) + LOGGER.info(s"Alter table for data type change is successful for table $dbName.$tableName") + LOGGER.audit(s"Alter table for data type change is successful for table $dbName.$tableName") +} catch { + case e: Exception => +LOGGER.error("Alter table change datatype failed : " + e.getMessage) +throw e +} finally { + // release lock after command execution
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106091809 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala --- @@ -129,4 +134,80 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { case databaseName ~ tableName ~ limit => ShowLoadsCommand(convertDbNameToLowerCase(databaseName), tableName.toLowerCase(), limit) } + + protected lazy val alterTableModifyDataType: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ CHANGE ~ ident ~ ident ~ +ident ~ opt("(" ~> rep1sep(valueOptions, ",") <~ ")") <~ opt(";") ^^ { + case dbName ~ table ~ change ~ columnName ~ columnNameCopy ~ dataType ~ values => +// both the column names should be same +CommonUtil.validateColumnNames(columnName, columnNameCopy) --- End diff -- Directly check equalsIgnoreCase, no separate util function required --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106091632 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala --- @@ -129,4 +134,80 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { case databaseName ~ tableName ~ limit => ShowLoadsCommand(convertDbNameToLowerCase(databaseName), tableName.toLowerCase(), limit) } + + protected lazy val alterTableModifyDataType: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ CHANGE ~ ident ~ ident ~ +ident ~ opt("(" ~> rep1sep(valueOptions, ",") <~ ")") <~ opt(";") ^^ { --- End diff -- Use datatype rule of existing parser rules, I think rule name is PrimitiveType. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106094811 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -136,6 +140,298 @@ case class AlterTableCompaction(alterTableModel: AlterTableModel) extends Runnab } } +private[sql] case class AlterTableDataTypeChange( --- End diff -- Alter table related move to AlterTableCommands file --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106090264 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala --- @@ -784,4 +786,105 @@ object GlobalDictionaryUtil { throw ex } } + + def loadDefaultDictionaryValueForNewColumn(carbonTablePath: CarbonTablePath, --- End diff -- Write comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106090223 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/DataTypeConverterUtil.scala --- @@ -77,4 +77,40 @@ object DataTypeConverterUtil { case DataType.STRUCT => "struct" } } + + /** + * convert from wrapper to external data type + * + * @param dataType + * @return + */ + def convertToThriftDataType(dataType: String): org.apache.carbondata.format.DataType = { --- End diff -- Similar conversion function already would have exist, please try to reuse --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106090749 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala --- @@ -784,4 +786,105 @@ object GlobalDictionaryUtil { throw ex } } + + def loadDefaultDictionaryValueForNewColumn(carbonTablePath: CarbonTablePath, --- End diff -- Modify the previous flow to use common function --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106092997 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala --- @@ -129,4 +134,80 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { case databaseName ~ tableName ~ limit => ShowLoadsCommand(convertDbNameToLowerCase(databaseName), tableName.toLowerCase(), limit) } + + protected lazy val alterTableModifyDataType: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ CHANGE ~ ident ~ ident ~ +ident ~ opt("(" ~> rep1sep(valueOptions, ",") <~ ")") <~ opt(";") ^^ { + case dbName ~ table ~ change ~ columnName ~ columnNameCopy ~ dataType ~ values => +// both the column names should be same +CommonUtil.validateColumnNames(columnName, columnNameCopy) +val alterTableChangeDataTypeModel = + AlterTableDataTypeChangeModel(parseDataType(dataType.toLowerCase, values), +convertDbNameToLowerCase(dbName), +table.toLowerCase, +columnName.toLowerCase, +columnNameCopy.toLowerCase) +AlterTableDataTypeChange(alterTableChangeDataTypeModel) +} + + protected lazy val alterTableAddColumns: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ +(ADD ~> COLUMNS ~> "(" ~> repsep(anyFieldDef, ",") <~ ")") ~ +(TBLPROPERTIES ~> "(" ~> repsep(loadOptions, ",") <~ ")").? <~ opt(";") ^^ { + case dbName ~ table ~ fields ~ tblProp => +fields.foreach{ f => + if (isComplexDimDictionaryExclude(f.dataType.get)) { +throw new MalformedCarbonCommandException( + s"Add column is unsupported for complex datatype column: ${f.column}") + } +} +val tableProps = if (tblProp.isDefined) { + // default value should not be converted to lower case + val tblProps = tblProp.get.map(f => if (f._1.toLowerCase.startsWith("default.value.")) { +f._1 -> f._2 + } else { +f._1 -> f._2.toLowerCase + }) + scala.collection.mutable.Map(tblProps: _*) +} else { + scala.collection.mutable.Map.empty[String, String] +} + +val tableModel = prepareTableModel (false, + convertDbNameToLowerCase(dbName), + table.toLowerCase, + fields.map(convertFieldNamesToLowercase), + Seq.empty, + tableProps, + None, + true) + +val alterTableAddColumnsModel = AlterTableAddColumnsModel(convertDbNameToLowerCase(dbName), + table, + tableProps, + tableModel.dimCols, + tableModel.msrCols, + tableModel.highcardinalitydims.getOrElse(Seq.empty)) +AlterTableAddColumns(alterTableAddColumnsModel) +} + + private def convertFieldNamesToLowercase(field: Field): Field = { +val name = field.column.toLowerCase +field.copy(column = name, name = Some(name)) + } + protected lazy val alterTableDropColumn: Parser[LogicalPlan] = --- End diff -- Add rule name columnlist, which can check duplicates and related validation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106092052 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala --- @@ -129,4 +134,80 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { case databaseName ~ tableName ~ limit => ShowLoadsCommand(convertDbNameToLowerCase(databaseName), tableName.toLowerCase(), limit) } + + protected lazy val alterTableModifyDataType: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ CHANGE ~ ident ~ ident ~ +ident ~ opt("(" ~> rep1sep(valueOptions, ",") <~ ")") <~ opt(";") ^^ { --- End diff -- Move this to SparkDDLParser --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106090051 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala --- @@ -194,4 +196,102 @@ object CarbonScalaUtil { } } } + + /** + * This method will validate a column for its data type and check whether the column data type + * can be modified and update if conditions are met + * + * @param dataTypeInfo + * @param carbonColumn + */ + def validateColumnDataType(dataTypeInfo: DataTypeInfo, carbonColumn: CarbonColumn): Unit = { +carbonColumn.getDataType.getName match { + case "INT" => +if (!dataTypeInfo.dataType.equals("bigint")) { + sys +.error(s"Given column ${ carbonColumn.getColName } with data type ${ + carbonColumn +.getDataType.getName +} cannot be modified. Int can only be changed to bigInt") +} + case "DECIMAL" => +if (!dataTypeInfo.dataType.equals("decimal")) { + sys +.error(s"Given column ${ carbonColumn.getColName } with data type ${ + carbonColumn.getDataType.getName +} cannot be modified. Decimal can be only be changed to Decimal of higher precision") +} +if (dataTypeInfo.precision <= carbonColumn.getColumnSchema.getPrecision) { + sys +.error(s"Given column ${ + carbonColumn +.getColName +} cannot be modified. Specified precision value ${ + dataTypeInfo +.precision +} should be greater or equal to current precision value ${ + carbonColumn.getColumnSchema +.getPrecision +}") +} else if (dataTypeInfo.scale <= carbonColumn.getColumnSchema.getScale) { + sys +.error(s"Given column ${ + carbonColumn +.getColName +} cannot be modified. Specified scale value ${ + dataTypeInfo +.scale +} should be greater or equal to current scale value ${ + carbonColumn.getColumnSchema +.getScale +}") +} else { + // difference of precision and scale specified by user should not be less than the + // difference of already existing precision and scale else it will result in data loss + val carbonColumnPrecisionScaleDiff = carbonColumn.getColumnSchema.getPrecision - + carbonColumn.getColumnSchema.getScale + val dataInfoPrecisionScaleDiff = dataTypeInfo.precision - dataTypeInfo.scale + if (dataInfoPrecisionScaleDiff < carbonColumnPrecisionScaleDiff) { +sys + .error(s"Given column ${ +carbonColumn + .getColName + } cannot be modified. Specified precision and scale values will lead to data loss") + } +} + case _ => +sys + .error(s"Given column ${ carbonColumn.getColName } with data type ${ +carbonColumn + .getDataType.getName + } cannot be modified. Only Int and Decimal data types are allowed for modification") +} + } + + /** + * This method will create a copy of the same object + * + * @param thriftColumnSchema object to be cloned + * @return + */ + def createColumnSchemaCopyObject(thriftColumnSchema: org.apache.carbondata.format.ColumnSchema) --- End diff -- Use schema copy will work, instead of writing new copy method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105936938 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeGrtThanFiterExecuterImpl.java --- @@ -46,44 +52,71 @@ public RowLevelRangeGrtThanFiterExecuterImpl( super(dimColEvaluatorInfoList, msrColEvalutorInfoList, exp, tableIdentifier, segmentProperties, null); this.filterRangeValues = filterRangeValues; +checkIfDefaultValueIsPresentInFilterList(); + } + + /** + * This method will check whether default value is present in the given filter values + */ + private void checkIfDefaultValueIsPresentInFilterList() { --- End diff -- Rename to if default value matches filter --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105962095 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/resolver/resolverinfo/visitor/NoDictionaryTypeVisitor.java --- @@ -45,7 +47,15 @@ public void populateFilterResolvedInfo(DimColumnResolvedFilterInfo visitableObj, DimColumnFilterInfo resolvedFilterObject = null; List evaluateResultListFinal; try { - evaluateResultListFinal = metadata.getExpression().evaluate(null).getListAsString(); + ExpressionResult result = metadata.getExpression().evaluate(null); --- End diff -- if need to check only is null scenario, use expression.isInstanceOf[EqualToExpresson].isNull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105972588 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java --- @@ -1520,6 +1572,69 @@ public static String getFormatFromProperty(DataType dataType) { } /** + * This method will delete the dictionary files for the given column IDs and + * clear the dictionary cache + * + * @param dictionaryColumns + * @param carbonTable + */ + public static void deleteDictionaryFileAndCache(List dictionaryColumns, --- End diff -- Create a class to manage Dictionary and move this method into that --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105969726 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java --- @@ -966,20 +974,22 @@ public static void clearBlockCache(List dataBlocks) { List isDictionaryDimensions = new ArrayList(); Set processedColumnGroup = new HashSet(); for (CarbonDimension carbonDimension : tableDimensionList) { - List childs = carbonDimension.getListOfChildDimensions(); - //assuming complex dimensions will always be atlast - if (null != childs && childs.size() > 0) { -break; - } - if (carbonDimension.isColumnar() && hasEncoding(carbonDimension.getEncoder(), - Encoding.DICTIONARY)) { -isDictionaryDimensions.add(true); - } else if (!carbonDimension.isColumnar()) { -if (processedColumnGroup.add(carbonDimension.columnGroupId())) { + if (!carbonDimension.isInvisible()) { --- End diff -- The list which is being created, should have two getters. one to get With invisible and other to get without invisible. Use required function accordingly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105913851 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java --- @@ -38,37 +52,174 @@ * table blocks in that case we need to select only those dimension out of * query dimension which is present in the current table block * + * @param blockExecutionInfo * @param queryDimensions * @param tableBlockDimensions + * @param tableComplexDimension * @return list of query dimension which is present in the table block */ - public static List getUpdatedQueryDimension(List queryDimensions, + public static List createDimensionInfoAndGetUpdatedQueryDimension( + BlockExecutionInfo blockExecutionInfo, List queryDimensions, List tableBlockDimensions, List tableComplexDimension) { List presentDimension = new ArrayList(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE); +boolean[] isDimensionExists = new boolean[queryDimensions.size()]; +Object[] defaultValues = new Object[queryDimensions.size()]; // selecting only those dimension which is present in the query +int dimIndex = 0; for (QueryDimension queryDimension : queryDimensions) { if (queryDimension.getDimension().hasEncoding(Encoding.IMPLICIT)) { presentDimension.add(queryDimension); +isDimensionExists[dimIndex] = true; } else { for (CarbonDimension tableDimension : tableBlockDimensions) { - if (tableDimension.equals(queryDimension.getDimension())) { -presentDimension.add(queryDimension); + if (tableDimension.getColumnId().equals(queryDimension.getDimension().getColumnId())) { +QueryDimension currentBlockDimension = new QueryDimension(tableDimension.getColName()); +tableDimension.getColumnSchema() +.setDataType(queryDimension.getDimension().getDataType()); +tableDimension.getColumnSchema() + .setPrecision(queryDimension.getDimension().getColumnSchema().getPrecision()); +tableDimension.getColumnSchema() + .setScale(queryDimension.getDimension().getColumnSchema().getScale()); +tableDimension.getColumnSchema() + .setDefaultValue(queryDimension.getDimension().getDefaultValue()); +currentBlockDimension.setDimension(tableDimension); + currentBlockDimension.setQueryOrder(queryDimension.getQueryOrder()); +presentDimension.add(currentBlockDimension); +isDimensionExists[dimIndex] = true; +break; } } +// add default value only in case query dimension is not found in the current block +if (!isDimensionExists[dimIndex]) { + defaultValues[dimIndex] = validateAndGetDefaultValue(queryDimension.getDimension()); + blockExecutionInfo.setRestructuredBlock(true); --- End diff -- This has to be set only if complex dimension also does not match --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105922838 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/FilterUtil.java --- @@ -571,6 +603,53 @@ public static DimColumnFilterInfo getFilterListForAllMembersRS(Expression expres } /** + * This method will check whether a default value for the non-existing column is present + * in the filter values list + * + * @param dimColumnEvaluatorInfo + * @return + */ + public static boolean isDimensionDefaultValuePresentInFilterValues( --- End diff -- Move this to RestructureFilterEvaluator abstract class --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105974874 --- Diff: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java --- @@ -245,6 +248,36 @@ public static DataType getDataType(String dataTypeStr) { } /** + * This method will convert the data according to its data type and perform a + * special handling for decimal data types + * + * @param dataInBytes + * @param dimension + * @return + */ + public static Object getDataBasedOnDataType(byte[] dataInBytes, CarbonDimension dimension) { --- End diff -- change the previous function to handle the logic getDataBasedOnDataType(dataInBytes, dimension.getDataType()); --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105852481 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/collector/ResultCollectorFactory.java --- @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.core.scan.collector; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.scan.collector.impl.AbstractScannedResultCollector; +import org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector; +import org.apache.carbondata.core.scan.collector.impl.DictionaryBasedVectorResultCollector; +import org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector; +import org.apache.carbondata.core.scan.collector.impl.RestructureBasedDictionaryResultCollector; +import org.apache.carbondata.core.scan.collector.impl.RestructureBasedRawResultCollector; +import org.apache.carbondata.core.scan.collector.impl.RestructureBasedVectorResultCollector; +import org.apache.carbondata.core.scan.executor.infos.BlockExecutionInfo; + +/** + * This class will provide the result collector instance based on the required type + */ +public class ResultCollectorFactory { + + /** + * logger of result collector factory + */ + private static final LogService LOGGER = + LogServiceFactory.getLogService(ResultCollectorFactory.class.getName()); + + /** + * This method will create result collector based on the given type + * + * @param blockExecutionInfo + * @return + */ + public static AbstractScannedResultCollector getScannedResultCollector( + BlockExecutionInfo blockExecutionInfo) { +AbstractScannedResultCollector scannerResultAggregator = null; +if (blockExecutionInfo.isRawRecordDetailQuery()) { + if (blockExecutionInfo.isRestructuredBlock()) { +LOGGER.info("Restructure based raw collector is used to scan and collect the data"); +scannerResultAggregator = new RestructureBasedRawResultCollector(blockExecutionInfo); + } else { +LOGGER.info("Row based raw collector is used to scan and collect the data"); +scannerResultAggregator = new RawBasedResultCollector(blockExecutionInfo); + } +} else if (blockExecutionInfo.isVectorBatchCollector()) { + if (blockExecutionInfo.isRestructuredBlock()) { +LOGGER.info("Restructure dictionary vector collector is used to scan and collect the data"); +scannerResultAggregator = new RestructureBasedVectorResultCollector(blockExecutionInfo); --- End diff -- RestructureBasedVectorResultCollector should be derived from DictionaryBasedVectorResultCollector and RestructureBasedDictionaryResultCollector should be derived from DictionaryBasedResultCollector RestructureBasedRawResultCollector should be derived from RawBasedResultCollector --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105868970 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/collector/impl/RestructureBasedRawResultCollector.java --- @@ -0,0 +1,265 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.core.scan.collector.impl; + +import java.nio.charset.Charset; +import java.util.ArrayList; +import java.util.List; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.update.BlockletLevelDeleteDeltaDataCache; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.keygenerator.KeyGenException; +import org.apache.carbondata.core.keygenerator.KeyGenerator; +import org.apache.carbondata.core.keygenerator.mdkey.MultiDimKeyVarLengthGenerator; +import org.apache.carbondata.core.metadata.encoder.Encoding; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension; +import org.apache.carbondata.core.scan.executor.infos.BlockExecutionInfo; +import org.apache.carbondata.core.scan.model.QueryDimension; +import org.apache.carbondata.core.scan.model.QueryMeasure; +import org.apache.carbondata.core.scan.result.AbstractScannedResult; +import org.apache.carbondata.core.scan.wrappers.ByteArrayWrapper; +import org.apache.carbondata.core.util.CarbonUtil; + +import org.apache.commons.lang3.ArrayUtils; +import org.apache.spark.unsafe.types.UTF8String; + +/** + * It is not a collector it is just a scanned result holder. + */ +public class RestructureBasedRawResultCollector extends AbstractScannedResultCollector { + + /** + * logger + */ + private static final LogService LOGGER = + LogServiceFactory.getLogService(RestructureBasedRawResultCollector.class.getName()); + + /** + * Key generator which will form the mdKey according to latest schema + */ + private KeyGenerator restructuredKeyGenerator; + + /** + * Key generator for uncompressing current block values + */ + private KeyGenerator updatedCurrentBlockKeyGenerator; + + public RestructureBasedRawResultCollector(BlockExecutionInfo blockExecutionInfos) { +super(blockExecutionInfos); +initRestructuredKeyGenerator(); +initCurrentBlockKeyGenerator(); + } + + /** + * This method will create a new key generator for generating mdKey according to latest schema + */ + private void initRestructuredKeyGenerator() { +SegmentProperties segmentProperties = +tableBlockExecutionInfos.getDataBlock().getSegmentProperties(); +QueryDimension[] queryDimensions = tableBlockExecutionInfos.getActualQueryDimensions(); +List updatedColumnCardinality = new ArrayList<>(queryDimensions.length); +List updatedDimensionPartitioner = new ArrayList<>(queryDimensions.length); +int[] dictionaryColumnBlockIndex = tableBlockExecutionInfos.getDictionaryColumnBlockIndex(); +int dimCounterInCurrentBlock = 0; +for (int i = 0; i < queryDimensions.length; i++) { + if (queryDimensions[i].getDimension().hasEncoding(Encoding.DICTIONARY)) { +if (tableBlockExecutionInfos.getDimensionInfo().getDimensionExists()[i]) { + // get the dictionary key ordinal as column cardinality in segment properties + // will only be for dictionary encoded columns + CarbonDimension currentBlockDimension = segmentProperties.getDimensions() + .get(dictionaryColumnBlockIndex[dimCounterInCurrentBlock]); + updatedColumnCardinality.add(segmentProperties + .getDimColumnsCardinality()[currentBlockDimension.getKeyOrdinal()]); +
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105867937 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/collector/impl/RestructureBasedRawResultCollector.java --- @@ -0,0 +1,265 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.core.scan.collector.impl; + +import java.nio.charset.Charset; +import java.util.ArrayList; +import java.util.List; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.update.BlockletLevelDeleteDeltaDataCache; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.keygenerator.KeyGenException; +import org.apache.carbondata.core.keygenerator.KeyGenerator; +import org.apache.carbondata.core.keygenerator.mdkey.MultiDimKeyVarLengthGenerator; +import org.apache.carbondata.core.metadata.encoder.Encoding; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension; +import org.apache.carbondata.core.scan.executor.infos.BlockExecutionInfo; +import org.apache.carbondata.core.scan.model.QueryDimension; +import org.apache.carbondata.core.scan.model.QueryMeasure; +import org.apache.carbondata.core.scan.result.AbstractScannedResult; +import org.apache.carbondata.core.scan.wrappers.ByteArrayWrapper; +import org.apache.carbondata.core.util.CarbonUtil; + +import org.apache.commons.lang3.ArrayUtils; +import org.apache.spark.unsafe.types.UTF8String; + +/** + * It is not a collector it is just a scanned result holder. + */ +public class RestructureBasedRawResultCollector extends AbstractScannedResultCollector { + + /** + * logger + */ + private static final LogService LOGGER = + LogServiceFactory.getLogService(RestructureBasedRawResultCollector.class.getName()); + + /** + * Key generator which will form the mdKey according to latest schema + */ + private KeyGenerator restructuredKeyGenerator; + + /** + * Key generator for uncompressing current block values + */ + private KeyGenerator updatedCurrentBlockKeyGenerator; + + public RestructureBasedRawResultCollector(BlockExecutionInfo blockExecutionInfos) { +super(blockExecutionInfos); +initRestructuredKeyGenerator(); +initCurrentBlockKeyGenerator(); + } + + /** + * This method will create a new key generator for generating mdKey according to latest schema + */ + private void initRestructuredKeyGenerator() { +SegmentProperties segmentProperties = +tableBlockExecutionInfos.getDataBlock().getSegmentProperties(); +QueryDimension[] queryDimensions = tableBlockExecutionInfos.getActualQueryDimensions(); +List updatedColumnCardinality = new ArrayList<>(queryDimensions.length); +List updatedDimensionPartitioner = new ArrayList<>(queryDimensions.length); +int[] dictionaryColumnBlockIndex = tableBlockExecutionInfos.getDictionaryColumnBlockIndex(); +int dimCounterInCurrentBlock = 0; +for (int i = 0; i < queryDimensions.length; i++) { + if (queryDimensions[i].getDimension().hasEncoding(Encoding.DICTIONARY)) { +if (tableBlockExecutionInfos.getDimensionInfo().getDimensionExists()[i]) { + // get the dictionary key ordinal as column cardinality in segment properties + // will only be for dictionary encoded columns + CarbonDimension currentBlockDimension = segmentProperties.getDimensions() + .get(dictionaryColumnBlockIndex[dimCounterInCurrentBlock]); + updatedColumnCardinality.add(segmentProperties + .getDimColumnsCardinality()[currentBlockDimension.getKeyOrdinal()]); +
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105883296 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java --- @@ -353,12 +312,15 @@ protected BlockExecutionInfo getBlockExecutionInfoForBlock(QueryModel queryModel } else { blockExecutionInfo.setAllSelectedDimensionBlocksIndexes(new int[0][0]); } +// get the list of updated filter measures present in the current block +Set updatedFilterMeasures = QueryUtil +.getUpdatedFilterMeasures(queryProperties.filterMeasures, segmentProperties.getMeasures()); // list of measures to be projected -List allProjectionListMeasureIdexes = new ArrayList<>(); +List allProjectionListMeasureIndexes = new ArrayList<>(); int[] measureBlockIndexes = QueryUtil -.getMeasureBlockIndexes(queryModel.getQueryMeasures(), expressionMeasures, -segmentProperties.getMeasuresOrdinalToBlockMapping(), queryProperties.filterMeasures, -allProjectionListMeasureIdexes); +.getMeasureBlockIndexes(updatedQueryMeasures, expressionMeasures, --- End diff -- Change name to currentBlockMeasures and Dimentions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105895041 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/executor/util/QueryUtil.java --- @@ -420,6 +438,54 @@ private static void getChildDimensionDictionaryDetail(CarbonDimension queryDimen } /** + * This method will create the updated list of filter measures present in the current block + * + * @param queryFilterMeasures + * @param currentBlockMeasures + * @return + */ + public static Set getUpdatedFilterMeasures(Set queryFilterMeasures, --- End diff -- Move this function to Abstract Query Executor --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105879703 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/collector/impl/RestructureBasedVectorResultCollector.java --- @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.core.scan.collector.impl; + +import java.math.BigDecimal; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.apache.carbondata.core.keygenerator.directdictionary.DirectDictionaryKeyGeneratorFactory; +import org.apache.carbondata.core.metadata.encoder.Encoding; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure; +import org.apache.carbondata.core.scan.executor.infos.BlockExecutionInfo; +import org.apache.carbondata.core.scan.model.QueryDimension; +import org.apache.carbondata.core.scan.model.QueryMeasure; +import org.apache.carbondata.core.scan.result.AbstractScannedResult; +import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector; +import org.apache.carbondata.core.scan.result.vector.CarbonColumnarBatch; +import org.apache.carbondata.core.scan.result.vector.ColumnVectorInfo; +import org.apache.carbondata.core.scan.result.vector.MeasureDataVectorProcessor; + +import org.apache.spark.sql.types.Decimal; + +/** + * It is not a collector it is just a scanned result holder. + */ +public class RestructureBasedVectorResultCollector extends AbstractScannedResultCollector { + + private ColumnVectorInfo[] dictionaryInfo; + + private ColumnVectorInfo[] noDictionaryInfo; + + private ColumnVectorInfo[] complexInfo; + + private ColumnVectorInfo[] measureColumnInfo; + + private ColumnVectorInfo[] allColumnInfo; + + public RestructureBasedVectorResultCollector(BlockExecutionInfo blockExecutionInfos) { +super(blockExecutionInfos); +QueryDimension[] queryDimensions = tableBlockExecutionInfos.getActualQueryDimensions(); +QueryMeasure[] queryMeasures = tableBlockExecutionInfos.getActualQueryMeasures(); +measureColumnInfo = new ColumnVectorInfo[queryMeasures.length]; +allColumnInfo = new ColumnVectorInfo[queryDimensions.length + queryMeasures.length]; +List dictInfoList = new ArrayList<>(); +List noDictInfoList = new ArrayList<>(); +List complexList = new ArrayList<>(); +int dimensionExistIndex = 0; +for (int i = 0; i < queryDimensions.length; i++) { + if (!dimensionInfo.getDimensionExists()[i]) { +// add a dummy column vector result collector object +ColumnVectorInfo columnVectorInfo = new ColumnVectorInfo(); +allColumnInfo[queryDimensions[i].getQueryOrder()] = columnVectorInfo; +continue; + } + // get the current block dimension and fetch the required information from it + QueryDimension currentBlockDimension = + tableBlockExecutionInfos.getQueryDimensions()[dimensionExistIndex++]; + if (!queryDimensions[i].getDimension().hasEncoding(Encoding.DICTIONARY)) { +ColumnVectorInfo columnVectorInfo = new ColumnVectorInfo(); +noDictInfoList.add(columnVectorInfo); +columnVectorInfo.dimension = currentBlockDimension; +columnVectorInfo.ordinal = currentBlockDimension.getDimension().getOrdinal(); +allColumnInfo[queryDimensions[i].getQueryOrder()] = columnVectorInfo; + } else if (queryDimensions[i].getDimension().hasEncoding(Encoding.DIRECT_DICTIONARY)) { +ColumnVectorInfo columnVectorInfo = new ColumnVectorInfo(); +dictInfoList.add(columnVectorInfo); +columnVectorInfo.dimension = currentBlockDimension; +columnVectorInfo.directDictionaryGenerator = DirectDictionaryKeyGeneratorFactory +
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105887002 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/executor/util/QueryUtil.java --- @@ -249,6 +248,25 @@ } /** + * This method will return the key ordinal of the query dimension from the current block + * + * @param blockDimensions + * @param queryDimension + * @return + */ + public static int getKeyOrdinalOfDimensionFromCurrentBlock(List blockDimensions, --- End diff -- Use CarbonUtil.getDimentionFromCurrentBlock, add getter to SegmentProperties --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r105883959 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java --- @@ -429,9 +396,10 @@ protected BlockExecutionInfo getBlockExecutionInfoForBlock(QueryModel queryModel * @param blockMetadataInfo block metadata info * @return key size */ - private int getKeySize(List queryDimension, SegmentProperties blockMetadataInfo) { -List fixedLengthDimensionOrdinal = -new ArrayList(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE); + private int getKeySize(List queryDimension, + SegmentProperties blockMetadataInfo) { +Set fixedLengthDimensionOrdinal = --- End diff -- Comment above to tell its purpose --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---