VenuReddy2103 commented on a change in pull request #3774:
URL: https://github.com/apache/carbondata/pull/3774#discussion_r437199601
##########
File path:
integration/spark/src/test/scala/org/apache/carbondata/geo/GeoTest.scala
##########
@@ -122,7 +161,8 @@ class GeoTest extends QueryTest with BeforeAndAfterAll with
BeforeAndAfterEach {
createTable(sourceTable)
loadData(sourceTable)
createTable(targetTable, "'SORT_SCOPE'='GLOBAL_SORT',")
- sql(s"insert into $targetTable select * from $sourceTable")
+ sql(s"insert into $targetTable select timevalue, longitude," +
Review comment:
why select * is changed to individual columns ? Probably because you get
geospatial column as well when you give select *. But, we need to handle this
insert into select *.
##########
File path:
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -653,6 +653,21 @@ object CommonUtil {
storeLocation
}
+ def validateForSpatialTypeColumn(properties: Map[String, String],
Review comment:
`columnPropert` string is just to use that in exception message. we
alreday have a similar method `validateSpatialIndexColumn()` for it. Suggest to
reuse.
##########
File path:
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -653,6 +653,21 @@ object CommonUtil {
storeLocation
}
+ def validateForSpatialTypeColumn(properties: Map[String, String],
Review comment:
`columnPropert` string is just to use that in exception message. we
alreday have a similar method `validateSpatialIndexColumn()` for it. Suggest to
modify existing and reuse instead of complete new method for the same
validation.
##########
File path:
integration/spark/src/main/scala/org/apache/spark/sql/CarbonSource.scala
##########
@@ -379,6 +380,17 @@ object CarbonSource {
if (isCreatedByCarbonExtension(properties)) {
// Table is created by SparkSession with CarbonExtension,
// There is no TableInfo yet, so create it from CatalogTable
+ val columnSchema = updateTable.schema
+ val spatialProperty = properties.get(CarbonCommonConstants.SPATIAL_INDEX)
+ // validate for spatial index column
+ columnSchema.foreach(x => {
+ if (spatialProperty.isDefined &&
Review comment:
instead of `spatialProperty.isDefined`check inside the loop it should be
outside. We can avoid unnecessary loop traversal if the property itself is not
defined.
##########
File path:
integration/spark/src/main/scala/org/apache/spark/sql/CarbonSource.scala
##########
@@ -379,6 +380,17 @@ object CarbonSource {
if (isCreatedByCarbonExtension(properties)) {
// Table is created by SparkSession with CarbonExtension,
// There is no TableInfo yet, so create it from CatalogTable
+ val columnSchema = updateTable.schema
Review comment:
why moved this exception case alone here from
`processSpatialIndexProperty` ? Probably we have missed something here. I
think, we have many other invalid property exceptions in
`processSpatialIndexProperty` and even for other properties in
`prepareTableModel()`.
##########
File path:
integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -213,8 +207,8 @@ object CarbonParserUtil {
// Process spatial index property
val indexFields = processSpatialIndexProperty(tableProperties, fields)
- val allFields = fields ++ indexFields
+ val allFields = (fields ++ indexFields).distinct
Review comment:
why distinct ? have already checked that indexFields are not in actual
fields. right ? If required please add the comments here.
##########
File path:
integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -213,8 +207,8 @@ object CarbonParserUtil {
// Process spatial index property
val indexFields = processSpatialIndexProperty(tableProperties, fields)
- val allFields = fields ++ indexFields
+ val allFields = (fields ++ indexFields).distinct
Review comment:
why distinct ? have already checked that indexFields are not in actual
fields. right ? If required please add the comments here. Same applies for
another instance below.
##########
File path:
integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -337,7 +331,7 @@ object CarbonParserUtil {
if
(tableProperties.get(CarbonCommonConstants.COLUMN_META_CACHE).isDefined) {
// validate the column_meta_cache option
- val tableColumns = dims.view.filterNot(_.spatialIndex).map(x =>
x.name.get) ++
+ val tableColumns = dims.view.map(x => x.name.get) ++
Review comment:
if we had not removed this filter, then you wouldn't require
modification in `validateColumnMetaCacheFields` or
`validateColumnMetaCacheOption` i guess.
##########
File path:
integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -337,7 +331,7 @@ object CarbonParserUtil {
if
(tableProperties.get(CarbonCommonConstants.COLUMN_META_CACHE).isDefined) {
// validate the column_meta_cache option
- val tableColumns = dims.view.filterNot(_.spatialIndex).map(x =>
x.name.get) ++
+ val tableColumns = dims.view.map(x => x.name.get) ++
Review comment:
if we had not removed this filter, then you wouldn't require
modification in `validateColumnMetaCacheFields`
`validateColumnMetaCacheOption`, `validateColumnMetaCacheAndCacheLevel` i guess.
##########
File path:
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp:
Option[String],
convertedStaticPartition)
scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
if (logicalPartitionRelation != null) {
- if (selectedColumnSchema.length !=
logicalPartitionRelation.output.length) {
+ val properties =
table.getTableInfo.getFactTable.getTableProperties.asScala
Review comment:
In `getReArrangedIndexAndSelectedSchema()` above, we make
`selectedColumnSchema` and return. In that method, we are excluding
SpatialColumn column. That is the reason why the `selectedColumnSchema` &
`logicalPartitionRelation.output` do not match here. I think, you need to fix
there. then this change wouldn't be required anymore.
##########
File path:
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp:
Option[String],
convertedStaticPartition)
scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
if (logicalPartitionRelation != null) {
- if (selectedColumnSchema.length !=
logicalPartitionRelation.output.length) {
+ val properties =
table.getTableInfo.getFactTable.getTableProperties.asScala
Review comment:
In `getReArrangedIndexAndSelectedSchema()` above, we make
`selectedColumnSchema` and return. In that method, we are excluding
SpatialColumn column. That is the reason why the `selectedColumnSchema` &
`logicalPartitionRelation.output` do not match here. Probably you need to fix
there. then this change wouldn't be required anymore.
##########
File path:
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##########
@@ -170,11 +171,18 @@ case class CarbonInsertIntoCommand(databaseNameOp:
Option[String],
convertedStaticPartition)
scanResultRdd = sparkSession.sessionState.executePlan(newLogicalPlan).toRdd
if (logicalPartitionRelation != null) {
- if (selectedColumnSchema.length !=
logicalPartitionRelation.output.length) {
+ val properties =
table.getTableInfo.getFactTable.getTableProperties.asScala
Review comment:
In `getReArrangedIndexAndSelectedSchema()` above, we make
`selectedColumnSchema` and return. In that method, we are excluding
SpatialColumn column. That is the reason why the `selectedColumnSchema` &
`logicalPartitionRelation.output` do not match here. Probably you need to fix
there. then this change wouldn't be required anymore ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]