jackylk commented on a change in pull request #3581: [CARBONDATA-3666] Avoided
listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368263868
##########
File path:
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala
##########
@@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand(
// then do the below steps
// 2.2.1 validate that all the aggregate tables are copied at the store
location.
// 2.2.2 Register the aggregate tables
- val tablePath = CarbonEnv.getTablePath(databaseNameOp,
tableName.toLowerCase)(sparkSession)
- val identifier = AbsoluteTableIdentifier.from(tablePath, databaseName,
tableName.toLowerCase)
// 2.1 check if the table already register with hive then ignore and
continue with the next
// schema
- if (!sparkSession.sessionState.catalog.listTables(databaseName)
- .exists(_.table.equalsIgnoreCase(tableName))) {
+ val provider = try {
+ sparkSession.sessionState.catalog
+ .getTableMetadata(TableIdentifier(tableName, databaseNameOp)).provider
+ } catch {
+ case _: NoSuchTableException =>
+ None
+ }
+ if (provider.isEmpty ||
+ provider.get.equalsIgnoreCase("org.apache.spark.sql.CarbonSource") ||
Review comment:
There are many places we are doing this check, it is getting repeated in
many places, not clean. Can you make a util function and use it in all places
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services