[GitHub] [spark] srowen commented on a change in pull request #26569: [SPARK-29938] [SQL] Add batching support in Alter table add partition flow

GitBox Thu, 12 Dec 2019 06:50:36 -0800

srowen commented on a change in pull request #26569: [SPARK-29938] [SQL] Add 
batching support in Alter table add partition flow
URL: https://github.com/apache/spark/pull/26569#discussion_r357185812


 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala
 ##########
 @@ -470,14 +470,36 @@ case class AlterTableAddPartitionCommand(
       CatalogTablePartition(normalizedSpec, table.storage.copy(
         locationUri = location.map(CatalogUtils.stringToURI)))
     }
-    catalog.createPartitions(table.identifier, parts, ignoreIfExists = 
ifNotExists)
+
+    // Hive metastore may not have enough memory to handle millions of 
partitions in single RPC.
+    // Also the request to metastore times out when adding lot of partitions 
in one shot.
+    // we should split them into smaller batches
+    val batchSize = 100
 
 Review comment:
   Yeah I don't think it's worth yet another config if it's not going to have 
much meaning to set. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] srowen commented on a change in pull request #26569: [SPARK-29938] [SQL] Add batching support in Alter table add partition flow

Reply via email to