xupefei commented on code in PR #48320:
URL: https://github.com/apache/spark/pull/48320#discussion_r1803475560
##########
sql/core/src/main/scala/org/apache/spark/sql/artifact/ArtifactManager.scala:
##########
@@ -279,6 +296,49 @@ class ArtifactManager(session: SparkSession) extends
Logging {
loader
}
+ private[sql] def clone(newSession: SparkSession): ArtifactManager = {
+ val sparkContext = session.sparkContext
+ sparkContext.synchronized {
+ val newArtifactManager = new ArtifactManager(newSession)
+ if (artifactPath.toFile.exists()) {
+ FileUtils.copyDirectory(artifactPath.toFile,
newArtifactManager.artifactPath.toFile)
+ }
+ val blockManager = sparkContext.env.blockManager
+ val newBlockIds = cachedBlockIdList.asScala.map { blockId =>
+ val newBlockId = blockId.copy(sessionUUID = newSession.sessionUUID)
+ copyBlock(blockId, newBlockId, blockManager)
+ }
+
+ // Re-register resources to SparkContext
Review Comment:
All files are fetched to the local artifact dir and copied over to the new
instance's folder at this stage. We could add new `register{Jar, File,
Archive}` methods in SparkContext however I don't see too much benefit other
than avoiding some scheme checks, as we still need to update
`sparkContext.added{Jars, Files, Archives}` with timestamp and do the
FileServer work.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]