[ 
https://issues.apache.org/jira/browse/SPARK-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152228#comment-14152228
 ] 

Andrew Or commented on SPARK-3685:
----------------------------------

Not sure if I fully understand what you mean. If I'm running an executor and I 
request 30G from the beginning, my application uses all of it to do computation 
and all is good. After I decommission the executor, I would like to keep 1G 
just to serve the shuffle files, but this can't be done easily because we need 
to start a smaller JVM and a smaller container. (Yarn currently doesn't support 
scaling the size of a container while it's still running yet). Either way we 
need to transfer some state from the bigger JVM to the smaller JVM, and that 
adds some complexity to the design. The simplest alternative then would just to 
write whatever state to an external location and just terminate the executor 
JVM / container without starting a smaller one, and then have an external 
service that is long-running to serve these files.

One proposal here then is to write these shuffle files to a special location 
and have the Yarn NM shuffle service serve the files. This is an alternative to 
DFS shuffle that is, however, highly specific to Yarn. I am doing some initial 
prototyping of this (the Yarn shuffle) approach to see how this will pan out.

> Spark's local dir should accept only local paths
> ------------------------------------------------
>
>                 Key: SPARK-3685
>                 URL: https://issues.apache.org/jira/browse/SPARK-3685
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, YARN
>    Affects Versions: 1.1.0
>            Reporter: Andrew Or
>
> When you try to set local dirs to "hdfs:/tmp/foo" it doesn't work. What it 
> will try to do is create a folder called "hdfs:" and put "tmp" inside it. 
> This is because in Util#getOrCreateLocalRootDirs we use java.io.File instead 
> of Hadoop's file system to parse this path. We also need to resolve the path 
> appropriately.
> This may not have an urgent use case, but it fails silently and does what is 
> least expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to