[ https://issues.apache.org/jira/browse/SPARK-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288341#comment-14288341 ]
Kannan Rajah commented on SPARK-1529: ------------------------------------- [~pwendell] I would like us to consider the option of reusing the write code path of existing shuffle implementation instead of implementing from scratch. This will allow us to take advantage of all the optimizations that are already done and will be done in future. Only the read code path needs to be reimplemented fully as we don't need all the shuffle server logic. There are a handful of shuffle classes that need to use the HDFS abstractions in order to achieve this. I have attached a high level proposal. Let me know your thoughts. Write IndexShufflleBlockManager, SortShuffleWriter, ExternalSorter, BlockObjectWriter. Read BlockStoreShuffleFetcher, HashShuffleReader > Support setting spark.local.dirs to a hadoop FileSystem > -------------------------------------------------------- > > Key: SPARK-1529 > URL: https://issues.apache.org/jira/browse/SPARK-1529 > Project: Spark > Issue Type: Bug > Components: Spark Core > Reporter: Patrick Wendell > Assignee: Cheng Lian > > In some environments, like with MapR, local volumes are accessed through the > Hadoop filesystem interface. We should allow setting spark.local.dir to a > Hadoop filesystem location. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org