mridulm commented on a change in pull request #31876:
URL: https://github.com/apache/spark/pull/31876#discussion_r617752756



##########
File path: core/src/main/scala/org/apache/spark/TaskEndReason.scala
##########
@@ -81,7 +81,7 @@ case object Resubmitted extends TaskFailedReason {
  */
 @DeveloperApi
 case class FetchFailed(
-    bmAddress: BlockManagerId,  // Note that bmAddress can be null
+    bmAddress: Location,  // Note that bmAddress can be null

Review comment:
       > And I have a new idea that we can introduce a new fetch failed class 
for the custom location and leave this one unchanged. For example, we can have 
CustomStorageFetchFailed. Thus, we the location is BlockManagerId then we use 
FetchFailed, otherwise, uses CustomStorageFetchFailed. WDYT?
   
   `CustomStorageFetchFailed` looks like a promising approach, we will need to 
think through what the implications of it would be would on the face of it, it 
should address immediate concerns IMO.
   Thoughts @attilapiros, @tgravescs ?
   
   > The only problem is the custom location. It's new data, e.g., 
("XXXLocation" -> XXXLocationJson). So it can be a problem if users use the old 
version Spark to load event files. Although, I think this's really an 
unexpected usage.
   
   There are couple of issues here:
   * A simpler question of how to handle custom location - from programmatic 
and data point of view.
   * How to handle different shuffle impls being in play for the same event 
directory.
     * If deployments have multiple shuffle infra in use over course of time 
(or different clusters with different configs and a shared history event dir), 
each with their own Location's.
     * How will SHS/REST api, etc understand which location class is being 
used/how to parse them.
   
   I actually dont have good solutions on this - other than adding some 
metadata per location record to indicate the 'type'.
   Any other thoughts ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to