[jira] [Created] (SPARK-47702) Shuffle service endpoint is not removed from the locations list when RDD block is removed form a node.

2024-04-02 Thread mahesh kumar behera (Jira)
mahesh kumar behera created SPARK-47702:
---

 Summary: Shuffle service endpoint is not removed from the 
locations list when RDD block is removed form a node.
 Key: SPARK-47702
 URL: https://issues.apache.org/jira/browse/SPARK-47702
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.5.1
Reporter: mahesh kumar behera


If SHUFFLE_SERVICE_FETCH_RDD_ENABLED is set to true, driver stores both 
executor end point and the external shuffle end points for a RDD block. When 
the RDD is migrated, the location info is updated to add the end point 
corresponds to new location and the old end point is removed. But currently, 
only the executor end point is removed. The shuffle service end point is not 
removed. This cause failure during RDD read if the shuffle service end point is 
chosen due to task locality.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47521) Fix file handle leakage during shuffle data read from external storage.

2024-03-22 Thread mahesh kumar behera (Jira)
mahesh kumar behera created SPARK-47521:
---

 Summary: Fix file handle leakage during shuffle data read from 
external storage.
 Key: SPARK-47521
 URL: https://issues.apache.org/jira/browse/SPARK-47521
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.3.4, 3.5.0, 3.4.1
Reporter: mahesh kumar behera


In method FallbackStorage.read method, the file handle is not closed if there 
is a failure during read operation.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47141) Support enabling migration of shuffle data directly to external storage using config parameter

2024-03-22 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated SPARK-47141:

Summary: Support enabling migration of shuffle data directly to external 
storage using config parameter  (was: Support shuffle migration to external 
storage)

> Support enabling migration of shuffle data directly to external storage using 
> config parameter
> --
>
> Key: SPARK-47141
> URL: https://issues.apache.org/jira/browse/SPARK-47141
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Currently Spark supports migration of shuffle data to peer nodes during node 
> decommissioning. If peer nodes are not accessible, then Spark falls back to 
> external storage. User needs to provide the storage location path. There are 
> scenarios where user may want to migrate to external storage instead of peer 
> nodes. This may be because of unstable  nodes or due to the need of 
> aggressive scale down. So user should be able to configure to migrate the 
> shuffle data directly to external storage if the use case permits. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47141) Support shuffle migration to external storage

2024-02-22 Thread mahesh kumar behera (Jira)
mahesh kumar behera created SPARK-47141:
---

 Summary: Support shuffle migration to external storage
 Key: SPARK-47141
 URL: https://issues.apache.org/jira/browse/SPARK-47141
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: mahesh kumar behera
 Fix For: 4.0.0


Currently Spark supports migration of shuffle data to peer nodes during node 
decommissioning. If peer nodes are not accessible, then Spark falls back to 
external storage. User needs to provide the storage location path. There are 
scenarios where user may want to migrate to external storage instead of peer 
nodes. This may be because of unstable  nodes or due to the need of aggressive 
scale down. So user should be able to configure to migrate the shuffle data 
directly to external storage if the use case permits. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33545) Support Fallback Storage during Worker decommission

2024-01-16 Thread mahesh kumar behera (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17807166#comment-17807166
 ] 

mahesh kumar behera commented on SPARK-33545:
-

[~dongjoon] 

As per this PR, the shuffle read from fallback storage is done along with read 
from local block. Is there any reason for the same ? The local reads are done 
in single thread, is there any obvious issue you see if we read shuffle data 
from fallback storage in multiple threads ?

> Support Fallback Storage during Worker decommission
> ---
>
> Key: SPARK-33545
> URL: https://issues.apache.org/jira/browse/SPARK-33545
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.1.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-44307) Bloom filter is not added for left outer join if the left side table is smaller than broadcast threshold.

2023-07-04 Thread mahesh kumar behera (Jira)
mahesh kumar behera created SPARK-44307:
---

 Summary: Bloom filter is not added for left outer join if the left 
side table is smaller than broadcast threshold.
 Key: SPARK-44307
 URL: https://issues.apache.org/jira/browse/SPARK-44307
 Project: Spark
  Issue Type: Bug
  Components: Optimizer
Affects Versions: 3.4.1
Reporter: mahesh kumar behera
 Fix For: 3.5.0


In case of left outer join, even if the left side table is small enough to be 
broadcasted, shuffle join is used. This is because of the property of the left 
outer join. If the left side is broadcasted in left outer join, the result 
generated will be wrong. But this is not taken care of in bloom filter. While 
injecting the bloom filter, if lest side is smaller than broadcast threshold, 
bloom filter is not added. It assumes that the left side will be broadcast and 
there is no need for a bloom filter. This causes bloom filter optimization to 
be missed in case of left outer join with small left side and huge right-side 
table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org