[jira] [Commented] (SPARK-40082) DAGScheduler may not schduler new stage in condition of push-based shuffle enabled

2022-08-16 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580070#comment-17580070 ] Min Shen commented on SPARK-40082: -- [~csingh] [~mridul]  Want to bring your attention to this ticket.

[jira] [Comment Edited] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2022-04-02 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516298#comment-17516298 ] Min Shen edited comment on SPARK-30602 at 4/2/22 1:02 PM: -- [~pan3793] , I guess

[jira] [Commented] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2022-04-02 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17516298#comment-17516298 ] Min Shen commented on SPARK-30602: -- [~pan3793] , I guess you were referring to the earlier screenshot

[jira] [Commented] (SPARK-36892) Disable batch fetch for a shuffle when push based shuffle is enabled

2021-09-30 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422800#comment-17422800 ] Min Shen commented on SPARK-36892: -- [~Gengliang.Wang] This issue and the ones fixed earlier are

[jira] [Commented] (SPARK-36558) Stage has all tasks finished but with ongoing finalization can cause job hang

2021-08-22 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402944#comment-17402944 ] Min Shen commented on SPARK-36558: -- This is an issue we previously resolved internally. Seems that

[jira] [Updated] (SPARK-33701) Adaptive shuffle merge finalization for push-based shuffle

2021-08-16 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-33701: - Description: SPARK-32920 implements a simple approach for shuffle merge finalization, which

[jira] [Commented] (SPARK-36530) Avoid finalizing when there's no push at all in a shuffle

2021-08-16 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400122#comment-17400122 ] Min Shen commented on SPARK-36530: -- Expanded the description of SPARK-33701 to include this scenario as

[jira] [Commented] (SPARK-36530) Avoid finalizing when there's no push at all in a shuffle

2021-08-16 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400100#comment-17400100 ] Min Shen commented on SPARK-36530: -- You mean every shuffle partition block is larger than that

[jira] [Commented] (SPARK-33331) Limit the number of pending blocks in memory and store blocks that collide

2021-08-16 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399845#comment-17399845 ] Min Shen commented on SPARK-1: -- With the change in SPARK-36423, I think this issue is further

[jira] [Commented] (SPARK-35036) Improve push based shuffle to work with AQE by fetching partial map indexes for a reduce partition

2021-08-16 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399843#comment-17399843 ] Min Shen commented on SPARK-35036: -- Not sure if this is fixable, given the reasons you already

[jira] [Updated] (SPARK-36483) Fix intermittent test failure due to netty dependency version bump

2021-08-11 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-36483: - Description: In SPARK-35132, Spark's netty dependency version was bumped from 4.1.51 to 4.1.63. Since

[jira] [Created] (SPARK-36483) Fix intermittent test failure due to netty dependency version bump

2021-08-11 Thread Min Shen (Jira)
Min Shen created SPARK-36483: Summary: Fix intermittent test failure due to netty dependency version bump Key: SPARK-36483 URL: https://issues.apache.org/jira/browse/SPARK-36483 Project: Spark

[jira] [Created] (SPARK-36423) Randomize blocks within a push request before pushing to improve block merge ratio

2021-08-05 Thread Min Shen (Jira)
Min Shen created SPARK-36423: Summary: Randomize blocks within a push request before pushing to improve block merge ratio Key: SPARK-36423 URL: https://issues.apache.org/jira/browse/SPARK-36423 Project:

[jira] [Commented] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2021-08-02 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391745#comment-17391745 ] Min Shen commented on SPARK-30602: -- [~mridulm80], thanks for shepherding this work and your reviews on

[jira] [Reopened] (SPARK-36378) Minor changes to address a few identified server side inefficiencies

2021-08-01 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen reopened SPARK-36378: -- > Minor changes to address a few identified server side inefficiencies >

[jira] [Updated] (SPARK-36378) Minor changes to address a few identified server side inefficiencies

2021-08-01 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-36378: - Parent: SPARK-33235 Issue Type: Sub-task (was: Bug) > Minor changes to address a few

[jira] [Updated] (SPARK-36378) Minor changes to address a few identified server side inefficiencies

2021-08-01 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-36378: - Parent: (was: SPARK-30602) Issue Type: Bug (was: Sub-task) > Minor changes to address a

[jira] [Commented] (SPARK-36378) Minor changes to address a few identified server side inefficiencies

2021-08-01 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391322#comment-17391322 ] Min Shen commented on SPARK-36378: -- If moving this outside of the SPIP is preferred, then will move

[jira] [Commented] (SPARK-36378) Minor changes to address a few identified server side inefficiencies

2021-08-01 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17391320#comment-17391320 ] Min Shen commented on SPARK-36378: -- Would prefer to merge this in if possible. > Minor changes to

[jira] [Updated] (SPARK-36378) Minor changes to address a few identified server side inefficiencies

2021-08-01 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-36378: - Description: With the SPIP ticket close to being finished, we have done some performance evaluations

[jira] [Created] (SPARK-36378) Minor changes to address a few identified server side inefficiencies

2021-08-01 Thread Min Shen (Jira)
Min Shen created SPARK-36378: Summary: Minor changes to address a few identified server side inefficiencies Key: SPARK-36378 URL: https://issues.apache.org/jira/browse/SPARK-36378 Project: Spark

[jira] [Created] (SPARK-36266) Rename classes in shuffle RPC used for block push operations

2021-07-22 Thread Min Shen (Jira)
Min Shen created SPARK-36266: Summary: Rename classes in shuffle RPC used for block push operations Key: SPARK-36266 URL: https://issues.apache.org/jira/browse/SPARK-36266 Project: Spark Issue

[jira] [Commented] (SPARK-35426) When addMergerLocation exceed the maxRetainedMergerLocations , we should remove the merger based on merged shuffle data size.

2021-06-14 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363231#comment-17363231 ] Min Shen commented on SPARK-35426: -- When a merger is removed from the retained list, it only prevents

[jira] [Commented] (SPARK-35426) When addMergerLocation exceed the maxRetainedMergerLocations , we should remove the merger based on merged shuffle data size.

2021-06-08 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359673#comment-17359673 ] Min Shen commented on SPARK-35426: -- [~zhuqi], the retained mergers are meant for choosing merger

[jira] [Commented] (SPARK-35549) Register merge status even after shuffle dependency is merge finalized

2021-06-08 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359671#comment-17359671 ] Min Shen commented on SPARK-35549: -- It might not be straightforward to register later MergeStatus.

[jira] [Updated] (SPARK-35546) Properly handle race conditions in RemoteBlockPushResolver for access to the internal ConcurrentHashMaps

2021-05-28 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-35546: - Summary: Properly handle race conditions in RemoteBlockPushResolver for access to the internal

[jira] [Updated] (SPARK-35036) Improve push based shuffle to work with AQE by fetching partial map indexes for a reduce partition

2021-05-02 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-35036: - Parent: SPARK-33235 Issue Type: Sub-task (was: New Feature) > Improve push based shuffle to

[jira] [Updated] (SPARK-35036) Improve push based shuffle to work with AQE by fetching partial map indexes for a reduce partition

2021-05-02 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-35036: - Parent: (was: SPARK-30602) Issue Type: New Feature (was: Sub-task) > Improve push based

[jira] [Commented] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2021-04-15 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17322476#comment-17322476 ] Min Shen commented on SPARK-30602: -- We have published the production results of push-based shuffle on

[jira] [Commented] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2021-03-21 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305738#comment-17305738 ] Min Shen commented on SPARK-30602: -- Just an update for where we are: The team at LinkedIn has been

[jira] [Created] (SPARK-33781) Improve caching of MergeStatus on the executor side to save memory

2020-12-14 Thread Min Shen (Jira)
Min Shen created SPARK-33781: Summary: Improve caching of MergeStatus on the executor side to save memory Key: SPARK-33781 URL: https://issues.apache.org/jira/browse/SPARK-33781 Project: Spark

[jira] [Created] (SPARK-33701) Adaptive shuffle merge finalization for push-based shuffle

2020-12-07 Thread Min Shen (Jira)
Min Shen created SPARK-33701: Summary: Adaptive shuffle merge finalization for push-based shuffle Key: SPARK-33701 URL: https://issues.apache.org/jira/browse/SPARK-33701 Project: Spark Issue

[jira] [Created] (SPARK-33574) Improve locality for push-based shuffle especially for join like operations

2020-11-26 Thread Min Shen (Jira)
Min Shen created SPARK-33574: Summary: Improve locality for push-based shuffle especially for join like operations Key: SPARK-33574 URL: https://issues.apache.org/jira/browse/SPARK-33574 Project: Spark

[jira] [Created] (SPARK-33573) Server and client side metrics related to push-based shuffle

2020-11-26 Thread Min Shen (Jira)
Min Shen created SPARK-33573: Summary: Server and client side metrics related to push-based shuffle Key: SPARK-33573 URL: https://issues.apache.org/jira/browse/SPARK-33573 Project: Spark Issue

[jira] [Updated] (SPARK-33235) Push-based Shuffle Improvement Tasks

2020-11-26 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-33235: - Description: This is the parent jira for follow-up improvement tasks for supporting Push-based shuffle.

[jira] [Updated] (SPARK-33235) Push-based Shuffle Improvement Tasks

2020-11-26 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-33235: - Summary: Push-based Shuffle Improvement Tasks (was: Push-based Shuffle Phase 2 Tasks) > Push-based

[jira] [Updated] (SPARK-33329) Pluggable API to fetch shuffle merger locations with Push based shuffle

2020-11-09 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-33329: - Parent: SPARK-33235 Issue Type: Sub-task (was: New Feature) > Pluggable API to fetch shuffle

[jira] [Updated] (SPARK-33329) Pluggable API to fetch shuffle merger locations with Push based shuffle

2020-11-09 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-33329: - Parent: (was: SPARK-30602) Issue Type: New Feature (was: Sub-task) > Pluggable API to

[jira] [Commented] (SPARK-32925) Support push-based shuffle in multiple deployment environments

2020-09-17 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197893#comment-17197893 ] Min Shen commented on SPARK-32925: -- cc [~dongjoon], [~holden], [~dbtsai] for comments on k8s shuffle

[jira] [Created] (SPARK-32925) Support push-based shuffle in multiple deployment environments

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32925: Summary: Support push-based shuffle in multiple deployment environments Key: SPARK-32925 URL: https://issues.apache.org/jira/browse/SPARK-32925 Project: Spark

[jira] [Created] (SPARK-32923) Add support to properly handle different type of stage retries

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32923: Summary: Add support to properly handle different type of stage retries Key: SPARK-32923 URL: https://issues.apache.org/jira/browse/SPARK-32923 Project: Spark

[jira] [Created] (SPARK-32922) Add support for ShuffleBlockFetcherIterator to read from merged shuffle partitions and to fallback to original shuffle blocks if encountering failures

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32922: Summary: Add support for ShuffleBlockFetcherIterator to read from merged shuffle partitions and to fallback to original shuffle blocks if encountering failures Key: SPARK-32922 URL:

[jira] [Created] (SPARK-32921) Extend MapOutputTracker to support tracking and serving the metadata about each merged shuffle partitions for a given shuffle in push-based shuffle scenario

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32921: Summary: Extend MapOutputTracker to support tracking and serving the metadata about each merged shuffle partitions for a given shuffle in push-based shuffle scenario Key: SPARK-32921

[jira] [Created] (SPARK-32920) Add support in Spark driver to coordinate the finalization of the push/merge phase in push-based shuffle for a given shuffle and the initiation of the reduce stage

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32920: Summary: Add support in Spark driver to coordinate the finalization of the push/merge phase in push-based shuffle for a given shuffle and the initiation of the reduce stage Key: SPARK-32920

[jira] [Created] (SPARK-32919) Add support in Spark driver to coordinate the shuffle map stage in push-based shuffle by selecting external shuffle services for merging shuffle partitions

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32919: Summary: Add support in Spark driver to coordinate the shuffle map stage in push-based shuffle by selecting external shuffle services for merging shuffle partitions Key: SPARK-32919

[jira] [Created] (SPARK-32918) RPC implementation to support control plane coordination for push-based shuffle

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32918: Summary: RPC implementation to support control plane coordination for push-based shuffle Key: SPARK-32918 URL: https://issues.apache.org/jira/browse/SPARK-32918 Project:

[jira] [Created] (SPARK-32917) Add support for executors to push shuffle blocks after successful map task completion

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32917: Summary: Add support for executors to push shuffle blocks after successful map task completion Key: SPARK-32917 URL: https://issues.apache.org/jira/browse/SPARK-32917

[jira] [Created] (SPARK-32916) Add support for external shuffle service in YARN deployment mode to leverage push-based shuffle

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32916: Summary: Add support for external shuffle service in YARN deployment mode to leverage push-based shuffle Key: SPARK-32916 URL: https://issues.apache.org/jira/browse/SPARK-32916

[jira] [Created] (SPARK-32915) RPC implementation to support pushing and merging shuffle blocks

2020-09-17 Thread Min Shen (Jira)
Min Shen created SPARK-32915: Summary: RPC implementation to support pushing and merging shuffle blocks Key: SPARK-32915 URL: https://issues.apache.org/jira/browse/SPARK-32915 Project: Spark

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-08-22 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: (was: p887-shen.pdf) > SPIP: Support push-based shuffle to improve shuffle efficiency >

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-08-22 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: vldb_magnet_final.pdf > SPIP: Support push-based shuffle to improve shuffle efficiency >

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-08-22 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: p887-shen.pdf > SPIP: Support push-based shuffle to improve shuffle efficiency >

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-08-22 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: (was: vldb_2020_magnet_shuffle.pdf) > SPIP: Support push-based shuffle to improve

[jira] [Commented] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-06-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144060#comment-17144060 ] Min Shen commented on SPARK-30602: -- Also want to share the production results we have so far. We have

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-06-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: Screen Shot 2020-06-23 at 11.31.22 AM.jpg > SPIP: Support push-based shuffle to improve

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-06-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: (was: Screen Shot 2020-06-17 at 7.01.32 PM.jpg) > SPIP: Support push-based shuffle to

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-06-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: Screen Shot 2020-06-17 at 7.01.32 PM.jpg > SPIP: Support push-based shuffle to improve

[jira] [Commented] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-06-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144010#comment-17144010 ] Min Shen commented on SPARK-30602: -- Our paper summarizing the work on this new push-based shuffle was

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-06-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: vldb_2020_magnet_shuffle.pdf > SPIP: Support push-based shuffle to improve shuffle

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-06-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: (was: magnet_shuffle.pdf) > SPIP: Support push-based shuffle to improve shuffle

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-06-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Attachment: magnet_shuffle.pdf > SPIP: Support push-based shuffle to improve shuffle efficiency >

[jira] [Commented] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-02-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043829#comment-17043829 ] Min Shen commented on SPARK-30602: -- [~shanyu], I have listed a few key differences between Riffle and

[jira] [Commented] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2020-01-23 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022342#comment-17022342 ] Min Shen commented on SPARK-29206: -- With more investigation into the Netty side issues, we are

[jira] [Updated] (SPARK-30602) SPIP: Support push-based shuffle to improve shuffle efficiency

2020-01-22 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-30602: - Description: In a large deployment of a Spark compute infrastructure, Spark shuffle is becoming a

[jira] [Created] (SPARK-30602) Support push-based shuffle to improve shuffle efficiency

2020-01-21 Thread Min Shen (Jira)
Min Shen created SPARK-30602: Summary: Support push-based shuffle to improve shuffle efficiency Key: SPARK-30602 URL: https://issues.apache.org/jira/browse/SPARK-30602 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-21492) Memory leak in SortMergeJoin

2019-10-14 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951165#comment-16951165 ] Min Shen edited comment on SPARK-21492 at 10/14/19 5:13 PM: Want to further

[jira] [Commented] (SPARK-21492) Memory leak in SortMergeJoin

2019-10-14 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951165#comment-16951165 ] Min Shen commented on SPARK-21492: -- Want to further clarify the scope of the fix in PR 

[jira] [Commented] (SPARK-21492) Memory leak in SortMergeJoin

2019-10-14 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951154#comment-16951154 ] Min Shen commented on SPARK-21492: -- We have deployed the latest version of the PR in 

[jira] [Commented] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2019-10-14 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951128#comment-16951128 ] Min Shen commented on SPARK-29206: -- It appears that simply by making sure the number of Netty server

[jira] [Commented] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2019-09-24 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937219#comment-16937219 ] Min Shen commented on SPARK-29206: -- [~irashid], Would appreciate your thoughts on this ticket as well.

[jira] [Commented] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2019-09-23 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936219#comment-16936219 ] Min Shen commented on SPARK-29206: -- [~tgraves], A PR is put up for this. The actual fix itself is

[jira] [Commented] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2019-09-22 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935556#comment-16935556 ] Min Shen commented on SPARK-29206: -- We initially tried an alternative approach to resolve this issue by

[jira] [Commented] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2019-09-22 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935552#comment-16935552 ] Min Shen commented on SPARK-29206: -- [~redsanket], [~tgraves], Since you worked on committing the

[jira] [Created] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2019-09-22 Thread Min Shen (Jira)
Min Shen created SPARK-29206: Summary: Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads Key: SPARK-29206 URL: https://issues.apache.org/jira/browse/SPARK-29206

[jira] [Commented] (SPARK-21492) Memory leak in SortMergeJoin

2019-09-21 Thread Min Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935176#comment-16935176 ] Min Shen commented on SPARK-21492: -- We also saw this issue happening in our cluster. Based on the

[jira] [Commented] (SPARK-24355) Improve Spark shuffle server responsiveness to non-ChunkFetch requests

2018-06-01 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498228#comment-16498228 ] Min Shen commented on SPARK-24355: -- [~jerryshao], We built a stress testing tool for Spark shuffle

[jira] [Commented] (SPARK-24355) Improve Spark shuffle server responsiveness to non-ChunkFetch requests

2018-05-22 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484650#comment-16484650 ] Min Shen commented on SPARK-24355: -- [~felixcheung]  [~jinxing6...@126.com] [~cloud_fan] Could you

[jira] [Created] (SPARK-24355) Improve Spark shuffle server responsiveness to non-ChunkFetch requests

2018-05-22 Thread Min Shen (JIRA)
Min Shen created SPARK-24355: Summary: Improve Spark shuffle server responsiveness to non-ChunkFetch requests Key: SPARK-24355 URL: https://issues.apache.org/jira/browse/SPARK-24355 Project: Spark

[jira] [Commented] (SPARK-22373) Intermittent NullPointerException in org.codehaus.janino.IClass.isAssignableFrom

2017-11-28 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270055#comment-16270055 ] Min Shen commented on SPARK-22373: -- Created PR https://github.com/apache/spark/pull/19839 [~sowen]

[jira] [Updated] (SPARK-22373) Intermittent NullPointerException in org.codehaus.janino.IClass.isAssignableFrom

2017-11-28 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-22373: - Attachment: generated.java CodeGeneratorTester.scala Attach the standalone testing

[jira] [Commented] (SPARK-22373) Intermittent NullPointerException in org.codehaus.janino.IClass.isAssignableFrom

2017-11-28 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269924#comment-16269924 ] Min Shen commented on SPARK-22373: -- [~leigjklotz], I think bumping up Janino version to 3.0.7

[jira] [Commented] (SPARK-22373) Intermittent NullPointerException in org.codehaus.janino.IClass.isAssignableFrom

2017-11-28 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269128#comment-16269128 ] Min Shen commented on SPARK-22373: -- Tried running the test application using 10 concurrent threads to

[jira] [Commented] (SPARK-22373) Intermittent NullPointerException in org.codehaus.janino.IClass.isAssignableFrom

2017-11-27 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267948#comment-16267948 ] Min Shen commented on SPARK-22373: -- The code that would throw this NPE looks like the following:

[jira] [Commented] (SPARK-22373) Intermittent NullPointerException in org.codehaus.janino.IClass.isAssignableFrom

2017-11-27 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267723#comment-16267723 ] Min Shen commented on SPARK-22373: -- Also facing this issue. >From our experience, it is more likely to

[jira] [Commented] (SPARK-10878) Race condition when resolving Maven coordinates via Ivy

2017-08-01 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109778#comment-16109778 ] Min Shen commented on SPARK-10878: -- We hit the same issue in our infrastructure where concurrent Livy

[jira] [Commented] (SPARK-11597) improve performance of array and map encoder

2017-05-17 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014993#comment-16014993 ] Min Shen commented on SPARK-11597: -- Is there any further update on this ticket? We have recently seen a

[jira] [Commented] (SPARK-19810) Remove support for Scala 2.10

2017-03-07 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900548#comment-15900548 ] Min Shen commented on SPARK-19810: -- [~srowen], Want to get an idea regarding the timeline for removing

[jira] [Commented] (SPARK-19380) YARN - Dynamic allocation should use configured number of executors as max number of executors

2017-01-27 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843340#comment-15843340 ] Min Shen commented on SPARK-19380: -- [~srowen], What we want is to be able to also cap the number of

[jira] [Updated] (SPARK-18646) ExecutorClassLoader for spark-shell does not honor spark.executor.userClassPathFirst

2016-11-30 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-18646: - Component/s: (was: Spark Core) Spark Shell > ExecutorClassLoader for spark-shell

[jira] [Created] (SPARK-18646) ExecutorClassLoader for spark-shell does not honor spark.executor.userClassPathFirst

2016-11-30 Thread Min Shen (JIRA)
Min Shen created SPARK-18646: Summary: ExecutorClassLoader for spark-shell does not honor spark.executor.userClassPathFirst Key: SPARK-18646 URL: https://issues.apache.org/jira/browse/SPARK-18646

[jira] [Updated] (SPARK-10172) History Server web UI gets messed up when sorting on any column

2015-08-23 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-10172: - Attachment: screen-shot.png [~srowen], Screen shot attached. When Attempt ID column is displayed, after

[jira] [Created] (SPARK-10172) History Server web UI gets messed up when sorting on any column

2015-08-22 Thread Min Shen (JIRA)
Min Shen created SPARK-10172: Summary: History Server web UI gets messed up when sorting on any column Key: SPARK-10172 URL: https://issues.apache.org/jira/browse/SPARK-10172 Project: Spark