[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840930#comment-17840930 ] Dongjoon Hyun commented on SPARK-27780: --- I removed the outdated target version from this issue. > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419135#comment-17419135 ] sooyeon shin commented on SPARK-27780: -- [~shyamabigd...@gmail.com] I meant that you could use shuffle file tracking instead of the external shuffle service. The shuffle file tracking is new feature in spark v3.0. ref) https://www.waitingforcode.com/apache-spark/what-new-apache-spark-3-shuffle-service-changes/read The following options are needed to apply shuffle file tracking. spark.dynamicAllocation.enabled=true +spark.shuffle.service.enabled=false+ spark.dynamicAllocation.shuffleTracking.enabled=true > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17418909#comment-17418909 ] Shyam commented on SPARK-27780: --- [~comma337] Can you please let me know where is the suffle service information in the links you mentioned? > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243841#comment-17243841 ] Hyukjin Kwon commented on SPARK-27780: -- I will switch to 3.2.0. Branch will be cut out soon. > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17234930#comment-17234930 ] Dongjoon Hyun commented on SPARK-27780: --- Gentle ping, [~irashid], because this has `Target Version: 3.1.0`. We can switch to 3.2.0 or drop the target version for now. > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163355#comment-17163355 ] sooyeon shin commented on SPARK-27780: -- Hi there. I had same issue. In 3.0, dynamic allocation can be run without external shuffle service. Please see below. --conf spark.dynamicAllocation.shuffleTracking.enabled=true [http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation] I hope it helps. > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149083#comment-17149083 ] Jason Moore commented on SPARK-27780: - I encounter this on 3.0.0 running with a much older shuffle service (I think like 2.1.1). Is there any documentation on which shuffle services 3.0.0 will work with? > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060768#comment-17060768 ] Dongjoon Hyun commented on SPARK-27780: --- Hi, [~irashid]. Is there any update? > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983536#comment-16983536 ] deshanxiao commented on SPARK-27780: I can't argee it more. Add shuffle service version is very necessary. > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.0.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27780) Shuffle server & client should be versioned to enable smoother upgrade
[ https://issues.apache.org/jira/browse/SPARK-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875858#comment-16875858 ] koert kuipers commented on SPARK-27780: --- fwiw i just ran into this since i am building and deploying spark from master (which includes SPARK-27665) but my shuffle service is spark 2.4.x and i cannot easily upgrade it > Shuffle server & client should be versioned to enable smoother upgrade > -- > > Key: SPARK-27780 > URL: https://issues.apache.org/jira/browse/SPARK-27780 > Project: Spark > Issue Type: New Feature > Components: Shuffle, Spark Core >Affects Versions: 3.0.0 >Reporter: Imran Rashid >Priority: Major > > The external shuffle service is often upgraded at a different time than spark > itself. However, this causes problems when the protocol changes between the > shuffle service and the spark runtime -- this forces users to upgrade > everything simultaneously. > We should add versioning to the shuffle client & server, so they know what > messages the other will support. This would allow better handling of mixed > versions, from better error msgs to allowing some mismatched versions (with > reduced capabilities). > This originally came up in a discussion here: > https://github.com/apache/spark/pull/24565#issuecomment-493496466 > There are a few ways we could do the versioning which we still need to > discuss: > 1) Version specified by config. This allows for mixed versions across the > cluster and rolling upgrades. It also will let a spark 3.0 client talk to a > 2.4 shuffle service. But, may be a nuisance for users to get this right. > 2) Auto-detection during registration with local shuffle service. This makes > the versioning easy for the end user, and can even handle a 2.4 shuffle > service though it does not support the new versioning. However, it will not > handle a rolling upgrade correctly -- if the local shuffle service has been > upgraded, but other nodes in the cluster have not, it will get the version > wrong. > 3) Exchange versions per-connection. When a connection is opened, the server > & client could first exchange messages with their versions, so they know how > to continue communication after that. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org