[jira] [Created] (SPARK-32504) Shuffle Storage API: Dynamic updates of shuffle metadata
Matt Cheah created SPARK-32504: -- Summary: Shuffle Storage API: Dynamic updates of shuffle metadata Key: SPARK-32504 URL: https://issues.apache.org/jira/browse/SPARK-32504 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah When using external storage for shuffles as part of the shuffle storage API mechanism, it is often desirable to update the metadata associated with shuffles that we have enabled plugin systems to implement via https://issues.apache.org/jira/browse/SPARK-31801. For example: # If data is stored in some replicated manner, and the number of replicas is updated - then we want the metadata stored on the driver to reflect the new number of replicas and where they are located. # If data is stored on the mapper's local disk, but is asynchronously backed up to some external storage medium, then we want to know when a backup is available externally. To achieve this, we would need to pass a hook to updating the shuffle metadata to the shuffle executor components at the root of the plugin tree on the executor side. The executor would establish an RPC connection with the driver and send messages to update shuffle metadata accordingly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28210) Shuffle Storage API: Reads
[ https://issues.apache.org/jira/browse/SPARK-28210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168287#comment-17168287 ] Matt Cheah commented on SPARK-28210: [~devaraj] [~tianczha] Thanks for expressing interest in this! This patch blocks on the shuffle metadata APIs patch, which one can find here: [https://github.com/apache/spark/pull/28618.|https://github.com/apache/spark/pull/28618] I think after merging the shuffle metadata API change, we can provide the appropriate reader APIs and then integrate the usage of shuffle metadata accordingly. I originally had a diff here: [https://github.com/mccheah/spark/pull/12], but it's fallen far out of sync with the patches proposed against upstream Spark. The proposed reader API can be found on the shuffle API design document: [https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit#heading=h.cli6x4fsunz6.|https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit#heading=h.cli6x4fsunz6] But we really cannot make any progress here unless we have the shuffle metadata storage APIs and integration complete. Can you take a review through the Apache patch listed above, give your +1 or feedback on how it can be improved, and then we can go from there? > Shuffle Storage API: Reads > -- > > Key: SPARK-28210 > URL: https://issues.apache.org/jira/browse/SPARK-28210 > Project: Spark > Issue Type: Sub-task > Components: Shuffle, Spark Core >Affects Versions: 3.1.0 >Reporter: Matt Cheah >Priority: Major > > As part of the effort to store shuffle data in arbitrary places, this issue > tracks implementing an API for reading the shuffle data stored by the write > API. Also ensure that the existing shuffle implementation is refactored to > use the API. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31801) Register shuffle map output metadata with a shuffle output tracker
Matt Cheah created SPARK-31801: -- Summary: Register shuffle map output metadata with a shuffle output tracker Key: SPARK-31801 URL: https://issues.apache.org/jira/browse/SPARK-31801 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.1.0 Reporter: Matt Cheah Part of the design as discussed in [this document|https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit#]. Establish a {{ShuffleOutputTracker}} API that resides on the driver, and handle accepting map output metadata returned by the map output writers and send them to the output tracker module accordingly. Requires https://issues.apache.org/jira/browse/SPARK-31798. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31798) Return map output metadata from shuffle writers
[ https://issues.apache.org/jira/browse/SPARK-31798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-31798: --- Description: Part of the overall sub-push for shuffle metadata management as proposed in [this document|https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]. The shuffle plugin system needs a means for storing custom metadata for shuffle readers to eventually use. This task tracks the API changes required for the map output writer to return some such metadata back to the driver. was: Part of the overall sub-push for shuffle metadata management as proposed in [this document|[https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]] The shuffle plugin system needs a means for storing custom metadata for shuffle readers to eventually use. This task tracks the API changes required for the map output writer to return some such metadata back to the driver. > Return map output metadata from shuffle writers > --- > > Key: SPARK-31798 > URL: https://issues.apache.org/jira/browse/SPARK-31798 > Project: Spark > Issue Type: Sub-task > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Priority: Major > > Part of the overall sub-push for shuffle metadata management as proposed in > [this > document|https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]. > The shuffle plugin system needs a means for storing custom metadata for > shuffle readers to eventually use. This task tracks the API changes required > for the map output writer to return some such metadata back to the driver. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31798) Return map output metadata from shuffle writers
[ https://issues.apache.org/jira/browse/SPARK-31798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-31798: --- Description: Part of the overall sub-push for shuffle metadata management as proposed in [this document|[https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]] The shuffle plugin system needs a means for storing custom metadata for shuffle readers to eventually use. This task tracks the API changes required for the map output writer to return some such metadata back to the driver. was: Part of the overall sub-push for shuffle metadata management as proposed in [this document|[https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]]. The shuffle plugin system needs a means for storing custom metadata for shuffle readers to eventually use. This task tracks the API changes required for the map output writer to return some such metadata back to the driver. > Return map output metadata from shuffle writers > --- > > Key: SPARK-31798 > URL: https://issues.apache.org/jira/browse/SPARK-31798 > Project: Spark > Issue Type: Sub-task > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Priority: Major > > Part of the overall sub-push for shuffle metadata management as proposed in > [this > document|[https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]] > The shuffle plugin system needs a means for storing custom metadata for > shuffle readers to eventually use. This task tracks the API changes required > for the map output writer to return some such metadata back to the driver. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-31798) Return map output metadata from shuffle writers
[ https://issues.apache.org/jira/browse/SPARK-31798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-31798: --- Description: Part of the overall sub-push for shuffle metadata management as proposed in [this document|[https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]]. The shuffle plugin system needs a means for storing custom metadata for shuffle readers to eventually use. This task tracks the API changes required for the map output writer to return some such metadata back to the driver. was: Part of the overall sub-push for shuffle metadata management as proposed in[ this document|[https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]]. The shuffle plugin system needs a means for storing custom metadata for shuffle readers to eventually use. This task tracks the API changes required for the map output writer to return some such metadata back to the driver. > Return map output metadata from shuffle writers > --- > > Key: SPARK-31798 > URL: https://issues.apache.org/jira/browse/SPARK-31798 > Project: Spark > Issue Type: Sub-task > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Priority: Major > > Part of the overall sub-push for shuffle metadata management as proposed in > [this > document|[https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]]. > The shuffle plugin system needs a means for storing custom metadata for > shuffle readers to eventually use. This task tracks the API changes required > for the map output writer to return some such metadata back to the driver. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-31798) Return map output metadata from shuffle writers
Matt Cheah created SPARK-31798: -- Summary: Return map output metadata from shuffle writers Key: SPARK-31798 URL: https://issues.apache.org/jira/browse/SPARK-31798 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah Part of the overall sub-push for shuffle metadata management as proposed in[ this document|[https://docs.google.com/document/d/1Aj6IyMsbS2sdIfHxLvIbHUNjHIWHTabfknIPoxOrTjk/edit]]. The shuffle plugin system needs a means for storing custom metadata for shuffle readers to eventually use. This task tracks the API changes required for the map output writer to return some such metadata back to the driver. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29072) Properly track shuffle write time with refactor
[ https://issues.apache.org/jira/browse/SPARK-29072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-29072: --- Description: From SPARK-28209, SPARK-28570, and SPARK-28571, we used the new shuffle writer plugin API across all the shuffle writers. However, we accidentally lost time tracking metrics for shuffle writes in the process, particularly for UnsafeShuffleWriter when writing with streams (without transferTo), as well as the SortShuffleWriter. (was: From SPARK-28209, SPARK-28570, and SPARK-28571, we used the new shuffle writer plugin API across all the shuffle writers. However, some mistakes were made in the process: * We lost shuffle write time metrics for the Unsafe and Sort shuffle writer implementations * There was a duplicate implementation of ShufflePartitionPairsWriter introduced This issue tracks rectifying both situations; the second is related to the first.) > Properly track shuffle write time with refactor > --- > > Key: SPARK-29072 > URL: https://issues.apache.org/jira/browse/SPARK-29072 > Project: Spark > Issue Type: Bug > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Priority: Major > > From SPARK-28209, SPARK-28570, and SPARK-28571, we used the new shuffle > writer plugin API across all the shuffle writers. However, we accidentally > lost time tracking metrics for shuffle writes in the process, particularly > for UnsafeShuffleWriter when writing with streams (without transferTo), as > well as the SortShuffleWriter. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29072) Properly track shuffle write time with refactor
[ https://issues.apache.org/jira/browse/SPARK-29072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-29072: --- Description: >From SPARK-28209, SPARK-28570, and SPARK-28571, we used the new shuffle writer >plugin API across all the shuffle writers. However, some mistakes were made in >the process: * We lost shuffle write time metrics for the Unsafe and Sort shuffle writer implementations * There was a duplicate implementation of ShufflePartitionPairsWriter introduced This issue tracks rectifying both situations; the second is related to the first. was: >From SPARK-28209, SPARK-28570, and SPARK-28571, we used the new shuffle writer >plugin API across all the shuffle writers. However, some mistakes were made in >the process: * We lost shuffle write time metrics for the Unsafe and Sort shuffle writer implementations * There was a duplicate implementation of ShufflePartitionPairsWriter introduced > Properly track shuffle write time with refactor > --- > > Key: SPARK-29072 > URL: https://issues.apache.org/jira/browse/SPARK-29072 > Project: Spark > Issue Type: Bug > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Priority: Major > > From SPARK-28209, SPARK-28570, and SPARK-28571, we used the new shuffle > writer plugin API across all the shuffle writers. However, some mistakes were > made in the process: > * We lost shuffle write time metrics for the Unsafe and Sort shuffle writer > implementations > * There was a duplicate implementation of ShufflePartitionPairsWriter > introduced > This issue tracks rectifying both situations; the second is related to the > first. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29072) Properly track shuffle write time with refactor
[ https://issues.apache.org/jira/browse/SPARK-29072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-29072: --- Description: >From SPARK-28209, SPARK-28570, and SPARK-28571, we used the new shuffle writer >plugin API across all the shuffle writers. However, some mistakes were made in >the process: * We lost shuffle write time metrics for the Unsafe and Sort shuffle writer implementations * There was a duplicate implementation of ShufflePartitionPairsWriter introduced > Properly track shuffle write time with refactor > --- > > Key: SPARK-29072 > URL: https://issues.apache.org/jira/browse/SPARK-29072 > Project: Spark > Issue Type: Bug > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Priority: Major > > From SPARK-28209, SPARK-28570, and SPARK-28571, we used the new shuffle > writer plugin API across all the shuffle writers. However, some mistakes were > made in the process: > * We lost shuffle write time metrics for the Unsafe and Sort shuffle writer > implementations > * There was a duplicate implementation of ShufflePartitionPairsWriter > introduced -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-29072) Properly track shuffle write time with refactor
Matt Cheah created SPARK-29072: -- Summary: Properly track shuffle write time with refactor Key: SPARK-29072 URL: https://issues.apache.org/jira/browse/SPARK-29072 Project: Spark Issue Type: Bug Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28764) Remove unnecessary writePartitionedFile method from ExternalSorter
Matt Cheah created SPARK-28764: -- Summary: Remove unnecessary writePartitionedFile method from ExternalSorter Key: SPARK-28764 URL: https://issues.apache.org/jira/browse/SPARK-28764 Project: Spark Issue Type: Task Components: Shuffle, Tests Affects Versions: 3.0.0 Reporter: Matt Cheah Following SPARK-28571, we now use {{ExternalSorter#writePartitionedData}} in {{SortShuffleWriter}} when persisting the shuffle data via the shuffle writer plugin. However, we left the {{writePartitionedFile}} method on {{ExternalSorter}} strictly for tests. We should figure out a way how to refactor those tests to use {{writePartitionedData}} instead of {{writePartitionedFile}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28607) Don't hold a reference to two partitionLengths arrays
[ https://issues.apache.org/jira/browse/SPARK-28607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-28607: --- Description: SPARK-28209 introduced the new shuffle writer API and its usage in BypassMergeSortShuffleWriter. However, the design of the API forces the partition lengths to be tracked both in the implementation of the plugin and also by the higher-level writer. This leads to redundant memory usage. We should only track the lengths of the partitions in the implementation of the plugin and propagate this information back up to the writer as the return value of {{commitAllPartitions}}. > Don't hold a reference to two partitionLengths arrays > - > > Key: SPARK-28607 > URL: https://issues.apache.org/jira/browse/SPARK-28607 > Project: Spark > Issue Type: Sub-task > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Priority: Major > > SPARK-28209 introduced the new shuffle writer API and its usage in > BypassMergeSortShuffleWriter. However, the design of the API forces the > partition lengths to be tracked both in the implementation of the plugin and > also by the higher-level writer. This leads to redundant memory usage. We > should only track the lengths of the partitions in the implementation of the > plugin and propagate this information back up to the writer as the return > value of {{commitAllPartitions}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28607) Don't hold a reference to two partitionLengths arrays
Matt Cheah created SPARK-28607: -- Summary: Don't hold a reference to two partitionLengths arrays Key: SPARK-28607 URL: https://issues.apache.org/jira/browse/SPARK-28607 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28209) Shuffle Storage API: Writer API and usage in BypassMergeSortShuffleWriter
[ https://issues.apache.org/jira/browse/SPARK-28209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-28209: --- Description: Adds the write-side API for storing shuffle data in arbitrary storage systems. Also refactor the BypassMergeSortShuffleWriter so that it uses this API, demonstrating the API's viability. (was: Adds the write-side API for storing shuffle data in arbitrary storage systems. Also refactor the existing shuffle write code so that it uses this API.) > Shuffle Storage API: Writer API and usage in BypassMergeSortShuffleWriter > - > > Key: SPARK-28209 > URL: https://issues.apache.org/jira/browse/SPARK-28209 > Project: Spark > Issue Type: Sub-task > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Assignee: Matt Cheah >Priority: Major > Fix For: 3.0.0 > > > Adds the write-side API for storing shuffle data in arbitrary storage > systems. Also refactor the BypassMergeSortShuffleWriter so that it uses this > API, demonstrating the API's viability. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28571) Shuffle storage API: Use API in SortShuffleWriter
Matt Cheah created SPARK-28571: -- Summary: Shuffle storage API: Use API in SortShuffleWriter Key: SPARK-28571 URL: https://issues.apache.org/jira/browse/SPARK-28571 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah Use the APIs introduced in SPARK-28209 in the SortShuffleWriter. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28570) Shuffle Storage API: Use writer API in UnsafeShuffleWriter
Matt Cheah created SPARK-28570: -- Summary: Shuffle Storage API: Use writer API in UnsafeShuffleWriter Key: SPARK-28570 URL: https://issues.apache.org/jira/browse/SPARK-28570 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah Use the APIs introduced in SPARK-28209 in the UnsafeShuffleWriter. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28209) Shuffle Storage API: Writer API and usage in BypassMergeSortShuffleWriter
[ https://issues.apache.org/jira/browse/SPARK-28209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-28209: --- Summary: Shuffle Storage API: Writer API and usage in BypassMergeSortShuffleWriter (was: Shuffle Storage API: Writer API) > Shuffle Storage API: Writer API and usage in BypassMergeSortShuffleWriter > - > > Key: SPARK-28209 > URL: https://issues.apache.org/jira/browse/SPARK-28209 > Project: Spark > Issue Type: Sub-task > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Assignee: Matt Cheah >Priority: Major > Fix For: 3.0.0 > > > Adds the write-side API for storing shuffle data in arbitrary storage > systems. Also refactor the existing shuffle write code so that it uses this > API. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28209) Shuffle Storage API: Writer API
[ https://issues.apache.org/jira/browse/SPARK-28209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-28209: --- Summary: Shuffle Storage API: Writer API (was: Shuffle Storage API: Writes) > Shuffle Storage API: Writer API > --- > > Key: SPARK-28209 > URL: https://issues.apache.org/jira/browse/SPARK-28209 > Project: Spark > Issue Type: Sub-task > Components: Shuffle >Affects Versions: 3.0.0 >Reporter: Matt Cheah >Assignee: Matt Cheah >Priority: Major > Fix For: 3.0.0 > > > Adds the write-side API for storing shuffle data in arbitrary storage > systems. Also refactor the existing shuffle write code so that it uses this > API. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28568) Make Javadoc in org.apache.spark.shuffle.api visible
Matt Cheah created SPARK-28568: -- Summary: Make Javadoc in org.apache.spark.shuffle.api visible Key: SPARK-28568 URL: https://issues.apache.org/jira/browse/SPARK-28568 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28238) DESCRIBE TABLE for Data Source V2 tables
Matt Cheah created SPARK-28238: -- Summary: DESCRIBE TABLE for Data Source V2 tables Key: SPARK-28238 URL: https://issues.apache.org/jira/browse/SPARK-28238 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 3.0.0 Reporter: Matt Cheah Implement the \{{DESCRIBE TABLE}} DDL command for tables that are loaded by V2 catalogs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28212) Shuffle Storage API: Shuffle Cleanup
Matt Cheah created SPARK-28212: -- Summary: Shuffle Storage API: Shuffle Cleanup Key: SPARK-28212 URL: https://issues.apache.org/jira/browse/SPARK-28212 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah In the shuffle storage API, there should be a plugin point for removing shuffles that are no longer used. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28211) Shuffle Storage API: Driver Lifecycle
Matt Cheah created SPARK-28211: -- Summary: Shuffle Storage API: Driver Lifecycle Key: SPARK-28211 URL: https://issues.apache.org/jira/browse/SPARK-28211 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah As part of the shuffle storage API, allow users to hook in application-wide startup and shutdown methods. This can do things like create tables in the shuffle storage database, or register / unregister against file servers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28210) Shuffle Storage API: Reads
Matt Cheah created SPARK-28210: -- Summary: Shuffle Storage API: Reads Key: SPARK-28210 URL: https://issues.apache.org/jira/browse/SPARK-28210 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah As part of the effort to store shuffle data in arbitrary places, this issue tracks implementing an API for reading the shuffle data stored by the write API. Also ensure that the existing shuffle implementation is refactored to use the API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25299) Use remote storage for persisting shuffle data
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-25299: --- Description: In Spark, the shuffle primitive requires Spark executors to persist data to the local disk of the worker nodes. If executors crash, the external shuffle service can continue to serve the shuffle data that was written beyond the lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the external shuffle service is deployed on every worker node. The shuffle service shares local disk with the executors that run on its node. There are some shortcomings with the way shuffle is fundamentally implemented right now. Particularly: * If any external shuffle service process or node becomes unavailable, all applications that had an executor that ran on that node must recompute the shuffle blocks that were lost. * Similarly to the above, the external shuffle service must be kept running at all times, which may waste resources when no applications are using that shuffle service node. * Mounting local storage can prevent users from taking advantage of desirable isolation benefits from using containerized environments, like Kubernetes. We had an external shuffle service implementation in an early prototype of the Kubernetes backend, but it was rejected due to its strict requirement to be able to mount hostPath volumes or other persistent volume setups. In the following [architecture discussion document|https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit#heading=h.btqugnmt2h40] (note: _not_ an SPIP), we brainstorm various high level architectures for improving the external shuffle service in a way that addresses the above problems. The purpose of this umbrella JIRA is to promote additional discussion on how we can approach these problems, both at the architecture level and the implementation level. We anticipate filing sub-issues that break down the tasks that must be completed to achieve this goal. Edit June 28 2019: Our SPIP is here: [https://docs.google.com/document/d/1d6egnL6WHOwWZe8MWv3m8n4PToNacdx7n_0iMSWwhCQ/edit] was: In Spark, the shuffle primitive requires Spark executors to persist data to the local disk of the worker nodes. If executors crash, the external shuffle service can continue to serve the shuffle data that was written beyond the lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the external shuffle service is deployed on every worker node. The shuffle service shares local disk with the executors that run on its node. There are some shortcomings with the way shuffle is fundamentally implemented right now. Particularly: * If any external shuffle service process or node becomes unavailable, all applications that had an executor that ran on that node must recompute the shuffle blocks that were lost. * Similarly to the above, the external shuffle service must be kept running at all times, which may waste resources when no applications are using that shuffle service node. * Mounting local storage can prevent users from taking advantage of desirable isolation benefits from using containerized environments, like Kubernetes. We had an external shuffle service implementation in an early prototype of the Kubernetes backend, but it was rejected due to its strict requirement to be able to mount hostPath volumes or other persistent volume setups. In the following [architecture discussion document|https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit#heading=h.btqugnmt2h40] (note: _not_ an SPIP), we brainstorm various high level architectures for improving the external shuffle service in a way that addresses the above problems. The purpose of this umbrella JIRA is to promote additional discussion on how we can approach these problems, both at the architecture level and the implementation level. We anticipate filing sub-issues that break down the tasks that must be completed to achieve this goal. > Use remote storage for persisting shuffle data > -- > > Key: SPARK-25299 > URL: https://issues.apache.org/jira/browse/SPARK-25299 > Project: Spark > Issue Type: New Feature > Components: Shuffle >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > Labels: SPIP > > In Spark, the shuffle primitive requires Spark executors to persist data to > the local disk of the worker nodes. If executors crash, the external shuffle > service can continue to serve the shuffle data that was written beyond the > lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the > external shuffle service is deployed on every worker node. The shuffle > service shares local disk with the executors that run on
[jira] [Commented] (SPARK-25299) Use remote storage for persisting shuffle data
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875106#comment-16875106 ] Matt Cheah commented on SPARK-25299: I also noticed the SPIP document wasn't ever posted on this ticket, so sorry about that! Here's the link for everyone who wasn't following along on the mailing list: [https://docs.google.com/document/d/1d6egnL6WHOwWZe8MWv3m8n4PToNacdx7n_0iMSWwhCQ/edit] > Use remote storage for persisting shuffle data > -- > > Key: SPARK-25299 > URL: https://issues.apache.org/jira/browse/SPARK-25299 > Project: Spark > Issue Type: New Feature > Components: Shuffle >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > Labels: SPIP > > In Spark, the shuffle primitive requires Spark executors to persist data to > the local disk of the worker nodes. If executors crash, the external shuffle > service can continue to serve the shuffle data that was written beyond the > lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the > external shuffle service is deployed on every worker node. The shuffle > service shares local disk with the executors that run on its node. > There are some shortcomings with the way shuffle is fundamentally implemented > right now. Particularly: > * If any external shuffle service process or node becomes unavailable, all > applications that had an executor that ran on that node must recompute the > shuffle blocks that were lost. > * Similarly to the above, the external shuffle service must be kept running > at all times, which may waste resources when no applications are using that > shuffle service node. > * Mounting local storage can prevent users from taking advantage of > desirable isolation benefits from using containerized environments, like > Kubernetes. We had an external shuffle service implementation in an early > prototype of the Kubernetes backend, but it was rejected due to its strict > requirement to be able to mount hostPath volumes or other persistent volume > setups. > In the following [architecture discussion > document|https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit#heading=h.btqugnmt2h40] > (note: _not_ an SPIP), we brainstorm various high level architectures for > improving the external shuffle service in a way that addresses the above > problems. The purpose of this umbrella JIRA is to promote additional > discussion on how we can approach these problems, both at the architecture > level and the implementation level. We anticipate filing sub-issues that > break down the tasks that must be completed to achieve this goal. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28209) Shuffle Storage API: Writes
Matt Cheah created SPARK-28209: -- Summary: Shuffle Storage API: Writes Key: SPARK-28209 URL: https://issues.apache.org/jira/browse/SPARK-28209 Project: Spark Issue Type: Sub-task Components: Shuffle Affects Versions: 3.0.0 Reporter: Matt Cheah Adds the write-side API for storing shuffle data in arbitrary storage systems. Also refactor the existing shuffle write code so that it uses this API. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25299) Use remote storage for persisting shuffle data
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875101#comment-16875101 ] Matt Cheah commented on SPARK-25299: Let's start by making sub-issues. I have a patch staged for master I intend to post by end of day. > Use remote storage for persisting shuffle data > -- > > Key: SPARK-25299 > URL: https://issues.apache.org/jira/browse/SPARK-25299 > Project: Spark > Issue Type: New Feature > Components: Shuffle >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > Labels: SPIP > > In Spark, the shuffle primitive requires Spark executors to persist data to > the local disk of the worker nodes. If executors crash, the external shuffle > service can continue to serve the shuffle data that was written beyond the > lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the > external shuffle service is deployed on every worker node. The shuffle > service shares local disk with the executors that run on its node. > There are some shortcomings with the way shuffle is fundamentally implemented > right now. Particularly: > * If any external shuffle service process or node becomes unavailable, all > applications that had an executor that ran on that node must recompute the > shuffle blocks that were lost. > * Similarly to the above, the external shuffle service must be kept running > at all times, which may waste resources when no applications are using that > shuffle service node. > * Mounting local storage can prevent users from taking advantage of > desirable isolation benefits from using containerized environments, like > Kubernetes. We had an external shuffle service implementation in an early > prototype of the Kubernetes backend, but it was rejected due to its strict > requirement to be able to mount hostPath volumes or other persistent volume > setups. > In the following [architecture discussion > document|https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit#heading=h.btqugnmt2h40] > (note: _not_ an SPIP), we brainstorm various high level architectures for > improving the external shuffle service in a way that addresses the above > problems. The purpose of this umbrella JIRA is to promote additional > discussion on how we can approach these problems, both at the architecture > level and the implementation level. We anticipate filing sub-issues that > break down the tasks that must be completed to achieve this goal. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-26874) With PARQUET-1414, Spark can erroneously write empty pages
[ https://issues.apache.org/jira/browse/SPARK-26874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-26874. Resolution: Not A Problem I did some more digging with [~rdblue] and we discovered that this is a root problem on the Parquet side, and particularly in a feature that hasn't been released yet. We'll continue pursuing solutions there. > With PARQUET-1414, Spark can erroneously write empty pages > -- > > Key: SPARK-26874 > URL: https://issues.apache.org/jira/browse/SPARK-26874 > Project: Spark > Issue Type: Bug > Components: Input/Output, SQL >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > > This issue will only come up when Spark upgrades its Parquet dependency to > the latest off of parquet-mr/master. This issue is being filed to proactively > fix the bug before we upgrade - it's not something that would easily be found > in the current unit tests and can be missed until the community scale tests > in an e.g. RC phase. > Parquet introduced a new feature to limit the number of rows written to a > page in a column chunk - see PARQUET-1414. Previously, Parquet would only > flush pages to the column store after the page writer had filled its buffer > with a certain amount of bytes. The idea of the Parquet patch was to make > page writers flush to the column store upon the writer being given a certain > number of rows - the default value is 2. > The patch makes the Spark Parquet Data Source erroneously write empty pages > to column chunks, making the Parquet file ultimately unreadable with > exceptions like these: > > {code:java} > Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value > at 0 in block -1 in file > file:/private/var/folders/7f/rrg37sj15r33cwlxbhg7fyg5080xxg/T/spark-2c1b1e5d-c132-42fe-98e1-b523b9baa176/parquet-data/part-2-b8b4bb3c-b49f-440b-b96c-53fcd19786ad-c000.snappy.parquet > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251) > at > org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207) > at > org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:101) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:181) > ... 18 more > Caused by: java.lang.IllegalArgumentException: Reading past RLE/BitPacking > stream. > at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:53) > at > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readNext(RunLengthBitPackingHybridDecoder.java:80) > at > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readInt(RunLengthBitPackingHybridDecoder.java:62) > at > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridValuesReader.readInteger(RunLengthBitPackingHybridValuesReader.java:53) > at > org.apache.parquet.column.impl.ColumnReaderBase$ValuesReaderIntIterator.nextInt(ColumnReaderBase.java:733) > at > org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:567) > at > org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:705) > at > org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30) > at > org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:47) > at > org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:84) > at > org.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:271) > at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147) > at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109) > at > org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165) > at > org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:222) > ... 22 more > {code} > What's happening here is that the reader is being given a page with no > values, which Parquet can never handle. > The root cause is due to the way Spark treats empty (null) records in > optional fields. Concretely, in {{ParquetWriteSupport#write}}, we always > indicate to the recordConsumer that we are starting a message > ({{recordConsumer#startMessage}}). If Spark is given a null field in the row, > it still indicates to the record consumer after having ignored the
[jira] [Updated] (SPARK-26874) With PARQUET-1414, Spark can erroneously write empty pages
[ https://issues.apache.org/jira/browse/SPARK-26874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-26874: --- Description: This issue will only come up when Spark upgrades its Parquet dependency to the latest off of parquet-mr/master. This issue is being filed to proactively fix the bug before we upgrade - it's not something that would easily be found in the current unit tests and can be missed until the community scale tests in an e.g. RC phase. Parquet introduced a new feature to limit the number of rows written to a page in a column chunk - see PARQUET-1414. Previously, Parquet would only flush pages to the column store after the page writer had filled its buffer with a certain amount of bytes. The idea of the Parquet patch was to make page writers flush to the column store upon the writer being given a certain number of rows - the default value is 2. The patch makes the Spark Parquet Data Source erroneously write empty pages to column chunks, making the Parquet file ultimately unreadable with exceptions like these: {code:java} Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file file:/private/var/folders/7f/rrg37sj15r33cwlxbhg7fyg5080xxg/T/spark-2c1b1e5d-c132-42fe-98e1-b523b9baa176/parquet-data/part-2-b8b4bb3c-b49f-440b-b96c-53fcd19786ad-c000.snappy.parquet at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251) at org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207) at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:101) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:181) ... 18 more Caused by: java.lang.IllegalArgumentException: Reading past RLE/BitPacking stream. at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:53) at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readNext(RunLengthBitPackingHybridDecoder.java:80) at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readInt(RunLengthBitPackingHybridDecoder.java:62) at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridValuesReader.readInteger(RunLengthBitPackingHybridValuesReader.java:53) at org.apache.parquet.column.impl.ColumnReaderBase$ValuesReaderIntIterator.nextInt(ColumnReaderBase.java:733) at org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:567) at org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:705) at org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30) at org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:47) at org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:84) at org.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:271) at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147) at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109) at org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165) at org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109) at org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137) at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:222) ... 22 more {code} What's happening here is that the reader is being given a page with no values, which Parquet can never handle. The root cause is due to the way Spark treats empty (null) records in optional fields. Concretely, in {{ParquetWriteSupport#write}}, we always indicate to the recordConsumer that we are starting a message ({{recordConsumer#startMessage}}). If Spark is given a null field in the row, it still indicates to the record consumer after having ignored the row that the message is finished ({{recordConsumer#endMessage}}). The ending of the message causes all column writers to increment their row count in the current page by 1, despite the fact that Spark is not necessarily sending records to the underlying page writer. Now suppose the page maximum row count is N; if Spark does the above N times in a page, and particularly if Spark cuts a page boundary and is subsequently given N empty values for an optional column - the column writer will then think it needs to flush the page to the column chunk store and will write out an empty page. This will most likely be manifested in very sparse columns. A simple demonstration of the issue is given below. Assume this code is manually inserted into {{ParquetIOSuite}}: {code:java}
[jira] [Updated] (SPARK-26874) When we upgrade Parquet to 1.11+, Spark can erroneously write empty pages
[ https://issues.apache.org/jira/browse/SPARK-26874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-26874: --- Priority: Critical (was: Major) > When we upgrade Parquet to 1.11+, Spark can erroneously write empty pages > - > > Key: SPARK-26874 > URL: https://issues.apache.org/jira/browse/SPARK-26874 > Project: Spark > Issue Type: Bug > Components: Input/Output, SQL >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Critical > > This issue will only come up when Spark upgrades its Parquet dependency to > the latest. This issue is being filed to proactively fix the bug before we > upgrade - it's not something that would easily be found in the current unit > tests and can be missed until the community scale tests in an e.g. RC phase. > Parquet introduced a new feature to limit the number of rows written to a > page in a column chunk - see PARQUET-1414. Previously, Parquet would only > flush pages to the column store after the page writer had filled its buffer > with a certain amount of bytes. The idea of the Parquet patch was to make > page writers flush to the column store upon the writer being given a certain > number of rows - the default value is 2. > The patch makes the Spark Parquet Data Source erroneously write empty pages > to column chunks, making the Parquet file ultimately unreadable with > exceptions like these: > > {code:java} > Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value > at 0 in block -1 in file > file:/private/var/folders/7f/rrg37sj15r33cwlxbhg7fyg5080xxg/T/spark-2c1b1e5d-c132-42fe-98e1-b523b9baa176/parquet-data/part-2-b8b4bb3c-b49f-440b-b96c-53fcd19786ad-c000.snappy.parquet > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251) > at > org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207) > at > org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:101) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:181) > ... 18 more > Caused by: java.lang.IllegalArgumentException: Reading past RLE/BitPacking > stream. > at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:53) > at > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readNext(RunLengthBitPackingHybridDecoder.java:80) > at > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readInt(RunLengthBitPackingHybridDecoder.java:62) > at > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridValuesReader.readInteger(RunLengthBitPackingHybridValuesReader.java:53) > at > org.apache.parquet.column.impl.ColumnReaderBase$ValuesReaderIntIterator.nextInt(ColumnReaderBase.java:733) > at > org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:567) > at > org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:705) > at > org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30) > at > org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:47) > at > org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:84) > at > org.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:271) > at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147) > at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109) > at > org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165) > at > org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:222) > ... 22 more > {code} > What's happening here is that the reader is being given a page with no > values, which Parquet can never handle. > The root cause is due to the way Spark treats empty (null) records in > optional fields. Concretely, in {{ParquetWriteSupport#write}}, we always > indicate to the recordConsumer that we are starting a message > ({{recordConsumer#startMessage}}). If Spark is given a null field in the row, > it still indicates to the record consumer after having ignored the row that > the message is finished ({{recordConsumer#endMessage}}). The ending of the > message causes all column writers to increment their row count in the current > page by 1, despite the fact that
[jira] [Updated] (SPARK-26874) When we upgrade Parquet to 1.11+, Spark can erroneously write empty pages
[ https://issues.apache.org/jira/browse/SPARK-26874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-26874: --- Description: This issue will only come up when Spark upgrades its Parquet dependency to the latest. This issue is being filed to proactively fix the bug before we upgrade - it's not something that would easily be found in the current unit tests and can be missed until the community scale tests in an e.g. RC phase. Parquet introduced a new feature to limit the number of rows written to a page in a column chunk - see PARQUET-1414. Previously, Parquet would only flush pages to the column store after the page writer had filled its buffer with a certain amount of bytes. The idea of the Parquet patch was to make page writers flush to the column store upon the writer being given a certain number of rows - the default value is 2. The patch makes the Spark Parquet Data Source erroneously write empty pages to column chunks, making the Parquet file ultimately unreadable with exceptions like these: {code:java} Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file file:/private/var/folders/7f/rrg37sj15r33cwlxbhg7fyg5080xxg/T/spark-2c1b1e5d-c132-42fe-98e1-b523b9baa176/parquet-data/part-2-b8b4bb3c-b49f-440b-b96c-53fcd19786ad-c000.snappy.parquet at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251) at org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207) at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:101) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:181) ... 18 more Caused by: java.lang.IllegalArgumentException: Reading past RLE/BitPacking stream. at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:53) at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readNext(RunLengthBitPackingHybridDecoder.java:80) at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readInt(RunLengthBitPackingHybridDecoder.java:62) at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridValuesReader.readInteger(RunLengthBitPackingHybridValuesReader.java:53) at org.apache.parquet.column.impl.ColumnReaderBase$ValuesReaderIntIterator.nextInt(ColumnReaderBase.java:733) at org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:567) at org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:705) at org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30) at org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:47) at org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:84) at org.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:271) at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147) at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109) at org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165) at org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109) at org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137) at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:222) ... 22 more {code} What's happening here is that the reader is being given a page with no values, which Parquet can never handle. The root cause is due to the way Spark treats empty (null) records in optional fields. Concretely, in {{ParquetWriteSupport#write}}, we always indicate to the recordConsumer that we are starting a message ({{recordConsumer#startMessage}}). If Spark is given a null field in the row, it still indicates to the record consumer after having ignored the row that the message is finished ({{recordConsumer#endMessage}}). The ending of the message causes all column writers to increment their row count in the current page by 1, despite the fact that Spark is not necessarily sending records to the underlying page writer. Now suppose the page maximum row count is N; if Spark does the above N times in a page, and particularly if Spark cuts a page boundary and is subsequently given N empty values for an optional column - the column writer will then think it needs to flush the page to the column chunk store and will write out an empty page. This will most likely be manifested in very sparse columns. A simple demonstration of the issue is given below. Assume this code is manually inserted into {{ParquetIOSuite}}: {code:java} test("PARQUET-1414 Problems")
[jira] [Commented] (SPARK-26874) When we upgrade Parquet to 1.11+, Spark can erroneously write empty pages
[ https://issues.apache.org/jira/browse/SPARK-26874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767736#comment-16767736 ] Matt Cheah commented on SPARK-26874: [~rdblue] [~cloud_fan] - was wondering if you had any thoughts here or can possibly confirm my understanding of what's going on. > When we upgrade Parquet to 1.11+, Spark can erroneously write empty pages > - > > Key: SPARK-26874 > URL: https://issues.apache.org/jira/browse/SPARK-26874 > Project: Spark > Issue Type: Bug > Components: Input/Output, SQL >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > > This issue will only come up when Spark upgrades its Parquet dependency to > the latest. This issue is being filed to proactively fix the bug before we > upgrade - it's not something that would easily be found in the current unit > tests and can be missed until the community scale tests in an e.g. RC phase. > Parquet introduced a new feature to limit the number of rows written to a > page in a column chunk - see PARQUET-1414. Previously, Parquet would only > flush pages to the column store after the page writer had filled its buffer > with a certain amount of bytes. The idea of the Parquet patch was to make > page writers flush to the column store upon the writer being given a certain > number of rows - the default value is 2. > The patch makes the Spark Parquet Data Source erroneously write empty pages > to column chunks, making the Parquet file ultimately unreadable with > exceptions like these: > > {code:java} > Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value > at 0 in block -1 in file > file:/private/var/folders/7f/rrg37sj15r33cwlxbhg7fyg5080xxg/T/spark-2c1b1e5d-c132-42fe-98e1-b523b9baa176/parquet-data/part-2-b8b4bb3c-b49f-440b-b96c-53fcd19786ad-c000.snappy.parquet > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251) > at > org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207) > at > org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:101) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:181) > ... 18 more > Caused by: java.lang.IllegalArgumentException: Reading past RLE/BitPacking > stream. > at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:53) > at > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readNext(RunLengthBitPackingHybridDecoder.java:80) > at > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readInt(RunLengthBitPackingHybridDecoder.java:62) > at > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridValuesReader.readInteger(RunLengthBitPackingHybridValuesReader.java:53) > at > org.apache.parquet.column.impl.ColumnReaderBase$ValuesReaderIntIterator.nextInt(ColumnReaderBase.java:733) > at > org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:567) > at > org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:705) > at > org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30) > at > org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:47) > at > org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:84) > at > org.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:271) > at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147) > at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109) > at > org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165) > at > org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:222) > ... 22 more > {code} > What's happening here is that the reader is being given a page with no > values, which Parquet can never handle. > The root cause is due to the way Spark treats empty (null) records in > optional fields. Concretely, in {{ParquetWriteSupport#write}}, we always > indicate to the recordConsumer that we are starting a message > ({{recordConsumer#startMessage}}). If Spark is given a null field in the row, > it still indicates to the record consumer after having written the row that > the message is finished ({{recordConsumer#endMessage}}). The
[jira] [Created] (SPARK-26874) When we upgrade Parquet to 1.11+, Spark can erroneously write empty pages
Matt Cheah created SPARK-26874: -- Summary: When we upgrade Parquet to 1.11+, Spark can erroneously write empty pages Key: SPARK-26874 URL: https://issues.apache.org/jira/browse/SPARK-26874 Project: Spark Issue Type: Bug Components: Input/Output, SQL Affects Versions: 2.4.0 Reporter: Matt Cheah This issue will only come up when Spark upgrades its Parquet dependency to the latest. This issue is being filed to proactively fix the bug before we upgrade - it's not something that would easily be found in the current unit tests and can be missed until the community scale tests in an e.g. RC phase. Parquet introduced a new feature to limit the number of rows written to a page in a column chunk - see PARQUET-1414. Previously, Parquet would only flush pages to the column store after the page writer had filled its buffer with a certain amount of bytes. The idea of the Parquet patch was to make page writers flush to the column store upon the writer being given a certain number of rows - the default value is 2. The patch makes the Spark Parquet Data Source erroneously write empty pages to column chunks, making the Parquet file ultimately unreadable with exceptions like these: {code:java} Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file file:/private/var/folders/7f/rrg37sj15r33cwlxbhg7fyg5080xxg/T/spark-2c1b1e5d-c132-42fe-98e1-b523b9baa176/parquet-data/part-2-b8b4bb3c-b49f-440b-b96c-53fcd19786ad-c000.snappy.parquet at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251) at org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207) at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:101) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:181) ... 18 more Caused by: java.lang.IllegalArgumentException: Reading past RLE/BitPacking stream. at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:53) at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readNext(RunLengthBitPackingHybridDecoder.java:80) at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readInt(RunLengthBitPackingHybridDecoder.java:62) at org.apache.parquet.column.values.rle.RunLengthBitPackingHybridValuesReader.readInteger(RunLengthBitPackingHybridValuesReader.java:53) at org.apache.parquet.column.impl.ColumnReaderBase$ValuesReaderIntIterator.nextInt(ColumnReaderBase.java:733) at org.apache.parquet.column.impl.ColumnReaderBase.checkRead(ColumnReaderBase.java:567) at org.apache.parquet.column.impl.ColumnReaderBase.consume(ColumnReaderBase.java:705) at org.apache.parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:30) at org.apache.parquet.column.impl.ColumnReaderImpl.(ColumnReaderImpl.java:47) at org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:84) at org.apache.parquet.io.RecordReaderImplementation.(RecordReaderImplementation.java:271) at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147) at org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109) at org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:165) at org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109) at org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:137) at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:222) ... 22 more {code} What's happening here is that the reader is being given a page with no values, which Parquet can never handle. The root cause is due to the way Spark treats empty (null) records in optional fields. Concretely, in {{ParquetWriteSupport#write}}, we always indicate to the recordConsumer that we are starting a message ({{recordConsumer#startMessage}}). If Spark is given a null field in the row, it still indicates to the record consumer after having written the row that the message is finished ({{recordConsumer#endMessage}}). The ending of the message causes all column writers to increment their row count in the current page by 1, despite the fact that Spark is not necessarily sending records to the underlying page writer. Now suppose the page maximum row count is N; if Spark does the above N times in a page, and particularly if Spark cuts a page boundary and is subsequently given N empty values for an optional column - the column writer will then think it needs to flush the page to the column chunk store and will write out an empty page.
[jira] [Resolved] (SPARK-26625) spark.redaction.regex should include oauthToken
[ https://issues.apache.org/jira/browse/SPARK-26625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-26625. Resolution: Fixed Fix Version/s: 3.0.0 > spark.redaction.regex should include oauthToken > --- > > Key: SPARK-26625 > URL: https://issues.apache.org/jira/browse/SPARK-26625 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core >Affects Versions: 2.4.0 >Reporter: Vinoo Ganesh >Priority: Critical > Fix For: 3.0.0 > > > The regex (spark.redaction.regex) that is used to decide which config > properties or environment settings are sensitive should also include > oauthToken to match spark.kubernetes.authenticate.submission.oauthToken -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-25877) Put all feature-related code in the feature step itself
[ https://issues.apache.org/jira/browse/SPARK-25877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-25877. Resolution: Fixed Fix Version/s: 3.0.0 > Put all feature-related code in the feature step itself > --- > > Key: SPARK-25877 > URL: https://issues.apache.org/jira/browse/SPARK-25877 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Marcelo Vanzin >Priority: Major > Fix For: 3.0.0 > > > This is a child task of SPARK-25874. It covers having all the code related to > features in the feature steps themselves, including logic about whether a > step should be applied or not. > Please refer to the parent bug for further details. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-26301) Consider switching from putting secret in environment variable directly to using secret reference
Matt Cheah created SPARK-26301: -- Summary: Consider switching from putting secret in environment variable directly to using secret reference Key: SPARK-26301 URL: https://issues.apache.org/jira/browse/SPARK-26301 Project: Spark Issue Type: New Feature Components: Kubernetes Affects Versions: 3.0.0 Reporter: Matt Cheah In SPARK-26194 we proposed using an environment variable that is loaded in the executor pod spec to share the generated SASL secret key between the driver and the executors. However in practice this is very difficult to secure. Most traditional Kubernetes deployments will handle permissions by allowing wide access to viewing pod specs but restricting access to view Kubernetes secrets. Now however any user that can view the pod spec can also view the contents of the SASL secrets. An example use case where this quickly breaks down is in the case where a systems administrator is allowed to look at pods that run user code in order to debug failing infrastructure, but the cluster administrator should not be able to view contents of secrets or other sensitive data from Spark applications run by their users. We propose modifying the existing solution to instead automatically create a Kubernetes Secret object containing the SASL encryption key, then using the [secret reference feature in Kubernetes|https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets-as-environment-variables] to store the data in the environment variable without putting the secret data in the pod spec itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-26194) Support automatic spark.authenticate secret in Kubernetes backend
[ https://issues.apache.org/jira/browse/SPARK-26194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-26194. Resolution: Fixed Fix Version/s: 3.0.0 > Support automatic spark.authenticate secret in Kubernetes backend > - > > Key: SPARK-26194 > URL: https://issues.apache.org/jira/browse/SPARK-26194 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Marcelo Vanzin >Priority: Major > Fix For: 3.0.0 > > > Currently k8s inherits the default behavior for {{spark.authenticate}}, which > is that the user must provide an auth secret. > k8s doesn't have that requirement and could instead generate its own unique > per-app secret, and propagate it to executors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26239) Add configurable auth secret source in k8s backend
[ https://issues.apache.org/jira/browse/SPARK-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707962#comment-16707962 ] Matt Cheah commented on SPARK-26239: It could work in client mode but is less useful there overall because the user has to determine how to get ahold of that secret file. Nevertheless for cluster mode users that have secret file mounting systems for the driver and executors, it would be a great start. I can start building the code for this. > Add configurable auth secret source in k8s backend > -- > > Key: SPARK-26239 > URL: https://issues.apache.org/jira/browse/SPARK-26239 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Marcelo Vanzin >Priority: Major > > This is a follow up to SPARK-26194, which aims to add auto-generated secrets > similar to the YARN backend. > There's a desire to support different ways to generate and propagate these > auth secrets (e.g. using things like Vault). Need to investigate: > - exposing configuration to support that > - changing SecurityManager so that it can delegate some of the > secret-handling logic to custom implementations > - figuring out whether this can also be used in client-mode, where the driver > is not created by the k8s backend in Spark. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-25876) Simplify configuration types in k8s backend
[ https://issues.apache.org/jira/browse/SPARK-25876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-25876. Resolution: Fixed Fix Version/s: 3.0.0 > Simplify configuration types in k8s backend > --- > > Key: SPARK-25876 > URL: https://issues.apache.org/jira/browse/SPARK-25876 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Marcelo Vanzin >Priority: Major > Fix For: 3.0.0 > > > This is a child of SPARK-25874 to deal with the current issues with the > different configuration objects used in the k8s backend. Please refer to the > parent for further discussion of what this means. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-26239) Add configurable auth secret source in k8s backend
[ https://issues.apache.org/jira/browse/SPARK-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705273#comment-16705273 ] Matt Cheah edited comment on SPARK-26239 at 11/30/18 8:59 PM: -- Could we add a simple version that just points to file paths for the executor and driver to load, with the secret contents being inside? The user can decide how those files are mounted into the containers. was (Author: mcheah): Would a simple addition just to point to file paths for the executor and driver to load, with the secret contents being inside? The user can decide how those files are mounted into the containers. > Add configurable auth secret source in k8s backend > -- > > Key: SPARK-26239 > URL: https://issues.apache.org/jira/browse/SPARK-26239 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Marcelo Vanzin >Priority: Major > > This is a follow up to SPARK-26194, which aims to add auto-generated secrets > similar to the YARN backend. > There's a desire to support different ways to generate and propagate these > auth secrets (e.g. using things like Vault). Need to investigate: > - exposing configuration to support that > - changing SecurityManager so that it can delegate some of the > secret-handling logic to custom implementations > - figuring out whether this can also be used in client-mode, where the driver > is not created by the k8s backend in Spark. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26239) Add configurable auth secret source in k8s backend
[ https://issues.apache.org/jira/browse/SPARK-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705273#comment-16705273 ] Matt Cheah commented on SPARK-26239: Would a simple addition just to point to file paths for the executor and driver to load, with the secret contents being inside? The user can decide how those files are mounted into the containers. > Add configurable auth secret source in k8s backend > -- > > Key: SPARK-26239 > URL: https://issues.apache.org/jira/browse/SPARK-26239 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Marcelo Vanzin >Priority: Major > > This is a follow up to SPARK-26194, which aims to add auto-generated secrets > similar to the YARN backend. > There's a desire to support different ways to generate and propagate these > auth secrets (e.g. using things like Vault). Need to investigate: > - exposing configuration to support that > - changing SecurityManager so that it can delegate some of the > secret-handling logic to custom implementations > - figuring out whether this can also be used in client-mode, where the driver > is not created by the k8s backend in Spark. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-25957) Skip building spark-r docker image if spark distribution does not have R support
[ https://issues.apache.org/jira/browse/SPARK-25957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-25957. Resolution: Fixed Fix Version/s: 3.0.0 > Skip building spark-r docker image if spark distribution does not have R > support > > > Key: SPARK-25957 > URL: https://issues.apache.org/jira/browse/SPARK-25957 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Nagaram Prasad Addepally >Priority: Major > Fix For: 3.0.0 > > > [docker-image-tool.sh|https://github.com/apache/spark/blob/master/bin/docker-image-tool.sh] > script by default tries to build spark-r image. We may not always build > spark distribution with R support. It would be good to skip building and > publishing spark-r images if R support is not available in the spark > distribution. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26078) WHERE .. IN fails to filter rows when used in combination with UNION
[ https://issues.apache.org/jira/browse/SPARK-26078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-26078: --- Labels: Correctness (was: ) > WHERE .. IN fails to filter rows when used in combination with UNION > > > Key: SPARK-26078 > URL: https://issues.apache.org/jira/browse/SPARK-26078 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.1, 2.4.0 >Reporter: Arttu Voutilainen >Priority: Blocker > Labels: correctness > > Hey, > We encountered a case where Spark SQL does not seem to handle WHERE .. IN > correctly, when used in combination with UNION, but instead returns also rows > that do not fulfill the condition. Swapping the order of the datasets in the > UNION makes the problem go away. Repro below: > > {code} > sql = SQLContext(sc) > a = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > b = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > a.registerTempTable('a') > b.registerTempTable('b') > bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'a' as source FROM a > UNION ALL > SELECT id, num, 'b' as source FROM b > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > no_bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'b' as source FROM b > UNION ALL > SELECT id, num, 'a' as source FROM a > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > bug.show() > no_bug.show() > bug.explain(True) > no_bug.explain(True) > {code} > This results in one extra row in the "bug" DF coming from DF "b", that should > not be there as it > {code:java} > >>> bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| a| > | a| 2| b| > | b| 1| b| > +---+---+--+ > >>> no_bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| b| > | a| 2| a| > +---+---+--+ > {code} > The reason can be seen in the query plans: > {code:java} > >>> bug.explain(True) > ... > == Optimized Logical Plan == > Union > == Optimized Logical Plan == > Union > :- Project [id#0, num#1L, a AS source#136] > : +- Join LeftSemi, (id#0 = id#4) > : :- LogicalRDD [id#0, num#1L], false > : +- Project [id#4] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Join LeftSemi, (id#4#172 = id#4#172) >:- Project [id#4, num#5L, b AS source#137] >: +- LogicalRDD [id#4, num#5L], false >+- Project [id#4 AS id#4#172] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > Note the line *+- Join LeftSemi, (id#4#172 = id#4#172)* - this condition > seems wrong, and I believe it causes the LeftSemi to return true for all rows > in the left-hand-side table, thus failing to filter as the WHERE .. IN > should. Compare with the non-buggy version, where both LeftSemi joins have > distinct #-things on both sides: > {code:java} > >>> no_bug.explain() > ... > == Optimized Logical Plan == > Union > :- Project [id#4, num#5L, b AS source#142] > : +- Join LeftSemi, (id#4 = id#4#173) > : :- LogicalRDD [id#4, num#5L], false > : +- Project [id#4 AS id#4#173] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Project [id#0, num#1L, a AS source#143] >+- Join LeftSemi, (id#0 = id#4#173) > :- LogicalRDD [id#0, num#1L], false > +- Project [id#4 AS id#4#173] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > > Best, > -Arttu > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26078) WHERE .. IN fails to filter rows when used in combination with UNION
[ https://issues.apache.org/jira/browse/SPARK-26078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-26078: --- Labels: Correctness correctness (was: Correctness) > WHERE .. IN fails to filter rows when used in combination with UNION > > > Key: SPARK-26078 > URL: https://issues.apache.org/jira/browse/SPARK-26078 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.1, 2.4.0 >Reporter: Arttu Voutilainen >Priority: Blocker > Labels: correctness > > Hey, > We encountered a case where Spark SQL does not seem to handle WHERE .. IN > correctly, when used in combination with UNION, but instead returns also rows > that do not fulfill the condition. Swapping the order of the datasets in the > UNION makes the problem go away. Repro below: > > {code} > sql = SQLContext(sc) > a = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > b = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > a.registerTempTable('a') > b.registerTempTable('b') > bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'a' as source FROM a > UNION ALL > SELECT id, num, 'b' as source FROM b > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > no_bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'b' as source FROM b > UNION ALL > SELECT id, num, 'a' as source FROM a > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > bug.show() > no_bug.show() > bug.explain(True) > no_bug.explain(True) > {code} > This results in one extra row in the "bug" DF coming from DF "b", that should > not be there as it > {code:java} > >>> bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| a| > | a| 2| b| > | b| 1| b| > +---+---+--+ > >>> no_bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| b| > | a| 2| a| > +---+---+--+ > {code} > The reason can be seen in the query plans: > {code:java} > >>> bug.explain(True) > ... > == Optimized Logical Plan == > Union > == Optimized Logical Plan == > Union > :- Project [id#0, num#1L, a AS source#136] > : +- Join LeftSemi, (id#0 = id#4) > : :- LogicalRDD [id#0, num#1L], false > : +- Project [id#4] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Join LeftSemi, (id#4#172 = id#4#172) >:- Project [id#4, num#5L, b AS source#137] >: +- LogicalRDD [id#4, num#5L], false >+- Project [id#4 AS id#4#172] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > Note the line *+- Join LeftSemi, (id#4#172 = id#4#172)* - this condition > seems wrong, and I believe it causes the LeftSemi to return true for all rows > in the left-hand-side table, thus failing to filter as the WHERE .. IN > should. Compare with the non-buggy version, where both LeftSemi joins have > distinct #-things on both sides: > {code:java} > >>> no_bug.explain() > ... > == Optimized Logical Plan == > Union > :- Project [id#4, num#5L, b AS source#142] > : +- Join LeftSemi, (id#4 = id#4#173) > : :- LogicalRDD [id#4, num#5L], false > : +- Project [id#4 AS id#4#173] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Project [id#0, num#1L, a AS source#143] >+- Join LeftSemi, (id#0 = id#4#173) > :- LogicalRDD [id#0, num#1L], false > +- Project [id#4 AS id#4#173] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > > Best, > -Arttu > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26078) WHERE .. IN fails to filter rows when used in combination with UNION
[ https://issues.apache.org/jira/browse/SPARK-26078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-26078: --- Target Version/s: 2.4.1, 3.0.0 > WHERE .. IN fails to filter rows when used in combination with UNION > > > Key: SPARK-26078 > URL: https://issues.apache.org/jira/browse/SPARK-26078 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.1, 2.4.0 >Reporter: Arttu Voutilainen >Priority: Blocker > Labels: correctness > > Hey, > We encountered a case where Spark SQL does not seem to handle WHERE .. IN > correctly, when used in combination with UNION, but instead returns also rows > that do not fulfill the condition. Swapping the order of the datasets in the > UNION makes the problem go away. Repro below: > > {code} > sql = SQLContext(sc) > a = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > b = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > a.registerTempTable('a') > b.registerTempTable('b') > bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'a' as source FROM a > UNION ALL > SELECT id, num, 'b' as source FROM b > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > no_bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'b' as source FROM b > UNION ALL > SELECT id, num, 'a' as source FROM a > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > bug.show() > no_bug.show() > bug.explain(True) > no_bug.explain(True) > {code} > This results in one extra row in the "bug" DF coming from DF "b", that should > not be there as it > {code:java} > >>> bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| a| > | a| 2| b| > | b| 1| b| > +---+---+--+ > >>> no_bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| b| > | a| 2| a| > +---+---+--+ > {code} > The reason can be seen in the query plans: > {code:java} > >>> bug.explain(True) > ... > == Optimized Logical Plan == > Union > == Optimized Logical Plan == > Union > :- Project [id#0, num#1L, a AS source#136] > : +- Join LeftSemi, (id#0 = id#4) > : :- LogicalRDD [id#0, num#1L], false > : +- Project [id#4] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Join LeftSemi, (id#4#172 = id#4#172) >:- Project [id#4, num#5L, b AS source#137] >: +- LogicalRDD [id#4, num#5L], false >+- Project [id#4 AS id#4#172] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > Note the line *+- Join LeftSemi, (id#4#172 = id#4#172)* - this condition > seems wrong, and I believe it causes the LeftSemi to return true for all rows > in the left-hand-side table, thus failing to filter as the WHERE .. IN > should. Compare with the non-buggy version, where both LeftSemi joins have > distinct #-things on both sides: > {code:java} > >>> no_bug.explain() > ... > == Optimized Logical Plan == > Union > :- Project [id#4, num#5L, b AS source#142] > : +- Join LeftSemi, (id#4 = id#4#173) > : :- LogicalRDD [id#4, num#5L], false > : +- Project [id#4 AS id#4#173] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Project [id#0, num#1L, a AS source#143] >+- Join LeftSemi, (id#0 = id#4#173) > :- LogicalRDD [id#0, num#1L], false > +- Project [id#4 AS id#4#173] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > > Best, > -Arttu > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26078) WHERE .. IN fails to filter rows when used in combination with UNION
[ https://issues.apache.org/jira/browse/SPARK-26078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-26078: --- Labels: correctness (was: Correctness correctness) > WHERE .. IN fails to filter rows when used in combination with UNION > > > Key: SPARK-26078 > URL: https://issues.apache.org/jira/browse/SPARK-26078 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.1, 2.4.0 >Reporter: Arttu Voutilainen >Priority: Blocker > Labels: correctness > > Hey, > We encountered a case where Spark SQL does not seem to handle WHERE .. IN > correctly, when used in combination with UNION, but instead returns also rows > that do not fulfill the condition. Swapping the order of the datasets in the > UNION makes the problem go away. Repro below: > > {code} > sql = SQLContext(sc) > a = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > b = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > a.registerTempTable('a') > b.registerTempTable('b') > bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'a' as source FROM a > UNION ALL > SELECT id, num, 'b' as source FROM b > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > no_bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'b' as source FROM b > UNION ALL > SELECT id, num, 'a' as source FROM a > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > bug.show() > no_bug.show() > bug.explain(True) > no_bug.explain(True) > {code} > This results in one extra row in the "bug" DF coming from DF "b", that should > not be there as it > {code:java} > >>> bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| a| > | a| 2| b| > | b| 1| b| > +---+---+--+ > >>> no_bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| b| > | a| 2| a| > +---+---+--+ > {code} > The reason can be seen in the query plans: > {code:java} > >>> bug.explain(True) > ... > == Optimized Logical Plan == > Union > == Optimized Logical Plan == > Union > :- Project [id#0, num#1L, a AS source#136] > : +- Join LeftSemi, (id#0 = id#4) > : :- LogicalRDD [id#0, num#1L], false > : +- Project [id#4] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Join LeftSemi, (id#4#172 = id#4#172) >:- Project [id#4, num#5L, b AS source#137] >: +- LogicalRDD [id#4, num#5L], false >+- Project [id#4 AS id#4#172] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > Note the line *+- Join LeftSemi, (id#4#172 = id#4#172)* - this condition > seems wrong, and I believe it causes the LeftSemi to return true for all rows > in the left-hand-side table, thus failing to filter as the WHERE .. IN > should. Compare with the non-buggy version, where both LeftSemi joins have > distinct #-things on both sides: > {code:java} > >>> no_bug.explain() > ... > == Optimized Logical Plan == > Union > :- Project [id#4, num#5L, b AS source#142] > : +- Join LeftSemi, (id#4 = id#4#173) > : :- LogicalRDD [id#4, num#5L], false > : +- Project [id#4 AS id#4#173] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Project [id#0, num#1L, a AS source#143] >+- Join LeftSemi, (id#0 = id#4#173) > :- LogicalRDD [id#0, num#1L], false > +- Project [id#4 AS id#4#173] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > > Best, > -Arttu > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26078) WHERE .. IN fails to filter rows when used in combination with UNION
[ https://issues.apache.org/jira/browse/SPARK-26078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-26078: --- Priority: Blocker (was: Major) > WHERE .. IN fails to filter rows when used in combination with UNION > > > Key: SPARK-26078 > URL: https://issues.apache.org/jira/browse/SPARK-26078 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.1, 2.4.0 >Reporter: Arttu Voutilainen >Priority: Blocker > > Hey, > We encountered a case where Spark SQL does not seem to handle WHERE .. IN > correctly, when used in combination with UNION, but instead returns also rows > that do not fulfill the condition. Swapping the order of the datasets in the > UNION makes the problem go away. Repro below: > > {code} > sql = SQLContext(sc) > a = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > b = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > a.registerTempTable('a') > b.registerTempTable('b') > bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'a' as source FROM a > UNION ALL > SELECT id, num, 'b' as source FROM b > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > no_bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'b' as source FROM b > UNION ALL > SELECT id, num, 'a' as source FROM a > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > bug.show() > no_bug.show() > bug.explain(True) > no_bug.explain(True) > {code} > This results in one extra row in the "bug" DF coming from DF "b", that should > not be there as it > {code:java} > >>> bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| a| > | a| 2| b| > | b| 1| b| > +---+---+--+ > >>> no_bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| b| > | a| 2| a| > +---+---+--+ > {code} > The reason can be seen in the query plans: > {code:java} > >>> bug.explain(True) > ... > == Optimized Logical Plan == > Union > == Optimized Logical Plan == > Union > :- Project [id#0, num#1L, a AS source#136] > : +- Join LeftSemi, (id#0 = id#4) > : :- LogicalRDD [id#0, num#1L], false > : +- Project [id#4] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Join LeftSemi, (id#4#172 = id#4#172) >:- Project [id#4, num#5L, b AS source#137] >: +- LogicalRDD [id#4, num#5L], false >+- Project [id#4 AS id#4#172] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > Note the line *+- Join LeftSemi, (id#4#172 = id#4#172)* - this condition > seems wrong, and I believe it causes the LeftSemi to return true for all rows > in the left-hand-side table, thus failing to filter as the WHERE .. IN > should. Compare with the non-buggy version, where both LeftSemi joins have > distinct #-things on both sides: > {code:java} > >>> no_bug.explain() > ... > == Optimized Logical Plan == > Union > :- Project [id#4, num#5L, b AS source#142] > : +- Join LeftSemi, (id#4 = id#4#173) > : :- LogicalRDD [id#4, num#5L], false > : +- Project [id#4 AS id#4#173] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Project [id#0, num#1L, a AS source#143] >+- Join LeftSemi, (id#0 = id#4#173) > :- LogicalRDD [id#0, num#1L], false > +- Project [id#4 AS id#4#173] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > > Best, > -Arttu > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26078) WHERE .. IN fails to filter rows when used in combination with UNION
[ https://issues.apache.org/jira/browse/SPARK-26078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688460#comment-16688460 ] Matt Cheah commented on SPARK-26078: As per other correctness issues we have seen as of late and as of other discussions around these kinds of things, I'm elevating this to a blocker. [~cloud_fan] [~r...@databricks.com]. We should look to get this into 2.4.1 and 3.0. > WHERE .. IN fails to filter rows when used in combination with UNION > > > Key: SPARK-26078 > URL: https://issues.apache.org/jira/browse/SPARK-26078 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.1, 2.4.0 >Reporter: Arttu Voutilainen >Priority: Major > > Hey, > We encountered a case where Spark SQL does not seem to handle WHERE .. IN > correctly, when used in combination with UNION, but instead returns also rows > that do not fulfill the condition. Swapping the order of the datasets in the > UNION makes the problem go away. Repro below: > > {code} > sql = SQLContext(sc) > a = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > b = spark.createDataFrame([{'id': 'a', 'num': 2}, {'id':'b', 'num':1}]) > a.registerTempTable('a') > b.registerTempTable('b') > bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'a' as source FROM a > UNION ALL > SELECT id, num, 'b' as source FROM b > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > no_bug = sql.sql(""" > SELECT id,num,source FROM > ( > SELECT id, num, 'b' as source FROM b > UNION ALL > SELECT id, num, 'a' as source FROM a > ) AS c > WHERE c.id IN (SELECT id FROM b WHERE num = 2) > """) > bug.show() > no_bug.show() > bug.explain(True) > no_bug.explain(True) > {code} > This results in one extra row in the "bug" DF coming from DF "b", that should > not be there as it > {code:java} > >>> bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| a| > | a| 2| b| > | b| 1| b| > +---+---+--+ > >>> no_bug.show() > +---+---+--+ > | id|num|source| > +---+---+--+ > | a| 2| b| > | a| 2| a| > +---+---+--+ > {code} > The reason can be seen in the query plans: > {code:java} > >>> bug.explain(True) > ... > == Optimized Logical Plan == > Union > == Optimized Logical Plan == > Union > :- Project [id#0, num#1L, a AS source#136] > : +- Join LeftSemi, (id#0 = id#4) > : :- LogicalRDD [id#0, num#1L], false > : +- Project [id#4] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Join LeftSemi, (id#4#172 = id#4#172) >:- Project [id#4, num#5L, b AS source#137] >: +- LogicalRDD [id#4, num#5L], false >+- Project [id#4 AS id#4#172] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > Note the line *+- Join LeftSemi, (id#4#172 = id#4#172)* - this condition > seems wrong, and I believe it causes the LeftSemi to return true for all rows > in the left-hand-side table, thus failing to filter as the WHERE .. IN > should. Compare with the non-buggy version, where both LeftSemi joins have > distinct #-things on both sides: > {code:java} > >>> no_bug.explain() > ... > == Optimized Logical Plan == > Union > :- Project [id#4, num#5L, b AS source#142] > : +- Join LeftSemi, (id#4 = id#4#173) > : :- LogicalRDD [id#4, num#5L], false > : +- Project [id#4 AS id#4#173] > :+- Filter (isnotnull(num#5L) && (num#5L = 2)) > : +- LogicalRDD [id#4, num#5L], false > +- Project [id#0, num#1L, a AS source#143] >+- Join LeftSemi, (id#0 = id#4#173) > :- LogicalRDD [id#0, num#1L], false > +- Project [id#4 AS id#4#173] > +- Filter (isnotnull(num#5L) && (num#5L = 2)) > +- LogicalRDD [id#4, num#5L], false > {code} > > Best, > -Arttu > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25821) Remove SQLContext methods deprecated as of Spark 1.4
[ https://issues.apache.org/jira/browse/SPARK-25821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685705#comment-16685705 ] Matt Cheah commented on SPARK-25821: Should we call this out in release notes? It seems pretty obscure and it's not clear if users would ever have called these APIs, but it might be worth a bullet point or two. > Remove SQLContext methods deprecated as of Spark 1.4 > > > Key: SPARK-25821 > URL: https://issues.apache.org/jira/browse/SPARK-25821 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 2.4.0 >Reporter: Sean Owen >Assignee: Sean Owen >Priority: Major > Fix For: 3.0.0 > > > There are several SQLContext methods that have been deprecate since Spark > 1.4, like: > {code:java} > @deprecated("Use read.parquet() instead.", "1.4.0") > @scala.annotation.varargs > def parquetFile(paths: String*): DataFrame = { > if (paths.isEmpty) { > emptyDataFrame > } else { > read.parquet(paths : _*) > } > }{code} > Let's remove them in Spark 3. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25862) Remove rangeBetween APIs introduced in SPARK-21608
[ https://issues.apache.org/jira/browse/SPARK-25862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-25862: --- Labels: release-notes (was: ) > Remove rangeBetween APIs introduced in SPARK-21608 > -- > > Key: SPARK-25862 > URL: https://issues.apache.org/jira/browse/SPARK-25862 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 2.4.0 >Reporter: Reynold Xin >Assignee: Reynold Xin >Priority: Major > Labels: release-notes > Fix For: 3.0.0 > > > As a follow up to https://issues.apache.org/jira/browse/SPARK-25842, removing > the API so we can introduce a new one. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25862) Remove rangeBetween APIs introduced in SPARK-21608
[ https://issues.apache.org/jira/browse/SPARK-25862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685704#comment-16685704 ] Matt Cheah commented on SPARK-25862: Adding release-notes label because this seems like a breaking change. > Remove rangeBetween APIs introduced in SPARK-21608 > -- > > Key: SPARK-25862 > URL: https://issues.apache.org/jira/browse/SPARK-25862 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 2.4.0 >Reporter: Reynold Xin >Assignee: Reynold Xin >Priority: Major > Labels: release-notes > Fix For: 3.0.0 > > > As a follow up to https://issues.apache.org/jira/browse/SPARK-25842, removing > the API so we can introduce a new one. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25908) Remove old deprecated items in Spark 3
[ https://issues.apache.org/jira/browse/SPARK-25908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-25908: --- Labels: release-notes (was: ) > Remove old deprecated items in Spark 3 > -- > > Key: SPARK-25908 > URL: https://issues.apache.org/jira/browse/SPARK-25908 > Project: Spark > Issue Type: Task > Components: Spark Core, SQL >Affects Versions: 3.0.0 >Reporter: Sean Owen >Assignee: Sean Owen >Priority: Major > Labels: release-notes > Fix For: 3.0.0 > > > There are many deprecated methods and classes in Spark. They _can_ be removed > in Spark 3, and for those that have been deprecated a long time (i.e. since > Spark <= 2.0), we should probably do so. This addresses most of these cases, > the easiest ones, those that are easy to remove and are old: > - Remove some AccumulableInfo .apply() methods > - Remove non-label-specific multiclass precision/recall/fScore in favor of > accuracy > - Remove toDegrees/toRadians in favor of degrees/radians (SparkR: only > deprecated) > - Remove approxCountDistinct in favor of approx_count_distinct (SparkR: only > deprecated) > - Remove unused Python StorageLevel constants > - Remove Dataset unionAll in favor of union > - Remove unused multiclass option in libsvm parsing > - Remove references to deprecated spark configs like spark.yarn.am.port > - Remove TaskContext.isRunningLocally > - Remove ShuffleMetrics.shuffle* methods > - Remove BaseReadWrite.context in favor of session > - Remove Column.!== in favor of =!= > - Remove Dataset.explode > - Remove Dataset.registerTempTable > - Remove SQLContext.getOrCreate, setActive, clearActive, constructors > Not touched yet: > - everything else in MLLib > - HiveContext > - Anything deprecated more recently than 2.0.0, generally -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24834) Utils#nanSafeCompare{Double,Float} functions do not differ from normal java double/float comparison
[ https://issues.apache.org/jira/browse/SPARK-24834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680170#comment-16680170 ] Matt Cheah commented on SPARK-24834: [~srowen] - I know this is an old ticket but I wanted to propose re-opening this and addressing it for Spark 3.0. My understanding is that this behavior is also not consistent with other SQL systems like MySQL and PostGres. In a sense, even though this would be a behavioral change, one could argue that this is a correctness issue given what one should be expecting given behavior from other systems. Would it be reasonable to make the behavior change for Spark 3.0 and call it out in the release notes? > Utils#nanSafeCompare{Double,Float} functions do not differ from normal java > double/float comparison > --- > > Key: SPARK-24834 > URL: https://issues.apache.org/jira/browse/SPARK-24834 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.3.2 >Reporter: Benjamin Duffield >Priority: Minor > > Utils.scala contains two functions `nanSafeCompareDoubles` and > `nanSafeCompareFloats` which purport to have special handling of NaN values > in comparisons. > The handling in these functions do not appear to differ from > java.lang.Double.compare and java.lang.Float.compare - they seem to produce > identical output to the built-in java comparison functions. > I think it's clearer to not have these special Utils functions, and instead > just use the standard java comparison functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-25875) Merge code to set up driver features for different languages
[ https://issues.apache.org/jira/browse/SPARK-25875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-25875. Resolution: Fixed > Merge code to set up driver features for different languages > > > Key: SPARK-25875 > URL: https://issues.apache.org/jira/browse/SPARK-25875 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Marcelo Vanzin >Priority: Major > > This is the first step for SPARK-25874. Please refer to the parent bug for > details. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-25809) Support additional K8S cluster types for integration tests
[ https://issues.apache.org/jira/browse/SPARK-25809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-25809. Resolution: Fixed > Support additional K8S cluster types for integration tests > -- > > Key: SPARK-25809 > URL: https://issues.apache.org/jira/browse/SPARK-25809 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.2, 2.4.0 >Reporter: Rob Vesse >Priority: Major > > Currently the Spark on K8S integration tests are hardcoded to use a > {{minikube}} based backend. It would be nice if developers had more > flexibility in the choice of K8S cluster they wish to use for integration > testing. More specifically it would be useful to be able to use the built-in > Kubernetes support in recent Docker releases and to just use a generic K8S > cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-24434) Support user-specified driver and executor pod templates
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-24434. Resolution: Fixed > Support user-specified driver and executor pod templates > > > Key: SPARK-24434 > URL: https://issues.apache.org/jira/browse/SPARK-24434 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Yinan Li >Priority: Major > > With more requests for customizing the driver and executor pods coming, the > current approach of adding new Spark configuration options has some serious > drawbacks: 1) it means more Kubernetes specific configuration options to > maintain, and 2) it widens the gap between the declarative model used by > Kubernetes and the configuration model used by Spark. We should start > designing a solution that allows users to specify pod templates as central > places for all customization needs for the driver and executor pods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18278) SPIP: Support native submission of spark jobs to a kubernetes cluster
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650524#comment-16650524 ] Matt Cheah commented on SPARK-18278: The fork is no longer being maintained, because Kubernetes support is now being built and available on mainline. Spark 2.3 introduced basic support for Kubernetes. Spark 2.4 will have Python support. Dynamic allocation is being reworked, see [my comment above|https://issues.apache.org/jira/browse/SPARK-18278?focusedCommentId=16646282=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16646282]. > SPIP: Support native submission of spark jobs to a kubernetes cluster > - > > Key: SPARK-18278 > URL: https://issues.apache.org/jira/browse/SPARK-18278 > Project: Spark > Issue Type: Umbrella > Components: Build, Deploy, Documentation, Kubernetes, Scheduler, > Spark Core >Affects Versions: 2.3.0 >Reporter: Erik Erlandson >Assignee: Anirudh Ramanathan >Priority: Major > Labels: SPIP > Attachments: SPARK-18278 Spark on Kubernetes Design Proposal Revision > 2 (1).pdf > > > A new Apache Spark sub-project that enables native support for submitting > Spark applications to a kubernetes cluster. The submitted application runs > in a driver executing on a kubernetes pod, and executors lifecycles are also > managed as pods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18278) SPIP: Support native submission of spark jobs to a kubernetes cluster
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648208#comment-16648208 ] Matt Cheah commented on SPARK-18278: [~liushaohui] a significantly large prerequisite for dynamic allocation is changing the way shuffle resiliency works; see https://issues.apache.org/jira/browse/SPARK-25299. > SPIP: Support native submission of spark jobs to a kubernetes cluster > - > > Key: SPARK-18278 > URL: https://issues.apache.org/jira/browse/SPARK-18278 > Project: Spark > Issue Type: Umbrella > Components: Build, Deploy, Documentation, Kubernetes, Scheduler, > Spark Core >Affects Versions: 2.3.0 >Reporter: Erik Erlandson >Assignee: Anirudh Ramanathan >Priority: Major > Labels: SPIP > Attachments: SPARK-18278 Spark on Kubernetes Design Proposal Revision > 2 (1).pdf > > > A new Apache Spark sub-project that enables native support for submitting > Spark applications to a kubernetes cluster. The submitted application runs > in a driver executing on a kubernetes pod, and executors lifecycles are also > managed as pods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23429) Add executor memory metrics to heartbeat and expose in executors REST API
[ https://issues.apache.org/jira/browse/SPARK-23429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-23429. Resolution: Fixed > Add executor memory metrics to heartbeat and expose in executors REST API > - > > Key: SPARK-23429 > URL: https://issues.apache.org/jira/browse/SPARK-23429 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 2.2.1 >Reporter: Edwina Lu >Priority: Major > > Add new executor level memory metrics ( jvmUsedMemory, onHeapExecutionMemory, > offHeapExecutionMemory, onHeapStorageMemory, offHeapStorageMemory, > onHeapUnifiedMemory, and offHeapUnifiedMemory), and expose these via the > executors REST API. This information will help provide insight into how > executor and driver JVM memory is used, and for the different memory regions. > It can be used to help determine good values for spark.executor.memory, > spark.driver.memory, spark.memory.fraction, and spark.memory.storageFraction. > Add an ExecutorMetrics class, with jvmUsedMemory, onHeapExecutionMemory, > offHeapExecutionMemory, onHeapStorageMemory, and offHeapStorageMemory. This > will track the memory usage at the executor level. The new ExecutorMetrics > will be sent by executors to the driver as part of the Heartbeat. A heartbeat > will be added for the driver as well, to collect these metrics for the driver. > Modify the EventLoggingListener to log ExecutorMetricsUpdate events if there > is a new peak value for one of the memory metrics for an executor and stage. > Only the ExecutorMetrics will be logged, and not the TaskMetrics, to minimize > additional logging. Analysis on a set of sample applications showed an > increase of 0.25% in the size of the Spark history log, with this approach. > Modify the AppStatusListener to collect snapshots of peak values for each > memory metric. Each snapshot has the time, jvmUsedMemory, executionMemory and > storageMemory, and list of active stages. > Add the new memory metrics (snapshots of peak values for each memory metric) > to the executors REST API. > This is a subtask for SPARK-23206. Please refer to the design doc for that > ticket for more details. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25262) Make Spark local dir volumes configurable with Spark on Kubernetes
[ https://issues.apache.org/jira/browse/SPARK-25262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606522#comment-16606522 ] Matt Cheah commented on SPARK-25262: For [https://github.com/apache/spark/pull/22323] we allow using tmpfs but other volume types aren't allowed. Is it fine to close this issue or do we want to keep this open to track work to support other volume types there? > Make Spark local dir volumes configurable with Spark on Kubernetes > -- > > Key: SPARK-25262 > URL: https://issues.apache.org/jira/browse/SPARK-25262 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0, 2.3.1 >Reporter: Rob Vesse >Priority: Major > > As discussed during review of the design document for SPARK-24434 while > providing pod templates will provide more in-depth customisation for Spark on > Kubernetes there are some things that cannot be modified because Spark code > generates pod specs in very specific ways. > The particular issue identified relates to handling on {{spark.local.dirs}} > which is done by {{LocalDirsFeatureStep.scala}}. For each directory > specified, or a single default if no explicit specification, it creates a > Kubernetes {{emptyDir}} volume. As noted in the Kubernetes documentation > this will be backed by the node storage > (https://kubernetes.io/docs/concepts/storage/volumes/#emptydir). In some > compute environments this may be extremely undesirable. For example with > diskless compute resources the node storage will likely be a non-performant > remote mounted disk, often with limited capacity. For such environments it > would likely be better to set {{medium: Memory}} on the volume per the K8S > documentation to use a {{tmpfs}} volume instead. > Another closely related issue is that users might want to use a different > volume type to back the local directories and there is no possibility to do > that. > Pod templates will not really solve either of these issues because Spark is > always going to attempt to generate a new volume for each local directory and > always going to set these as {{emptyDir}}. > Therefore the proposal is to make two changes to {{LocalDirsFeatureStep}}: > * Provide a new config setting to enable using {{tmpfs}} backed {{emptyDir}} > volumes > * Modify the logic to check if there is a volume already defined with the > name and if so skip generating a volume definition for it -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-25222) Spark on Kubernetes Pod Watcher dumps raw container status
[ https://issues.apache.org/jira/browse/SPARK-25222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-25222. Resolution: Fixed > Spark on Kubernetes Pod Watcher dumps raw container status > -- > > Key: SPARK-25222 > URL: https://issues.apache.org/jira/browse/SPARK-25222 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0, 2.3.1 >Reporter: Rob Vesse >Priority: Minor > > Spark on Kubernetes provides logging of the pod/container status as a monitor > of the job progress. However the logger just dumps the raw container status > object leading to fairly unreadable output like so: > {noformat} > 18/08/24 09:03:27 INFO LoggingPodStatusWatcherImpl: State changed, new state: >pod name: spark-groupby-1535101393784-driver >namespace: default >labels: spark-app-selector -> spark-47f7248122b9444b8d5fd3701028a1e8, > spark-role -> driver >pod uid: 88de6467-a77c-11e8-b9da-a4bf0128b75b >creation time: 2018-08-24T09:03:14Z >service account name: spark >volumes: spark-local-dir-1, spark-conf-volume, spark-token-kjxkv >node name: tab-cmp4 >start time: 2018-08-24T09:03:14Z >container images: rvesse/spark:latest >phase: Running >status: > [ContainerStatus(containerID=docker://23ae58571f59505e837dca40455d0347fb90e9b88e2a2b145a38e2919fceb447, > image=rvesse/spark:latest, > imageID=docker-pullable://rvesse/spark@sha256:92abf0b718743d0f5a26068fc94ec42233db0493c55a8570dc8c851c62a4bc0a, > lastState=ContainerState(running=null, terminated=null, waiting=null, > additionalProperties={}), name=spark-kubernetes-driver, ready=true, > restartCount=0, > state=ContainerState(running=ContainerStateRunning(startedAt=Time(time=2018-08-24T09:03:26Z, > additionalProperties={}), additionalProperties={}), terminated=null, > waiting=null, additionalProperties={}), additionalProperties={})] > {noformat} > The {{LoggingPodStatusWatcher}} actually already includes code to nicely > format this information but only invokes it at the end of the job: > {noformat} > 18/08/24 09:04:07 INFO LoggingPodStatusWatcherImpl: Container final statuses: > Container name: spark-kubernetes-driver > Container image: rvesse/spark:latest > Container state: Terminated > Exit code: 0 > {noformat} > It would be nice if we continually used the nice formatting throughout the > logging. > We already have patched this on our internal fork and will upstream a fix > shortly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25299) Use remote storage for persisting shuffle data
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603807#comment-16603807 ] Matt Cheah commented on SPARK-25299: (Changed the title to "remote storage" for a little more generalization) > Use remote storage for persisting shuffle data > -- > > Key: SPARK-25299 > URL: https://issues.apache.org/jira/browse/SPARK-25299 > Project: Spark > Issue Type: New Feature > Components: Shuffle >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > > In Spark, the shuffle primitive requires Spark executors to persist data to > the local disk of the worker nodes. If executors crash, the external shuffle > service can continue to serve the shuffle data that was written beyond the > lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the > external shuffle service is deployed on every worker node. The shuffle > service shares local disk with the executors that run on its node. > There are some shortcomings with the way shuffle is fundamentally implemented > right now. Particularly: > * If any external shuffle service process or node becomes unavailable, all > applications that had an executor that ran on that node must recompute the > shuffle blocks that were lost. > * Similarly to the above, the external shuffle service must be kept running > at all times, which may waste resources when no applications are using that > shuffle service node. > * Mounting local storage can prevent users from taking advantage of > desirable isolation benefits from using containerized environments, like > Kubernetes. We had an external shuffle service implementation in an early > prototype of the Kubernetes backend, but it was rejected due to its strict > requirement to be able to mount hostPath volumes or other persistent volume > setups. > In the following [architecture discussion > document|https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit#heading=h.btqugnmt2h40] > (note: _not_ an SPIP), we brainstorm various high level architectures for > improving the external shuffle service in a way that addresses the above > problems. The purpose of this umbrella JIRA is to promote additional > discussion on how we can approach these problems, both at the architecture > level and the implementation level. We anticipate filing sub-issues that > break down the tasks that must be completed to achieve this goal. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25299) Use remote storage for persisting shuffle data
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-25299: --- Summary: Use remote storage for persisting shuffle data (was: Use distributed storage for persisting shuffle data) > Use remote storage for persisting shuffle data > -- > > Key: SPARK-25299 > URL: https://issues.apache.org/jira/browse/SPARK-25299 > Project: Spark > Issue Type: New Feature > Components: Shuffle >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > > In Spark, the shuffle primitive requires Spark executors to persist data to > the local disk of the worker nodes. If executors crash, the external shuffle > service can continue to serve the shuffle data that was written beyond the > lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the > external shuffle service is deployed on every worker node. The shuffle > service shares local disk with the executors that run on its node. > There are some shortcomings with the way shuffle is fundamentally implemented > right now. Particularly: > * If any external shuffle service process or node becomes unavailable, all > applications that had an executor that ran on that node must recompute the > shuffle blocks that were lost. > * Similarly to the above, the external shuffle service must be kept running > at all times, which may waste resources when no applications are using that > shuffle service node. > * Mounting local storage can prevent users from taking advantage of > desirable isolation benefits from using containerized environments, like > Kubernetes. We had an external shuffle service implementation in an early > prototype of the Kubernetes backend, but it was rejected due to its strict > requirement to be able to mount hostPath volumes or other persistent volume > setups. > In the following [architecture discussion > document|https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit#heading=h.btqugnmt2h40] > (note: _not_ an SPIP), we brainstorm various high level architectures for > improving the external shuffle service in a way that addresses the above > problems. The purpose of this umbrella JIRA is to promote additional > discussion on how we can approach these problems, both at the architecture > level and the implementation level. We anticipate filing sub-issues that > break down the tasks that must be completed to achieve this goal. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599697#comment-16599697 ] Matt Cheah commented on SPARK-24434: Everyone, thank you for your contribution to this discussion. It is important for us to agree upon constructive next steps to avoid this kind of miscommunication in the future. I have a few notes to make from having collaborated with Yifei and Onur on this patch and on behalf of Palantir. Firstly, we apologize that we did not communicate clearly enough on Apache communication channels that we were working on this, and the urgency of which we needed this work done. We agree with [~felixcheung]'s assessment that notes from the weekly meetings that have bearing on Spark development should be sent back to the wider community. We are specifically sorry for not having said something to the effect of "I am taking a stab at implementing this at https://github.com/... . Stavros, are you cool with that?" Palantir and the Kubernetes Big Data group must improve our communication next time. Secondly, we would suggest that a work in progress patch proposed early on in the feature's development would have been helpful for users to prepare to be able to use this feature in their internal tools. It's helpful for everyone to see the API and expected behavior of a new feature so that they can plan to take advantage of that feature ahead of time. Thirdly, a small clarifying comment on timelines and urgency. While we don't see the need for this to be in Spark 2.4, we will be taking the patch ahead of time on our fork of Spark which follows the Apache master branch (see [https://github.com/palantir/spark).] We were hoping to cherry-pick this patch soon but could have been clearer in our communication of this need. Fourthly, we are sorry for the wording in "On 15 Aug it was discussed that as Stavros Kontopoulos was out, and was not actively working on this PR at that moment, Yifei Huang and I can take over and start working on this.”: Instead of “take over”, we should have said “contribute to this feature” in this comment specifically. Finally, moving forward we are happy to collaborate on what the community believes to be the best implementation of this feature. We are happy to use Onur's, but we can also use Stavros's. Regardless of the chosen implementation, credit should be given to all parties. For example if Onur's implementation is chosen, Stavros's design work should be called out in the pull request description. Either way, we would like to see this feature merged by Friday, September 07, though this will have to be delayed if the Spark 2.4 release branch is not cut before that time (since we don't want this going into master and ending up in Spark 2.4 as a result). We are open to feedback on any of the above points and suggestions on how we can improve the way we contribute to Spark in the future. > Support user-specified driver and executor pod templates > > > Key: SPARK-24434 > URL: https://issues.apache.org/jira/browse/SPARK-24434 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Yinan Li >Priority: Major > > With more requests for customizing the driver and executor pods coming, the > current approach of adding new Spark configuration options has some serious > drawbacks: 1) it means more Kubernetes specific configuration options to > maintain, and 2) it widens the gap between the declarative model used by > Kubernetes and the configuration model used by Spark. We should start > designing a solution that allows users to specify pod templates as central > places for all customization needs for the driver and executor pods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25299) Use distributed storage for persisting shuffle data
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599428#comment-16599428 ] Matt Cheah commented on SPARK-25299: Note that SPARK-1529 was a much earlier feature request that is more or less identical to this one, but the old age of SPARK-1529 led me to open this newer issue instead of re-opening the old one. If it is preferable to use the old issue we can do that as well. > Use distributed storage for persisting shuffle data > --- > > Key: SPARK-25299 > URL: https://issues.apache.org/jira/browse/SPARK-25299 > Project: Spark > Issue Type: New Feature > Components: Shuffle >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > > In Spark, the shuffle primitive requires Spark executors to persist data to > the local disk of the worker nodes. If executors crash, the external shuffle > service can continue to serve the shuffle data that was written beyond the > lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the > external shuffle service is deployed on every worker node. The shuffle > service shares local disk with the executors that run on its node. > There are some shortcomings with the way shuffle is fundamentally implemented > right now. Particularly: > * If any external shuffle service process or node becomes unavailable, all > applications that had an executor that ran on that node must recompute the > shuffle blocks that were lost. > * Similarly to the above, the external shuffle service must be kept running > at all times, which may waste resources when no applications are using that > shuffle service node. > * Mounting local storage can prevent users from taking advantage of > desirable isolation benefits from using containerized environments, like > Kubernetes. We had an external shuffle service implementation in an early > prototype of the Kubernetes backend, but it was rejected due to its strict > requirement to be able to mount hostPath volumes or other persistent volume > setups. > In the following [architecture discussion > document|https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit#heading=h.btqugnmt2h40] > (note: _not_ an SPIP), we brainstorm various high level architectures for > improving the external shuffle service in a way that addresses the above > problems. The purpose of this umbrella JIRA is to promote additional > discussion on how we can approach these problems, both at the architecture > level and the implementation level. We anticipate filing sub-issues that > break down the tasks that must be completed to achieve this goal. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-25299) Use distributed storage for persisting shuffle data
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599428#comment-16599428 ] Matt Cheah edited comment on SPARK-25299 at 9/1/18 12:27 AM: - Note that SPARK-1529 was a much earlier feature request that is more or less identical to this one, but the old age of SPARK-1529 led me to open this newer issue instead of re-opening the old one. If it is preferable to use the old issue we can do that as well. was (Author: mcheah): Note that SPARK-1529 was a much earlier feature request that is more or less identical to this one, but the old age of SPARK-1529 led me to open this newer issue instead of re-opening the old one. If it is preferable to use the old issue we can do that as well. > Use distributed storage for persisting shuffle data > --- > > Key: SPARK-25299 > URL: https://issues.apache.org/jira/browse/SPARK-25299 > Project: Spark > Issue Type: New Feature > Components: Shuffle >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > > In Spark, the shuffle primitive requires Spark executors to persist data to > the local disk of the worker nodes. If executors crash, the external shuffle > service can continue to serve the shuffle data that was written beyond the > lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the > external shuffle service is deployed on every worker node. The shuffle > service shares local disk with the executors that run on its node. > There are some shortcomings with the way shuffle is fundamentally implemented > right now. Particularly: > * If any external shuffle service process or node becomes unavailable, all > applications that had an executor that ran on that node must recompute the > shuffle blocks that were lost. > * Similarly to the above, the external shuffle service must be kept running > at all times, which may waste resources when no applications are using that > shuffle service node. > * Mounting local storage can prevent users from taking advantage of > desirable isolation benefits from using containerized environments, like > Kubernetes. We had an external shuffle service implementation in an early > prototype of the Kubernetes backend, but it was rejected due to its strict > requirement to be able to mount hostPath volumes or other persistent volume > setups. > In the following [architecture discussion > document|https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit#heading=h.btqugnmt2h40] > (note: _not_ an SPIP), we brainstorm various high level architectures for > improving the external shuffle service in a way that addresses the above > problems. The purpose of this umbrella JIRA is to promote additional > discussion on how we can approach these problems, both at the architecture > level and the implementation level. We anticipate filing sub-issues that > break down the tasks that must be completed to achieve this goal. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25299) Use distributed storage for persisting shuffle data
Matt Cheah created SPARK-25299: -- Summary: Use distributed storage for persisting shuffle data Key: SPARK-25299 URL: https://issues.apache.org/jira/browse/SPARK-25299 Project: Spark Issue Type: New Feature Components: Shuffle Affects Versions: 2.4.0 Reporter: Matt Cheah In Spark, the shuffle primitive requires Spark executors to persist data to the local disk of the worker nodes. If executors crash, the external shuffle service can continue to serve the shuffle data that was written beyond the lifetime of the executor itself. In YARN, Mesos, and Standalone mode, the external shuffle service is deployed on every worker node. The shuffle service shares local disk with the executors that run on its node. There are some shortcomings with the way shuffle is fundamentally implemented right now. Particularly: * If any external shuffle service process or node becomes unavailable, all applications that had an executor that ran on that node must recompute the shuffle blocks that were lost. * Similarly to the above, the external shuffle service must be kept running at all times, which may waste resources when no applications are using that shuffle service node. * Mounting local storage can prevent users from taking advantage of desirable isolation benefits from using containerized environments, like Kubernetes. We had an external shuffle service implementation in an early prototype of the Kubernetes backend, but it was rejected due to its strict requirement to be able to mount hostPath volumes or other persistent volume setups. In the following [architecture discussion document|https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit#heading=h.btqugnmt2h40] (note: _not_ an SPIP), we brainstorm various high level architectures for improving the external shuffle service in a way that addresses the above problems. The purpose of this umbrella JIRA is to promote additional discussion on how we can approach these problems, both at the architecture level and the implementation level. We anticipate filing sub-issues that break down the tasks that must be completed to achieve this goal. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-25264) Fix comma-delineated arguments passed into PythonRunner and RRunner
[ https://issues.apache.org/jira/browse/SPARK-25264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-25264. Resolution: Fixed Fix Version/s: 2.4.0 > Fix comma-delineated arguments passed into PythonRunner and RRunner > --- > > Key: SPARK-25264 > URL: https://issues.apache.org/jira/browse/SPARK-25264 > Project: Spark > Issue Type: Bug > Components: Kubernetes, PySpark >Affects Versions: 2.4.0 >Reporter: Ilan Filonenko >Priority: Major > Fix For: 2.4.0 > > > The arguments passed into the PythonRunner and RRunner are comma-delineated. > Because the Runners do a arg.slice(2,...) This means that the delineation in > the entrypoint needs to be a space, as it would be expected by the Runner > arguments. > This issue was logged here: > [https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/273] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25152) Enable Spark on Kubernetes R Integration Tests
Matt Cheah created SPARK-25152: -- Summary: Enable Spark on Kubernetes R Integration Tests Key: SPARK-25152 URL: https://issues.apache.org/jira/browse/SPARK-25152 Project: Spark Issue Type: Test Components: Kubernetes, SparkR Affects Versions: 2.4.0 Reporter: Matt Cheah We merged [https://github.com/apache/spark/pull/21584] for SPARK-24433 but we had to turn off the integration tests due to issues with the Jenkins environment. Re-enable the tests after the environment is fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24433) Add Spark R support
[ https://issues.apache.org/jira/browse/SPARK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-24433: --- Fix Version/s: 2.4.0 > Add Spark R support > --- > > Key: SPARK-24433 > URL: https://issues.apache.org/jira/browse/SPARK-24433 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Yinan Li >Priority: Major > Fix For: 2.4.0 > > > This is the ticket to track work on adding support for R binding into the > Kubernetes mode. The feature is available in our fork at > github.com/apache-spark-on-k8s/spark and needs to be upstreamed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-24433) Add Spark R support
[ https://issues.apache.org/jira/browse/SPARK-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-24433. Resolution: Fixed > Add Spark R support > --- > > Key: SPARK-24433 > URL: https://issues.apache.org/jira/browse/SPARK-24433 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Yinan Li >Priority: Major > > This is the ticket to track work on adding support for R binding into the > Kubernetes mode. The feature is available in our fork at > github.com/apache-spark-on-k8s/spark and needs to be upstreamed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-24960) k8s: explicitly expose ports on driver container
[ https://issues.apache.org/jira/browse/SPARK-24960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-24960. Resolution: Fixed Fix Version/s: 2.4.0 > k8s: explicitly expose ports on driver container > > > Key: SPARK-24960 > URL: https://issues.apache.org/jira/browse/SPARK-24960 > Project: Spark > Issue Type: Improvement > Components: Deploy, Kubernetes, Scheduler >Affects Versions: 2.2.0 >Reporter: Adelbert Chang >Priority: Minor > Fix For: 2.4.0 > > > For the Kubernetes scheduler, the Driver Pod does not explicitly expose its > ports. It is possible for a Kubernetes environment to be setup such that Pod > ports are closed by default and must be opened explicitly in the Pod spec. In > such an environment without this improvement the Driver Service will be > unable to route requests (e.g. from the Executors) to the corresponding > Driver Pod, which can be observed on the Executor side with this error > message: > {noformat} > Caused by: java.io.IOException: Failed to connect to > org-apache-spark-examples-sparkpi-1519271450264-driver-svc.dev.svc.cluster.local:7078{noformat} > > For posterity, this is a copy of the [original > issue|https://github.com/apache-spark-on-k8s/spark/issues/617] filed in the > now deprecated {{apache-spark-on-k8s}} repository. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-24963) Integration tests will fail if they run in a namespace not being the default
[ https://issues.apache.org/jira/browse/SPARK-24963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-24963. Resolution: Fixed Fix Version/s: 2.4.0 > Integration tests will fail if they run in a namespace not being the default > > > Key: SPARK-24963 > URL: https://issues.apache.org/jira/browse/SPARK-24963 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Stavros Kontopoulos >Priority: Minor > Fix For: 2.4.0 > > > Related discussion is here: > [https://github.com/apache/spark/pull/21748#pullrequestreview-141048893] > If spark-rbac.yaml is used when tests are used locally, client mode tests > will fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23146) Support client mode for Kubernetes cluster backend
[ https://issues.apache.org/jira/browse/SPARK-23146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-23146. Resolution: Fixed Fix Version/s: 2.4.0 > Support client mode for Kubernetes cluster backend > -- > > Key: SPARK-23146 > URL: https://issues.apache.org/jira/browse/SPARK-23146 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Anirudh Ramanathan >Priority: Major > Fix For: 2.4.0 > > > This issue tracks client mode support within Spark when running in the > Kubernetes cluster backend. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24825) [K8S][TEST] Kubernetes integration tests don't trace the maven project dependency structure
[ https://issues.apache.org/jira/browse/SPARK-24825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547137#comment-16547137 ] Matt Cheah commented on SPARK-24825: We're looking into this now, this particular phase was built out by myself, [~ssuchter], [~foxish], and [~ifilonenko]. Consulting with some folks from RiseLab now also - [~shaneknapp]. I think we really shouldn't have to maven install, the multi-module build should pick up the other modules properly. We've likely configured the maven reactor incorrectly in the integration test's pom.xml. > [K8S][TEST] Kubernetes integration tests don't trace the maven project > dependency structure > --- > > Key: SPARK-24825 > URL: https://issues.apache.org/jira/browse/SPARK-24825 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Critical > > The Kubernetes integration tests will currently fail if maven installation is > not performed first, because the integration test build believes it should be > pulling the Spark parent artifact from maven central. However, this is > incorrect because the integration test should be building the Spark parent > pom as a dependency in the multi-module build, and the integration test > should just use the dynamically built artifact. Or to put it another way, the > integration test builds should never be pulling Spark dependencies from maven > central. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24825) [K8S][TEST] Kubernetes integration tests don't trace the maven project dependency structure
[ https://issues.apache.org/jira/browse/SPARK-24825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-24825: --- Issue Type: Bug (was: Improvement) > [K8S][TEST] Kubernetes integration tests don't trace the maven project > dependency structure > --- > > Key: SPARK-24825 > URL: https://issues.apache.org/jira/browse/SPARK-24825 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > > The Kubernetes integration tests will currently fail if maven installation is > not performed first, because the integration test build believes it should be > pulling the Spark parent artifact from maven central. However, this is > incorrect because the integration test should be building the Spark parent > pom as a dependency in the multi-module build, and the integration test > should just use the dynamically built artifact. Or to put it another way, the > integration test builds should never be pulling Spark dependencies from maven > central. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-24825) [K8S][TEST] Kubernetes integration tests don't trace the maven project dependency structure
Matt Cheah created SPARK-24825: -- Summary: [K8S][TEST] Kubernetes integration tests don't trace the maven project dependency structure Key: SPARK-24825 URL: https://issues.apache.org/jira/browse/SPARK-24825 Project: Spark Issue Type: Improvement Components: Kubernetes, Tests Affects Versions: 2.4.0 Reporter: Matt Cheah The Kubernetes integration tests will currently fail if maven installation is not performed first, because the integration test build believes it should be pulling the Spark parent artifact from maven central. However, this is incorrect because the integration test should be building the Spark parent pom as a dependency in the multi-module build, and the integration test should just use the dynamically built artifact. Or to put it another way, the integration test builds should never be pulling Spark dependencies from maven central. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24825) [K8S][TEST] Kubernetes integration tests don't trace the maven project dependency structure
[ https://issues.apache.org/jira/browse/SPARK-24825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah updated SPARK-24825: --- Priority: Critical (was: Major) > [K8S][TEST] Kubernetes integration tests don't trace the maven project > dependency structure > --- > > Key: SPARK-24825 > URL: https://issues.apache.org/jira/browse/SPARK-24825 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Critical > > The Kubernetes integration tests will currently fail if maven installation is > not performed first, because the integration test build believes it should be > pulling the Spark parent artifact from maven central. However, this is > incorrect because the integration test should be building the Spark parent > pom as a dependency in the multi-module build, and the integration test > should just use the dynamically built artifact. Or to put it another way, the > integration test builds should never be pulling Spark dependencies from maven > central. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-24683) SparkLauncher.NO_RESOURCE doesn't work with Java applications
[ https://issues.apache.org/jira/browse/SPARK-24683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-24683. Resolution: Fixed Fix Version/s: 2.4.0 > SparkLauncher.NO_RESOURCE doesn't work with Java applications > - > > Key: SPARK-24683 > URL: https://issues.apache.org/jira/browse/SPARK-24683 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Critical > Fix For: 2.4.0 > > > On the tip of master, after we merged the Python bindings support, Spark > Submit on Kubernetes no longer works with {{SparkLauncher.NO_RESOURCE. > This is because spark-submit will not set the main class as a child argument. > See here: > https://github.com/apache/spark/blob/1a644afbac35c204f9ad55f86999319a9ab458c6/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L694 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-24683) SparkLauncher.NO_RESOURCE doesn't work with Java applications
Matt Cheah created SPARK-24683: -- Summary: SparkLauncher.NO_RESOURCE doesn't work with Java applications Key: SPARK-24683 URL: https://issues.apache.org/jira/browse/SPARK-24683 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 2.4.0 Reporter: Matt Cheah On the tip of master, after we merged the Python bindings support, Spark Submit on Kubernetes no longer works with {{SparkLauncher.NO_RESOURCE. This is because spark-submit will not set the main class as a child argument. See here: https://github.com/apache/spark/blob/1a644afbac35c204f9ad55f86999319a9ab458c6/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L694 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-24655) [K8S] Custom Docker Image Expectations and Documentation
Matt Cheah created SPARK-24655: -- Summary: [K8S] Custom Docker Image Expectations and Documentation Key: SPARK-24655 URL: https://issues.apache.org/jira/browse/SPARK-24655 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 2.3.1 Reporter: Matt Cheah A common use case we want to support with Kubernetes is the usage of custom Docker images. Some examples include: * A user builds an application using Gradle or Maven, using Spark as a compile-time dependency. The application's jars (both the custom-written jars and the dependencies) need to be packaged in a docker image that can be run via spark-submit. * A user builds a PySpark or R application and desires to include custom dependencies * A user wants to switch the base image from Alpine to CentOS while using either built-in or custom jars We currently do not document how these custom Docker images are supposed to be built, nor do we guarantee stability of these Docker images with various spark-submit versions. To illustrate how this can break down, suppose for example we decide to change the names of environment variables that denote the driver/executor extra JVM options specified by {{spark.[driver|executor].extraJavaOptions}}. If we change the environment variable spark-submit provides then the user must update their custom Dockerfile and build new images. Rather than jumping to an implementation immediately though, it's worth taking a step back and considering these matters from the perspective of the end user. Towards that end, this ticket will serve as a forum where we can answer at least the following questions, and any others pertaining to the matter: # What would be the steps a user would need to take to build a custom Docker image, given their desire to customize the dependencies and the content (OS or otherwise) of said images? # How can we ensure the user does not need to rebuild the image if only the spark-submit version changes? The end deliverable for this ticket is a design document, and then we'll create sub-issues for the technical implementation and documentation of the contract. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24248) [K8S] Use the Kubernetes cluster as the backing store for the state of pods
[ https://issues.apache.org/jira/browse/SPARK-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516368#comment-16516368 ] Matt Cheah commented on SPARK-24248: I've summarized what we ended up going with after some iteration on the PR here: [https://docs.google.com/document/d/1BWTK76k2242spz66JOx8SKKEl5qFV6Cg9ASxCdmIWbY/edit?usp=sharing]. Recommendations are still welcome and can be worked on in follow up patches. > [K8S] Use the Kubernetes cluster as the backing store for the state of pods > --- > > Key: SPARK-24248 > URL: https://issues.apache.org/jira/browse/SPARK-24248 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > Fix For: 2.4.0 > > > We have a number of places in KubernetesClusterSchedulerBackend right now > that maintains the state of pods in memory. However, the Kubernetes API can > always give us the most up to date and correct view of what our executors are > doing. We should consider moving away from in-memory state as much as can in > favor of using the Kubernetes cluster as the source of truth for pod status. > Maintaining less state in memory makes it so that there's a lower chance that > we accidentally miss updating one of these data structures and breaking the > lifecycle of executors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24584) [K8s] More efficient storage of executor pod state
[ https://issues.apache.org/jira/browse/SPARK-24584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516300#comment-16516300 ] Matt Cheah commented on SPARK-24584: Related to https://issues.apache.org/jira/browse/SPARK-24248 > [K8s] More efficient storage of executor pod state > -- > > Key: SPARK-24584 > URL: https://issues.apache.org/jira/browse/SPARK-24584 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Matt Cheah >Priority: Major > > Currently we store buffers of snapshots in {{ExecutorPodsSnapshotStore}}, > where the snapshots are duplicated per subscriber. With hundreds or maybe > thousands of executors, this buffering may become untenable. Investigate > storing less state while still maintaining the same level-triggered semantics. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-24584) [K8s] More efficient storage of executor pod state
Matt Cheah created SPARK-24584: -- Summary: [K8s] More efficient storage of executor pod state Key: SPARK-24584 URL: https://issues.apache.org/jira/browse/SPARK-24584 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 2.4.0 Reporter: Matt Cheah Currently we store buffers of snapshots in {{ExecutorPodsSnapshotStore}}, where the snapshots are duplicated per subscriber. With hundreds or maybe thousands of executors, this buffering may become untenable. Investigate storing less state while still maintaining the same level-triggered semantics. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-24248) [K8S] Use the Kubernetes cluster as the backing store for the state of pods
[ https://issues.apache.org/jira/browse/SPARK-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-24248. Resolution: Fixed Fix Version/s: 2.4.0 > [K8S] Use the Kubernetes cluster as the backing store for the state of pods > --- > > Key: SPARK-24248 > URL: https://issues.apache.org/jira/browse/SPARK-24248 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > Fix For: 2.4.0 > > > We have a number of places in KubernetesClusterSchedulerBackend right now > that maintains the state of pods in memory. However, the Kubernetes API can > always give us the most up to date and correct view of what our executors are > doing. We should consider moving away from in-memory state as much as can in > favor of using the Kubernetes cluster as the source of truth for pod status. > Maintaining less state in memory makes it so that there's a lower chance that > we accidentally miss updating one of these data structures and breaking the > lifecycle of executors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23010) Add integration testing for Kubernetes backend into the apache/spark repository
[ https://issues.apache.org/jira/browse/SPARK-23010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-23010. Resolution: Fixed Fix Version/s: 2.4.0 Some tests are missing but we got a basic set of the tests and the infrastructure into master. > Add integration testing for Kubernetes backend into the apache/spark > repository > --- > > Key: SPARK-23010 > URL: https://issues.apache.org/jira/browse/SPARK-23010 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Anirudh Ramanathan >Priority: Major > Fix For: 2.4.0 > > > Add tests for the scheduler backend into apache/spark > /xref: > http://apache-spark-developers-list.1001551.n3.nabble.com/Integration-testing-and-Scheduler-Backends-td23105.html -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23984) PySpark Bindings for K8S
[ https://issues.apache.org/jira/browse/SPARK-23984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-23984. Resolution: Fixed Fix Version/s: 2.4.0 > PySpark Bindings for K8S > > > Key: SPARK-23984 > URL: https://issues.apache.org/jira/browse/SPARK-23984 > Project: Spark > Issue Type: New Feature > Components: Kubernetes, PySpark >Affects Versions: 2.3.0 >Reporter: Ilan Filonenko >Priority: Major > Fix For: 2.4.0 > > > This ticket is tracking the ongoing work of moving the upsteam work from > [https://github.com/apache-spark-on-k8s/spark] specifically regarding Python > bindings for Spark on Kubernetes. > The points of focus are: dependency management, increased non-JVM memory > overhead default values, and modified Docker images to include Python > Support. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24248) [K8S] Use the Kubernetes cluster as the backing store for the state of pods
[ https://issues.apache.org/jira/browse/SPARK-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471295#comment-16471295 ] Matt Cheah commented on SPARK-24248: I see - I suppose if the watch connection drops, we should try to reestablish connection to the API server periodically, and when we can reestablish it we can do a full sync then? > [K8S] Use the Kubernetes cluster as the backing store for the state of pods > --- > > Key: SPARK-24248 > URL: https://issues.apache.org/jira/browse/SPARK-24248 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > We have a number of places in KubernetesClusterSchedulerBackend right now > that maintains the state of pods in memory. However, the Kubernetes API can > always give us the most up to date and correct view of what our executors are > doing. We should consider moving away from in-memory state as much as can in > favor of using the Kubernetes cluster as the source of truth for pod status. > Maintaining less state in memory makes it so that there's a lower chance that > we accidentally miss updating one of these data structures and breaking the > lifecycle of executors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-24248) [K8S] Use the Kubernetes cluster as the backing store for the state of pods
[ https://issues.apache.org/jira/browse/SPARK-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471280#comment-16471280 ] Matt Cheah edited comment on SPARK-24248 at 5/10/18 11:18 PM: -- I thought about it a bit more, and believe that we can do most if not all of our actions in the Watcher directly. In other words, can we drive the entire lifecycle of all executors solely via watch events? This would be a pretty big rewrite, but you get a large number of benefits from this - namely, you remove any need for synchronization or local state management at all. {{KubernetesClusterSchedulerBackend}} becomes effectively stateless, apart from the parent class's fields. This would imply at least the following changes, though I'm sure I'm missing some: * When the watcher receives a modified or error event, check the status of the executor, construct the exit reason, and call {{RemoveExecutor}} directly * The watcher keeps a running count of active executors and itself triggers rounds of creating new executors (instead of the periodic polling) * {{KubernetesDriverEndpoint::onDisconnected}} is a tricky one. What I'm thinking is that we can just disable the executor but not remove it, counting on the Watch to receive an event that would actually trigger removing the executor. The idea here is that the status of the pods as reported by the Watch should be fully reliable - e.g. whenever any error occurs in the executor such that it becomes unusable, the Kubernetes API should report such state. We could perhaps make the API's representation of the world more accurate by attaching liveness probes to the executor pod. was (Author: mcheah): I thought about it a bit more, and believe that we can do most if not all of our actions in the Watcher directly. In other words, can we drive the entire lifecycle of all executors solely from the perspective of watch events? This would be a pretty big rewrite, but you get a large number of benefits from this - namely, you remove any need for synchronization or local state management at all. {{KubernetesClusterSchedulerBackend}} becomes effectively stateless, apart from the parent class's fields. This would imply at least the following changes, though I'm sure I'm missing some: * When the watcher receives a modified or error event, check the status of the executor, construct the exit reason, and call \{{RemoveExecutor}} directly * The watcher keeps a running count of active executors and itself triggers rounds of creating new executors (instead of the periodic polling) * {\{KubernetesDriverEndpoint::onDisconnected}} is a tricky one. What I'm thinking is that we can just disable the executor but not remove it, counting on the Watch to receive an event that would actually trigger removing the executor. The idea here is that the status of the pods as reported by the Watch should be fully reliable - e.g. whenever any error occurs in the executor such that it becomes unusable, the Kubernetes API should report such state. We could perhaps make the API's representation of the world more accurate by attaching liveness probes to the executor pod. > [K8S] Use the Kubernetes cluster as the backing store for the state of pods > --- > > Key: SPARK-24248 > URL: https://issues.apache.org/jira/browse/SPARK-24248 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > We have a number of places in KubernetesClusterSchedulerBackend right now > that maintains the state of pods in memory. However, the Kubernetes API can > always give us the most up to date and correct view of what our executors are > doing. We should consider moving away from in-memory state as much as can in > favor of using the Kubernetes cluster as the source of truth for pod status. > Maintaining less state in memory makes it so that there's a lower chance that > we accidentally miss updating one of these data structures and breaking the > lifecycle of executors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24248) [K8S] Use the Kubernetes cluster as the backing store for the state of pods
[ https://issues.apache.org/jira/browse/SPARK-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471280#comment-16471280 ] Matt Cheah commented on SPARK-24248: I thought about it a bit more, and believe that we can do most if not all of our actions in the Watcher directly. In other words, can we drive the entire lifecycle of all executors solely from the perspective of watch events? This would be a pretty big rewrite, but you get a large number of benefits from this - namely, you remove any need for synchronization or local state management at all. {{KubernetesClusterSchedulerBackend}} becomes effectively stateless, apart from the parent class's fields. This would imply at least the following changes, though I'm sure I'm missing some: * When the watcher receives a modified or error event, check the status of the executor, construct the exit reason, and call \{{RemoveExecutor}} directly * The watcher keeps a running count of active executors and itself triggers rounds of creating new executors (instead of the periodic polling) * {\{KubernetesDriverEndpoint::onDisconnected}} is a tricky one. What I'm thinking is that we can just disable the executor but not remove it, counting on the Watch to receive an event that would actually trigger removing the executor. The idea here is that the status of the pods as reported by the Watch should be fully reliable - e.g. whenever any error occurs in the executor such that it becomes unusable, the Kubernetes API should report such state. We could perhaps make the API's representation of the world more accurate by attaching liveness probes to the executor pod. > [K8S] Use the Kubernetes cluster as the backing store for the state of pods > --- > > Key: SPARK-24248 > URL: https://issues.apache.org/jira/browse/SPARK-24248 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > We have a number of places in KubernetesClusterSchedulerBackend right now > that maintains the state of pods in memory. However, the Kubernetes API can > always give us the most up to date and correct view of what our executors are > doing. We should consider moving away from in-memory state as much as can in > favor of using the Kubernetes cluster as the source of truth for pod status. > Maintaining less state in memory makes it so that there's a lower chance that > we accidentally miss updating one of these data structures and breaking the > lifecycle of executors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24248) [K8S] Use the Kubernetes cluster as the backing store for the state of pods
[ https://issues.apache.org/jira/browse/SPARK-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471206#comment-16471206 ] Matt Cheah commented on SPARK-24248: [~foxish] [~liyinan926] curious as to what you think about this idea. > [K8S] Use the Kubernetes cluster as the backing store for the state of pods > --- > > Key: SPARK-24248 > URL: https://issues.apache.org/jira/browse/SPARK-24248 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > We have a number of places in KubernetesClusterSchedulerBackend right now > that maintains the state of pods in memory. However, the Kubernetes API can > always give us the most up to date and correct view of what our executors are > doing. We should consider moving away from in-memory state as much as can in > favor of using the Kubernetes cluster as the source of truth for pod status. > Maintaining less state in memory makes it so that there's a lower chance that > we accidentally miss updating one of these data structures and breaking the > lifecycle of executors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-24248) [K8S] Use the Kubernetes cluster as the backing store for the state of pods
Matt Cheah created SPARK-24248: -- Summary: [K8S] Use the Kubernetes cluster as the backing store for the state of pods Key: SPARK-24248 URL: https://issues.apache.org/jira/browse/SPARK-24248 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 2.3.0 Reporter: Matt Cheah We have a number of places in KubernetesClusterSchedulerBackend right now that maintains the state of pods in memory. However, the Kubernetes API can always give us the most up to date and correct view of what our executors are doing. We should consider moving away from in-memory state as much as can in favor of using the Kubernetes cluster as the source of truth for pod status. Maintaining less state in memory makes it so that there's a lower chance that we accidentally miss updating one of these data structures and breaking the lifecycle of executors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-24247) [K8S] currentNodeToLocalTaskCount is unused in KubernetesClusterSchedulerBackend
Matt Cheah created SPARK-24247: -- Summary: [K8S] currentNodeToLocalTaskCount is unused in KubernetesClusterSchedulerBackend Key: SPARK-24247 URL: https://issues.apache.org/jira/browse/SPARK-24247 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 2.3.0 Reporter: Matt Cheah This variable isn't used: [https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala#L109] - we should either remove it or else be putting it to good use. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size
[ https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464467#comment-16464467 ] Matt Cheah commented on SPARK-24135: Put up the PR< see above - created a separate setting for this class of errors. > [K8s] Executors that fail to start up because of init-container errors are > not retried and limit the executor pool size > --- > > Key: SPARK-24135 > URL: https://issues.apache.org/jira/browse/SPARK-24135 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > In KubernetesClusterSchedulerBackend, we detect if executors disconnect after > having been started or if executors hit the {{ERROR}} or {{DELETED}} states. > When executors fail in these ways, they are removed from the pending > executors pool and the driver should retry requesting these executors. > However, the driver does not handle a different class of error: when the pod > enters the {{Init:Error}} state. This state comes up when the executor fails > to launch because one of its init-containers fails. Spark itself doesn't > attach any init-containers to the executors. However, custom web hooks can > run on the cluster and attach init-containers to the executor pods. > Additionally, pod presets can specify init containers to run on these pods. > Therefore Spark should be handling the {{Init:Error}} cases regardless if > Spark itself is aware of init-containers or not. > This class of error is particularly bad because when we hit this state, the > failed executor will never start, but it's still seen as pending by the > executor allocator. The executor allocator won't request more rounds of > executors because its current batch hasn't been resolved to either running or > failed. Therefore we end up with being stuck with the number of executors > that successfully started before the faulty one failed to start, potentially > creating a fake resource bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size
[ https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462106#comment-16462106 ] Matt Cheah edited comment on SPARK-24135 at 5/3/18 8:35 AM: Not necessarily - if the pods fail to start up, we should retry them indefinitely as a replication controller or a deployment would. There's an argument that can be made that we should be using those higher level primitives to run executors instead of raw pods anyways, just that Spark's scheduler code would need non-trivial changes to do so right now. was (Author: mcheah): Not necessarily - if the pods fail to start up, we should retry them indefinitely as a replication controller or a deployment would. There's an argument that can be made that we should be using those lower level primitives to run executors instead of raw pods anyways, just that Spark's scheduler code would need non-trivial changes to do so right now. > [K8s] Executors that fail to start up because of init-container errors are > not retried and limit the executor pool size > --- > > Key: SPARK-24135 > URL: https://issues.apache.org/jira/browse/SPARK-24135 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > In KubernetesClusterSchedulerBackend, we detect if executors disconnect after > having been started or if executors hit the {{ERROR}} or {{DELETED}} states. > When executors fail in these ways, they are removed from the pending > executors pool and the driver should retry requesting these executors. > However, the driver does not handle a different class of error: when the pod > enters the {{Init:Error}} state. This state comes up when the executor fails > to launch because one of its init-containers fails. Spark itself doesn't > attach any init-containers to the executors. However, custom web hooks can > run on the cluster and attach init-containers to the executor pods. > Additionally, pod presets can specify init containers to run on these pods. > Therefore Spark should be handling the {{Init:Error}} cases regardless if > Spark itself is aware of init-containers or not. > This class of error is particularly bad because when we hit this state, the > failed executor will never start, but it's still seen as pending by the > executor allocator. The executor allocator won't request more rounds of > executors because its current batch hasn't been resolved to either running or > failed. Therefore we end up with being stuck with the number of executors > that successfully started before the faulty one failed to start, potentially > creating a fake resource bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size
[ https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462106#comment-16462106 ] Matt Cheah commented on SPARK-24135: Not necessarily - if the pods fail to start up, we should retry them indefinitely as a replication controller or a deployment would. There's an argument that can be made that we should be using those lower level primitives to run executors instead of raw pods anyways, just that Spark's scheduler code would need non-trivial changes to do so right now. > [K8s] Executors that fail to start up because of init-container errors are > not retried and limit the executor pool size > --- > > Key: SPARK-24135 > URL: https://issues.apache.org/jira/browse/SPARK-24135 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > In KubernetesClusterSchedulerBackend, we detect if executors disconnect after > having been started or if executors hit the {{ERROR}} or {{DELETED}} states. > When executors fail in these ways, they are removed from the pending > executors pool and the driver should retry requesting these executors. > However, the driver does not handle a different class of error: when the pod > enters the {{Init:Error}} state. This state comes up when the executor fails > to launch because one of its init-containers fails. Spark itself doesn't > attach any init-containers to the executors. However, custom web hooks can > run on the cluster and attach init-containers to the executor pods. > Additionally, pod presets can specify init containers to run on these pods. > Therefore Spark should be handling the {{Init:Error}} cases regardless if > Spark itself is aware of init-containers or not. > This class of error is particularly bad because when we hit this state, the > failed executor will never start, but it's still seen as pending by the > executor allocator. The executor allocator won't request more rounds of > executors because its current batch hasn't been resolved to either running or > failed. Therefore we end up with being stuck with the number of executors > that successfully started before the faulty one failed to start, potentially > creating a fake resource bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size
[ https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461188#comment-16461188 ] Matt Cheah edited comment on SPARK-24135 at 5/2/18 3:37 PM: {quote}Restarting seems like it would eventually be limited by the job failure limit that Spark already has. If pod startup failures are deterministic the job failure count will hit this limit and job will be killed that way. {quote} In the case of the executor failing to start at all, this wouldn't be caught by Spark's task failure count logic because you're never going to end up scheduling tasks on these executors that failed to start. was (Author: mcheah): > Restarting seems like it would eventually be limited by the job failure limit >that Spark already has. If pod startup failures are deterministic the job >failure count will hit this limit and job will be killed that way. In the case of the executor failing to start at all, this wouldn't be caught by Spark's task failure count logic because you're never going to end up scheduling tasks on these executors that failed to start. > [K8s] Executors that fail to start up because of init-container errors are > not retried and limit the executor pool size > --- > > Key: SPARK-24135 > URL: https://issues.apache.org/jira/browse/SPARK-24135 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > In KubernetesClusterSchedulerBackend, we detect if executors disconnect after > having been started or if executors hit the {{ERROR}} or {{DELETED}} states. > When executors fail in these ways, they are removed from the pending > executors pool and the driver should retry requesting these executors. > However, the driver does not handle a different class of error: when the pod > enters the {{Init:Error}} state. This state comes up when the executor fails > to launch because one of its init-containers fails. Spark itself doesn't > attach any init-containers to the executors. However, custom web hooks can > run on the cluster and attach init-containers to the executor pods. > Additionally, pod presets can specify init containers to run on these pods. > Therefore Spark should be handling the {{Init:Error}} cases regardless if > Spark itself is aware of init-containers or not. > This class of error is particularly bad because when we hit this state, the > failed executor will never start, but it's still seen as pending by the > executor allocator. The executor allocator won't request more rounds of > executors because its current batch hasn't been resolved to either running or > failed. Therefore we end up with being stuck with the number of executors > that successfully started before the faulty one failed to start, potentially > creating a fake resource bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size
[ https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460047#comment-16460047 ] Matt Cheah edited comment on SPARK-24135 at 5/2/18 3:37 PM: {quote}But I'm not sure how much this buys us because very likely the newly requested executors will fail to be initialized, {quote} That's entirely up to the behavior of the init container itself - there's many reasons for one to believe that a given init container's logic can be flaky. But it's not immediately obvious to me whether or not the init container's failure should count towards a job failure. Job failures shouldn't be caused by failures in the framework, and in this case, the framework has added the init-container for these pods - in other words the user's code didn't directly cause the job failure. was (Author: mcheah): _> But I'm not sure how much this buys us because very likely the newly requested executors will fail to be initialized,_ That's entirely up to the behavior of the init container itself - there's many reasons for one to believe that a given init container's logic can be flaky. But it's not immediately obvious to me whether or not the init container's failure should count towards a job failure. Job failures shouldn't be caused by failures in the framework, and in this case, the framework has added the init-container for these pods - in other words the user's code didn't directly cause the job failure. > [K8s] Executors that fail to start up because of init-container errors are > not retried and limit the executor pool size > --- > > Key: SPARK-24135 > URL: https://issues.apache.org/jira/browse/SPARK-24135 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > In KubernetesClusterSchedulerBackend, we detect if executors disconnect after > having been started or if executors hit the {{ERROR}} or {{DELETED}} states. > When executors fail in these ways, they are removed from the pending > executors pool and the driver should retry requesting these executors. > However, the driver does not handle a different class of error: when the pod > enters the {{Init:Error}} state. This state comes up when the executor fails > to launch because one of its init-containers fails. Spark itself doesn't > attach any init-containers to the executors. However, custom web hooks can > run on the cluster and attach init-containers to the executor pods. > Additionally, pod presets can specify init containers to run on these pods. > Therefore Spark should be handling the {{Init:Error}} cases regardless if > Spark itself is aware of init-containers or not. > This class of error is particularly bad because when we hit this state, the > failed executor will never start, but it's still seen as pending by the > executor allocator. The executor allocator won't request more rounds of > executors because its current batch hasn't been resolved to either running or > failed. Therefore we end up with being stuck with the number of executors > that successfully started before the faulty one failed to start, potentially > creating a fake resource bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24135) [K8s] Executors that fail to start up because of init-container errors are not retried and limit the executor pool size
[ https://issues.apache.org/jira/browse/SPARK-24135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461188#comment-16461188 ] Matt Cheah commented on SPARK-24135: > Restarting seems like it would eventually be limited by the job failure limit >that Spark already has. If pod startup failures are deterministic the job >failure count will hit this limit and job will be killed that way. In the case of the executor failing to start at all, this wouldn't be caught by Spark's task failure count logic because you're never going to end up scheduling tasks on these executors that failed to start. > [K8s] Executors that fail to start up because of init-container errors are > not retried and limit the executor pool size > --- > > Key: SPARK-24135 > URL: https://issues.apache.org/jira/browse/SPARK-24135 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Matt Cheah >Priority: Major > > In KubernetesClusterSchedulerBackend, we detect if executors disconnect after > having been started or if executors hit the {{ERROR}} or {{DELETED}} states. > When executors fail in these ways, they are removed from the pending > executors pool and the driver should retry requesting these executors. > However, the driver does not handle a different class of error: when the pod > enters the {{Init:Error}} state. This state comes up when the executor fails > to launch because one of its init-containers fails. Spark itself doesn't > attach any init-containers to the executors. However, custom web hooks can > run on the cluster and attach init-containers to the executor pods. > Additionally, pod presets can specify init containers to run on these pods. > Therefore Spark should be handling the {{Init:Error}} cases regardless if > Spark itself is aware of init-containers or not. > This class of error is particularly bad because when we hit this state, the > failed executor will never start, but it's still seen as pending by the > executor allocator. The executor allocator won't request more rounds of > executors because its current batch hasn't been resolved to either running or > failed. Therefore we end up with being stuck with the number of executors > that successfully started before the faulty one failed to start, potentially > creating a fake resource bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org