[jira] [Resolved] (SPARK-49195) Embed script level parsing logic into SparkSubmitCommandBuilder
[ https://issues.apache.org/jira/browse/SPARK-49195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49195. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47702 [https://github.com/apache/spark/pull/47702] > Embed script level parsing logic into SparkSubmitCommandBuilder > --- > > Key: SPARK-49195 > URL: https://issues.apache.org/jira/browse/SPARK-49195 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Embed the logics in script to JVM, see > https://github.com/apache/spark/pull/47402 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49195) Embed script level parsing logic into SparkSubmitCommandBuilder
[ https://issues.apache.org/jira/browse/SPARK-49195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49195: - Assignee: Hyukjin Kwon > Embed script level parsing logic into SparkSubmitCommandBuilder > --- > > Key: SPARK-49195 > URL: https://issues.apache.org/jira/browse/SPARK-49195 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > Embed the logics in script to JVM, see > https://github.com/apache/spark/pull/47402 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49197) Redact `Spark Command` output in `launcher` module
[ https://issues.apache.org/jira/browse/SPARK-49197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49197. --- Fix Version/s: 3.4.4 4.0.0 3.5.3 Resolution: Fixed Issue resolved by pull request 47704 [https://github.com/apache/spark/pull/47704] > Redact `Spark Command` output in `launcher` module > -- > > Key: SPARK-49197 > URL: https://issues.apache.org/jira/browse/SPARK-49197 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0, 3.5.2, 3.4.3 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.4, 4.0.0, 3.5.3 > > > When `launcher` module shows `Spark Command`, there is no redaction. Although > Spark Cluster is supposed to be in a secure environment, this could be > collected by a centralized log system. We need to do a proper redaction. > {code} > $ SPARK_NO_DAEMONIZE=1 SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true > -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter > -Dspark.org.apache.spark.ui.JWSFilter.param.secretKey=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" > sbin/start-master.sh > starting org.apache.spark.deploy.master.Master, logging to > /Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/logs/spark-dongjoon-org.apache.spark.deploy.master.Master-1-M3-Max.local.out > Spark Command: /Users/dongjoon/.jenv/versions/17/bin/java -cp > /Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/conf/:/Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/jars/slf4j-api-2.0.13.jar:/Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/jars/* > -Dspark.master.rest.enabled=true > -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter > -Dspark.org.apache.spark.ui.JWSFilter.param.secretKey=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4= > -Xmx1g org.apache.spark.deploy.master.Master --host M3-Max.local --port 7077 > --webui-port 8080 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38862) Let consumers provide their own method for Authentication for The REST Submission Server
[ https://issues.apache.org/jira/browse/SPARK-38862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17873007#comment-17873007 ] Dongjoon Hyun commented on SPARK-38862: --- According to the Apache Spark community guideline, `Target Versions` is removed again. - https://spark.apache.org/contributing.html {quote}Do not set the following fields: - Fix Version. This is assigned by committers only when resolved. - Target Version. This is assigned by committers to indicate a PR has been accepted for possible fix by the target version.{quote} > Let consumers provide their own method for Authentication for The REST > Submission Server > > > Key: SPARK-38862 > URL: https://issues.apache.org/jira/browse/SPARK-38862 > Project: Spark > Issue Type: New Feature > Components: Documentation, Spark Core, Spark Submit >Affects Versions: 3.4.0, 4.0.0 >Reporter: Jack >Priority: Major > Labels: authentication, pull-request-available, rest, spark, > spark-submit, submit > > [Spark documentation|https://spark.apache.org/docs/latest/security.html] > states that > ??The REST Submission Server and the MesosClusterDispatcher do not support > authentication. You should ensure that all network access to the REST API & > MesosClusterDispatcher (port 6066 and 7077 respectively by default) are > restricted to hosts that are trusted to submit jobs.?? > Whilst it is true that we can use network policies to restrict access to our > exposed submission endpoint, it would be preferable to at least also allow > some primitive form of authentication at a global level, whether this is by > some token provided to the runtime environment or is a "system user" using > basic authentication of a username/password combination - I am not strictly > opinionated and I think either would suffice. > Alternatively, one could implement a custom proxy to provide this > authentication check, but upon investigation this option is rejected by the > spark master as-is today. > I would imagine that whatever solution is agreed for a first phase, a custom > authenticator may be something we want a user to be able to provide so that > if an admin needed some more advanced authentication check, such as RBAC et > al, it could be facilitated without the need for writing a complete custom > proxy layer; although it could be argued just some basic built in layer being > available; eg. RestSubmissionBasicAuthenticator could be preferable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38862) Let consumers provide their own method for Authentication for The REST Submission Server
[ https://issues.apache.org/jira/browse/SPARK-38862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-38862: -- Target Version/s: (was: 4.0.0) > Let consumers provide their own method for Authentication for The REST > Submission Server > > > Key: SPARK-38862 > URL: https://issues.apache.org/jira/browse/SPARK-38862 > Project: Spark > Issue Type: New Feature > Components: Documentation, Spark Core, Spark Submit >Affects Versions: 3.4.0, 4.0.0 >Reporter: Jack >Priority: Major > Labels: authentication, pull-request-available, rest, spark, > spark-submit, submit > > [Spark documentation|https://spark.apache.org/docs/latest/security.html] > states that > ??The REST Submission Server and the MesosClusterDispatcher do not support > authentication. You should ensure that all network access to the REST API & > MesosClusterDispatcher (port 6066 and 7077 respectively by default) are > restricted to hosts that are trusted to submit jobs.?? > Whilst it is true that we can use network policies to restrict access to our > exposed submission endpoint, it would be preferable to at least also allow > some primitive form of authentication at a global level, whether this is by > some token provided to the runtime environment or is a "system user" using > basic authentication of a username/password combination - I am not strictly > opinionated and I think either would suffice. > Alternatively, one could implement a custom proxy to provide this > authentication check, but upon investigation this option is rejected by the > spark master as-is today. > I would imagine that whatever solution is agreed for a first phase, a custom > authenticator may be something we want a user to be able to provide so that > if an admin needed some more advanced authentication check, such as RBAC et > al, it could be facilitated without the need for writing a complete custom > proxy layer; although it could be argued just some basic built in layer being > available; eg. RestSubmissionBasicAuthenticator could be preferable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
Re: [VOTE] Archive Spark Documentations in Apache Archives
+1 for the proposals - enhancing the release process to put the docs to `release` directory in order to archive. - uploading old releases via SVN manually to archive. Since deletion is not a scope of this vote, I don't see any risk here. Thank you, Kent. Dongjoon. On 2024/08/12 09:07:47 Kent Yao wrote: > Archive Spark Documentations in Apache Archives > > Hi dev, > > To address the issue of the Spark website repository size > reaching the storage limit for GitHub-hosted runners [1], I suggest > enhancing step [2] in our release process by relocating the > documentation releases from the dev[3] directory to the release > directory[4]. Then it would captured by the Apache Archives > service[5] to create permanent links, which would be alternative > endpoints for our documentation, like > > https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/_site/index.html > for > https://spark.apache.org/docs/3.5.2/index.html > > Note that the previous example still uses the staging repository, > which will become > https://archive.apache.org/dist/spark/docs/3.5.2/index.html. > > For older releases hosted on the Spark website [6], we also need to > upload them via SVN manually. > > After that, when we reach the threshold again, we can delete some of > the old ones on page [6], and update their links on page [7] or use > redirection. > > JIRA ticket: https://issues.apache.org/jira/browse/SPARK-49209 > > Please vote on the idea of Archive Spark Documentations in > Apache Archives for the next 72 hours: > > [ ] +1: Accept the proposal > [ ] +0 > [ ] -1: I don’t think this is a good idea because … > > Bests, > Kent Yao > > [1] https://lists.apache.org/thread/o0w4gqoks23xztdmjjj26jkp1yyg2bvq > [2] > https://spark.apache.org/release-process.html#upload-to-apache-release-directory > [3] https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/ > [4] https://dist.apache.org/repos/dist/release/spark/docs/3.5.2 > [5] https://archive.apache.org/dist/spark/ > [6] https://github.com/apache/spark-website/tree/asf-site/site/docs > [7] https://spark.apache.org/documentation.html > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
[jira] [Resolved] (SPARK-49196) Upgrade `kubernetes-client` to 6.13.2
[ https://issues.apache.org/jira/browse/SPARK-49196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49196. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47703 [https://github.com/apache/spark/pull/47703] > Upgrade `kubernetes-client` to 6.13.2 > - > > Key: SPARK-49196 > URL: https://issues.apache.org/jira/browse/SPARK-49196 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Trivial > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49206) Add `Environment Variables` table to Master `EnvironmentPage`
Dongjoon Hyun created SPARK-49206: - Summary: Add `Environment Variables` table to Master `EnvironmentPage` Key: SPARK-49206 URL: https://issues.apache.org/jira/browse/SPARK-49206 Project: Spark Issue Type: Sub-task Components: Spark Core, UI Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49197) Redact `Spark Command` output in `launcher` module
[ https://issues.apache.org/jira/browse/SPARK-49197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49197: - Assignee: Dongjoon Hyun > Redact `Spark Command` output in `launcher` module > -- > > Key: SPARK-49197 > URL: https://issues.apache.org/jira/browse/SPARK-49197 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0, 3.5.2, 3.4.3 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Critical > Labels: pull-request-available > > When `launcher` module shows `Spark Command`, there is no redaction. Although > Spark Cluster is supposed to be in a secure environment, this could be > collected by a centralized log system. We need to do a proper redaction. > {code} > $ SPARK_NO_DAEMONIZE=1 SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true > -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter > -Dspark.org.apache.spark.ui.JWSFilter.param.secretKey=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" > sbin/start-master.sh > starting org.apache.spark.deploy.master.Master, logging to > /Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/logs/spark-dongjoon-org.apache.spark.deploy.master.Master-1-M3-Max.local.out > Spark Command: /Users/dongjoon/.jenv/versions/17/bin/java -cp > /Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/conf/:/Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/jars/slf4j-api-2.0.13.jar:/Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/jars/* > -Dspark.master.rest.enabled=true > -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter > -Dspark.org.apache.spark.ui.JWSFilter.param.secretKey=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4= > -Xmx1g org.apache.spark.deploy.master.Master --host M3-Max.local --port 7077 > --webui-port 8080 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49197) Redact `Spark Command` output in `launcher` module
Dongjoon Hyun created SPARK-49197: - Summary: Redact `Spark Command` output in `launcher` module Key: SPARK-49197 URL: https://issues.apache.org/jira/browse/SPARK-49197 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 3.4.3, 4.0.0, 3.5.2 Reporter: Dongjoon Hyun When `launcher` module shows `Spark Command`, there is no redaction. Although Spark Cluster is supposed to be in a secure environment, this could be collected by a centralized log system. We need to do a proper redaction. {code} $ SPARK_NO_DAEMONIZE=1 SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.secretKey=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" sbin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/logs/spark-dongjoon-org.apache.spark.deploy.master.Master-1-M3-Max.local.out Spark Command: /Users/dongjoon/.jenv/versions/17/bin/java -cp /Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/conf/:/Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/jars/slf4j-api-2.0.13.jar:/Users/dongjoon/APACHE/spark-releases/spark-4.0.0-preview1-bin-hadoop3/jars/* -Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.secretKey=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4= -Xmx1g org.apache.spark.deploy.master.Master --host M3-Max.local --port 7077 --webui-port 8080 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
Re: [VOTE] Release Apache ORC 2.0.2 (RC0)
+1 I tested the following as a release manager. - Checksum and sign. - Unit testing locally. - Integration tests with Apache Spark. Here is the release manager checklist. - https://github.com/apache/orc/issues/1992 Dongjoon On Sun, Aug 11, 2024 at 3:28 PM Dongjoon Hyun wrote: > Please vote on releasing the following candidate as Apache ORC version > 2.0.2. This vote is open until August 15th 1AM (PST) and passes if a > majority +1 PMC votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release this package as Apache ORC 2.0.2 > [ ] -1 Do not release this package because ... > > TAG: > https://github.com/apache/orc/releases/tag/v2.0.2-rc0 > > RELEASE FILES: > https://dist.apache.org/repos/dist/dev/orc/v2.0.2-rc0 > > STAGING REPOSITORY: > https://repository.apache.org/content/repositories/orgapacheorc-1083/ > > LIST OF ISSUES: > https://issues.apache.org/jira/projects/ORC/versions/12354875 > https://github.com/apache/orc/milestone/32?closed=1 > > Thanks, > Dongjoon. >
[VOTE] Release Apache ORC 2.0.2 (RC0)
Please vote on releasing the following candidate as Apache ORC version 2.0.2. This vote is open until August 15th 1AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache ORC 2.0.2 [ ] -1 Do not release this package because ... TAG: https://github.com/apache/orc/releases/tag/v2.0.2-rc0 RELEASE FILES: https://dist.apache.org/repos/dist/dev/orc/v2.0.2-rc0 STAGING REPOSITORY: https://repository.apache.org/content/repositories/orgapacheorc-1083/ LIST OF ISSUES: https://issues.apache.org/jira/projects/ORC/versions/12354875 https://github.com/apache/orc/milestone/32?closed=1 Thanks, Dongjoon.
[jira] [Assigned] (SPARK-49137) When the Boolean condition in the `if statement` is invalid, an exception should be thrown instead of returning false directly
[ https://issues.apache.org/jira/browse/SPARK-49137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49137: - Assignee: BingKun Pan > When the Boolean condition in the `if statement` is invalid, an exception > should be thrown instead of returning false directly > -- > > Key: SPARK-49137 > URL: https://issues.apache.org/jira/browse/SPARK-49137 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49137) When the Boolean condition in the `if statement` is invalid, an exception should be thrown instead of returning false directly
[ https://issues.apache.org/jira/browse/SPARK-49137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49137. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47648 [https://github.com/apache/spark/pull/47648] > When the Boolean condition in the `if statement` is invalid, an exception > should be thrown instead of returning false directly > -- > > Key: SPARK-49137 > URL: https://issues.apache.org/jira/browse/SPARK-49137 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49169) Upgrade `commons-compress` to 1.27.0
[ https://issues.apache.org/jira/browse/SPARK-49169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49169: -- Parent: SPARK-47046 Issue Type: Sub-task (was: Improvement) > Upgrade `commons-compress` to 1.27.0 > > > Key: SPARK-49169 > URL: https://issues.apache.org/jira/browse/SPARK-49169 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49169) Upgrade `commons-compress` to 1.27.0
[ https://issues.apache.org/jira/browse/SPARK-49169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49169. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47674 [https://github.com/apache/spark/pull/47674] > Upgrade `commons-compress` to 1.27.0 > > > Key: SPARK-49169 > URL: https://issues.apache.org/jira/browse/SPARK-49169 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49171) Update Spark Shell with Spark Connect
[ https://issues.apache.org/jira/browse/SPARK-49171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49171. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47676 [https://github.com/apache/spark/pull/47676] > Update Spark Shell with Spark Connect > - > > Key: SPARK-49171 > URL: https://issues.apache.org/jira/browse/SPARK-49171 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Documentation update by SPARK-48936 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49178) `Row#getSeq` exhibits a performance regression between master and Spark 3.5 with Scala 2.12
[ https://issues.apache.org/jira/browse/SPARK-49178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49178: -- Parent: SPARK-44111 Issue Type: Sub-task (was: Improvement) > `Row#getSeq` exhibits a performance regression between master and Spark 3.5 > with Scala 2.12 > --- > > Key: SPARK-49178 > URL: https://issues.apache.org/jira/browse/SPARK-49178 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > {code:java} > object GetSeqBenchmark extends SqlBasedBenchmark { > import spark.implicits._ > def testRowGetSeq(valuesPerIteration: Int, arraySize: Int): Unit = { > val data = (0 until arraySize).toArray > val row = Seq(data).toDF().collect().head > val benchmark = new Benchmark( > s"Test get seq with $arraySize from row", > valuesPerIteration, > output = output) > benchmark.addCase("Get Seq") { _: Int => > for (_ <- 0L until valuesPerIteration) { > val ret = row.getSeq(0) > } > } > benchmark.run() > } > override def runBenchmarkSuite(mainArgs: Array[String]): Unit = { > val valuesPerIteration = 10 > testRowGetSeq(valuesPerIteration, 10) > testRowGetSeq(valuesPerIteration, 100) > testRowGetSeq(valuesPerIteration, 1000) > testRowGetSeq(valuesPerIteration, 1) > testRowGetSeq(valuesPerIteration, 10) > } > } {code} > > branch-3.5 > {code:java} > OpenJDK 64-Bit Server VM 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 10 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 194.8 5.1 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 100 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 96.8 10.3 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 1000 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 97.0 10.3 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 1 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 96.8 10.3 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 10 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 96.9 10.3 1.0X {code} > master > {code:java} > OpenJDK 64-Bit Server VM 17.0.12+7-LTS on Linux 6.5.0-1025-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 10 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 9
[jira] [Resolved] (SPARK-49178) `Row#getSeq` exhibits a performance regression between master and Spark 3.5 with Scala 2.12
[ https://issues.apache.org/jira/browse/SPARK-49178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49178. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47698 [https://github.com/apache/spark/pull/47698] > `Row#getSeq` exhibits a performance regression between master and Spark 3.5 > with Scala 2.12 > --- > > Key: SPARK-49178 > URL: https://issues.apache.org/jira/browse/SPARK-49178 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > {code:java} > object GetSeqBenchmark extends SqlBasedBenchmark { > import spark.implicits._ > def testRowGetSeq(valuesPerIteration: Int, arraySize: Int): Unit = { > val data = (0 until arraySize).toArray > val row = Seq(data).toDF().collect().head > val benchmark = new Benchmark( > s"Test get seq with $arraySize from row", > valuesPerIteration, > output = output) > benchmark.addCase("Get Seq") { _: Int => > for (_ <- 0L until valuesPerIteration) { > val ret = row.getSeq(0) > } > } > benchmark.run() > } > override def runBenchmarkSuite(mainArgs: Array[String]): Unit = { > val valuesPerIteration = 10 > testRowGetSeq(valuesPerIteration, 10) > testRowGetSeq(valuesPerIteration, 100) > testRowGetSeq(valuesPerIteration, 1000) > testRowGetSeq(valuesPerIteration, 1) > testRowGetSeq(valuesPerIteration, 10) > } > } {code} > > branch-3.5 > {code:java} > OpenJDK 64-Bit Server VM 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 10 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 194.8 5.1 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 100 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 96.8 10.3 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 1000 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 97.0 10.3 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 1 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 96.8 10.3 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 10 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 96.9 10.3 1.0X {code} > master > {code:java} > OpenJDK 64-Bit Server VM 17.0.12+7-LTS on Linux 6.5.0-1025-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 10 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq
[jira] [Assigned] (SPARK-49178) `Row#getSeq` exhibits a performance regression between master and Spark 3.5 with Scala 2.12
[ https://issues.apache.org/jira/browse/SPARK-49178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49178: - Assignee: Yang Jie > `Row#getSeq` exhibits a performance regression between master and Spark 3.5 > with Scala 2.12 > --- > > Key: SPARK-49178 > URL: https://issues.apache.org/jira/browse/SPARK-49178 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > > {code:java} > object GetSeqBenchmark extends SqlBasedBenchmark { > import spark.implicits._ > def testRowGetSeq(valuesPerIteration: Int, arraySize: Int): Unit = { > val data = (0 until arraySize).toArray > val row = Seq(data).toDF().collect().head > val benchmark = new Benchmark( > s"Test get seq with $arraySize from row", > valuesPerIteration, > output = output) > benchmark.addCase("Get Seq") { _: Int => > for (_ <- 0L until valuesPerIteration) { > val ret = row.getSeq(0) > } > } > benchmark.run() > } > override def runBenchmarkSuite(mainArgs: Array[String]): Unit = { > val valuesPerIteration = 10 > testRowGetSeq(valuesPerIteration, 10) > testRowGetSeq(valuesPerIteration, 100) > testRowGetSeq(valuesPerIteration, 1000) > testRowGetSeq(valuesPerIteration, 1) > testRowGetSeq(valuesPerIteration, 10) > } > } {code} > > branch-3.5 > {code:java} > OpenJDK 64-Bit Server VM 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 10 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 194.8 5.1 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 100 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 96.8 10.3 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 1000 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 97.0 10.3 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 1 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 96.8 10.3 1.0XOpenJDK 64-Bit Server VM > 1.8.0_422-b05 on Linux 5.15.0-1068-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 10 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 1 1 > 0 96.9 10.3 1.0X {code} > master > {code:java} > OpenJDK 64-Bit Server VM 17.0.12+7-LTS on Linux 6.5.0-1025-azure > AMD EPYC 7763 64-Core Processor > Test get seq with 10 from row: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > > Get Seq 9
[jira] [Updated] (SPARK-49176) Fix `spark.ui.custom.executor.log.url` docs by adding K8s
[ https://issues.apache.org/jira/browse/SPARK-49176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49176: -- Summary: Fix `spark.ui.custom.executor.log.url` docs by adding K8s (was: Fix `spark.ui.custom.executor.log.url` documentation by adding K8s) > Fix `spark.ui.custom.executor.log.url` docs by adding K8s > - > > Key: SPARK-49176 > URL: https://issues.apache.org/jira/browse/SPARK-49176 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49149) Support customized log url for Spark UI and History server in Kubernetes environment
[ https://issues.apache.org/jira/browse/SPARK-49149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17872194#comment-17872194 ] Dongjoon Hyun commented on SPARK-49149: --- Have you try the following in K8s? I verified that `spark.ui.custom.executor.log.url` works well like the following. In addition, SPARK-44214 added live `Driver Log` UI at Apache Spark 4.0.0. {code} bin/spark-submit \ --master k8s://$K8S_MASTER \ --deploy-mode cluster \ -c spark.executor.instances=10 \ -c spark.driver.log.localDir=/tmp \ -c spark.ui.custom.executor.log.url='https://your-server/log?appId={{APP_ID}}&execId={{EXECUTOR_ID}}' \ -c spark.kubernetes.driver.master=$K8S_MASTER \ -c spark.executorEnv.SPARK_EXECUTOR_ATTRIBUTE_APP_ID='$(SPARK_APPLICATION_ID)' \ -c spark.executorEnv.SPARK_EXECUTOR_ATTRIBUTE_EXECUTOR_ID='$(SPARK_EXECUTOR_ID)' \ ... {code} > Support customized log url for Spark UI and History server in Kubernetes > environment > > > Key: SPARK-49149 > URL: https://issues.apache.org/jira/browse/SPARK-49149 > Project: Spark > Issue Type: Improvement > Components: UI >Affects Versions: 4.0.0 >Reporter: Yichuan Huang >Priority: Major > > Spark provides two configs to alternate the log url on the live UI and > history server, `{{{}spark.ui.custom.executor.log.url{}}}` and > `{{{}spark.history.custom.executor.log.url{}}}`. The configs support path > variables which will be replaced at runtime, but currently this only works on > Yarn. Running on k8s with the path variable doesn't work. Here's an example > {code:java} > ./bin/spark-shell --conf > spark.ui.custom.executor.log.url="URL_PREFIX?appId=APP_ID&execId=EXECUTOR_ID"{code} > The log column doesn't show up in the Spark UI, and Spark driver printing > {code:java} > 24/08/07 17:23:45 INFO ExecutorLogUrlHandler: Fail to renew executor log > urls: some of required attributes are missing in app's event log.. Required: > Set(APP_ID, EXECUTOR_ID) / available: Set(). Falling back to s how app's > original log urls.{code} > the empty attribute `{{{}Set(){}}}` in the above log segment indicates the > feature not supported in k8s environment, thus creating this ticket to add > the support -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (ORC-1758) Use `OpenContainers` Annotations in docker images
[ https://issues.apache.org/jira/browse/ORC-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved ORC-1758. Fix Version/s: 2.1.0 Resolution: Fixed This is resolved via https://github.com/apache/orc/pull/2002 > Use `OpenContainers` Annotations in docker images > - > > Key: ORC-1758 > URL: https://issues.apache.org/jira/browse/ORC-1758 > Project: ORC > Issue Type: Task > Components: Infra >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 2.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (SPARK-49149) Support customized log url for Spark UI and History server in Kubernetes environment
[ https://issues.apache.org/jira/browse/SPARK-49149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49149: -- Fix Version/s: (was: 4.0.0) (was: 3.5.3) > Support customized log url for Spark UI and History server in Kubernetes > environment > > > Key: SPARK-49149 > URL: https://issues.apache.org/jira/browse/SPARK-49149 > Project: Spark > Issue Type: Improvement > Components: UI >Affects Versions: 4.0.0 >Reporter: Yichuan Huang >Priority: Major > > Spark provides two configs to alternate the log url on the live UI and > history server, `{{{}spark.ui.custom.executor.log.url{}}}` and > `{{{}spark.history.custom.executor.log.url{}}}`. The configs support path > variables which will be replaced at runtime, but currently this only works on > Yarn. Running on k8s with the path variable doesn't work. Here's an example > {code:java} > ./bin/spark-shell --conf > spark.ui.custom.executor.log.url="URL_PREFIX?appId=APP_ID&execId=EXECUTOR_ID"{code} > The log column doesn't show up in the Spark UI, and Spark driver printing > {code:java} > 24/08/07 17:23:45 INFO ExecutorLogUrlHandler: Fail to renew executor log > urls: some of required attributes are missing in app's event log.. Required: > Set(APP_ID, EXECUTOR_ID) / available: Set(). Falling back to s how app's > original log urls.{code} > the empty attribute `{{{}Set(){}}}` in the above log segment indicates the > feature not supported in k8s environment, thus creating this ticket to add > the support -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49149) Support customized log url for Spark UI and History server in Kubernetes environment
[ https://issues.apache.org/jira/browse/SPARK-49149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49149: -- Affects Version/s: 4.0.0 (was: 3.4.0) (was: 3.4.3) > Support customized log url for Spark UI and History server in Kubernetes > environment > > > Key: SPARK-49149 > URL: https://issues.apache.org/jira/browse/SPARK-49149 > Project: Spark > Issue Type: Improvement > Components: UI >Affects Versions: 4.0.0 >Reporter: Yichuan Huang >Priority: Major > Fix For: 4.0.0, 3.5.3 > > > Spark provides two configs to alternate the log url on the live UI and > history server, `{{{}spark.ui.custom.executor.log.url{}}}` and > `{{{}spark.history.custom.executor.log.url{}}}`. The configs support path > variables which will be replaced at runtime, but currently this only works on > Yarn. Running on k8s with the path variable doesn't work. Here's an example > {code:java} > ./bin/spark-shell --conf > spark.ui.custom.executor.log.url="URL_PREFIX?appId=APP_ID&execId=EXECUTOR_ID"{code} > The log column doesn't show up in the Spark UI, and Spark driver printing > {code:java} > 24/08/07 17:23:45 INFO ExecutorLogUrlHandler: Fail to renew executor log > urls: some of required attributes are missing in app's event log.. Required: > Set(APP_ID, EXECUTOR_ID) / available: Set(). Falling back to s how app's > original log urls.{code} > the empty attribute `{{{}Set(){}}}` in the above log segment indicates the > feature not supported in k8s environment, thus creating this ticket to add > the support -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-49149) Support customized log url for Spark UI and History server in Kubernetes environment
[ https://issues.apache.org/jira/browse/SPARK-49149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17872156#comment-17872156 ] Dongjoon Hyun commented on SPARK-49149: --- Thank you for filing an issue. According to the community guidelines, - https://spark.apache.org/contributing.html We need to change `Affects Versions` to `4.0.0` because this is an improvement. In addition, `Fix Versions` should be empty until the committers merge a patch. Let me revise the fields for you, [~hycsam]. > Support customized log url for Spark UI and History server in Kubernetes > environment > > > Key: SPARK-49149 > URL: https://issues.apache.org/jira/browse/SPARK-49149 > Project: Spark > Issue Type: Improvement > Components: UI >Affects Versions: 3.4.0, 3.4.3 >Reporter: Yichuan Huang >Priority: Major > Fix For: 4.0.0, 3.5.3 > > > Spark provides two configs to alternate the log url on the live UI and > history server, `{{{}spark.ui.custom.executor.log.url{}}}` and > `{{{}spark.history.custom.executor.log.url{}}}`. The configs support path > variables which will be replaced at runtime, but currently this only works on > Yarn. Running on k8s with the path variable doesn't work. Here's an example > {code:java} > ./bin/spark-shell --conf > spark.ui.custom.executor.log.url="URL_PREFIX?appId=APP_ID&execId=EXECUTOR_ID"{code} > The log column doesn't show up in the Spark UI, and Spark driver printing > {code:java} > 24/08/07 17:23:45 INFO ExecutorLogUrlHandler: Fail to renew executor log > urls: some of required attributes are missing in app's event log.. Required: > Set(APP_ID, EXECUTOR_ID) / available: Set(). Falling back to s how app's > original log urls.{code} > the empty attribute `{{{}Set(){}}}` in the above log segment indicates the > feature not supported in k8s environment, thus creating this ticket to add > the support -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49168) Add `OpenContainers` Annotations to docker image
[ https://issues.apache.org/jira/browse/SPARK-49168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49168. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 43 [https://github.com/apache/spark-kubernetes-operator/pull/43] > Add `OpenContainers` Annotations to docker image > > > Key: SPARK-49168 > URL: https://issues.apache.org/jira/browse/SPARK-49168 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > > {code:java} > LABEL org.opencontainers.image.authors="Apache Spark project > " > LABEL org.opencontainers.image.licenses="Apache-2.0" > LABEL org.opencontainers.image.ref.name="Apache Spark Kubernetes Operator" > LABEL org.opencontainers.image.version="${APP_VERSION}" {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49168) Add `OpenContainers` Annotations to docker image
[ https://issues.apache.org/jira/browse/SPARK-49168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49168: - Assignee: Dongjoon Hyun > Add `OpenContainers` Annotations to docker image > > > Key: SPARK-49168 > URL: https://issues.apache.org/jira/browse/SPARK-49168 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > > {code:java} > LABEL org.opencontainers.image.authors="Apache Spark project > " > LABEL org.opencontainers.image.licenses="Apache-2.0" > LABEL org.opencontainers.image.ref.name="Apache Spark Kubernetes Operator" > LABEL org.opencontainers.image.version="${APP_VERSION}" {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (ORC-1758) Use `OpenContainers` Annotations in docker images
[ https://issues.apache.org/jira/browse/ORC-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned ORC-1758: -- Assignee: Dongjoon Hyun > Use `OpenContainers` Annotations in docker images > - > > Key: ORC-1758 > URL: https://issues.apache.org/jira/browse/ORC-1758 > Project: ORC > Issue Type: Task > Components: Infra >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ORC-1758) Use `OpenContainers` Annotations in docker images
[ https://issues.apache.org/jira/browse/ORC-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated ORC-1758: --- Affects Version/s: 2.1.0 (was: 2.0.1) > Use `OpenContainers` Annotations in docker images > - > > Key: ORC-1758 > URL: https://issues.apache.org/jira/browse/ORC-1758 > Project: ORC > Issue Type: Task > Components: Infra >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ORC-1758) Use `OpenContainers` Annotations in docker images
Dongjoon Hyun created ORC-1758: -- Summary: Use `OpenContainers` Annotations in docker images Key: ORC-1758 URL: https://issues.apache.org/jira/browse/ORC-1758 Project: ORC Issue Type: Task Components: Infra Affects Versions: 2.0.1 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (SPARK-49168) Add `OpenContainers` Annotations to docker image
Dongjoon Hyun created SPARK-49168: - Summary: Add `OpenContainers` Annotations to docker image Key: SPARK-49168 URL: https://issues.apache.org/jira/browse/SPARK-49168 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: kubernetes-operator-0.1.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49168) Add `OpenContainers` Annotations to docker image
[ https://issues.apache.org/jira/browse/SPARK-49168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49168: -- Description: {code:java} LABEL org.opencontainers.image.authors="Apache Spark project " LABEL org.opencontainers.image.licenses="Apache-2.0" LABEL org.opencontainers.image.ref.name="Apache Spark Kubernetes Operator" LABEL org.opencontainers.image.version="${APP_VERSION}" {code} > Add `OpenContainers` Annotations to docker image > > > Key: SPARK-49168 > URL: https://issues.apache.org/jira/browse/SPARK-49168 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: Dongjoon Hyun >Priority: Major > > {code:java} > LABEL org.opencontainers.image.authors="Apache Spark project > " > LABEL org.opencontainers.image.licenses="Apache-2.0" > LABEL org.opencontainers.image.ref.name="Apache Spark Kubernetes Operator" > LABEL org.opencontainers.image.version="${APP_VERSION}" {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49167) Enforce UseUtilityClass rule
[ https://issues.apache.org/jira/browse/SPARK-49167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49167. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 42 [https://github.com/apache/spark-kubernetes-operator/pull/42] > Enforce UseUtilityClass rule > > > Key: SPARK-49167 > URL: https://issues.apache.org/jira/browse/SPARK-49167 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49167) Enforce UseUtilityClass rule
[ https://issues.apache.org/jira/browse/SPARK-49167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49167: - Assignee: William Hyun > Enforce UseUtilityClass rule > > > Key: SPARK-49167 > URL: https://issues.apache.org/jira/browse/SPARK-49167 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (ORC-1757) Bump `slf4j` to 2.0.14
[ https://issues.apache.org/jira/browse/ORC-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved ORC-1757. Fix Version/s: 2.1.0 Resolution: Fixed Issue resolved by pull request 2000 [https://github.com/apache/orc/pull/2000] > Bump `slf4j` to 2.0.14 > -- > > Key: ORC-1757 > URL: https://issues.apache.org/jira/browse/ORC-1757 > Project: ORC > Issue Type: Bug > Components: Java >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 2.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ORC-1757) Bump `slf4j` to 2.0.14
[ https://issues.apache.org/jira/browse/ORC-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned ORC-1757: -- Assignee: Dongjoon Hyun > Bump `slf4j` to 2.0.14 > -- > > Key: ORC-1757 > URL: https://issues.apache.org/jira/browse/ORC-1757 > Project: ORC > Issue Type: Bug > Components: Java >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ORC-1757) Bump `slf4j` to 2.0.14
Dongjoon Hyun created ORC-1757: -- Summary: Bump `slf4j` to 2.0.14 Key: ORC-1757 URL: https://issues.apache.org/jira/browse/ORC-1757 Project: ORC Issue Type: Bug Components: Java Affects Versions: 2.1.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ORC-1756) Bump `snappy-java` to 1.1.10.6 in `bench` module
[ https://issues.apache.org/jira/browse/ORC-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned ORC-1756: -- Assignee: Dongjoon Hyun > Bump `snappy-java` to 1.1.10.6 in `bench` module > > > Key: ORC-1756 > URL: https://issues.apache.org/jira/browse/ORC-1756 > Project: ORC > Issue Type: Bug > Components: Java >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ORC-1756) Bump `snappy-java` to 1.1.10.6 in `bench` module
[ https://issues.apache.org/jira/browse/ORC-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved ORC-1756. Fix Version/s: 2.1.0 Resolution: Fixed Issue resolved by pull request 2001 [https://github.com/apache/orc/pull/2001] > Bump `snappy-java` to 1.1.10.6 in `bench` module > > > Key: ORC-1756 > URL: https://issues.apache.org/jira/browse/ORC-1756 > Project: ORC > Issue Type: Bug > Components: Java >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 2.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ORC-1756) Bump `snappy-java` to 1.1.10.6 in `bench` module
Dongjoon Hyun created ORC-1756: -- Summary: Bump `snappy-java` to 1.1.10.6 in `bench` module Key: ORC-1756 URL: https://issues.apache.org/jira/browse/ORC-1756 Project: ORC Issue Type: Bug Components: Java Affects Versions: 2.1.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ORC-1755) Bump `commons-lang3` to 3.16.0
[ https://issues.apache.org/jira/browse/ORC-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned ORC-1755: -- Assignee: Dongjoon Hyun > Bump `commons-lang3` to 3.16.0 > -- > > Key: ORC-1755 > URL: https://issues.apache.org/jira/browse/ORC-1755 > Project: ORC > Issue Type: Bug > Components: Java >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ORC-1755) Bump `commons-lang3` to 3.16.0
[ https://issues.apache.org/jira/browse/ORC-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved ORC-1755. Fix Version/s: 2.1.0 Resolution: Fixed Issue resolved by pull request 1999 [https://github.com/apache/orc/pull/1999] > Bump `commons-lang3` to 3.16.0 > -- > > Key: ORC-1755 > URL: https://issues.apache.org/jira/browse/ORC-1755 > Project: ORC > Issue Type: Bug > Components: Java >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 2.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ORC-1755) Bump `commons-lang3` to 3.16.0
Dongjoon Hyun created ORC-1755: -- Summary: Bump `commons-lang3` to 3.16.0 Key: ORC-1755 URL: https://issues.apache.org/jira/browse/ORC-1755 Project: ORC Issue Type: Bug Components: Java Affects Versions: 2.1.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: Spark website repo size hits the storage limit of GitHub-hosted runners
Ya, I agree that we need to investigate what happened at PySpark 3.5+ docs. For old Spark docs, it seems to be negligible. - All Spark 0.x docs: 231M - All Spark 1.x docs: 1.3G - All Spark 2.x docs: 3.4G For example, the total size of above all old Spark docs is less than the following 4 releases docs. 1.1G ./3.5.0 1.2G ./3.5.1 1.2G ./3.5.2 RC2 1.1G ./4.0.0-preview1 So, if we do start something, we had better focus on the latest doc first in the reverse order. Dongjoon On Thu, Aug 8, 2024 at 11:22 AM Sean Owen wrote: > Whoa! Is there any clear reason why 3.5 docs are so big? 1GB of docs / 10x > jump seems crazy. Maybe we need to investigate and fix that also. > > I take it that the problem is the size of the repo once it's cloned into > the docker container. Removing the .html files helps that, but, then we > don't have .html docs in the published site! > We can generate them in the build process, but I presume it's waaay too > long to rebuild docs for every release every time. > > I do support at *least* tarring up old .html docs from old releases > (<3.0?) and making them available somehow on the site, so that they're > accessible if needed. > > Analytics says that page views for docs before 3.1 are quite minimal, > probably hundreds of views this year at best vs 10M total views: > > https://analytics.apache.org/index.php?module=CoreHome&action=index&date=yesterday&period=day&idSite=40#?idSite=40&period=year&date=2024-08-07&category=General_Actions&subcategory=General_Pages > > On Thu, Aug 8, 2024 at 12:42 PM Dongjoon Hyun > wrote: > >> The culprit seems to be PySpark 3.5 documentation which grows 11x times >> at 3.5+ >> >> $ du -h 3.4.3/api/python | tail -n1 >> 84M 3.4.3/api/python >> >> $ du -h 3.5.1/api/python | tail -n1 >> 943M 3.5.1/api/python >> >> Since we will generate big documents for 3.5.x, 4.0.0-preview, 4.0.x, >> 4.1.x, the proposed tarball idea sounds promising to me too. >> >> $ ls -alh 3.5.1.tgz >> -rw-r--r-- 1 dongjoon staff 103M Aug 8 10:22 3.5.1.tgz >> >> Specifically, shall we keep HTML files for only the latest version of >> live releases, e.g. 3.4.3, 3.5.1, and 4.0.0-preview1? >> >> In other words, all 0.x ~ 3.4.2 and 3.5.1 will be tarball files in the >> current status. >> >> Dongjoon. >> >> >> On Thu, Aug 8, 2024 at 10:01 AM Sean Owen wrote: >> >>> I agree with 'archiving', but what does that mean? delete from the repo >>> and site? >>> While I really doubt people are looking for docs for, say, 0.5.0, it'd >>> be a big jump to totally remove it. >>> >>> What if we made a compressed tarball of old docs and put that in the >>> repo, linked to it, and removed the docs files for many old releases? >>> It's still in the repo and will be in the container when docs are built, >>> but, compressed would be much smaller. >>> That could buy a significant amount of time. >>> >>> On Thu, Aug 8, 2024 at 7:06 AM Kent Yao wrote: >>> >>>> Hi dev, >>>> >>>> The current size of the spark-website repository is approximately 16GB, >>>> exceeding the storage limit of GitHub-hosted runners. The GitHub >>>> actions >>>> have been failing recently in the actions/checkout step caused by >>>> 'No space left on device' errors. >>>> >>>> Filesystem Size Used Avail Use% Mounted on >>>> overlay 73G 58G 16G 80% / >>>> tmpfs64M 0 64M 0% /dev >>>> tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup >>>> shm 64M 0 64M 0% /dev/shm >>>> /dev/root73G 58G 16G 80% /__w >>>> tmpfs 1.6G 1.2M 1.6G 1% /run/docker.sock >>>> tmpfs 7.9G 0 7.9G 0% /proc/acpi >>>> tmpfs 7.9G 0 7.9G 0% /proc/scsi >>>> tmpfs 7.9G 0 7.9G 0% /sys/firmware >>>> >>>> >>>> The documentation for each version contributes the most volume. Since >>>> version >>>> 3.5.0, the documentation size has grown 3-4 times larger than the >>>> size of 3.4.x, >>>> with more than 1GB. >>>> >>>> >>>> 9.9M ./0.6.0 >>>> 10M ./0.6.1 >>>> 10M ./0.6.2 >>>> 15M ./0.7.0 >>>> 16M ./0.7.2 >>>> 16M ./0.7.3 >>>> 20M ./0.8.0 >>>> 20M ./0.8.1 >>>> 38M ./0.9.0 >>>>
[jira] [Resolved] (SPARK-49165) Fix RestartPolicyTest to cover `SchedulingFailure`
[ https://issues.apache.org/jira/browse/SPARK-49165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49165. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 41 [https://github.com/apache/spark-kubernetes-operator/pull/41] > Fix RestartPolicyTest to cover `SchedulingFailure` > -- > > Key: SPARK-49165 > URL: https://issues.apache.org/jira/browse/SPARK-49165 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Tests >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
Re: [VOTE] Release Spark 3.5.2 (RC5)
+1 I'm resending my vote. Dongjoon. On 2024/08/06 16:06:00 Kent Yao wrote: > Hi dev, > > Please vote on releasing the following candidate as Apache Spark version > 3.5.2. > > The vote is open until Aug 9, 17:00:00 UTC, and passes if a majority +1 > PMC votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release this package as Apache Spark 3.5.2 > [ ] -1 Do not release this package because ... > > To learn more about Apache Spark, please see https://spark.apache.org/ > > The tag to be voted on is v3.5.2-rc5 (commit > bb7846dd487f259994fdc69e18e03382e3f64f42): > https://github.com/apache/spark/tree/v3.5.2-rc5 > > The release files, including signatures, digests, etc. can be found at: > https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-bin/ > > Signatures used for Spark RCs can be found in this file: > https://dist.apache.org/repos/dist/dev/spark/KEYS > > The staging repository for this release can be found at: > https://repository.apache.org/content/repositories/orgapachespark-1462/ > > The documentation corresponding to this release can be found at: > https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/ > > The list of bug fixes going into 3.5.2 can be found at the following URL: > https://issues.apache.org/jira/projects/SPARK/versions/12353980 > > FAQ > > = > How can I help test this release? > = > > If you are a Spark user, you can help us test this release by taking > an existing Spark workload and running on this release candidate, then > reporting any regressions. > > If you're working in PySpark you can set up a virtual env and install > the current RC via "pip install > https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-bin/pyspark-3.5.2.tar.gz"; > and see if anything important breaks. > In the Java/Scala, you can add the staging repository to your projects > resolvers and test > with the RC (make sure to clean up the artifact cache before/after so > you don't end up building with an out of date RC going forward). > > === > What should happen to JIRA tickets still targeting 3.5.2? > === > > The current list of open tickets targeted at 3.5.2 can be found at: > https://issues.apache.org/jira/projects/SPARK and search for > "Target Version/s" = 3.5.2 > > Committers should look at those and triage. Extremely important bug > fixes, documentation, and API tweaks that impact compatibility should > be worked on immediately. Everything else please retarget to an > appropriate release. > > == > But my bug isn't fixed? > == > > In order to make timely releases, we will typically not hold the > release unless the bug in question is a regression from the previous > release. That being said, if there is something which is a regression > that has not been correctly targeted please ping me or a committer to > help target the issue. > > Thanks, > Kent Yao > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: [VOTE] Release Spark 3.5.2 (RC5)
Hi, Kent and all. It seems that the vote replies are not archived in the mailing list for some reasons. https://lists.apache.org/list.html?dev@spark.apache.org https://lists.apache.org/thread/chos58kswjg3x9cotp5rn0oc7hnfc6o4 Dongjoon/ On Wed, Aug 7, 2024 at 1:44 PM John Zhuge wrote: > +1 (non-binding) > > Thanks for the great work! > > On Wed, Aug 7, 2024 at 8:55 AM L. C. Hsieh wrote: > >> +1 >> >> Thanks Kent. >> >> On Wed, Aug 7, 2024 at 8:31 AM Dongjoon Hyun wrote: >> > >> > +1 >> > >> > Thank you, Kent. >> > >> > Dongjoon. >> > >> > On 2024/08/06 16:06:00 Kent Yao wrote: >> > > Hi dev, >> > > >> > > Please vote on releasing the following candidate as Apache Spark >> version 3.5.2. >> > > >> > > The vote is open until Aug 9, 17:00:00 UTC, and passes if a majority >> +1 >> > > PMC votes are cast, with a minimum of 3 +1 votes. >> > > >> > > [ ] +1 Release this package as Apache Spark 3.5.2 >> > > [ ] -1 Do not release this package because ... >> > > >> > > To learn more about Apache Spark, please see >> https://spark.apache.org/ >> > > >> > > The tag to be voted on is v3.5.2-rc5 (commit >> > > bb7846dd487f259994fdc69e18e03382e3f64f42): >> > > https://github.com/apache/spark/tree/v3.5.2-rc5 >> > > >> > > The release files, including signatures, digests, etc. can be found >> at: >> > > https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-bin/ >> > > >> > > Signatures used for Spark RCs can be found in this file: >> > > https://dist.apache.org/repos/dist/dev/spark/KEYS >> > > >> > > The staging repository for this release can be found at: >> > > >> https://repository.apache.org/content/repositories/orgapachespark-1462/ >> > > >> > > The documentation corresponding to this release can be found at: >> > > https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/ >> > > >> > > The list of bug fixes going into 3.5.2 can be found at the following >> URL: >> > > https://issues.apache.org/jira/projects/SPARK/versions/12353980 >> > > >> > > FAQ >> > > >> > > = >> > > How can I help test this release? >> > > = >> > > >> > > If you are a Spark user, you can help us test this release by taking >> > > an existing Spark workload and running on this release candidate, then >> > > reporting any regressions. >> > > >> > > If you're working in PySpark you can set up a virtual env and install >> > > the current RC via "pip install >> > > >> https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-bin/pyspark-3.5.2.tar.gz >> " >> > > and see if anything important breaks. >> > > In the Java/Scala, you can add the staging repository to your projects >> > > resolvers and test >> > > with the RC (make sure to clean up the artifact cache before/after so >> > > you don't end up building with an out of date RC going forward). >> > > >> > > === >> > > What should happen to JIRA tickets still targeting 3.5.2? >> > > === >> > > >> > > The current list of open tickets targeted at 3.5.2 can be found at: >> > > https://issues.apache.org/jira/projects/SPARK and search for >> > > "Target Version/s" = 3.5.2 >> > > >> > > Committers should look at those and triage. Extremely important bug >> > > fixes, documentation, and API tweaks that impact compatibility should >> > > be worked on immediately. Everything else please retarget to an >> > > appropriate release. >> > > >> > > == >> > > But my bug isn't fixed? >> > > == >> > > >> > > In order to make timely releases, we will typically not hold the >> > > release unless the bug in question is a regression from the previous >> > > release. That being said, if there is something which is a regression >> > > that has not been correctly targeted please ping me or a committer to >> > > help target the issue. >> > > >> > > Thanks, >> > > Kent Yao >> > > >> > > - >> > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> > > >> > > >> > >> > - >> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> > >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> > > -- > John Zhuge >
[jira] [Assigned] (SPARK-49165) Fix RestartPolicyTest to cover `SchedulingFailure`
[ https://issues.apache.org/jira/browse/SPARK-49165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49165: - Assignee: Dongjoon Hyun > Fix RestartPolicyTest to cover `SchedulingFailure` > -- > > Key: SPARK-49165 > URL: https://issues.apache.org/jira/browse/SPARK-49165 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Tests >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
Re: Spark website repo size hits the storage limit of GitHub-hosted runners
The culprit seems to be PySpark 3.5 documentation which grows 11x times at 3.5+ $ du -h 3.4.3/api/python | tail -n1 84M 3.4.3/api/python $ du -h 3.5.1/api/python | tail -n1 943M 3.5.1/api/python Since we will generate big documents for 3.5.x, 4.0.0-preview, 4.0.x, 4.1.x, the proposed tarball idea sounds promising to me too. $ ls -alh 3.5.1.tgz -rw-r--r-- 1 dongjoon staff 103M Aug 8 10:22 3.5.1.tgz Specifically, shall we keep HTML files for only the latest version of live releases, e.g. 3.4.3, 3.5.1, and 4.0.0-preview1? In other words, all 0.x ~ 3.4.2 and 3.5.1 will be tarball files in the current status. Dongjoon. On Thu, Aug 8, 2024 at 10:01 AM Sean Owen wrote: > I agree with 'archiving', but what does that mean? delete from the repo > and site? > While I really doubt people are looking for docs for, say, 0.5.0, it'd be > a big jump to totally remove it. > > What if we made a compressed tarball of old docs and put that in the repo, > linked to it, and removed the docs files for many old releases? > It's still in the repo and will be in the container when docs are built, > but, compressed would be much smaller. > That could buy a significant amount of time. > > On Thu, Aug 8, 2024 at 7:06 AM Kent Yao wrote: > >> Hi dev, >> >> The current size of the spark-website repository is approximately 16GB, >> exceeding the storage limit of GitHub-hosted runners. The GitHub actions >> have been failing recently in the actions/checkout step caused by >> 'No space left on device' errors. >> >> Filesystem Size Used Avail Use% Mounted on >> overlay 73G 58G 16G 80% / >> tmpfs64M 0 64M 0% /dev >> tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup >> shm 64M 0 64M 0% /dev/shm >> /dev/root73G 58G 16G 80% /__w >> tmpfs 1.6G 1.2M 1.6G 1% /run/docker.sock >> tmpfs 7.9G 0 7.9G 0% /proc/acpi >> tmpfs 7.9G 0 7.9G 0% /proc/scsi >> tmpfs 7.9G 0 7.9G 0% /sys/firmware >> >> >> The documentation for each version contributes the most volume. Since >> version >> 3.5.0, the documentation size has grown 3-4 times larger than the >> size of 3.4.x, >> with more than 1GB. >> >> >> 9.9M ./0.6.0 >> 10M ./0.6.1 >> 10M ./0.6.2 >> 15M ./0.7.0 >> 16M ./0.7.2 >> 16M ./0.7.3 >> 20M ./0.8.0 >> 20M ./0.8.1 >> 38M ./0.9.0 >> 38M ./0.9.1 >> 38M ./0.9.2 >> 36M ./1.0.0 >> 38M ./1.0.1 >> 38M ./1.0.2 >> 48M ./1.1.0 >> 48M ./1.1.1 >> 73M ./1.2.0 >> 73M ./1.2.1 >> 74M ./1.2.2 >> 69M ./1.3.0 >> 73M ./1.3.1 >> 68M ./1.4.0 >> 70M ./1.4.1 >> 80M ./1.5.0 >> 78M ./1.5.1 >> 78M ./1.5.2 >> 87M ./1.6.0 >> 87M ./1.6.1 >> 87M ./1.6.2 >> 86M ./1.6.3 >> 117M ./2.0.0 >> 119M ./2.0.0-preview >> 118M ./2.0.1 >> 118M ./2.0.2 >> 121M ./2.1.0 >> 121M ./2.1.1 >> 122M ./2.1.2 >> 122M ./2.1.3 >> 130M ./2.2.0 >> 131M ./2.2.1 >> 132M ./2.2.2 >> 131M ./2.2.3 >> 141M ./2.3.0 >> 141M ./2.3.1 >> 141M ./2.3.2 >> 142M ./2.3.3 >> 142M ./2.3.4 >> 145M ./2.4.0 >> 146M ./2.4.1 >> 145M ./2.4.2 >> 144M ./2.4.3 >> 145M ./2.4.4 >> 143M ./2.4.5 >> 143M ./2.4.6 >> 143M ./2.4.7 >> 143M ./2.4.8 >> 197M ./3.0.0 >> 185M ./3.0.0-preview >> 197M ./3.0.0-preview2 >> 198M ./3.0.1 >> 198M ./3.0.2 >> 205M ./3.0.3 >> 239M ./3.1.1 >> 239M ./3.1.2 >> 239M ./3.1.3 >> 840M ./3.2.0 >> 842M ./3.2.1 >> 282M ./3.2.2 >> 244M ./3.2.3 >> 282M ./3.2.4 >> 295M ./3.3.0 >> 297M ./3.3.1 >> 297M ./3.3.2 >> 297M ./3.3.3 >> 297M ./3.3.4 >> 314M ./3.4.0 >> 314M ./3.4.1 >> 328M ./3.4.2 >> 324M ./3.4.3 >> 1.1G ./3.5.0 >> 1.2G ./3.5.1 >> 1.1G ./4.0.0-preview1 >> >> I'm concerned about publishing the documentation for version 3.5.2 >> to the asf-site. So, I have merged PR[2] to eliminate this potential >> blocker. >> >> Considering that the problem still exists, should we temporarily archive >> some of the outdated version documents? For example, only keep >> the latest version for each feature release in the asf-site branch. Or, >> Do you have any other suggestions? >> >> >> Bests, >> Kent Yao >> >> >> [1] >> https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories >> [2] https://github.com/apache/spark-website/pull/543 >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >>
[jira] [Resolved] (SPARK-49159) Enforce `FieldDeclarationsShouldBeAtStartOfClass`, `LinguisticNaming` and `ClassWithOnlyPrivateConstructorsShouldBeFinal` rules
[ https://issues.apache.org/jira/browse/SPARK-49159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49159. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 40 [https://github.com/apache/spark-kubernetes-operator/pull/40] > Enforce `FieldDeclarationsShouldBeAtStartOfClass`, `LinguisticNaming` and > `ClassWithOnlyPrivateConstructorsShouldBeFinal` rules > --- > > Key: SPARK-49159 > URL: https://issues.apache.org/jira/browse/SPARK-49159 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49159) Enforce `FieldDeclarationsShouldBeAtStartOfClass`, `LinguisticNaming` and `ClassWithOnlyPrivateConstructorsShouldBeFinal` rules
[ https://issues.apache.org/jira/browse/SPARK-49159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49159: - Assignee: William Hyun > Enforce `FieldDeclarationsShouldBeAtStartOfClass`, `LinguisticNaming` and > `ClassWithOnlyPrivateConstructorsShouldBeFinal` rules > --- > > Key: SPARK-49159 > URL: https://issues.apache.org/jira/browse/SPARK-49159 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49165) Fix RestartPolicyTest to cover `SchedulingFailure`
Dongjoon Hyun created SPARK-49165: - Summary: Fix RestartPolicyTest to cover `SchedulingFailure` Key: SPARK-49165 URL: https://issues.apache.org/jira/browse/SPARK-49165 Project: Spark Issue Type: Sub-task Components: Kubernetes, Tests Affects Versions: kubernetes-operator-0.1.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49158) Enforce ConfusingTernary and PrematureDeclaration rules
[ https://issues.apache.org/jira/browse/SPARK-49158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49158. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 39 [https://github.com/apache/spark-kubernetes-operator/pull/39] > Enforce ConfusingTernary and PrematureDeclaration rules > --- > > Key: SPARK-49158 > URL: https://issues.apache.org/jira/browse/SPARK-49158 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49158) Enforce ConfusingTernary and PrematureDeclaration rules
[ https://issues.apache.org/jira/browse/SPARK-49158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49158: - Assignee: William Hyun > Enforce ConfusingTernary and PrematureDeclaration rules > --- > > Key: SPARK-49158 > URL: https://issues.apache.org/jira/browse/SPARK-49158 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49157) Enforce SignatureDeclareThrowsException and AvoidThrowingRawExceptionTypes rules
[ https://issues.apache.org/jira/browse/SPARK-49157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49157. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 38 [https://github.com/apache/spark-kubernetes-operator/pull/38] > Enforce SignatureDeclareThrowsException and AvoidThrowingRawExceptionTypes > rules > > > Key: SPARK-49157 > URL: https://issues.apache.org/jira/browse/SPARK-49157 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49157) Enforce SignatureDeclareThrowsException and AvoidThrowingRawExceptionTypes rules
[ https://issues.apache.org/jira/browse/SPARK-49157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49157: - Assignee: William Hyun > Enforce SignatureDeclareThrowsException and AvoidThrowingRawExceptionTypes > rules > > > Key: SPARK-49157 > URL: https://issues.apache.org/jira/browse/SPARK-49157 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49156) Enforce ImmutableField and UselessParentheses rules
[ https://issues.apache.org/jira/browse/SPARK-49156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49156. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 37 [https://github.com/apache/spark-kubernetes-operator/pull/37] > Enforce ImmutableField and UselessParentheses rules > --- > > Key: SPARK-49156 > URL: https://issues.apache.org/jira/browse/SPARK-49156 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49156) Enforce ImmutableField and UselessParentheses rules
[ https://issues.apache.org/jira/browse/SPARK-49156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49156: - Assignee: William Hyun > Enforce ImmutableField and UselessParentheses rules > --- > > Key: SPARK-49156 > URL: https://issues.apache.org/jira/browse/SPARK-49156 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 >Reporter: William Hyun >Assignee: William Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49153) Increase `Gradle` JVM memory to `4g` like Spark repo
[ https://issues.apache.org/jira/browse/SPARK-49153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49153. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 36 [https://github.com/apache/spark-kubernetes-operator/pull/36] > Increase `Gradle` JVM memory to `4g` like Spark repo > > > Key: SPARK-49153 > URL: https://issues.apache.org/jira/browse/SPARK-49153 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > > {code:java} > > Task :spark-operator:compileTestJava > Note: > /Users/dongjoon/APACHE/spark-kubernetes-operator/spark-operator/src/test/java/org/apache/spark/k8s/operator/metrics/healthcheck/SentinelManagerTest.java > uses or overrides a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > Note: Some input files use unchecked or unsafe operations. > Note: Recompile with -Xlint:unchecked for details. > OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been > disabled. > [1.372s][warning][codecache] CodeCache is full. Compiler has been disabled. > OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using > -XX:ReservedCodeCacheSize= > [1.372s][warning][codecache] Try increasing the code cache size using > -XX:ReservedCodeCacheSize= > CodeCache: size=2944Kb used=2943Kb max_used=2943Kb free=0Kb > bounds [0x000105004000, 0x0001052e4000, 0x0001052e4000] > total_blobs=1102 nmethods=464 adapters=554 > compilation: disabled (not enough contiguous free space left) > stopped_count=1, restarted_count=0 > full_count=1 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49153) Increase `Gradle` JVM memory to `4g` like Spark repo
[ https://issues.apache.org/jira/browse/SPARK-49153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49153: - Assignee: Dongjoon Hyun > Increase `Gradle` JVM memory to `4g` like Spark repo > > > Key: SPARK-49153 > URL: https://issues.apache.org/jira/browse/SPARK-49153 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > > {code:java} > > Task :spark-operator:compileTestJava > Note: > /Users/dongjoon/APACHE/spark-kubernetes-operator/spark-operator/src/test/java/org/apache/spark/k8s/operator/metrics/healthcheck/SentinelManagerTest.java > uses or overrides a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > Note: Some input files use unchecked or unsafe operations. > Note: Recompile with -Xlint:unchecked for details. > OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been > disabled. > [1.372s][warning][codecache] CodeCache is full. Compiler has been disabled. > OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using > -XX:ReservedCodeCacheSize= > [1.372s][warning][codecache] Try increasing the code cache size using > -XX:ReservedCodeCacheSize= > CodeCache: size=2944Kb used=2943Kb max_used=2943Kb free=0Kb > bounds [0x000105004000, 0x0001052e4000, 0x0001052e4000] > total_blobs=1102 nmethods=464 adapters=554 > compilation: disabled (not enough contiguous free space left) > stopped_count=1, restarted_count=0 > full_count=1 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49150) Upgrade `commons-lang3` to 3.16.0
[ https://issues.apache.org/jira/browse/SPARK-49150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49150: - Assignee: BingKun Pan > Upgrade `commons-lang3` to 3.16.0 > - > > Key: SPARK-49150 > URL: https://issues.apache.org/jira/browse/SPARK-49150 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49150) Upgrade `commons-lang3` to 3.16.0
[ https://issues.apache.org/jira/browse/SPARK-49150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49150. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47658 [https://github.com/apache/spark/pull/47658] > Upgrade `commons-lang3` to 3.16.0 > - > > Key: SPARK-49150 > URL: https://issues.apache.org/jira/browse/SPARK-49150 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49153) Increase `Gradle` JVM memory to `4g` like Spark repo
Dongjoon Hyun created SPARK-49153: - Summary: Increase `Gradle` JVM memory to `4g` like Spark repo Key: SPARK-49153 URL: https://issues.apache.org/jira/browse/SPARK-49153 Project: Spark Issue Type: Sub-task Components: Build Affects Versions: kubernetes-operator-0.1.0 Reporter: Dongjoon Hyun {code:java} > Task :spark-operator:compileTestJava Note: /Users/dongjoon/APACHE/spark-kubernetes-operator/spark-operator/src/test/java/org/apache/spark/k8s/operator/metrics/healthcheck/SentinelManagerTest.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled. [1.372s][warning][codecache] CodeCache is full. Compiler has been disabled. OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize= [1.372s][warning][codecache] Try increasing the code cache size using -XX:ReservedCodeCacheSize= CodeCache: size=2944Kb used=2943Kb max_used=2943Kb free=0Kb bounds [0x000105004000, 0x0001052e4000, 0x0001052e4000] total_blobs=1102 nmethods=464 adapters=554 compilation: disabled (not enough contiguous free space left) stopped_count=1, restarted_count=0 full_count=1 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49143) Update `YuniKorn` docs with v1.5.2
[ https://issues.apache.org/jira/browse/SPARK-49143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49143. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47654 [https://github.com/apache/spark/pull/47654] > Update `YuniKorn` docs with v1.5.2 > -- > > Key: SPARK-49143 > URL: https://issues.apache.org/jira/browse/SPARK-49143 > Project: Spark > Issue Type: Sub-task > Components: Documentation, Kubernetes >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49148) Use the latest PMD 6.x rules instead of the deprecated ones
[ https://issues.apache.org/jira/browse/SPARK-49148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49148. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 35 [https://github.com/apache/spark-kubernetes-operator/pull/35] > Use the latest PMD 6.x rules instead of the deprecated ones > --- > > Key: SPARK-49148 > URL: https://issues.apache.org/jira/browse/SPARK-49148 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > > There are too many warnings on PMD like the following. > {code:java} > Use Rule name category/java/errorprone.xml/BrokenNullCheck instead of the > deprecated Rule name rulesets/java/basic.xml/BrokenNullCheck. PMD 7.0.0 will > remove support for this deprecated Rule name usage. > Use Rule name category/java/errorprone.xml/CheckSkipResult instead of the > deprecated Rule name rulesets/java/basic.xml/CheckSkipResult. PMD 7.0.0 will > remove support for this deprecated Rule name usage. {code} > > {code:java} > $ ./gradlew clean build | grep PMD | wc -l > 204 {code} > > We had better fix it before we do the release. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49148) Use the latest PMD 6.x rules instead of the deprecated ones
[ https://issues.apache.org/jira/browse/SPARK-49148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49148: - Assignee: Dongjoon Hyun > Use the latest PMD 6.x rules instead of the deprecated ones > --- > > Key: SPARK-49148 > URL: https://issues.apache.org/jira/browse/SPARK-49148 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > > There are too many warnings on PMD like the following. > {code:java} > Use Rule name category/java/errorprone.xml/BrokenNullCheck instead of the > deprecated Rule name rulesets/java/basic.xml/BrokenNullCheck. PMD 7.0.0 will > remove support for this deprecated Rule name usage. > Use Rule name category/java/errorprone.xml/CheckSkipResult instead of the > deprecated Rule name rulesets/java/basic.xml/CheckSkipResult. PMD 7.0.0 will > remove support for this deprecated Rule name usage. {code} > > {code:java} > $ ./gradlew clean build | grep PMD | wc -l > 204 {code} > > We had better fix it before we do the release. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49148) Use the latest PMD 6.x rules instead of the deprecated ones
[ https://issues.apache.org/jira/browse/SPARK-49148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49148: -- Description: There are too many warnings on PMD like the following. {code:java} Use Rule name category/java/errorprone.xml/BrokenNullCheck instead of the deprecated Rule name rulesets/java/basic.xml/BrokenNullCheck. PMD 7.0.0 will remove support for this deprecated Rule name usage. Use Rule name category/java/errorprone.xml/CheckSkipResult instead of the deprecated Rule name rulesets/java/basic.xml/CheckSkipResult. PMD 7.0.0 will remove support for this deprecated Rule name usage. {code} {code:java} $ ./gradlew clean build | grep PMD | wc -l 204 {code} We had better fix it before we do the release. was: There are too many warnings on PMD like the following. {code:java} Use Rule name category/java/errorprone.xml/BrokenNullCheck instead of the deprecated Rule name rulesets/java/basic.xml/BrokenNullCheck. PMD 7.0.0 will remove support for this deprecated Rule name usage. Use Rule name category/java/errorprone.xml/CheckSkipResult instead of the deprecated Rule name rulesets/java/basic.xml/CheckSkipResult. PMD 7.0.0 will remove support for this deprecated Rule name usage. {code} We had better fix it before we do the release. > Use the latest PMD 6.x rules instead of the deprecated ones > --- > > Key: SPARK-49148 > URL: https://issues.apache.org/jira/browse/SPARK-49148 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun >Priority: Minor > > There are too many warnings on PMD like the following. > {code:java} > Use Rule name category/java/errorprone.xml/BrokenNullCheck instead of the > deprecated Rule name rulesets/java/basic.xml/BrokenNullCheck. PMD 7.0.0 will > remove support for this deprecated Rule name usage. > Use Rule name category/java/errorprone.xml/CheckSkipResult instead of the > deprecated Rule name rulesets/java/basic.xml/CheckSkipResult. PMD 7.0.0 will > remove support for this deprecated Rule name usage. {code} > > {code:java} > $ ./gradlew clean build | grep PMD | wc -l > 204 {code} > > We had better fix it before we do the release. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49148) Use the latest PMD 6.x rules instead of the deprecated ones
Dongjoon Hyun created SPARK-49148: - Summary: Use the latest PMD 6.x rules instead of the deprecated ones Key: SPARK-49148 URL: https://issues.apache.org/jira/browse/SPARK-49148 Project: Spark Issue Type: Sub-task Components: Build Affects Versions: kubernetes-operator-0.1.0 Reporter: Dongjoon Hyun There are too many warnings on PMD like the following. {code:java} Use Rule name category/java/errorprone.xml/BrokenNullCheck instead of the deprecated Rule name rulesets/java/basic.xml/BrokenNullCheck. PMD 7.0.0 will remove support for this deprecated Rule name usage. Use Rule name category/java/errorprone.xml/CheckSkipResult instead of the deprecated Rule name rulesets/java/basic.xml/CheckSkipResult. PMD 7.0.0 will remove support for this deprecated Rule name usage. {code} We had better fix it before we do the release. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49144) Use the latest `setup-java` v4 with `cache` feature
[ https://issues.apache.org/jira/browse/SPARK-49144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49144. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 34 [https://github.com/apache/spark-kubernetes-operator/pull/34] > Use the latest `setup-java` v4 with `cache` feature > > > Key: SPARK-49144 > URL: https://issues.apache.org/jira/browse/SPARK-49144 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49144) Use the latest `setup-java` v4 with `cache` feature
[ https://issues.apache.org/jira/browse/SPARK-49144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49144: - Assignee: Dongjoon Hyun > Use the latest `setup-java` v4 with `cache` feature > > > Key: SPARK-49144 > URL: https://issues.apache.org/jira/browse/SPARK-49144 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-48338) Sql Scripting support for Spark SQL
[ https://issues.apache.org/jira/browse/SPARK-48338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871752#comment-17871752 ] Dongjoon Hyun commented on SPARK-48338: --- To all, unfortunately, the JiraID (SPARK-48338) is already compromised a lot. {code:java} $ git log --oneline | grep SPARK-48338 e5b6b5ff6f5 [SPARK-48338][SQL] Improve exceptions thrown from parser/interpreter 239d77b86ca [SPARK-48338][SQL] Check variable declarations {code} I'm wondering what is your recommendation for this, [~cloud_fan] ? > Sql Scripting support for Spark SQL > --- > > Key: SPARK-48338 > URL: https://issues.apache.org/jira/browse/SPARK-48338 > Project: Spark > Issue Type: Epic > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Aleksandar Tomic >Assignee: Aleksandar Tomic >Priority: Major > Labels: pull-request-available > Attachments: Sql Scripting - OSS.odt, [Design Doc] Sql Scripting - > OSS.pdf > > > Design doc for this feature is in attachment. > High level example of Sql Script: > ``` > BEGIN > DECLARE c INT = 10; > WHILE c > 0 DO > INSERT INTO tscript VALUES (c); > SET c = c - 1; > END WHILE; > END > ``` > High level motivation behind this feature: > SQL Scripting gives customers the ability to develop complex ETL and analysis > entirely in SQL. Until now, customers have had to write verbose SQL > statements or combine SQL + Python to efficiently write business logic. > Coming from another system, customers have to choose whether or not they want > to migrate to pyspark. Some customers end up not using Spark because of this > gap. SQL Scripting is a key milestone towards enabling SQL practitioners to > write sophisticated queries, without the need to use pyspark. Further, SQL > Scripting is a necessary step towards support for SQL Stored Procedures, and > along with SQL Variables (released) and Temp Tables (in progress), will allow > for more seamless data warehouse migrations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-48338) Sql Scripting support for Spark SQL
[ https://issues.apache.org/jira/browse/SPARK-48338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871751#comment-17871751 ] Dongjoon Hyun commented on SPARK-48338: --- According to the community guideline, I removed the `Fixed Version`. It seems that it's not recovered correctly after the previos mistake. - [https://spark.apache.org/contributing.html] {quote}Do not set the following fields: Fix Version. This is assigned by committers only when resolved. {quote} > Sql Scripting support for Spark SQL > --- > > Key: SPARK-48338 > URL: https://issues.apache.org/jira/browse/SPARK-48338 > Project: Spark > Issue Type: Epic > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Aleksandar Tomic >Assignee: Aleksandar Tomic >Priority: Major > Labels: pull-request-available > Attachments: Sql Scripting - OSS.odt, [Design Doc] Sql Scripting - > OSS.pdf > > > Design doc for this feature is in attachment. > High level example of Sql Script: > ``` > BEGIN > DECLARE c INT = 10; > WHILE c > 0 DO > INSERT INTO tscript VALUES (c); > SET c = c - 1; > END WHILE; > END > ``` > High level motivation behind this feature: > SQL Scripting gives customers the ability to develop complex ETL and analysis > entirely in SQL. Until now, customers have had to write verbose SQL > statements or combine SQL + Python to efficiently write business logic. > Coming from another system, customers have to choose whether or not they want > to migrate to pyspark. Some customers end up not using Spark because of this > gap. SQL Scripting is a key milestone towards enabling SQL practitioners to > write sophisticated queries, without the need to use pyspark. Further, SQL > Scripting is a necessary step towards support for SQL Stored Procedures, and > along with SQL Variables (released) and Temp Tables (in progress), will allow > for more seamless data warehouse migrations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-48338) Sql Scripting support for Spark SQL
[ https://issues.apache.org/jira/browse/SPARK-48338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-48338: -- Fix Version/s: (was: 4.0.0) > Sql Scripting support for Spark SQL > --- > > Key: SPARK-48338 > URL: https://issues.apache.org/jira/browse/SPARK-48338 > Project: Spark > Issue Type: Epic > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Aleksandar Tomic >Assignee: Aleksandar Tomic >Priority: Major > Labels: pull-request-available > Attachments: Sql Scripting - OSS.odt, [Design Doc] Sql Scripting - > OSS.pdf > > > Design doc for this feature is in attachment. > High level example of Sql Script: > ``` > BEGIN > DECLARE c INT = 10; > WHILE c > 0 DO > INSERT INTO tscript VALUES (c); > SET c = c - 1; > END WHILE; > END > ``` > High level motivation behind this feature: > SQL Scripting gives customers the ability to develop complex ETL and analysis > entirely in SQL. Until now, customers have had to write verbose SQL > statements or combine SQL + Python to efficiently write business logic. > Coming from another system, customers have to choose whether or not they want > to migrate to pyspark. Some customers end up not using Spark because of this > gap. SQL Scripting is a key milestone towards enabling SQL practitioners to > write sophisticated queries, without the need to use pyspark. Further, SQL > Scripting is a necessary step towards support for SQL Stored Procedures, and > along with SQL Variables (released) and Temp Tables (in progress), will allow > for more seamless data warehouse migrations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49144) Use the latest `setup-java` v4 with `cache` feature
Dongjoon Hyun created SPARK-49144: - Summary: Use the latest `setup-java` v4 with `cache` feature Key: SPARK-49144 URL: https://issues.apache.org/jira/browse/SPARK-49144 Project: Spark Issue Type: Sub-task Components: Project Infra Affects Versions: kubernetes-operator-0.1.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (ORC-1709) Upgrade GitHub Action `setup-java` to v4 and use built-in cache feature
[ https://issues.apache.org/jira/browse/ORC-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned ORC-1709: -- Assignee: (was: Dongjoon Hyun) > Upgrade GitHub Action `setup-java` to v4 and use built-in cache feature > --- > > Key: ORC-1709 > URL: https://issues.apache.org/jira/browse/ORC-1709 > Project: ORC > Issue Type: Task > Components: Infra >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun >Priority: Minor > Fix For: 2.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ORC-1709) Upgrade GitHub Action `setup-java` to v4 and use built-in cache feature
[ https://issues.apache.org/jira/browse/ORC-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved ORC-1709. Fix Version/s: 2.1.0 Resolution: Fixed Issue resolved by pull request 1925 [https://github.com/apache/orc/pull/1925] > Upgrade GitHub Action `setup-java` to v4 and use built-in cache feature > --- > > Key: ORC-1709 > URL: https://issues.apache.org/jira/browse/ORC-1709 > Project: ORC > Issue Type: Task > Components: Infra >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 2.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ORC-1709) Upgrade GitHub Action `setup-java` to v4 and use built-in cache feature
[ https://issues.apache.org/jira/browse/ORC-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned ORC-1709: -- Assignee: Dongjoon Hyun > Upgrade GitHub Action `setup-java` to v4 and use built-in cache feature > --- > > Key: ORC-1709 > URL: https://issues.apache.org/jira/browse/ORC-1709 > Project: ORC > Issue Type: Task > Components: Infra >Affects Versions: 2.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ORC-1751) [C++] Syntax error in ThirdpartyToolchain
[ https://issues.apache.org/jira/browse/ORC-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated ORC-1751: --- Fix Version/s: 2.0.2 > [C++] Syntax error in ThirdpartyToolchain > - > > Key: ORC-1751 > URL: https://issues.apache.org/jira/browse/ORC-1751 > Project: ORC > Issue Type: Improvement > Components: C++ >Reporter: Hao Zou >Assignee: Hao Zou >Priority: Major > Fix For: 2.1.0, 2.0.2 > > > This topic has been discussed > [here|https://github.com/apache/arrow/pull/43417]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ORC-1751) [C++] Syntax error in ThirdpartyToolchain
[ https://issues.apache.org/jira/browse/ORC-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871732#comment-17871732 ] Dongjoon Hyun commented on ORC-1751: This landed to branch-2.0 via https://github.com/apache/orc/pull/1997 > [C++] Syntax error in ThirdpartyToolchain > - > > Key: ORC-1751 > URL: https://issues.apache.org/jira/browse/ORC-1751 > Project: ORC > Issue Type: Improvement > Components: C++ >Reporter: Hao Zou >Assignee: Hao Zou >Priority: Major > Fix For: 2.1.0, 2.0.2 > > > This topic has been discussed > [here|https://github.com/apache/arrow/pull/43417]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (SPARK-49141) Mark variant as hive incompatible data type
[ https://issues.apache.org/jira/browse/SPARK-49141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49141. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 47652 [https://github.com/apache/spark/pull/47652] > Mark variant as hive incompatible data type > > > Key: SPARK-49141 > URL: https://issues.apache.org/jira/browse/SPARK-49141 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49143) Update `YuniKorn` docs with v1.5.2
[ https://issues.apache.org/jira/browse/SPARK-49143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49143: - Assignee: Dongjoon Hyun > Update `YuniKorn` docs with v1.5.2 > -- > > Key: SPARK-49143 > URL: https://issues.apache.org/jira/browse/SPARK-49143 > Project: Spark > Issue Type: Sub-task > Components: Documentation, Kubernetes >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49143) Update `YuniKorn` docs with v1.5.2
Dongjoon Hyun created SPARK-49143: - Summary: Update `YuniKorn` docs with v1.5.2 Key: SPARK-49143 URL: https://issues.apache.org/jira/browse/SPARK-49143 Project: Spark Issue Type: Sub-task Components: Documentation, Kubernetes Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
Re: [VOTE] Release Spark 3.5.2 (RC5)
+1 Thank you, Kent. Dongjoon. On 2024/08/06 16:06:00 Kent Yao wrote: > Hi dev, > > Please vote on releasing the following candidate as Apache Spark version > 3.5.2. > > The vote is open until Aug 9, 17:00:00 UTC, and passes if a majority +1 > PMC votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release this package as Apache Spark 3.5.2 > [ ] -1 Do not release this package because ... > > To learn more about Apache Spark, please see https://spark.apache.org/ > > The tag to be voted on is v3.5.2-rc5 (commit > bb7846dd487f259994fdc69e18e03382e3f64f42): > https://github.com/apache/spark/tree/v3.5.2-rc5 > > The release files, including signatures, digests, etc. can be found at: > https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-bin/ > > Signatures used for Spark RCs can be found in this file: > https://dist.apache.org/repos/dist/dev/spark/KEYS > > The staging repository for this release can be found at: > https://repository.apache.org/content/repositories/orgapachespark-1462/ > > The documentation corresponding to this release can be found at: > https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-docs/ > > The list of bug fixes going into 3.5.2 can be found at the following URL: > https://issues.apache.org/jira/projects/SPARK/versions/12353980 > > FAQ > > = > How can I help test this release? > = > > If you are a Spark user, you can help us test this release by taking > an existing Spark workload and running on this release candidate, then > reporting any regressions. > > If you're working in PySpark you can set up a virtual env and install > the current RC via "pip install > https://dist.apache.org/repos/dist/dev/spark/v3.5.2-rc5-bin/pyspark-3.5.2.tar.gz"; > and see if anything important breaks. > In the Java/Scala, you can add the staging repository to your projects > resolvers and test > with the RC (make sure to clean up the artifact cache before/after so > you don't end up building with an out of date RC going forward). > > === > What should happen to JIRA tickets still targeting 3.5.2? > === > > The current list of open tickets targeted at 3.5.2 can be found at: > https://issues.apache.org/jira/projects/SPARK and search for > "Target Version/s" = 3.5.2 > > Committers should look at those and triage. Extremely important bug > fixes, documentation, and API tweaks that impact compatibility should > be worked on immediately. Everything else please retarget to an > appropriate release. > > == > But my bug isn't fixed? > == > > In order to make timely releases, we will typically not hold the > release unless the bug in question is a regression from the previous > release. That being said, if there is something which is a regression > that has not been correctly targeted please ping me or a committer to > help target the issue. > > Thanks, > Kent Yao > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
[jira] [Assigned] (SPARK-49106) Documented Prometheus endpoints
[ https://issues.apache.org/jira/browse/SPARK-49106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49106: - Assignee: Jerry Zhou (was: Dongjoon Hyun) > Documented Prometheus endpoints > --- > > Key: SPARK-49106 > URL: https://issues.apache.org/jira/browse/SPARK-49106 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Jerry Zhou >Assignee: Jerry Zhou >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49106) Documented Prometheus endpoints
[ https://issues.apache.org/jira/browse/SPARK-49106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49106: - Assignee: Dongjoon Hyun > Documented Prometheus endpoints > --- > > Key: SPARK-49106 > URL: https://issues.apache.org/jira/browse/SPARK-49106 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-49106) Documented Prometheus endpoints
[ https://issues.apache.org/jira/browse/SPARK-49106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-49106: -- Reporter: Jerry Zhou (was: Dongjoon Hyun) > Documented Prometheus endpoints > --- > > Key: SPARK-49106 > URL: https://issues.apache.org/jira/browse/SPARK-49106 > Project: Spark > Issue Type: Sub-task > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Jerry Zhou > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49132) Minimize docker image by removing redundant `chown` commands
[ https://issues.apache.org/jira/browse/SPARK-49132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49132. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 33 [https://github.com/apache/spark-kubernetes-operator/pull/33] > Minimize docker image by removing redundant `chown` commands > > > Key: SPARK-49132 > URL: https://issues.apache.org/jira/browse/SPARK-49132 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49132) Minimize docker image by removing redundant `chown` commands
[ https://issues.apache.org/jira/browse/SPARK-49132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49132: - Assignee: Dongjoon Hyun > Minimize docker image by removing redundant `chown` commands > > > Key: SPARK-49132 > URL: https://issues.apache.org/jira/browse/SPARK-49132 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49130) Verify built images in `build-image` CI job via `docker run` test
[ https://issues.apache.org/jira/browse/SPARK-49130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49130. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed This is resolved via https://github.com/apache/spark-kubernetes-operator/pull/32 > Verify built images in `build-image` CI job via `docker run` test > - > > Key: SPARK-49130 > URL: https://issues.apache.org/jira/browse/SPARK-49130 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Project Infra, Tests >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (ORC-1751) [C++] Syntax error in ThirdpartyToolchain
[ https://issues.apache.org/jira/browse/ORC-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved ORC-1751. Fix Version/s: 2.1.0 Resolution: Fixed This is resolved via https://github.com/apache/orc/pull/1994 > [C++] Syntax error in ThirdpartyToolchain > - > > Key: ORC-1751 > URL: https://issues.apache.org/jira/browse/ORC-1751 > Project: ORC > Issue Type: Improvement > Components: C++ >Reporter: Hao Zou >Assignee: Hao Zou >Priority: Major > Fix For: 2.1.0 > > > This topic has been discussed > [here|https://github.com/apache/arrow/pull/43417]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (SPARK-49130) Verify built images in `build-image` CI job via `docker run` test
[ https://issues.apache.org/jira/browse/SPARK-49130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49130: - Assignee: Dongjoon Hyun > Verify built images in `build-image` CI job via `docker run` test > - > > Key: SPARK-49130 > URL: https://issues.apache.org/jira/browse/SPARK-49130 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Project Infra, Tests >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49132) Minimize docker image by removing redundant `chown` commands
Dongjoon Hyun created SPARK-49132: - Summary: Minimize docker image by removing redundant `chown` commands Key: SPARK-49132 URL: https://issues.apache.org/jira/browse/SPARK-49132 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: kubernetes-operator-0.1.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49130) Verify built images in `build-image` CI job via `docker run` test
Dongjoon Hyun created SPARK-49130: - Summary: Verify built images in `build-image` CI job via `docker run` test Key: SPARK-49130 URL: https://issues.apache.org/jira/browse/SPARK-49130 Project: Spark Issue Type: Sub-task Components: Kubernetes, Project Infra, Tests Affects Versions: kubernetes-operator-0.1.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49129) Fix `ENTRYPOINT` to point `/opt/spark-operator/operator/docker-entrypoint.sh`
[ https://issues.apache.org/jira/browse/SPARK-49129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49129. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 31 [https://github.com/apache/spark-kubernetes-operator/pull/31] > Fix `ENTRYPOINT` to point `/opt/spark-operator/operator/docker-entrypoint.sh` > - > > Key: SPARK-49129 > URL: https://issues.apache.org/jira/browse/SPARK-49129 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Critical > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49129) Fix `ENTRYPOINT` to point `/opt/spark-operator/operator/docker-entrypoint.sh`
[ https://issues.apache.org/jira/browse/SPARK-49129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49129: - Assignee: Dongjoon Hyun > Fix `ENTRYPOINT` to point `/opt/spark-operator/operator/docker-entrypoint.sh` > - > > Key: SPARK-49129 > URL: https://issues.apache.org/jira/browse/SPARK-49129 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Critical > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-49129) Fix `ENTRYPOINT` to point `/opt/spark-operator/operator/docker-entrypoint.sh`
Dongjoon Hyun created SPARK-49129: - Summary: Fix `ENTRYPOINT` to point `/opt/spark-operator/operator/docker-entrypoint.sh` Key: SPARK-49129 URL: https://issues.apache.org/jira/browse/SPARK-49129 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: kubernetes-operator-0.1.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-49045) Add docker image build for operator
[ https://issues.apache.org/jira/browse/SPARK-49045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-49045. --- Fix Version/s: kubernetes-operator-0.1.0 Resolution: Fixed Issue resolved by pull request 28 [https://github.com/apache/spark-kubernetes-operator/pull/28] > Add docker image build for operator > --- > > Key: SPARK-49045 > URL: https://issues.apache.org/jira/browse/SPARK-49045 > Project: Spark > Issue Type: Sub-task > Components: k8s >Affects Versions: kubernetes-operator-0.1.0 >Reporter: Zhou JIANG >Assignee: Zhou JIANG >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-0.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49045) Add docker image build for operator
[ https://issues.apache.org/jira/browse/SPARK-49045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49045: - Assignee: Zhou JIANG > Add docker image build for operator > --- > > Key: SPARK-49045 > URL: https://issues.apache.org/jira/browse/SPARK-49045 > Project: Spark > Issue Type: Sub-task > Components: k8s >Affects Versions: kubernetes-operator-0.1.0 >Reporter: Zhou JIANG >Assignee: Zhou JIANG >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-49128) Support custom History Server UI title
[ https://issues.apache.org/jira/browse/SPARK-49128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-49128: - Assignee: Dongjoon Hyun > Support custom History Server UI title > -- > > Key: SPARK-49128 > URL: https://issues.apache.org/jira/browse/SPARK-49128 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org