[jira] [Commented] (SPARK-34193) Potential race condition during decommissioning with TorrentBroadcast

2021-01-21 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269565#comment-17269565 ] Holden Karau commented on SPARK-34193: -- Note, so far I've only triggered

[jira] [Created] (SPARK-34193) Potential race condition during decommissioning with TorrentBroadcast

2021-01-21 Thread Holden Karau (Jira)
Holden Karau created SPARK-34193: Summary: Potential race condition during decommissioning with TorrentBroadcast Key: SPARK-34193 URL: https://issues.apache.org/jira/browse/SPARK-34193 Project: Spark

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-19 Thread Holden Karau
+1, pip installs on Python 3.8 One potential thing we might want to consider if there ends up being another RC is that the error message for installing with Python2 could be clearer. Processing ./pyspark-3.1.1.tar.gz ERROR: Command errored out with exit status 1: command: /tmp/py3.1/bin/

[jira] [Created] (SPARK-34105) In addition to killing exlcuded/flakey executors which should support decommissioning

2021-01-13 Thread Holden Karau (Jira)
Holden Karau created SPARK-34105: Summary: In addition to killing exlcuded/flakey executors which should support decommissioning Key: SPARK-34105 URL: https://issues.apache.org/jira/browse/SPARK-34105

[jira] [Created] (SPARK-34104) Allow users to specify a maximum decommissioning time

2021-01-13 Thread Holden Karau (Jira)
Holden Karau created SPARK-34104: Summary: Allow users to specify a maximum decommissioning time Key: SPARK-34104 URL: https://issues.apache.org/jira/browse/SPARK-34104 Project: Spark Issue

[jira] [Assigned] (SPARK-34104) Allow users to specify a maximum decommissioning time

2021-01-13 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau reassigned SPARK-34104: Assignee: Holden Karau > Allow users to specify a maximum decommissioning t

[jira] [Commented] (SPARK-34104) Allow users to specify a maximum decommissioning time

2021-01-13 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264488#comment-17264488 ] Holden Karau commented on SPARK-34104: -- I'm working on this. > Al

[jira] [Resolved] (SPARK-33711) Race condition in Spark k8s Pod lifecycle manager that leads to shutdowns

2021-01-11 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-33711. -- Fix Version/s: 3.1.1 3.2.0 Assignee: Attila Zsolt Piros

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-06 Thread Holden Karau
I think that posting the 3.1.0 maven release was an accident and we're going to 3.1.1 RCs is the right step forward. I'd ask for maybe a day before cutting the 3.1.1 release, I think https://issues.apache.org/jira/browse/SPARK-34018 is also a blocker (at first I thought it was just a test issue, bu

[jira] [Updated] (SPARK-34018) NPE in ExecutorPodsSnapshot

2021-01-06 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau updated SPARK-34018: - Description: Currently the test (finishedExecutorWithRunningSidecar in test utils depends on

[jira] [Commented] (SPARK-34018) finishedExecutorWithRunningSidecar in test utils depends on nulls matching

2021-01-06 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17260232#comment-17260232 ] Holden Karau commented on SPARK-34018: -- I think this could maybe happy

[jira] [Created] (SPARK-34018) finishedExecutorWithRunningSidecar in test utils depends on nulls matching

2021-01-05 Thread Holden Karau (Jira)
Holden Karau created SPARK-34018: Summary: finishedExecutorWithRunningSidecar in test utils depends on nulls matching Key: SPARK-34018 URL: https://issues.apache.org/jira/browse/SPARK-34018 Project

[jira] [Created] (SPARK-33874) Spark may report PodRunning if there is a sidecar that has not exited

2020-12-21 Thread Holden Karau (Jira)
Holden Karau created SPARK-33874: Summary: Spark may report PodRunning if there is a sidecar that has not exited Key: SPARK-33874 URL: https://issues.apache.org/jira/browse/SPARK-33874 Project: Spark

[jira] [Resolved] (SPARK-33261) Allow people to extend the pod feature steps

2020-12-14 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-33261. -- Fix Version/s: 3.2.0 Resolution: Fixed > Allow people to extend the pod feature st

[jira] [Created] (SPARK-33763) Add metrics for better tracking of dynamic allocation

2020-12-11 Thread Holden Karau (Jira)
Holden Karau created SPARK-33763: Summary: Add metrics for better tracking of dynamic allocation Key: SPARK-33763 URL: https://issues.apache.org/jira/browse/SPARK-33763 Project: Spark Issue

[R] Github actions "Error: Error in loadNamespace(name) : there is no package called ‘devtools’ 2989"

2020-12-10 Thread Holden Karau
Hi Folks, It's been ages since I've done any R dev work, is anyone working on Spark R github actions? If not I'll dig into this but I figure someone with more context can fix it quickly. Cheers, Holden :) -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark

[jira] [Created] (SPARK-33746) Minikube is failing to start on research-jenkins-worker-05

2020-12-10 Thread Holden Karau (Jira)
Holden Karau created SPARK-33746: Summary: Minikube is failing to start on research-jenkins-worker-05 Key: SPARK-33746 URL: https://issues.apache.org/jira/browse/SPARK-33746 Project: Spark

[jira] [Created] (SPARK-33745) SparkR docker install requires LaTeX packages

2020-12-10 Thread Holden Karau (Jira)
Holden Karau created SPARK-33745: Summary: SparkR docker install requires LaTeX packages Key: SPARK-33745 URL: https://issues.apache.org/jira/browse/SPARK-33745 Project: Spark Issue Type

[jira] [Commented] (SPARK-33727) `gpg: keyserver receive failed: No name` during K8s IT

2020-12-09 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246923#comment-17246923 ] Holden Karau commented on SPARK-33727: -- So I've seen the keys.gnupg.net k

[jira] [Created] (SPARK-33728) Improve error messages during K8s integration test failure

2020-12-09 Thread Holden Karau (Jira)
Holden Karau created SPARK-33728: Summary: Improve error messages during K8s integration test failure Key: SPARK-33728 URL: https://issues.apache.org/jira/browse/SPARK-33728 Project: Spark

[jira] [Created] (SPARK-33724) Allow decommissioning script location to be configured

2020-12-09 Thread Holden Karau (Jira)
Holden Karau created SPARK-33724: Summary: Allow decommissioning script location to be configured Key: SPARK-33724 URL: https://issues.apache.org/jira/browse/SPARK-33724 Project: Spark Issue

[jira] [Created] (SPARK-33716) Decommissioning Race Condition during Pod Snapshot

2020-12-08 Thread Holden Karau (Jira)
Holden Karau created SPARK-33716: Summary: Decommissioning Race Condition during Pod Snapshot Key: SPARK-33716 URL: https://issues.apache.org/jira/browse/SPARK-33716 Project: Spark Issue

Re: I'm going to be out starting Nov 5th

2020-11-01 Thread Holden Karau
Oct 31, 2020 at 9:53 PM 郑瑞峰 wrote: > >> Take care, Holden! Best wishes! >> >> >> -- 原始邮件 -- >> *发件人:* "Hyukjin Kwon" ; >> *发送时间:* 2020年11月1日(星期天) 上午10:24 >> *收件人:* "Denny Lee"; >> *抄送:* &quo

I'm going to be out starting Nov 5th

2020-10-31 Thread Holden Karau
Hi Folks, Just a heads up so folks working on decommissioning or other areas I've been active in don't block on me, I'm going to be out for at least a week and possibly more starting on November 5th. If there is anything that folks want me to review before then please let me know and I'll make the

[jira] [Created] (SPARK-33262) Keep pending pods in account while scheduling new pods

2020-10-27 Thread Holden Karau (Jira)
Holden Karau created SPARK-33262: Summary: Keep pending pods in account while scheduling new pods Key: SPARK-33262 URL: https://issues.apache.org/jira/browse/SPARK-33262 Project: Spark Issue

[jira] [Created] (SPARK-33261) Allow people to extend the pod feature steps

2020-10-27 Thread Holden Karau (Jira)
Holden Karau created SPARK-33261: Summary: Allow people to extend the pod feature steps Key: SPARK-33261 URL: https://issues.apache.org/jira/browse/SPARK-33261 Project: Spark Issue Type

[jira] [Resolved] (SPARK-30821) Executor pods with multiple containers will not be rescheduled unless all containers fail

2020-10-24 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-30821. -- Fix Version/s: 3.1.0 3.0.2 Resolution: Fixed > Executor pods w

[jira] [Created] (SPARK-33231) Make podCreationTimeout configurable

2020-10-23 Thread Holden Karau (Jira)
Holden Karau created SPARK-33231: Summary: Make podCreationTimeout configurable Key: SPARK-33231 URL: https://issues.apache.org/jira/browse/SPARK-33231 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-33200) Validate that all shuffle files are migrated during E2E testing

2020-10-20 Thread Holden Karau (Jira)
Holden Karau created SPARK-33200: Summary: Validate that all shuffle files are migrated during E2E testing Key: SPARK-33200 URL: https://issues.apache.org/jira/browse/SPARK-33200 Project: Spark

Re: Scala vs Python for ETL with Spark

2020-10-17 Thread Holden Karau
Scala and Python have their advantages and disadvantages with Spark. In my experience with performance is super important you’ll end up needing to do some of your work in the JVM, but in many situations what matters work is what your team and company are familiar with and the ecosystem of tooling

[jira] [Created] (SPARK-33154) Handle shuffle blocks being removed during decommissioning

2020-10-14 Thread Holden Karau (Jira)
Holden Karau created SPARK-33154: Summary: Handle shuffle blocks being removed during decommissioning Key: SPARK-33154 URL: https://issues.apache.org/jira/browse/SPARK-33154 Project: Spark

[jira] [Resolved] (SPARK-33151) Jenkins PRB is not responding

2020-10-14 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-33151. -- Resolution: Fixed > Jenkins PRB is not respond

[jira] [Created] (SPARK-33151) Jenkins PRB is not responding

2020-10-14 Thread Holden Karau (Jira)
Holden Karau created SPARK-33151: Summary: Jenkins PRB is not responding Key: SPARK-33151 URL: https://issues.apache.org/jira/browse/SPARK-33151 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-33097) Add Decommissioning messages to the Spark listener bus

2020-10-08 Thread Holden Karau (Jira)
Holden Karau created SPARK-33097: Summary: Add Decommissioning messages to the Spark listener bus Key: SPARK-33097 URL: https://issues.apache.org/jira/browse/SPARK-33097 Project: Spark Issue

Re: Official support of CREATE EXTERNAL TABLE

2020-10-07 Thread Holden Karau
ine with option 2 if there are reasonable use cases. I think it's >>> always safer to keep the behavior the same as before. If we want to change >>> the behavior and follow option 2, we need use cases to justify it. >>> >>> For now, the only use case I see is

Re: Official support of CREATE EXTERNAL TABLE

2020-10-06 Thread Holden Karau
As someone who's had the job of porting different SQL dialects to Spark, I'm also very much in favor of keeping EXTERNAL, and I think Ryan's suggestion of leaving it up to the catalogs on how to handle this makes sense. On Tue, Oct 6, 2020 at 1:54 PM Ryan Blue wrote: > I would summarize both the

New ASF guidelines for docker images

2020-10-04 Thread Holden Karau
Hi Folks, It seems like there are new guidelines for incubator projects that might make it easier for us to publish Spark docker images. While we are not an incubator project, I think this is probably a sign we should revisit publishing docker images. Cheers, Holden -- Forwarded message

[jira] [Created] (SPARK-33049) Decommission Core Integration Test is flaky.

2020-10-01 Thread Holden Karau (Jira)
Holden Karau created SPARK-33049: Summary: Decommission Core Integration Test is flaky. Key: SPARK-33049 URL: https://issues.apache.org/jira/browse/SPARK-33049 Project: Spark Issue Type: Bug

Re: [VOTE][Format] Allow for 256-bit Decimal's in the Arrow specification

2020-09-29 Thread Holden Karau
+1 (non-binding) On Tue, Sep 29, 2020 at 6:08 PM Sutou Kouhei wrote: > +1 > > In > "Re: [VOTE][Format] Allow for 256-bit Decimal's in the Arrow > specification" on Tue, 29 Sep 2020 13:38:04 -0700, > Jacques Nadeau wrote: > > > +1 > > > > On Tue, Sep 29, 2020 at 11:19 AM Wes McKinney > wro

[jira] [Assigned] (SPARK-32381) Expose the ability for users to use parallel file & avoid location information discovery in RDDs

2020-09-24 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau reassigned SPARK-32381: Assignee: Chao Sun > Expose the ability for users to use parallel file & avoid l

[jira] [Resolved] (SPARK-32381) Expose the ability for users to use parallel file & avoid location information discovery in RDDs

2020-09-24 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-32381. -- Fix Version/s: 3.1.0 Resolution: Fixed > Expose the ability for users to use paral

Re: Running K8s integration tests for changes in core?

2020-09-24 Thread Holden Karau
l set this up right now and keep an eye on the queue/build results > today. > > shane > > On Thu, Aug 20, 2020 at 2:28 PM Holden Karau wrote: > >> Sounds good, thanks for the heads up. I hope you get some time to relax :) >> >> On Thu, Aug 20, 2020 at 2:26 PM shane

[jira] [Commented] (SPARK-32980) Launcher Client tests flake with minikube

2020-09-23 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201073#comment-17201073 ] Holden Karau commented on SPARK-32980: -- Our method of getting the service ass

[jira] [Created] (SPARK-32980) Launcher Client tests flake with minikube

2020-09-23 Thread Holden Karau (Jira)
Holden Karau created SPARK-32980: Summary: Launcher Client tests flake with minikube Key: SPARK-32980 URL: https://issues.apache.org/jira/browse/SPARK-32980 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-32979) Spark K8s decom test is broken

2020-09-23 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-32979. -- Resolution: Duplicate Duplicate of SPARK-32937 > Spark K8s decom test is bro

[jira] [Created] (SPARK-32979) Spark K8s decom test is broken

2020-09-23 Thread Holden Karau (Jira)
Holden Karau created SPARK-32979: Summary: Spark K8s decom test is broken Key: SPARK-32979 URL: https://issues.apache.org/jira/browse/SPARK-32979 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-32937) DecomissionSuite in k8s integration tests is failing.

2020-09-19 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198867#comment-17198867 ] Holden Karau commented on SPARK-32937: -- So there are additional proposed cha

[jira] [Commented] (SPARK-32937) DecomissionSuite in k8s integration tests is failing.

2020-09-19 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198863#comment-17198863 ] Holden Karau commented on SPARK-32937: -- I want to be clear, I'm ask

[jira] [Commented] (SPARK-32937) DecomissionSuite in k8s integration tests is failing.

2020-09-19 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198860#comment-17198860 ] Holden Karau commented on SPARK-32937: -- It looks like someone changed the log

Re: How to clear spark Shuffle files

2020-09-14 Thread Holden Karau
There's a second new mechanism which uses TTL for cleanup of shuffle files. Can you share more about your use case? On Mon, Sep 14, 2020 at 1:33 PM Edward Mitchell wrote: > We've also had some similar disk fill issues. > > For Java/Scala RDDs, shuffle file cleanup is done as part of the JVM > ga

[jira] [Commented] (SPARK-32881) NoSuchElementException occurs during decommissioning

2020-09-14 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195700#comment-17195700 ] Holden Karau commented on SPARK-32881: -- Thanks for the catch, I'll take

[jira] [Created] (SPARK-32866) Docker buildx now requires --push

2020-09-12 Thread Holden Karau (Jira)
Holden Karau created SPARK-32866: Summary: Docker buildx now requires --push Key: SPARK-32866 URL: https://issues.apache.org/jira/browse/SPARK-32866 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-32745) Add an option to reject block migrations when under disk pressure

2020-08-30 Thread Holden Karau (Jira)
Holden Karau created SPARK-32745: Summary: Add an option to reject block migrations when under disk pressure Key: SPARK-32745 URL: https://issues.apache.org/jira/browse/SPARK-32745 Project: Spark

[jira] [Commented] (YUNIKORN-380) Deploying YUNIKORN 0.9 RC2 on Minikube 1.3.1 results in port conflict

2020-08-30 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/YUNIKORN-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17187396#comment-17187396 ] Holden Karau commented on YUNIKORN-380: --- That makes sense. I'll

[jira] [Resolved] (SPARK-32643) [Cleanup] Consolidate state kept in ExecutorDecommissionInfo with TaskSetManager.tidToExecutorKillTimeMapping

2020-08-26 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-32643. -- Fix Version/s: 3.1.0 Resolution: Fixed > [Cleanup] Consolidate state kept

[jira] [Assigned] (SPARK-32643) [Cleanup] Consolidate state kept in ExecutorDecommissionInfo with TaskSetManager.tidToExecutorKillTimeMapping

2020-08-26 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau reassigned SPARK-32643: Assignee: Devesh Agrawal > [Cleanup] Consolidate state kept in ExecutorDecommissionI

[jira] [Created] (YUNIKORN-380) Deploying YUNIKORN 0.9 RC2 on Minikube 1.3.1 results in port conflict

2020-08-23 Thread Holden Karau (Jira)
Holden Karau created YUNIKORN-380: - Summary: Deploying YUNIKORN 0.9 RC2 on Minikube 1.3.1 results in port conflict Key: YUNIKORN-380 URL: https://issues.apache.org/jira/browse/YUNIKORN-380 Project

[jira] [Created] (YUNIKORN-380) Deploying YUNIKORN 0.9 RC2 on Minikube 1.3.1 results in port conflict

2020-08-23 Thread Holden Karau (Jira)
Holden Karau created YUNIKORN-380: - Summary: Deploying YUNIKORN 0.9 RC2 on Minikube 1.3.1 results in port conflict Key: YUNIKORN-380 URL: https://issues.apache.org/jira/browse/YUNIKORN-380 Project

[jira] [Commented] (YUNIKORN-379) Default log level is too verbose

2020-08-23 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/YUNIKORN-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17182865#comment-17182865 ] Holden Karau commented on YUNIKORN-379: --- Also related to this issue is the

[jira] [Created] (YUNIKORN-379) Default log level is too verbose

2020-08-23 Thread Holden Karau (Jira)
Holden Karau created YUNIKORN-379: - Summary: Default log level is too verbose Key: YUNIKORN-379 URL: https://issues.apache.org/jira/browse/YUNIKORN-379 Project: Apache YuniKorn Issue Type

[jira] [Created] (YUNIKORN-379) Default log level is too verbose

2020-08-23 Thread Holden Karau (Jira)
Holden Karau created YUNIKORN-379: - Summary: Default log level is too verbose Key: YUNIKORN-379 URL: https://issues.apache.org/jira/browse/YUNIKORN-379 Project: Apache YuniKorn Issue Type

Re: Running K8s integration tests for changes in core?

2020-08-20 Thread Holden Karau
r of ubuntu workers w/minikube and docker, >>> but it shouldn't be too bad as the full integration test takes ~45m, vs 4+ >>> hrs for the regular PRB. >>> >>> i can enable this in about 1m of time if the consensus is for us to want >>> this. >>> &g

Re: Running K8s integration tests for changes in core?

2020-08-19 Thread Holden Karau
Sounds good. In the meantime would folks committing things in core run the K8s PRB or run it locally? A second change this morning was committed that broke the K8s PR tests. On Tue, Aug 18, 2020 at 9:53 PM Prashant Sharma wrote: > +1, we should enable. > > On Wed, Aug 19, 2020 at 9:18

Running K8s integration tests for changes in core?

2020-08-18 Thread Holden Karau
Hi Dev Folks, I was wondering how people feel about enabling the K8s PRB automatically for all core changes? Sometimes I forget that a change might impact one of the K8s integration tests since a bunch of them look at log messages. Would folks be OK with turning on the K8s integration PRB for all

[jira] [Created] (SPARK-32657) Decommissioning Integration Test checks for old string

2020-08-18 Thread Holden Karau (Jira)
Holden Karau created SPARK-32657: Summary: Decommissioning Integration Test checks for old string Key: SPARK-32657 URL: https://issues.apache.org/jira/browse/SPARK-32657 Project: Spark Issue

[jira] [Created] (SPARK-32642) Add support for ESS in Spark sidecar

2020-08-17 Thread Holden Karau (Jira)
Holden Karau created SPARK-32642: Summary: Add support for ESS in Spark sidecar Key: SPARK-32642 URL: https://issues.apache.org/jira/browse/SPARK-32642 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-32642) Add support for ESS in Spark sidecar

2020-08-17 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17179257#comment-17179257 ] Holden Karau commented on SPARK-32642: -- I'm exploring this. > Add sup

[jira] [Resolved] (SPARK-31198) Use graceful decommissioning as part of dynamic scaling

2020-08-13 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-31198. -- Fix Version/s: 3.1.0 Resolution: Fixed > Use graceful decommissioning as part

[jira] [Commented] (SPARK-32530) SPIP: Kotlin support for Apache Spark

2020-08-11 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175547#comment-17175547 ] Holden Karau commented on SPARK-32530: -- I can understand the desire to have

Re: [VOTE] Release Spark 2.4.7 (RC1)

2020-08-08 Thread Holden Karau
I'm going to go ahead and vote -0 then based on that then. On Fri, Aug 7, 2020 at 11:36 PM Dongjoon Hyun wrote: > Hi, All. > > Unfortunately, there is an on-going discussion about the new decimal > correctness. > > Although we fixed one correctness issue at master and backported it > partially t

Re: spark-on-k8s is still experimental?

2020-08-05 Thread Holden Karau
.com/apache/spark/pull/29368. > > Anyway, many thanks! > > > On Tue, Aug 4, 2020 at 12:26 AM Holden Karau wrote: > >> There was discussion around removing the statement and declaring it GA >> but I believe it was decided to leave it in until an external shuffle >> s

[jira] [Resolved] (SPARK-31197) Exit the executor once all tasks & migrations are finished

2020-08-05 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-31197. -- Fix Version/s: 3.1.0 Target Version/s: 3.1.0 Resolution: Fixed > Exit

[jira] [Assigned] (SPARK-31197) Exit the executor once all tasks & migrations are finished

2020-08-05 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau reassigned SPARK-31197: Assignee: Holden Karau > Exit the executor once all tasks & migrations are f

Re: Someone should get started on migrating your build jobs to ci-builds.apache.org

2020-08-05 Thread Holden Karau
The Spark project is using github actions successfully. We’re currently trying to get more workers added (apparently all of the ASF repos get one pool together). But honestly it’s been faster for us even with the current number of workers so I think that should be fine for mahout. On Wed, Aug 5, 2

Re: [VOTE] Update the committer guidelines to clarify when to commit changes.

2020-08-04 Thread Holden Karau
10:18 PM Xiao Li wrote: > >> +1 >> >> Xiao >> >> On Fri, Jul 31, 2020 at 9:32 AM Mridul Muralidharan >> wrote: >> >>> >>> +1 >>> >>> Thanks, >>> Mridul >>> >>> On Thu, Jul 30, 2020 at 4:49 PM

Re: Removing references to Master

2020-08-04 Thread Holden Karau
I think this is a good idea, and yes keeping it backwards compatible initially is important since we missed the boat on Spark 3. I like the Controller/Leader one since I think that does a good job of reflecting the codes role. On Tue, Aug 4, 2020 at 7:01 AM Tom Graves wrote: > Hey everyone, > >

Re: spark-on-k8s is still experimental?

2020-08-03 Thread Holden Karau
There was discussion around removing the statement and declaring it GA but I believe it was decided to leave it in until an external shuffle service is supported on K8s. On Mon, Aug 3, 2020 at 2:45 AM JackyLee wrote: > +1. It has been worked well in our company and we has used it to support > on

Re: [VOTE] Update the committer guidelines to clarify when to commit changes.

2020-07-30 Thread Holden Karau
+1 from myself :) On Thu, Jul 30, 2020 at 2:53 PM Jungtaek Lim wrote: > +1 (non-binding, I guess) > > Thanks for raising the issue and sorting it out! > > On Fri, Jul 31, 2020 at 6:47 AM Holden Karau wrote: > >> Hi Spark Developers, >> >> After the discu

[VOTE] Update the committer guidelines to clarify when to commit changes.

2020-07-30 Thread Holden Karau
Hi Spark Developers, After the discussion of the proposal to amend Spark committer guidelines, it appears folks are generally in agreement on policy clarifications. (See https://lists.apache.org/thread.html/r6706e977fda2c474a7f24775c933c2f46ea19afbfafb03c90f6972ba%40%3Cdev.spark.apache.org%3E, as

[jira] [Resolved] (SPARK-32417) Flaky test: BlockManagerDecommissionIntegrationSuite.verify that an already running task which is going to cache data succeeds on a decommissioned executor

2020-07-30 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-32417. -- Fix Version/s: 3.1.0 Assignee: Devesh Agrawal Resolution: Fixed > Flaky t

[jira] [Resolved] (SPARK-32199) Clear shuffle state when decommissioned nodes/executors are finally lost

2020-07-30 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-32199. -- Fix Version/s: 3.1.0 Assignee: Devesh Agrawal Resolution: Fixed > Cl

[jira] [Assigned] (SPARK-32198) Don't fail running jobs when decommissioned executors finally go away

2020-07-30 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau reassigned SPARK-32198: Assignee: Devesh Agrawal > Don't fail running jobs when decommissioned executors

[jira] [Resolved] (SPARK-32198) Don't fail running jobs when decommissioned executors finally go away

2020-07-30 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-32198. -- Fix Version/s: 3.1.0 Target Version/s: 3.1.0 Resolution: Fixed > Don

Re: [DISCUSS] Amend the commiter guidelines on the subject of -1s & how we expect PR discussion to be treated.

2020-07-25 Thread Holden Karau
ote: > >> +1 >> >> Tom >> >> On Tuesday, July 21, 2020, 03:35:18 PM CDT, Holden Karau < >> hol...@pigscanfly.ca> wrote: >> >> >> Hi Spark Developers, >> >> There has been a rather active discussion regarding the s

Re: Starting work on last Scala 2.13 updates

2020-07-24 Thread Holden Karau
This is awesome progress :) On Fri, Jul 24, 2020 at 8:56 AM Sean Owen wrote: > Status update - we should have Scala 2.13 compiling, with the > exception of the REPL. > Looks like 99% or so of tests pass too, but the remaining ones might > be hard to debug. I haven't looked hard yet. > https://is

[jira] [Commented] (SPARK-32264) More resources in Github Actions

2020-07-23 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164066#comment-17164066 ] Holden Karau commented on SPARK-32264: -- It's being routed inside of Git

Re: Exposing Spark parallelized directory listing & non-locality listing in core

2020-07-23 Thread Holden Karau
Awesome that sounds great :) On Thu, Jul 23, 2020 at 3:43 AM Steve Loughran wrote: > > > On Wed, 22 Jul 2020 at 18:50, Holden Karau wrote: > >> Wonderful. To be clear the patch is more to start the discussion about >> how we want to do it and less what I think is t

[jira] [Resolved] (SPARK-32217) Track whether the worker is also being decommissioned along with an executor

2020-07-22 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-32217. -- Fix Version/s: 3.1.0 Assignee: Devesh Agrawal Resolution: Fixed > Tr

[jira] [Created] (SPARK-32397) Snapshot artifacts can have differing timestamps, making it hard to consume

2020-07-22 Thread Holden Karau (Jira)
Holden Karau created SPARK-32397: Summary: Snapshot artifacts can have differing timestamps, making it hard to consume Key: SPARK-32397 URL: https://issues.apache.org/jira/browse/SPARK-32397 Project

Re: [DISCUSS] Amend the commiter guidelines on the subject of -1s & how we expect PR discussion to be treated.

2020-07-22 Thread Holden Karau
A -1 from a non-committer can be overridden only with input from multiple committers and suitable time for any committer to raise concerns. A -1 from a committer who can not be reached requires a consensus vote of the PMC under ASF voting rules to determine the next steps within the ASF guidelines f

Re: Exposing Spark parallelized directory listing & non-locality listing in core

2020-07-22 Thread Holden Karau
Wonderful. To be clear the patch is more to start the discussion about how we want to do it and less what I think is the right way. On Wed, Jul 22, 2020 at 10:47 AM Steve Loughran wrote: > > > On Wed, 22 Jul 2020 at 00:51, Holden Karau wrote: > >> Hi Folks, >> >

Exposing Spark parallelized directory listing & non-locality listing in core

2020-07-21 Thread Holden Karau
Hi Folks, In Spark SQL there is the ability to have Spark do it's partition discovery/file listing in parallel on the worker nodes and also avoid locality lookups. I'd like to expose this in core, but given the Hadoop APIs it's a bit more complicated to do right. I made a quick POC and two potenti

[jira] [Resolved] (SPARK-24266) Spark client terminates while driver is still running

2020-07-21 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-24266. -- Fix Version/s: 3.1.0 Target Version/s: 3.1.0 (was: 2.4.7, 3.1.0) Resolution

[DISCUSS] Amend the commiter guidelines on the subject of -1s & how we expect PR discussion to be treated.

2020-07-21 Thread Holden Karau
Hi Spark Developers, There has been a rather active discussion regarding the specific vetoes that occured during Spark 3. From that I believe we are now mostly in agreement that it would be best to clarify our rules around code vetoes & merging in general. Personally I believe this change is impor

[jira] [Created] (SPARK-32381) Expose the ability for users to use parallel file & avoid location information discovery in RDDs

2020-07-21 Thread Holden Karau (Jira)
Holden Karau created SPARK-32381: Summary: Expose the ability for users to use parallel file & avoid location information discovery in RDDs Key: SPARK-32381 URL: https://issues.apache.org/jira/browse/SPARK-3

[jira] [Commented] (SPARK-26345) Parquet support Column indexes

2020-07-21 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162313#comment-17162313 ] Holden Karau commented on SPARK-26345: -- We don't assign issues norma

Re: [PySpark] Revisiting PySpark type annotations

2020-07-21 Thread Holden Karau
Yeah I think this could be a great project now that we're only Python 3.5+. One potential is making this an Outreachy project to get more folks from different backgrounds involved in Spark. On Tue, Jul 21, 2020 at 12:33 PM Driesprong, Fokko wrote: > Since we've recently dropped support for Pytho

Re: [OSS DIGEST] The major changes of Apache Spark from June 3 to June 16

2020-07-21 Thread Holden Karau
ingbo > > On Tue, Jul 21, 2020 at 11:13 AM Holden Karau > wrote: > >> I'd also add [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are >> being shutdown & >> >> [SPARK-21040][CORE] Speculate tasks which are running on decommission >> executors

Re: [OSS DIGEST] The major changes of Apache Spark from June 3 to June 16

2020-07-21 Thread Holden Karau
I'd also add [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown & [SPARK-21040][CORE] Speculate tasks which are running on decommission executors two of the PRs merged after the decommissioning SPIP. On Tue, Jul 21, 2020 at 10:53 AM Xingbo Jiang wrote: > Hi all, > > This i

[jira] [Resolved] (SPARK-20629) Copy shuffle data when nodes are being shut down

2020-07-19 Thread Holden Karau (Jira)
[ https://issues.apache.org/jira/browse/SPARK-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holden Karau resolved SPARK-20629. -- Fix Version/s: 3.1.0 Assignee: Holden Karau Resolution: Fixed > Copy shuf

<    1   2   3   4   5   6   7   8   9   10   >