[spark] branch master updated: [SPARK-42205][CORE] Don't log accumulator values in stage / task start and getting result event logs

2023-09-29 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new bb59b489204 [SPARK-42205][CORE] Don't log

[spark] branch master updated: [SPARK-44818] Fix race for pending task kill issued before taskThread is initialized

2023-08-21 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new c34ec411244 [SPARK-44818] Fix race

[spark] branch master updated: [SPARK-43300][CORE] NonFateSharingCache wrapper for Guava Cache

2023-05-15 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new d53ddbe00fe [SPARK-43300][CORE

[spark] branch master updated: [SPARK-40261][CORE] Exclude DirectTaskResult metadata when calculating result size

2022-08-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5a4b075f95f [SPARK-40261][CORE] Exclude

[spark] branch master updated: [SPARK-40235][CORE] Use interruptible lock instead of synchronized in Executor.updateDependencies()

2022-08-29 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 295dd57c13c [SPARK-40235][CORE] Use

[spark] branch master updated: [SPARK-40211][CORE][SQL] Allow customize initial partitions number in take() behavior

2022-08-26 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 1178bcecc83 [SPARK-40211][CORE][SQL] Allow

[spark] branch master updated (50c163578cf -> 6cd9d88e237)

2022-08-18 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 50c163578cf Revert "[SPARK-4][SQL] Update INSERTs without user-specified fields to not automaticall

[spark] branch master updated: [SPARK-39983][CORE][SQL] Do not cache unserialized broadcast relations on the driver

2022-08-10 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e17d8ecabca [SPARK-39983][CORE][SQL] Do

[spark] branch master updated: [SPARK-39636][CORE][UI] Fix multiple bugs in JsonProtocol, impacting off heap StorageLevels and Task/Executor ResourceRequests

2022-06-30 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a39fc8773b2 [SPARK-39636][CORE][UI] Fix

[spark] branch branch-3.2 updated: [SPARK-39422][SQL] Improve error message for 'SHOW CREATE TABLE' with unsupported serdes

2022-06-09 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 7fd2e967a8a [SPARK-39422][SQL

[spark] branch branch-3.3 updated: [SPARK-39422][SQL] Improve error message for 'SHOW CREATE TABLE' with unsupported serdes

2022-06-09 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new ff048f1b69e [SPARK-39422][SQL

[spark] branch master updated: [SPARK-39422][SQL] Improve error message for 'SHOW CREATE TABLE' with unsupported serdes

2022-06-09 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 8765eea1c08 [SPARK-39422][SQL] Improve error

[spark] branch branch-3.3 updated: [SPARK-39361] Don't use Log4J2's extended throwable conversion pattern in default logging configurations

2022-06-02 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 4da8f3a76b1 [SPARK-39361] Don't use

[spark] branch master updated: [SPARK-39361] Don't use Log4J2's extended throwable conversion pattern in default logging configurations

2022-06-02 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new fd45c3656be [SPARK-39361] Don't use Log4J2's

[spark] 01/02: [SPARK-32911][CORE] Free memory in UnsafeExternalSorter.SpillableIterator.spill() when all records have been read

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git commit 02488e0df30a25fd235e2dd25e9b1b3404150125 Author: Tom van Bussel AuthorDate: Fri Sep 18 11:49:26 2020 +

[spark] 02/02: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git commit c653d287b15db3e50fa206071a5435028879f15f Author: sandeepvinayak AuthorDate: Tue May 31 15:28:07 2022 -0700

[spark] branch branch-3.0 updated (9b268122f68 -> c653d287b15)

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git from 9b268122f68 [SPARK-39293][SQL] Fix the accumulator of ArrayAggregate to handle complex types properly new

[spark] branch branch-3.1 updated: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 0908337a765 [SPARK-39283][CORE] Fix

[spark] branch branch-3.2 updated: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 606830e9cae [SPARK-39283][CORE] Fix

[spark] branch branch-3.3 updated: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 1ad1c18fc28 [SPARK-39283][CORE] Fix

[spark] branch master updated: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 8d0c035f102 [SPARK-39283][CORE] Fix deadlock

[spark] branch branch-3.0 updated: [SPARK-37784][SQL] Correctly handle UDTs in CodeGenerator.addBufferedState()

2022-01-04 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 3aaf722 [SPARK-37784][SQL] Correctly

[spark] branch branch-3.1 updated: [SPARK-37784][SQL] Correctly handle UDTs in CodeGenerator.addBufferedState()

2022-01-04 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 5cc8b39 [SPARK-37784][SQL] Correctly

[spark] branch branch-3.2 updated: [SPARK-37784][SQL] Correctly handle UDTs in CodeGenerator.addBufferedState()

2022-01-04 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 45b7b7e [SPARK-37784][SQL] Correctly

[spark] branch master updated: [SPARK-37784][SQL] Correctly handle UDTs in CodeGenerator.addBufferedState()

2022-01-04 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new eeef48fa [SPARK-37784][SQL] Correctly handle

[spark] branch master updated: [SPARK-37379][SQL] Add tree pattern pruning to CTESubstitution rule

2021-11-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 3b4eb1f [SPARK-37379][SQL] Add tree pattern

[spark] 01/01: hacky wip towards python udf profiling

2021-11-18 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch python-udf-accumulator in repository https://gitbox.apache.org/repos/asf/spark.git commit 9213a85a40499fc7f0e24ea14c5051c45a022ef2 Author: Josh Rosen AuthorDate: Wed Oct 20 16:17:44 2021

[spark] branch python-udf-accumulator created (now 9213a85)

2021-11-18 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch python-udf-accumulator in repository https://gitbox.apache.org/repos/asf/spark.git. at 9213a85 hacky wip towards python udf profiling This branch includes the following new commits

[spark] branch master updated: [SPARK-36933][CORE] Clean up TaskMemoryManager.acquireExecutionMemory()

2021-10-18 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 1ef6c13 [SPARK-36933][CORE] Clean up

[spark] branch branch-3.0 updated: [SPARK-23626][CORE] Eagerly compute RDD.partitions on entire DAG when submitting job to DAGScheduler

2021-10-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 1709265 [SPARK-23626][CORE] Eagerly

[spark] branch branch-3.1 updated: [SPARK-23626][CORE] Eagerly compute RDD.partitions on entire DAG when submitting job to DAGScheduler

2021-10-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new c43f355 [SPARK-23626][CORE] Eagerly

[spark] branch branch-3.2 updated: [SPARK-23626][CORE] Eagerly compute RDD.partitions on entire DAG when submitting job to DAGScheduler

2021-10-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 01ee46e [SPARK-23626][CORE] Eagerly

[spark] branch master updated: [SPARK-23626][CORE] Eagerly compute RDD.partitions on entire DAG when submitting job to DAGScheduler

2021-10-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new c4e975e [SPARK-23626][CORE] Eagerly compute

[spark] branch branch-3.2 updated: [SPARK-36774][CORE][TESTS] Move SparkSubmitTestUtils to core module and use it in SparkSubmitSuite

2021-09-16 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 3502fda [SPARK-36774][CORE][TESTS

[spark] branch master updated (f1f2ec3 -> 3ae6e67)

2021-09-16 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f1f2ec3 [SPARK-36735][SQL][FOLLOWUP] Fix indentation of DynamicPartitionPruningSuite add 3ae6e67 [SPARK

[spark] branch master updated (33e45ec -> 23bed0d)

2019-08-22 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 33e45ec [SPARK-28769][CORE] Improve warning message of BarrierExecutionMode when required slots > maximum sl

[spark] branch branch-2.4 updated: [SPARK-26038][BRANCH-2.4] Decimal toScalaBigInt/toJavaBigInteger for decimals not fitting in long

2019-06-21 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new a71e90a [SPARK-26038][BRANCH-2.4

[spark] branch master updated: [SPARK-28112][TEST] Fix Kryo exception perf. bottleneck in tests due to absence of ML/MLlib classes

2019-06-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ec032ce [SPARK-28112][TEST] Fix Kryo

[spark] branch branch-2.4 updated: [SPARK-26555][SQL][BRANCH-2.4] make ScalaReflection subtype checking thread safe

2019-06-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new ba7f61e [SPARK-26555][SQL][BRANCH

[spark] branch master updated: [SPARK-28102][CORE] Avoid performance problems when lz4-java JNI libraries fail to initialize

2019-06-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6b27ad5 [SPARK-28102][CORE] Avoid

[spark] branch master updated: [SPARK-27839][SQL] Change UTF8String.replace() to operate on UTF8 bytes

2019-06-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new fc65e0f [SPARK-27839][SQL] Change

[spark] branch master updated: [SPARK-27684][SQL] Avoid conversion overhead for primitive types

2019-05-30 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 93db7b8 [SPARK-27684][SQL] Avoid conversion

[spark-website] branch asf-site updated: Update Josh Rosen's affiliation

2019-05-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new f90d6dd Update Josh Rosen's

spark git commit: [SPARK-22997] Add additional defenses against use of freed MemoryBlocks

2018-01-10 Thread joshrosen
ks.com> Closes #20191 from JoshRosen/SPARK-22997-add-defenses-against-use-after-free-bugs-in-memory-allocator. (cherry picked from commit f340b6b3066033d40b7e163fd5fb68e9820adfb1) Signed-off-by: Josh Rosen <joshro...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/re

spark git commit: [SPARK-22997] Add additional defenses against use of freed MemoryBlocks

2018-01-10 Thread joshrosen
; Closes #20191 from JoshRosen/SPARK-22997-add-defenses-against-use-after-free-bugs-in-memory-allocator. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f340b6b3 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f340

spark git commit: [SPARK-21444] Be more defensive when removing broadcasts in MapOutputTracker

2017-07-17 Thread joshrosen
ork unreliability / fuzz testing tools. Author: Josh Rosen <joshro...@databricks.com> Closes #18662 from JoshRosen/SPARK-21444. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5952ad2b Tree: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [SPARK-20715] Store MapStatuses only in MapOutputTracker, not ShuffleMapStage

2017-06-11 Thread joshrosen
as this patch tested? Existing tests. I purposely avoided making interface / API which would require significant updates or modifications to test code. Author: Josh Rosen <joshro...@databricks.com> Closes #17955 from JoshRosen/map-output-tracker-rewrite. Project: http://git-wip-us.apache.org/r

spark git commit: HOTFIX: fix Scalastyle break introduced in 4d57981cfb18e7500cde6c03ae46c7c9b697d064

2017-05-30 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master de953c214 -> 798a04fd7 HOTFIX: fix Scalastyle break introduced in 4d57981cfb18e7500cde6c03ae46c7c9b697d064 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/798a04fd

spark git commit: [SPARK-20102] Fix nightly packaging and RC packaging scripts w/ two minor build fixes

2017-03-27 Thread joshrosen
ipt. ## How was this patch tested? The LFTP fix was tested by manually running the failing commands on AMPLab Jenkins against the ASF SFTP server. The PySpark fix was tested locally. Author: Josh Rosen <joshro...@databricks.com> Closes #17437 from JoshRosen/spark-20102. Project: http:

spark git commit: [SPARK-19529][BRANCH-1.6] Backport PR #16866 to branch-1.6

2017-02-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 e78138a43 -> a50ef3d9a [SPARK-19529][BRANCH-1.6] Backport PR #16866 to branch-1.6 ## What changes were proposed in this pull request? This PR backports PR #16866 to branch-1.6 ## How was this patch tested? Existing tests. Author:

spark git commit: [SPARK-18952][BACKPORT] Regex strings not properly escaped in codegen for aggregations

2017-01-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.1 80a3e13e5 -> 3b6ac323b [SPARK-18952][BACKPORT] Regex strings not properly escaped in codegen for aggregations ## What changes were proposed in this pull request? Backport for #16361 to 2.1 branch. ## How was this patch tested? Unit

spark git commit: [SPARK-18952] Regex strings not properly escaped in codegen for aggregations

2017-01-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 15c2bd01b -> faabe69cc [SPARK-18952] Regex strings not properly escaped in codegen for aggregations ## What changes were proposed in this pull request? If I use the function regexp_extract, and then in my regex string, use `\`, i.e.

spark git commit: [SPARK-18761][CORE] Introduce "task reaper" to oversee task killing in executors

2016-12-20 Thread joshrosen
sh Rosen <joshro...@databricks.com> Closes #16189 from JoshRosen/cancellation. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2971ae56 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2971ae56 Diff: http://git-wip-us

spark git commit: [SPARK-18553][CORE][BRANCH-1.6] Fix leak of TaskSetManager following executor loss

2016-12-01 Thread joshrosen
viewed #15986. Author: Josh Rosen <joshro...@databricks.com> Closes #16070 from JoshRosen/fix-leak-following-total-executor-loss-1.6. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8f25cb26 Tree: http://git-wip-us.apa

spark git commit: [SPARK-18553][CORE] Fix leak of TaskSetManager following executor loss

2016-11-29 Thread joshrosen
hedulerImpl.scala#L523) in `removeExecutor`, so I'd appreciate a very careful review of these changes. I added a new unit test to `TaskSchedulerImplSuite`. /cc kayousterhout and markhamstra, who reviewed #15986. Author: Josh Rosen <joshro...@databricks.com> Closes #16045 from JoshRo

spark git commit: [SPARK-18553][CORE] Fix leak of TaskSetManager following executor loss

2016-11-29 Thread joshrosen
or: Josh Rosen <joshro...@databricks.com> Closes #16045 from JoshRosen/fix-leak-following-total-executor-loss-master. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9a02f682 Tree: http://git-wip-us.apache.org/repos/asf/sp

spark git commit: [SPARK-18553][CORE][BRANCH-2.0] Fix leak of TaskSetManager following executor loss

2016-11-28 Thread joshrosen
d and merged). ## How was this patch tested? I added a new unit test to `TaskSchedulerImplSuite`. You can check out this PR as of 25e455e711b978cd331ee0f484f70fde31307634 to see the failing test. cc kayousterhout, markhamstra, rxin for review. Author: Josh Rosen <joshro...@databricks.com> Closes #1

spark git commit: [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed

2016-11-16 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.1 951579382 -> 6a3cbbc03 [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed ## What changes were proposed in this pull request? This PR aims to provide a pip installable PySpark package. This does a bunch of work to copy the

spark git commit: [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed

2016-11-16 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master bb6cdfd9a -> a36a76ac4 [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed ## What changes were proposed in this pull request? This PR aims to provide a pip installable PySpark package. This does a bunch of work to copy the jars

spark git commit: [SPARK-18418] Fix flags for make_binary_release for hadoop profile

2016-11-12 Thread joshrosen
lly tested as part of https://github.com/apache/spark/pull/15659 by having the build succeed. cc joshrosen Author: Holden Karau <hol...@us.ibm.com> Closes #15860 from holdenk/minor-fix-release-build-script. (cherry picked from commit 1386fd28daf798bf152606f4da30a36223d75d18) Signed-off-by: J

spark git commit: [SPARK-18418] Fix flags for make_binary_release for hadoop profile

2016-11-12 Thread joshrosen
ted as part of https://github.com/apache/spark/pull/15659 by having the build succeed. cc joshrosen Author: Holden Karau <hol...@us.ibm.com> Closes #15860 from holdenk/minor-fix-release-build-script. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.a

spark git commit: [SPARK-18236] Reduce duplicate objects in Spark UI and HistoryServer

2016-11-07 Thread joshrosen
ntent.com/assets/50748/19953290/6a271290-a129-11e6-93ad-b825f1448886.png) Author: Josh Rosen <joshro...@databricks.com> Closes #15743 from JoshRosen/spark-ui-memory-usage. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3a7

spark git commit: [SPARK-18034] Upgrade to MiMa 0.1.11 to fix flakiness

2016-10-21 Thread joshrosen
com/typesafehub/migration-manager/issues/115). Author: Josh Rosen <joshro...@databricks.com> Closes #15571 from JoshRosen/SPARK-18034. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a65d40ab Tree: http://git-wip-us.a

spark git commit: [SPARK-18034] Upgrade to MiMa 0.1.11 to fix flakiness

2016-10-21 Thread joshrosen
com/typesafehub/migration-manager/issues/115). Author: Josh Rosen <joshro...@databricks.com> Closes #15571 from JoshRosen/SPARK-18034. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b3b4b954 Tree: http://git-wip-us.apache.

spark git commit: [SPARK-17803][TESTS] Upgrade docker-client dependency

2016-10-06 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 a2bf09588 -> e355ca8e8 [SPARK-17803][TESTS] Upgrade docker-client dependency [SPARK-17803: Docker integration tests don't run with "Docker for Mac"](https://issues.apache.org/jira/browse/SPARK-17803) ## What changes were proposed in

spark git commit: [SPARK-17803][TESTS] Upgrade docker-client dependency

2016-10-06 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 9a48e60e6 -> 49d11d499 [SPARK-17803][TESTS] Upgrade docker-client dependency [SPARK-17803: Docker integration tests don't run with "Docker for Mac"](https://issues.apache.org/jira/browse/SPARK-17803) ## What changes were proposed in this

spark git commit: [SPARK-17712][SQL] Fix invalid pushdown of data-independent filters beneath aggregates

2016-09-29 Thread joshrosen
nce any columns. ## How was this patch tested? New regression test in FilterPushdownSuite. Author: Josh Rosen <joshro...@databricks.com> Closes #15289 from JoshRosen/SPARK-17712. (cherry picked from commit 37eb9184f1e9f1c07142c66936671f4711ef407d) Signed-off-by: Josh Rosen <joshro...@da

spark git commit: [SPARK-16343][SQL] Improve the PushDownPredicate rule to pushdown predicates correctly in non-deterministic condition.

2016-09-29 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 ca8130050 -> 7ffafa3bf [SPARK-16343][SQL] Improve the PushDownPredicate rule to pushdown predicates correctly in non-deterministic condition. ## What changes were proposed in this pull request? Currently our Optimizer may reorder the

spark git commit: [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore

2016-09-27 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 98bbc4410 -> 2cd327ef5 [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore ## What changes were proposed in this pull request? There is an assert in MemoryStore's putIteratorAsValues method which is used to

spark git commit: [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore

2016-09-27 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 2f84a6866 -> e7bce9e18 [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore ## What changes were proposed in this pull request? There is an assert in MemoryStore's putIteratorAsValues method which is used to

spark git commit: [SPARK-17618] Guard against invalid comparisons between UnsafeRow and other formats

2016-09-27 Thread joshrosen
row an IllegalArgumentException if it is called with an object that is not an `UnsafeRow`. Author: Josh Rosen <joshro...@databricks.com> Closes #15265 from JoshRosen/SPARK-17618-master. (cherry picked from commit 2f84a686604b298537bfd4d087b41594d2aa7ec6) Signed-off-by: Josh Rosen <joshro...@databricks.com>

spark git commit: [SPARK-17618] Guard against invalid comparisons between UnsafeRow and other formats

2016-09-27 Thread joshrosen
ion if it is called with an object that is not an `UnsafeRow`. Author: Josh Rosen <joshro...@databricks.com> Closes #15265 from JoshRosen/SPARK-17618-master. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2f84a686 T

spark git commit: [SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames

2016-09-22 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 f14f47f07 -> 243bdb11d [SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames Consider you have a bucket as `s3a://some-bucket` and under it you have files: ``` s3a://some-bucket/file1.parquet

spark git commit: [SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames

2016-09-22 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 9f24a17c5 -> 85d609cf2 [SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames ## What changes were proposed in this pull request? Consider you have a bucket as `s3a://some-bucket` and under it you have files: ```

spark git commit: [SPARK-17485] Prevent failed remote reads of cached blocks from failing entire job (branch-1.6 backport)

2016-09-22 Thread joshrosen
5186 from JoshRosen/SPARK-17485-branch-1.6-backport. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/94524cef Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/94524cef Diff: http://git-wip-us.apache.org/repos/asf/spark/diff

spark git commit: [SPARK-17418] Prevent kinesis-asl-assembly artifacts from being published

2016-09-21 Thread joshrosen
joshro...@databricks.com> Closes #15167 from JoshRosen/stop-publishing-kinesis-assembly. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ce0a222f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ce0a222f Diff: http:

spark git commit: [SPARK-17418] Prevent kinesis-asl-assembly artifacts from being published

2016-09-21 Thread joshrosen
joshro...@databricks.com> Closes #15167 from JoshRosen/stop-publishing-kinesis-assembly. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d7ee1221 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d7ee1221 Diff: http://git-wip-us.a

spark git commit: [SPARK-17160] Properly escape field names in code-generated error messages

2016-09-19 Thread joshrosen
5156 from JoshRosen/SPARK-17160. (cherry picked from commit e719b1c045ba185d242d21bbfcdee2c84dafc587) Signed-off-by: Josh Rosen <joshro...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7026eb87 Tree:

spark git commit: [SPARK-17160] Properly escape field names in code-generated error messages

2016-09-19 Thread joshrosen
ror message string literals in generated Java code, leading to compilation errors. This patch addresses these issues by using `addReferenceObj` to store the error messages as string fields rather than inline string constants. Author: Josh Rosen <joshro...@databricks.com> Closes #15156 from

spark git commit: [SPARK-17473][SQL] fixing docker integration tests error due to different versions of jars.

2016-09-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 c4660d607 -> f56035ba6 [SPARK-17473][SQL] fixing docker integration tests error due to different versions of jars. ## What changes were proposed in this pull request? Docker tests are using older version of jersey jars (1.19), which

spark git commit: [SPARK-17473][SQL] fixing docker integration tests error due to different versions of jars.

2016-09-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master d720a4019 -> cdea1d134 [SPARK-17473][SQL] fixing docker integration tests error due to different versions of jars. ## What changes were proposed in this pull request? Docker tests are using older version of jersey jars (1.19), which was

spark git commit: [SPARK-17491] Close serialization stream to fix wrong answer bug in putIteratorAsBytes()

2016-09-17 Thread joshrosen
ith `zip` but didn't first check that the lengths of the two collections were equal, causing missing records to go unnoticed. The updated test case reproduced this bug. In addition, I added a new `PartiallySerializedBlockSuite` to unit test that component. Author: Josh Rosen <joshro...@databricks.com>

spark git commit: [SPARK-17491] Close serialization stream to fix wrong answer bug in putIteratorAsBytes()

2016-09-17 Thread joshrosen
ith `zip` but didn't first check that the lengths of the two collections were equal, causing missing records to go unnoticed. The updated test case reproduced this bug. In addition, I added a new `PartiallySerializedBlockSuite` to unit test that component. Author: Josh Rosen <joshro...@databrick

spark git commit: [SPARK-17484] Prevent invalid block locations from being reported after put() exceptions

2016-09-15 Thread joshrosen
ted? Two new regression tests in BlockManagerSuite. Author: Josh Rosen <joshro...@databricks.com> Closes #15085 from JoshRosen/SPARK-17484. (cherry picked from commit 1202075c95eabba0ffebc170077df798f271a139) Signed-off-by: Josh Rosen <joshro...@databricks.com> Project:

spark git commit: [SPARK-17484] Prevent invalid block locations from being reported after put() exceptions

2016-09-15 Thread joshrosen
ted? Two new regression tests in BlockManagerSuite. Author: Josh Rosen <joshro...@databricks.com> Closes #15085 from JoshRosen/SPARK-17484. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1202075c Tree: http://git-wip-us.a

spark git commit: [SPARK-17547] Ensure temp shuffle data file is cleaned up after error

2016-09-15 Thread joshrosen
ion of the temp file. This patch avoids this potential cause of disk-space leaks by adding `finally` blocks to ensure that temp files are always deleted if they haven't been renamed. Author: Josh Rosen <joshro...@databricks.com> Closes #15104 from JoshRosen/cleanup-tmp-data-file-in-shuff

spark git commit: [SPARK-17547] Ensure temp shuffle data file is cleaned up after error

2016-09-15 Thread joshrosen
ion of the temp file. This patch avoids this potential cause of disk-space leaks by adding `finally` blocks to ensure that temp files are always deleted if they haven't been renamed. Author: Josh Rosen <joshro...@databricks.com> Closes #15104 from JoshRosen/cleanup-tmp-data-file-in-shuff

spark git commit: [SPARK-17547] Ensure temp shuffle data file is cleaned up after error

2016-09-15 Thread joshrosen
ion of the temp file. This patch avoids this potential cause of disk-space leaks by adding `finally` blocks to ensure that temp files are always deleted if they haven't been renamed. Author: Josh Rosen <joshro...@databricks.com> Closes #15104 from JoshRosen/cleanup-tmp-data-file-in-shuffle-writer.

spark git commit: [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak

2016-09-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 fffcec90b -> bb2bdb440 [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak The expression like `if (memoryMap(taskAttemptId) == 0)

spark git commit: [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak

2016-09-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master dbfc7aa4d -> bb3229436 [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak The expression like `if (memoryMap(taskAttemptId) == 0) memoryMap.remove(taskAttemptId)`

spark git commit: [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak

2016-09-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 bf3f6d2f1 -> a447cd888 [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak ## What changes were proposed in this pull request? The expression like `if

spark git commit: [SPARK-17463][CORE] Make CollectionAccumulator and SetAccumulator's value can be read thread-safely

2016-09-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master ff6e4cbdc -> e33bfaed3 [SPARK-17463][CORE] Make CollectionAccumulator and SetAccumulator's value can be read thread-safely ## What changes were proposed in this pull request? Make CollectionAccumulator and SetAccumulator's value can be

spark git commit: [SPARK-17463][CORE] Make CollectionAccumulator and SetAccumulator's value can be read thread-safely

2016-09-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 fab77dadf -> fffcec90b [SPARK-17463][CORE] Make CollectionAccumulator and SetAccumulator's value can be read thread-safely ## What changes were proposed in this pull request? Make CollectionAccumulator and SetAccumulator's value can

spark git commit: [SPARK-17485] Prevent failed remote reads of cached blocks from failing entire job

2016-09-12 Thread joshrosen
tes` returned `None` when no remote locations for the block could be found (which could occur if an executor died and its block manager de-registered with the master). Author: Josh Rosen <joshro...@databricks.com> Closes #15037 from JoshRosen/SPARK-17485. (cherry picked fr

spark git commit: [SPARK-14818] Post-2.0 MiMa exclusion and build changes

2016-09-12 Thread joshrosen
sen <joshro...@databricks.com> Closes #15061 from JoshRosen/post-2.0-mima-changes. (cherry picked from commit 7c51b99a428a965ff7d136e1cdda20305d260453) Signed-off-by: Josh Rosen <joshro...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wi

spark git commit: [SPARK-14818] Post-2.0 MiMa exclusion and build changes

2016-09-12 Thread joshrosen
joshro...@databricks.com> Closes #15061 from JoshRosen/post-2.0-mima-changes. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7c51b99a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7c51b99a Diff: http:

spark git commit: [SPARK-17483] Refactoring in BlockManager status reporting and block removal

2016-09-12 Thread joshrosen
ere into their own separate PR in order to make them easier to review and so that the behavior-changing parts of my other patch can be isolated to their own PR. Author: Josh Rosen <joshro...@databricks.com> Closes #15036 from JoshRosen/cache-failure-race-conditions-refactorings-only.

spark git commit: [SPARK-17503][CORE] Fix memory leak in Memory store when unable to cache the whole RDD in memory

2016-09-12 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 30521522d -> 0a36e360c [SPARK-17503][CORE] Fix memory leak in Memory store when unable to cache the whole RDD in memory ## What changes were proposed in this pull request? MemoryStore may throw OutOfMemoryError when trying to

spark git commit: [SPARK-17503][CORE] Fix memory leak in Memory store when unable to cache the whole RDD in memory

2016-09-12 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 8087ecf8d -> 1742c3ab8 [SPARK-17503][CORE] Fix memory leak in Memory store when unable to cache the whole RDD in memory ## What changes were proposed in this pull request? MemoryStore may throw OutOfMemoryError when trying to cache a

spark git commit: [SPARK-17405] RowBasedKeyValueBatch should use default page size to prevent OOMs

2016-09-08 Thread joshrosen
ask for the first-level hash map storage, even when running in low-memory situations such as local mode. This changes it to use the memory manager default page size, which is automatically reduced from 64MB in these situations. cc ooq JoshRosen ## How was this patch tested? Tested manually with `bin/sp

  1   2   3   4   5   6   7   8   9   10   >