(spark) branch master updated: [SPARK-42204][CORE] Add option to disable redundant logging of TaskMetrics internal accumulators in event logs

2024-09-06 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new f9a8ca54e7ad [SPARK-42204][CORE] Add option

(spark) branch master updated: [SPARK-48628][CORE] Add task peak on/off heap memory metrics

2024-08-21 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 9b9a7a7478d1 [SPARK-48628][CORE] Add task

(spark) branch master updated: [SPARK-48716] Add jobGroupId to SparkListenerSQLExecutionStart

2024-07-09 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 4c99c4df7f9c [SPARK-48716] Add jobGroupId to

(spark) branch master updated: [SPARK-48541][CORE] Add a new exit code for executors killed by TaskReaper

2024-06-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 54587638685b [SPARK-48541][CORE] Add a new

(spark) branch master updated: [SPARK-48544][SQL] Reduce memory pressure of empty TreeNode BitSets

2024-06-10 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5a2f374a208f [SPARK-48544][SQL] Reduce memory

[spark] branch master updated: [SPARK-42205][CORE] Don't log accumulator values in stage / task start and getting result event logs

2023-09-29 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new bb59b489204 [SPARK-42205][CORE] Don'

[spark] branch master updated: [SPARK-44818] Fix race for pending task kill issued before taskThread is initialized

2023-08-21 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new c34ec411244 [SPARK-44818] Fix race for

[spark] branch master updated: [SPARK-43300][CORE] NonFateSharingCache wrapper for Guava Cache

2023-05-15 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new d53ddbe00fe [SPARK-43300][CORE

[spark] branch master updated: [SPARK-40261][CORE] Exclude DirectTaskResult metadata when calculating result size

2022-08-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5a4b075f95f [SPARK-40261][CORE] Exclude

[spark] branch master updated: [SPARK-40235][CORE] Use interruptible lock instead of synchronized in Executor.updateDependencies()

2022-08-29 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 295dd57c13c [SPARK-40235][CORE] Use

[spark] branch master updated: [SPARK-40211][CORE][SQL] Allow customize initial partitions number in take() behavior

2022-08-26 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 1178bcecc83 [SPARK-40211][CORE][SQL] Allow

[spark] branch master updated (50c163578cf -> 6cd9d88e237)

2022-08-18 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 50c163578cf Revert "[SPARK-4][SQL] Update INSERTs without user-specified fields to not automaticall

[spark] branch master updated: [SPARK-39983][CORE][SQL] Do not cache unserialized broadcast relations on the driver

2022-08-10 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e17d8ecabca [SPARK-39983][CORE][SQL] Do not

[spark] branch master updated: [SPARK-39636][CORE][UI] Fix multiple bugs in JsonProtocol, impacting off heap StorageLevels and Task/Executor ResourceRequests

2022-06-30 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a39fc8773b2 [SPARK-39636][CORE][UI] Fix

[spark] branch branch-3.2 updated: [SPARK-39422][SQL] Improve error message for 'SHOW CREATE TABLE' with unsupported serdes

2022-06-09 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 7fd2e967a8a [SPARK-39422][SQL

[spark] branch branch-3.3 updated: [SPARK-39422][SQL] Improve error message for 'SHOW CREATE TABLE' with unsupported serdes

2022-06-09 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new ff048f1b69e [SPARK-39422][SQL

[spark] branch master updated: [SPARK-39422][SQL] Improve error message for 'SHOW CREATE TABLE' with unsupported serdes

2022-06-09 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 8765eea1c08 [SPARK-39422][SQL] Improve error

[spark] branch branch-3.3 updated: [SPARK-39361] Don't use Log4J2's extended throwable conversion pattern in default logging configurations

2022-06-02 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 4da8f3a76b1 [SPARK-39361] Don'

[spark] branch master updated: [SPARK-39361] Don't use Log4J2's extended throwable conversion pattern in default logging configurations

2022-06-02 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new fd45c3656be [SPARK-39361] Don't use Log

[spark] 01/02: [SPARK-32911][CORE] Free memory in UnsafeExternalSorter.SpillableIterator.spill() when all records have been read

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git commit 02488e0df30a25fd235e2dd25e9b1b3404150125 Author: Tom van Bussel AuthorDate: Fri Sep 18 11:49:26 2020 +

[spark] 02/02: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git commit c653d287b15db3e50fa206071a5435028879f15f Author: sandeepvinayak AuthorDate: Tue May 31 15:28:07 2022 -0700

[spark] branch branch-3.0 updated (9b268122f68 -> c653d287b15)

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git from 9b268122f68 [SPARK-39293][SQL] Fix the accumulator of ArrayAggregate to handle complex types properly new

[spark] branch branch-3.1 updated: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 0908337a765 [SPARK-39283][CORE] Fix

[spark] branch branch-3.2 updated: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 606830e9cae [SPARK-39283][CORE] Fix

[spark] branch branch-3.3 updated: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 1ad1c18fc28 [SPARK-39283][CORE] Fix

[spark] branch master updated: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-31 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 8d0c035f102 [SPARK-39283][CORE] Fix deadlock

[spark] branch branch-3.0 updated: [SPARK-37784][SQL] Correctly handle UDTs in CodeGenerator.addBufferedState()

2022-01-04 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 3aaf722 [SPARK-37784][SQL] Correctly

[spark] branch branch-3.1 updated: [SPARK-37784][SQL] Correctly handle UDTs in CodeGenerator.addBufferedState()

2022-01-04 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 5cc8b39 [SPARK-37784][SQL] Correctly

[spark] branch branch-3.2 updated: [SPARK-37784][SQL] Correctly handle UDTs in CodeGenerator.addBufferedState()

2022-01-04 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 45b7b7e [SPARK-37784][SQL] Correctly

[spark] branch master updated: [SPARK-37784][SQL] Correctly handle UDTs in CodeGenerator.addBufferedState()

2022-01-04 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new eeef48fa [SPARK-37784][SQL] Correctly handle

[spark] branch master updated: [SPARK-37379][SQL] Add tree pattern pruning to CTESubstitution rule

2021-11-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 3b4eb1f [SPARK-37379][SQL] Add tree pattern

[spark] 01/01: hacky wip towards python udf profiling

2021-11-18 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch python-udf-accumulator in repository https://gitbox.apache.org/repos/asf/spark.git commit 9213a85a40499fc7f0e24ea14c5051c45a022ef2 Author: Josh Rosen AuthorDate: Wed Oct 20 16:17:44 2021

[spark] branch python-udf-accumulator created (now 9213a85)

2021-11-18 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch python-udf-accumulator in repository https://gitbox.apache.org/repos/asf/spark.git. at 9213a85 hacky wip towards python udf profiling This branch includes the following new commits

[spark] branch master updated: [SPARK-36933][CORE] Clean up TaskMemoryManager.acquireExecutionMemory()

2021-10-18 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 1ef6c13 [SPARK-36933][CORE] Clean up

[spark] branch branch-3.0 updated: [SPARK-23626][CORE] Eagerly compute RDD.partitions on entire DAG when submitting job to DAGScheduler

2021-10-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 1709265 [SPARK-23626][CORE] Eagerly

[spark] branch branch-3.1 updated: [SPARK-23626][CORE] Eagerly compute RDD.partitions on entire DAG when submitting job to DAGScheduler

2021-10-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new c43f355 [SPARK-23626][CORE] Eagerly

[spark] branch branch-3.2 updated: [SPARK-23626][CORE] Eagerly compute RDD.partitions on entire DAG when submitting job to DAGScheduler

2021-10-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 01ee46e [SPARK-23626][CORE] Eagerly

[spark] branch master updated: [SPARK-23626][CORE] Eagerly compute RDD.partitions on entire DAG when submitting job to DAGScheduler

2021-10-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new c4e975e [SPARK-23626][CORE] Eagerly compute

[spark] branch branch-3.2 updated: [SPARK-36774][CORE][TESTS] Move SparkSubmitTestUtils to core module and use it in SparkSubmitSuite

2021-09-16 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 3502fda [SPARK-36774][CORE][TESTS

[spark] branch master updated (f1f2ec3 -> 3ae6e67)

2021-09-16 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f1f2ec3 [SPARK-36735][SQL][FOLLOWUP] Fix indentation of DynamicPartitionPruningSuite add 3ae6e67 [SPARK

[spark] branch master updated (33e45ec -> 23bed0d)

2019-08-22 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 33e45ec [SPARK-28769][CORE] Improve warning message of BarrierExecutionMode when required slots > maximum sl

[spark] branch branch-2.4 updated: [SPARK-26038][BRANCH-2.4] Decimal toScalaBigInt/toJavaBigInteger for decimals not fitting in long

2019-06-21 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new a71e90a [SPARK-26038][BRANCH-2.4

[spark] branch master updated: [SPARK-28112][TEST] Fix Kryo exception perf. bottleneck in tests due to absence of ML/MLlib classes

2019-06-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ec032ce [SPARK-28112][TEST] Fix Kryo

[spark] branch branch-2.4 updated: [SPARK-26555][SQL][BRANCH-2.4] make ScalaReflection subtype checking thread safe

2019-06-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new ba7f61e [SPARK-26555][SQL][BRANCH

[spark] branch master updated: [SPARK-28102][CORE] Avoid performance problems when lz4-java JNI libraries fail to initialize

2019-06-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6b27ad5 [SPARK-28102][CORE] Avoid

[spark] branch master updated: [SPARK-27839][SQL] Change UTF8String.replace() to operate on UTF8 bytes

2019-06-19 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new fc65e0f [SPARK-27839][SQL] Change

[spark] branch master updated: [SPARK-27684][SQL] Avoid conversion overhead for primitive types

2019-05-30 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 93db7b8 [SPARK-27684][SQL] Avoid conversion

[spark-website] branch asf-site updated: Update Josh Rosen's affiliation

2019-05-14 Thread joshrosen
This is an automated email from the ASF dual-hosted git repository. joshrosen pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new f90d6dd Update Josh Rosen&#

spark git commit: [SPARK-22997] Add additional defenses against use of freed MemoryBlocks

2018-01-10 Thread joshrosen
sed by blind writes to freed memory blocks. ## How was this patch tested? New unit tests in `PlatformSuite`, including new tests for existing functionality because we did not have sufficient mutation coverage of the on-heap memory allocator's pooling logic. Author: Josh Rosen Closes #201

spark git commit: [SPARK-22997] Add additional defenses against use of freed MemoryBlocks

2018-01-10 Thread joshrosen
sed by blind writes to freed memory blocks. ## How was this patch tested? New unit tests in `PlatformSuite`, including new tests for existing functionality because we did not have sufficient mutation coverage of the on-heap memory allocator's pooling logic. Author: Josh Rosen Closes #201

spark git commit: [SPARK-21444] Be more defensive when removing broadcasts in MapOutputTracker

2017-07-17 Thread joshrosen
njection / network unreliability / fuzz testing tools. Author: Josh Rosen Closes #18662 from JoshRosen/SPARK-21444. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5952ad2b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5952ad2

spark git commit: [SPARK-20715] Store MapStatuses only in MapOutputTracker, not ShuffleMapStage

2017-06-11 Thread joshrosen
and kayousterhout and markhamstra (for scheduler changes). ## How was this patch tested? Existing tests. I purposely avoided making interface / API which would require significant updates or modifications to test code. Author: Josh Rosen Closes #17955 from JoshRosen/map-output-tracker-rewrite.

spark git commit: HOTFIX: fix Scalastyle break introduced in 4d57981cfb18e7500cde6c03ae46c7c9b697d064

2017-05-30 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master de953c214 -> 798a04fd7 HOTFIX: fix Scalastyle break introduced in 4d57981cfb18e7500cde6c03ae46c7c9b697d064 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/798a04fd Tre

spark git commit: [SPARK-20102] Fix nightly packaging and RC packaging scripts w/ two minor build fixes

2017-03-27 Thread joshrosen
setup script. ## How was this patch tested? The LFTP fix was tested by manually running the failing commands on AMPLab Jenkins against the ASF SFTP server. The PySpark fix was tested locally. Author: Josh Rosen Closes #17437 from JoshRosen/spark-20102. (cherry picke

spark git commit: [SPARK-20102] Fix nightly packaging and RC packaging scripts w/ two minor build fixes

2017-03-27 Thread joshrosen
setup script. ## How was this patch tested? The LFTP fix was tested by manually running the failing commands on AMPLab Jenkins against the ASF SFTP server. The PySpark fix was tested locally. Author: Josh Rosen Closes #17437 from JoshRosen/spark-20102. Project: http://git-wip-us.apache.org/re

spark git commit: [SPARK-19529][BRANCH-1.6] Backport PR #16866 to branch-1.6

2017-02-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 e78138a43 -> a50ef3d9a [SPARK-19529][BRANCH-1.6] Backport PR #16866 to branch-1.6 ## What changes were proposed in this pull request? This PR backports PR #16866 to branch-1.6 ## How was this patch tested? Existing tests. Author: Ch

spark git commit: [SPARK-18952][BACKPORT] Regex strings not properly escaped in codegen for aggregations

2017-01-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.1 80a3e13e5 -> 3b6ac323b [SPARK-18952][BACKPORT] Regex strings not properly escaped in codegen for aggregations ## What changes were proposed in this pull request? Backport for #16361 to 2.1 branch. ## How was this patch tested? Unit

spark git commit: [SPARK-18952] Regex strings not properly escaped in codegen for aggregations

2017-01-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 15c2bd01b -> faabe69cc [SPARK-18952] Regex strings not properly escaped in codegen for aggregations ## What changes were proposed in this pull request? If I use the function regexp_extract, and then in my regex string, use `\`, i.e. escap

spark git commit: [SPARK-18761][CORE] Introduce "task reaper" to oversee task killing in executors

2016-12-20 Thread joshrosen
tasks. This feature is flagged off by default and is controlled by four new configurations under the `spark.task.reaper.*` namespace. See the updated `configuration.md` doc for details. ## How was this patch tested? Tested via a new test case in `JobCancellationSuite`, plus manual testin

spark git commit: [SPARK-18553][CORE][BRANCH-1.6] Fix leak of TaskSetManager following executor loss

2016-12-01 Thread joshrosen
t and markhamstra, who reviewed #15986. Author: Josh Rosen Closes #16070 from JoshRosen/fix-leak-following-total-executor-loss-1.6. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8f25cb26 Tree: http://git-wip-us.apache.org

spark git commit: [SPARK-18553][CORE] Fix leak of TaskSetManager following executor loss

2016-11-29 Thread joshrosen
ark/scheduler/TaskSchedulerImpl.scala#L523) in `removeExecutor`, so I'd appreciate a very careful review of these changes. I added a new unit test to `TaskSchedulerImplSuite`. /cc kayousterhout and markhamstra, who reviewed #15986. Author: Josh Rosen Closes #16045 from JoshRosen/fix-leak

spark git commit: [SPARK-18553][CORE] Fix leak of TaskSetManager following executor loss

2016-11-29 Thread joshrosen
o reviewed #15986. Author: Josh Rosen Closes #16045 from JoshRosen/fix-leak-following-total-executor-loss-master. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9a02f682 Tree: http://git-wip-us.apache.org/repos/asf/spark/t

spark git commit: [SPARK-18553][CORE][BRANCH-2.0] Fix leak of TaskSetManager following executor loss

2016-11-28 Thread joshrosen
r this fix is reviewed and merged). ## How was this patch tested? I added a new unit test to `TaskSchedulerImplSuite`. You can check out this PR as of 25e455e711b978cd331ee0f484f70fde31307634 to see the failing test. cc kayousterhout, markhamstra, rxin for review. Author: Josh Rosen Closes #15

spark git commit: [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed

2016-11-16 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.1 951579382 -> 6a3cbbc03 [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed ## What changes were proposed in this pull request? This PR aims to provide a pip installable PySpark package. This does a bunch of work to copy the ja

spark git commit: [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed

2016-11-16 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master bb6cdfd9a -> a36a76ac4 [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed ## What changes were proposed in this pull request? This PR aims to provide a pip installable PySpark package. This does a bunch of work to copy the jars o

spark git commit: [SPARK-18418] Fix flags for make_binary_release for hadoop profile

2016-11-12 Thread joshrosen
lly tested as part of https://github.com/apache/spark/pull/15659 by having the build succeed. cc joshrosen Author: Holden Karau Closes #15860 from holdenk/minor-fix-release-build-script. (cherry picked from commit 1386fd28daf798bf152606f4da30a36223d75d18) Signed-off-by: Josh Rosen Project: h

spark git commit: [SPARK-18418] Fix flags for make_binary_release for hadoop profile

2016-11-12 Thread joshrosen
ted as part of https://github.com/apache/spark/pull/15659 by having the build succeed. cc joshrosen Author: Holden Karau Closes #15860 from holdenk/minor-fix-release-build-script. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/sp

spark git commit: [SPARK-18236] Reduce duplicate objects in Spark UI and HistoryServer

2016-11-07 Thread joshrosen
ge](https://cloud.githubusercontent.com/assets/50748/19953290/6a271290-a129-11e6-93ad-b825f1448886.png) Author: Josh Rosen Closes #15743 from JoshRosen/spark-ui-memory-usage. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3a7

spark git commit: [SPARK-18034] Upgrade to MiMa 0.1.11 to fix flakiness

2016-10-21 Thread joshrosen
com/typesafehub/migration-manager/issues/115). Author: Josh Rosen Closes #15571 from JoshRosen/SPARK-18034. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a65d40ab Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a65d4

spark git commit: [SPARK-18034] Upgrade to MiMa 0.1.11 to fix flakiness

2016-10-21 Thread joshrosen
com/typesafehub/migration-manager/issues/115). Author: Josh Rosen Closes #15571 from JoshRosen/SPARK-18034. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b3b4b954 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b3b4b

spark git commit: [SPARK-17803][TESTS] Upgrade docker-client dependency

2016-10-06 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 a2bf09588 -> e355ca8e8 [SPARK-17803][TESTS] Upgrade docker-client dependency [SPARK-17803: Docker integration tests don't run with "Docker for Mac"](https://issues.apache.org/jira/browse/SPARK-17803) ## What changes were proposed in t

spark git commit: [SPARK-17803][TESTS] Upgrade docker-client dependency

2016-10-06 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 9a48e60e6 -> 49d11d499 [SPARK-17803][TESTS] Upgrade docker-client dependency [SPARK-17803: Docker integration tests don't run with "Docker for Mac"](https://issues.apache.org/jira/browse/SPARK-17803) ## What changes were proposed in this

spark git commit: [SPARK-17712][SQL] Fix invalid pushdown of data-independent filters beneath aggregates

2016-09-29 Thread joshrosen
n't reference any columns. ## How was this patch tested? New regression test in FilterPushdownSuite. Author: Josh Rosen Closes #15289 from JoshRosen/SPARK-17712. (cherry picked from commit 37eb9184f1e9f1c07142c66936671f4711ef407d) Signed-off-by: Josh Rosen Project: http://git-wip-us.ap

spark git commit: [SPARK-16343][SQL] Improve the PushDownPredicate rule to pushdown predicates correctly in non-deterministic condition.

2016-09-29 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 ca8130050 -> 7ffafa3bf [SPARK-16343][SQL] Improve the PushDownPredicate rule to pushdown predicates correctly in non-deterministic condition. ## What changes were proposed in this pull request? Currently our Optimizer may reorder the

spark git commit: [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore

2016-09-27 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 98bbc4410 -> 2cd327ef5 [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore ## What changes were proposed in this pull request? There is an assert in MemoryStore's putIteratorAsValues method which is used to c

spark git commit: [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore

2016-09-27 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 2f84a6866 -> e7bce9e18 [SPARK-17056][CORE] Fix a wrong assert regarding unroll memory in MemoryStore ## What changes were proposed in this pull request? There is an assert in MemoryStore's putIteratorAsValues method which is used to check

spark git commit: [SPARK-17618] Guard against invalid comparisons between UnsafeRow and other formats

2016-09-27 Thread joshrosen
throw an IllegalArgumentException if it is called with an object that is not an `UnsafeRow`. Author: Josh Rosen Closes #15265 from JoshRosen/SPARK-17618-master. (cherry picked from commit 2f84a686604b298537bfd4d087b41594d2aa7ec6) Signed-off-by: Josh Rosen Project: http://git-wip-us.apache.org/repos/asf/spark/rep

spark git commit: [SPARK-17618] Guard against invalid comparisons between UnsafeRow and other formats

2016-09-27 Thread joshrosen
ption if it is called with an object that is not an `UnsafeRow`. Author: Josh Rosen Closes #15265 from JoshRosen/SPARK-17618-master. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2f84a686 Tree: http://git-wip-us.apache.o

spark git commit: [SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames

2016-09-22 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 f14f47f07 -> 243bdb11d [SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames Consider you have a bucket as `s3a://some-bucket` and under it you have files: ``` s3a://some-bucket/file1.parquet s3a://some-bucket/file

spark git commit: [SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames

2016-09-22 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 9f24a17c5 -> 85d609cf2 [SPARK-17613] S3A base paths with no '/' at the end return empty DataFrames ## What changes were proposed in this pull request? Consider you have a bucket as `s3a://some-bucket` and under it you have files: ``` s3a:/

spark git commit: [SPARK-17485] Prevent failed remote reads of cached blocks from failing entire job (branch-1.6 backport)

2016-09-22 Thread joshrosen
se `None` branches are already exercised because the old `getRemoteBytes` returned `None` when no remote locations for the block could be found (which could occur if an executor died and its block manager de-registered with the master). Author: Josh Rosen Closes #15186 from JoshRosen/SPARK-1748

spark git commit: [SPARK-17418] Prevent kinesis-asl-assembly artifacts from being published

2016-09-21 Thread joshrosen
167 from JoshRosen/stop-publishing-kinesis-assembly. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ce0a222f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ce0a222f Diff: http://git-wip-us.apache.org/repos/asf/spark/d

spark git commit: [SPARK-17418] Prevent kinesis-asl-assembly artifacts from being published

2016-09-21 Thread joshrosen
167 from JoshRosen/stop-publishing-kinesis-assembly. (cherry picked from commit d7ee12211a99efae6f7395e47089236838461d61) Signed-off-by: Josh Rosen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cd0bd89d Tree: http://git-

spark git commit: [SPARK-17418] Prevent kinesis-asl-assembly artifacts from being published

2016-09-21 Thread joshrosen
rom JoshRosen/stop-publishing-kinesis-assembly. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d7ee1221 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d7ee1221 Diff: http://git-wip-us.apache.org/repos/asf/spark/d

spark git commit: [SPARK-17160] Properly escape field names in code-generated error messages

2016-09-19 Thread joshrosen
ror message string literals in generated Java code, leading to compilation errors. This patch addresses these issues by using `addReferenceObj` to store the error messages as string fields rather than inline string constants. Author: Josh Rosen Closes #15156 from JoshRosen/SPARK-17160. (che

spark git commit: [SPARK-17160] Properly escape field names in code-generated error messages

2016-09-19 Thread joshrosen
ror message string literals in generated Java code, leading to compilation errors. This patch addresses these issues by using `addReferenceObj` to store the error messages as string fields rather than inline string constants. Author: Josh Rosen Closes #15156 from JoshRosen/SPARK-17160. Proj

spark git commit: [SPARK-17473][SQL] fixing docker integration tests error due to different versions of jars.

2016-09-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 c4660d607 -> f56035ba6 [SPARK-17473][SQL] fixing docker integration tests error due to different versions of jars. ## What changes were proposed in this pull request? Docker tests are using older version of jersey jars (1.19), which

spark git commit: [SPARK-17473][SQL] fixing docker integration tests error due to different versions of jars.

2016-09-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master d720a4019 -> cdea1d134 [SPARK-17473][SQL] fixing docker integration tests error due to different versions of jars. ## What changes were proposed in this pull request? Docker tests are using older version of jersey jars (1.19), which was

spark git commit: [SPARK-17491] Close serialization stream to fix wrong answer bug in putIteratorAsBytes()

2016-09-17 Thread joshrosen
with `zip` but didn't first check that the lengths of the two collections were equal, causing missing records to go unnoticed. The updated test case reproduced this bug. In addition, I added a new `PartiallySerializedBlockSuite` to unit test that component. Author: Josh Rosen Closes #15043

spark git commit: [SPARK-17491] Close serialization stream to fix wrong answer bug in putIteratorAsBytes()

2016-09-17 Thread joshrosen
with `zip` but didn't first check that the lengths of the two collections were equal, causing missing records to go unnoticed. The updated test case reproduced this bug. In addition, I added a new `PartiallySerializedBlockSuite` to unit test that component. Author: Josh Rosen Closes

spark git commit: [SPARK-17484] Prevent invalid block locations from being reported after put() exceptions

2016-09-15 Thread joshrosen
this patch tested? Two new regression tests in BlockManagerSuite. Author: Josh Rosen Closes #15085 from JoshRosen/SPARK-17484. (cherry picked from commit 1202075c95eabba0ffebc170077df798f271a139) Signed-off-by: Josh Rosen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Comm

spark git commit: [SPARK-17484] Prevent invalid block locations from being reported after put() exceptions

2016-09-15 Thread joshrosen
this patch tested? Two new regression tests in BlockManagerSuite. Author: Josh Rosen Closes #15085 from JoshRosen/SPARK-17484. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1202075c Tree: http://git-wip-us.apache.org/re

spark git commit: [SPARK-17483] Refactoring in BlockManager status reporting and block removal

2016-09-15 Thread joshrosen
them out here into their own separate PR in order to make them easier to review and so that the behavior-changing parts of my other patch can be isolated to their own PR. Author: Josh Rosen Closes #15036 from JoshRosen/cache-failure-race-conditions-refactorings-only. (cherry picked fro

spark git commit: [SPARK-17547] Ensure temp shuffle data file is cleaned up after error

2016-09-15 Thread joshrosen
letion of the temp file. This patch avoids this potential cause of disk-space leaks by adding `finally` blocks to ensure that temp files are always deleted if they haven't been renamed. Author: Josh Rosen Closes #15104 from JoshRosen/cleanup-tmp-data-file-in-shuffle-writer. (cherry

spark git commit: [SPARK-17547] Ensure temp shuffle data file is cleaned up after error

2016-09-15 Thread joshrosen
letion of the temp file. This patch avoids this potential cause of disk-space leaks by adding `finally` blocks to ensure that temp files are always deleted if they haven't been renamed. Author: Josh Rosen Closes #15104 from JoshRosen/cleanup-tmp-data-file-in-shuffle-writer. (cherry

spark git commit: [SPARK-17547] Ensure temp shuffle data file is cleaned up after error

2016-09-15 Thread joshrosen
letion of the temp file. This patch avoids this potential cause of disk-space leaks by adding `finally` blocks to ensure that temp files are always deleted if they haven't been renamed. Author: Josh Rosen Closes #15104 from JoshRosen/cleanup-tmp-data-file-in-shuffle-writer. Proje

spark git commit: [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak

2016-09-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 fffcec90b -> bb2bdb440 [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak The expression like `if (memoryMap(taskAttemptId) == 0) memoryMap.remove(taskAttemptId)

spark git commit: [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak

2016-09-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master dbfc7aa4d -> bb3229436 [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak The expression like `if (memoryMap(taskAttemptId) == 0) memoryMap.remove(taskAttemptId)` in

spark git commit: [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak

2016-09-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 bf3f6d2f1 -> a447cd888 [SPARK-17465][SPARK CORE] Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak ## What changes were proposed in this pull request? The expression like `if (memoryMap(

spark git commit: [SPARK-17463][CORE] Make CollectionAccumulator and SetAccumulator's value can be read thread-safely

2016-09-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master ff6e4cbdc -> e33bfaed3 [SPARK-17463][CORE] Make CollectionAccumulator and SetAccumulator's value can be read thread-safely ## What changes were proposed in this pull request? Make CollectionAccumulator and SetAccumulator's value can be re

  1   2   3   4   5   6   7   8   9   10   >