[GitHub] spark pull request #22504: [SPARK-25118][Submit] Persist Driver Logs in Clie...

2018-10-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22504#discussion_r226454499 --- Diff: core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala --- @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22504: [SPARK-25118][Submit] Persist Driver Logs in Clie...

2018-10-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22504#discussion_r226453812 --- Diff: docs/configuration.md --- @@ -266,6 +266,37 @@ of the most common options to set are: Only has effect in Spark standalone mode or Mesos

[GitHub] spark pull request #22504: [SPARK-25118][Submit] Persist Driver Logs in Clie...

2018-10-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22504#discussion_r226395358 --- Diff: core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala --- @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22504: [SPARK-25118][Submit] Persist Driver Logs in Clie...

2018-10-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22504#discussion_r226400385 --- Diff: docs/configuration.md --- @@ -266,6 +266,37 @@ of the most common options to set are: Only has effect in Spark standalone mode or Mesos

[GitHub] spark pull request #22504: [SPARK-25118][Submit] Persist Driver Logs in Clie...

2018-10-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22504#discussion_r226394446 --- Diff: core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala --- @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22504: [SPARK-25118][Submit] Persist Driver Logs in Clie...

2018-10-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22504#discussion_r226391941 --- Diff: core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala --- @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22504: [SPARK-25118][Submit] Persist Driver Logs in Clie...

2018-10-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22504#discussion_r226396957 --- Diff: core/src/test/scala/org/apache/spark/util/logging/DriverLoggerSuite.scala --- @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22504: [SPARK-25118][Submit] Persist Driver Logs in Clie...

2018-10-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22504#discussion_r226397391 --- Diff: core/src/test/scala/org/apache/spark/util/logging/DriverLoggerSuite.scala --- @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22504: [SPARK-25118][Submit] Persist Driver Logs in Clie...

2018-10-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22504#discussion_r226395070 --- Diff: core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala --- @@ -0,0 +1,241 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #22751: [SPARK-20327][yarn] Follow up: fix resource request test...

2018-10-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22751 merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22751: [SPARK-20327][yarn] Follow up: fix resource reque...

2018-10-16 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22751#discussion_r225707336 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala --- @@ -372,6 +358,35 @@ class ClientSuite extends

[GitHub] spark issue #22705: [SPARK-25704][CORE] Allocate a bit less than Int.MaxValu...

2018-10-15 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22705 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22733: [SPARK-25738][SQL] Fix LOAD DATA INPATH for hdfs ...

2018-10-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22733#discussion_r225319798 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -383,7 +386,7 @@ case class LoadDataCommand

[GitHub] spark pull request #22733: [SPARK-25738][SQL] Fix LOAD DATA INPATH for hdfs ...

2018-10-15 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22733 [SPARK-25738][SQL] Fix LOAD DATA INPATH for hdfs port ## What changes were proposed in this pull request? LOAD DATA INPATH didn't work if the defaultFS included a port for hdfs

[GitHub] spark pull request #22705: [SPARK-25704][CORE][WIP] Allocate a bit less than...

2018-10-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22705#discussion_r225208317 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -195,7 +196,11 @@ object ChunkedByteBuffer { val is = new

[GitHub] spark pull request #22710: DO NOT MERGE

2018-10-12 Thread squito
GitHub user squito reopened a pull request: https://github.com/apache/spark/pull/22710 DO NOT MERGE just for testing You can merge this pull request into a Git repository by running: $ git pull https://github.com/squito/spark blah Alternatively you can review and apply

[GitHub] spark pull request #22710: DO NOT MERGE

2018-10-12 Thread squito
Github user squito closed the pull request at: https://github.com/apache/spark/pull/22710 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22710: DO NOT MERGE

2018-10-12 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22710 DO NOT MERGE just for testing You can merge this pull request into a Git repository by running: $ git pull https://github.com/squito/spark blah Alternatively you can review and apply

[GitHub] spark pull request #22705: [SPARK-25704][CORE][WIP] Allocate a bit less than...

2018-10-11 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22705 [SPARK-25704][CORE][WIP] Allocate a bit less than Int.MaxValue JVMs don't you allocate arrays of length exactly Int.MaxValue, so leave a little extra room. This is necessary when reading blocks

[GitHub] spark issue #22557: [SPARK-25535][core] Work around bad error handling in co...

2018-10-09 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22557 merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222403700 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222372817 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222402317 --- Diff: core/src/main/scala/org/apache/spark/metrics/ExecutorMetricType.scala --- @@ -59,6 +60,43 @@ case object JVMOffHeapMemory extends

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222396576 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222389408 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222398145 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222398177 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222370714 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222370522 --- Diff: core/src/test/scala/org/apache/spark/executor/ProcfsBasedSystemsSuite.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222395115 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222404587 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222398288 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222388835 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222370195 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-03 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222373667 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,268 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-10-03 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22612 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22557: [SPARK-25535][core] Work around bad error handling in co...

2018-10-02 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22557 lgtm will leave for a day before mergning --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22604: [SPARK-25586][MLlib][Core] Replace toString method with ...

2018-10-01 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22604 Jenkins, ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22557: [SPARK-25535][core] Work around bad error handlin...

2018-09-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22557#discussion_r221056816 --- Diff: core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala --- @@ -157,6 +165,111 @@ private[spark] object CryptoStreamUtils extends

[GitHub] spark pull request #22557: [SPARK-25535][core] Work around bad error handlin...

2018-09-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22557#discussion_r221058039 --- Diff: common/network-common/src/main/java/org/apache/spark/network/crypto/AuthEngine.java --- @@ -241,29 +249,52 @@ private SecretKeySpec generateKey

[GitHub] spark pull request #22557: [SPARK-25535][core] Work around bad error handlin...

2018-09-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22557#discussion_r221056128 --- Diff: core/src/test/scala/org/apache/spark/security/CryptoStreamUtilsSuite.scala --- @@ -164,6 +167,34 @@ class CryptoStreamUtilsSuite extends

[GitHub] spark issue #22546: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-25 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22546 this is the same as https://github.com/apache/spark/pull/22511 (more discussion there) just opened against master. I opened it against 2.4 first for testing

[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-25 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22511 > This PR is directly heading to branch-2.4 by bypassing master branch YEs good point, sorry I opened this against 2.4 just for testing in case the errors were more likely in 2.4 for s

[GitHub] spark pull request #22511: [SPARK-25422][CORE] Don't memory map blocks strea...

2018-09-25 Thread squito
Github user squito closed the pull request at: https://github.com/apache/spark/pull/22511 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22546: [SPARK-25422][CORE] Don't memory map blocks strea...

2018-09-25 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22546 [SPARK-25422][CORE] Don't memory map blocks streamed to disk. After data has been streamed to disk, the buffers are inserted into the memory store in some cases (eg., with broadcast blocks

[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-25 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22511 > The analysis makes sense to me. The thing I'm not sure is, how can we hit it? The "fetch block to temp file" code path is only enabled for big blocks (> 2GB). The fai

[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-21 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22511 > this seems like a big change, will we hit perf regression? Not vs. 2.3. It only effects things when stream-to-disk is enabled, and when it is enabled, for reading remote cached blo

[GitHub] spark pull request #22511: [SPARK-25422][CORE] Don't memory map blocks strea...

2018-09-20 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22511 [SPARK-25422][CORE] Don't memory map blocks streamed to disk. After data has been streamed to disk, the buffers are inserted into the memory store in some cases (eg., with broadcast blocks

[GitHub] spark pull request #22496: [SPARK-25422] DO NOT MERGE

2018-09-20 Thread squito
Github user squito closed the pull request at: https://github.com/apache/spark/pull/22496 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22460: DO NOT MERGE

2018-09-20 Thread squito
Github user squito closed the pull request at: https://github.com/apache/spark/pull/22460 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22496: [SPARK-25422] DO NOT MERGE

2018-09-20 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22496 [SPARK-25422] DO NOT MERGE You can merge this pull request into a Git repository by running: $ git pull https://github.com/squito/spark SPARK-25422 Alternatively you can review and apply

[GitHub] spark issue #22483: [MINOR][PYTHON] Use a helper in `PythonUtils` instead of...

2018-09-20 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22483 lgtm --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22460: DO NOT MERGE

2018-09-19 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22460 > To speed things up might be worth it to hack run-test.py to only run core tests (or DistributedSuite). when running it manually on the amplab workers, it seems to be way more common i

[GitHub] spark issue #21451: [SPARK-24296][CORE] Replicate large blocks as a stream.

2018-09-19 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21451 still looking -- will put comments on the jira so its more visible --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22460: DO NOT MERGE

2018-09-18 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22460 DO NOT MERGE You can merge this pull request into a Git repository by running: $ git pull https://github.com/squito/spark debugging Alternatively you can review and apply these changes

[GitHub] spark issue #22452: [SPARK-25456][SQL][TEST] Fix PythonForeachWriterSuite

2018-09-18 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22452 merged to master & 2.4 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-

[GitHub] spark issue #22452: [SPARK-25456][SQL][TEST] Fix PythonForeachWriterSuite

2018-09-18 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22452 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22452: [SPARK-25456][SQL][TEST] Fix PythonForeachWriterS...

2018-09-18 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22452#discussion_r218496943 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/python/PythonForeachWriterSuite.scala --- @@ -75,15 +78,20 @@ class

[GitHub] spark issue #22444: [SPARK-25409][Core]Speed up Spark History loading via in...

2018-09-18 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22444 > history server startup needs to go through all these logs before being usable, so any server restart results in hours of downtime, just from scanning. I don't think this is true. The fi

[GitHub] spark pull request #22452: [SPARK-25456][SQL][TEST] Fix PythonForeachWriterS...

2018-09-18 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22452 [SPARK-25456][SQL][TEST] Fix PythonForeachWriterSuite PythonForeachWriterSuite was failing because RowQueue now needs to have a handle on a SparkEnv with a SerializerManager, so added a mock env

[GitHub] spark pull request #22444: [SPARK-25409][Core]Speed up Spark History loading...

2018-09-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22444#discussion_r218279175 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -465,20 +475,31 @@ private[history] class FsHistoryProvider

[GitHub] spark issue #21451: [SPARK-24296][CORE] Replicate large blocks as a stream.

2018-09-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21451 looking. so far seems unrelated to me, but as you've said its failed in a few builds so I'm gonna keep digging. The error is occurring before any rdds are getting replicated via the new code path

[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

2018-09-13 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22381#discussion_r217510789 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala --- @@ -503,9 +503,12 @@ private[spark] object AppStatusStore

[GitHub] spark pull request #22404: DO NOT MERGE

2018-09-13 Thread squito
Github user squito closed the pull request at: https://github.com/apache/spark/pull/22404 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22404: DO NOT MERGE

2018-09-12 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22404 thanks Sean, all good points, just updated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22404: DO NOT MERGE

2018-09-12 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22404 DO NOT MERGE just for testing You can merge this pull request into a Git repository by running: $ git pull https://github.com/squito/spark assorted_2.3_fixes Alternatively you can review

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-09-12 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22288 cc @jiangxb1987 @attilapiros also for thoughts --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-09-12 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22288 > I'm quite worry about this killing behaviour. I thik we should kill a executor iff it is idle. yes, you have a good point. So the two extremes we need to consider are: 1)

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-09-12 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22288 > If it takes more time to acquire a new executor after killing a blacklisted one and the abort timer is up, we end up aborting the TaskSet. This was to see if we want to account for the t

[GitHub] spark issue #22288: [SPARK-22148][Scheduler] Acquire new executors to avoid ...

2018-09-11 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22288 Ok I looked at jiras, and this looks it also covers SPARK-15815, right? you could add that to the summary too. You mention some future improvements: > Taking into account sta

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-11 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r216724373 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -414,9 +425,48 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-11 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r216725731 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -414,9 +425,48 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-11 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r216723175 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -414,9 +425,48 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-11 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r216726755 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -623,8 +623,9 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-11 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r216726323 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -414,9 +425,48 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark issue #22385: [SPARK-25400][CORE][TEST] Increase test timeouts

2018-09-11 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22385 @dongjoon-hyun sure thing, done --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22385: [SPARK-25400][CORE] Increase test timeouts

2018-09-10 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22385 [SPARK-25400][CORE] Increase test timeouts We've seen some flakiness in jenkins which looks like it just needs a longer timeout. You can merge this pull request into a Git repository

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-10 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22192 retest this please. I took a look at the failures, pretty certain its unrelated, and I filed https://issues.apache.org/jira/browse/SPARK-25400 to increase the timeouts in one of those

[GitHub] spark pull request #22371: [SPARK-25386][CORE] Don't need to synchronize the...

2018-09-10 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22371#discussion_r216387272 --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala --- @@ -138,13 +154,22 @@ private[spark] class

[GitHub] spark pull request #22371: [SPARK-25386][CORE] Don't need to synchronize the...

2018-09-10 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22371#discussion_r216369925 --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala --- @@ -138,13 +154,22 @@ private[spark] class

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-09-06 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21758#discussion_r215659440 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1349,6 +1339,29 @@ class DAGScheduler( s"l

[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-09-05 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22209 any reason not to merge to 2.3? its a bug in 2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22265: [SPARK-25253][PYSPARK][FOLLOWUP] Undefined name: from py...

2018-08-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22265 ah, of course, `_exception_message` is only used in the exception handling, so we never get an error about an undefined name in any of the tests. Ok, thanks for the explanations, I appreciate

[GitHub] spark issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate ...

2018-08-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22273 I'm kinda 'meh' / -0 on this change. My issue wasn't so much not seeing something printed out, it was more (a) python output isn't integrated into jenkins test reports and (b) I'm still learning my

[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

2018-08-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r213835409 --- Diff: python/pyspark/context.py --- @@ -494,10 +494,14 @@ def f(split, iterator): c = list(c)# Make it a list so we can compute its

[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

2018-08-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r213815841 --- Diff: python/pyspark/context.py --- @@ -494,10 +494,14 @@ def f(split, iterator): c = list(c)# Make it a list so we can compute its

[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

2018-08-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r213785551 --- Diff: python/pyspark/context.py --- @@ -494,10 +494,14 @@ def f(split, iterator): c = list(c)# Make it a list so we can compute its

[GitHub] spark issue #22265: [SPARK-25253][PYSPARK][FOLLOWUP] Undefined name: from py...

2018-08-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22265 thanks, lgtm btw @HyukjinKwon I had expected jenkins testing to catch this ... but are there no tests run on python 3? is python 3 testing still manual from the contributor & revi

[GitHub] spark pull request #22247: [SPARK-25253][PYSPARK] Refactor local connection ...

2018-08-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22247#discussion_r213469645 --- Diff: python/pyspark/java_gateway.py --- @@ -147,6 +147,39 @@ def do_server_auth(conn, auth_secret): raise Exception("Unexpected reply

[GitHub] spark pull request #22247: [SPARK-25253][PYSPARK] Refactor local connection ...

2018-08-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22247#discussion_r213398017 --- Diff: python/pyspark/taskcontext.py --- @@ -108,38 +108,12 @@ def _load_from_socket(port, auth_secret): """ Load da

[GitHub] spark pull request #22247: [SPARK-25253][PYSPARK] Refactor local connection ...

2018-08-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22247#discussion_r213324531 --- Diff: python/pyspark/java_gateway.py --- @@ -147,6 +147,39 @@ def do_server_auth(conn, auth_secret): raise Exception("Unexpected reply

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r213319641 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1865,6 +1871,62 @@ abstract class RDD[T: ClassTag]( // RDD chain

[GitHub] spark issue #22247: [SPARK-25253][PYSPARK] Refactor local connection & auth ...

2018-08-27 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22247 @jiangxb1987 @HyukjinKwon @mengxr --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22247: [SPARK-25253][PYSPARK] Refactor local connection ...

2018-08-27 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/22247 [SPARK-25253][PYSPARK] Refactor local connection & auth code This eliminates some duplication in the code to connect to a server on localhost to talk directly to the jvm. Also it gives consis

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r213060868 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -180,7 +188,73 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r213043068 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -180,7 +188,73 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r213032992 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -180,7 +188,73 @@ private[spark] abstract class BasePythonRunner

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r213009399 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -33,6 +33,9 @@ import org.apache.spark.util.random.SamplingUtils

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r213017779 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1865,6 +1871,62 @@ abstract class RDD[T: ClassTag]( // RDD chain

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-27 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r213010846 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1918,3 +1980,19 @@ object RDD { new DoubleRDDFunctions(rdd.map(x

<    1   2   3   4   5   6   7   8   9   10   >