[GitHub] spark pull request: [HOTFIX] Some clean-up in shuffle code.

2015-03-31 Thread pwendell
GitHub user pwendell opened a pull request: https://github.com/apache/spark/pull/5286 [HOTFIX] Some clean-up in shuffle code. Before diving into review #4450 I did a look through the existing shuffle code to learn how it works. Unfortunately, there are some very confusing

[GitHub] spark pull request: SPARK-6433 hive tests to import spark-sql test...

2015-03-31 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/5119#discussion_r27537871 --- Diff: pom.xml --- @@ -1472,6 +1473,25 @@ groupIdorg.scalatest/groupId artifactIdscalatest-maven-plugin/artifactId

[GitHub] spark pull request: SPARK-6433 hive tests to import spark-sql test...

2015-03-31 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5119#issuecomment-88300495 The overall approach LGTM, but I would suggest adding a better comment since it's non-obvious what is going on. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-6627] Some clean-up in shuffle code.

2015-03-31 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/5286#discussion_r27496946 --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockManager.scala --- @@ -39,25 +41,18 @@ import org.apache.spark.storage._ // Note

[GitHub] spark pull request: [SPARK-6627] Some clean-up in shuffle code.

2015-03-31 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5286#issuecomment-88186520 Jenkins, retest this pleas.e --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-4550. In sort-based shuffle, store map o...

2015-03-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4450#discussion_r27423831 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ChainedBuffer.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-4550. In sort-based shuffle, store map o...

2015-03-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4450#discussion_r27424659 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ChainedBuffer.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [HOTFIX][SPARK-4123]: Updated to fix bug where...

2015-03-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5269#issuecomment-87804589 Thanks - pulling this in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-4550. In sort-based shuffle, store map o...

2015-03-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4450#discussion_r27424016 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ChainedBuffer.scala --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-6544][build] Increment Avro version fro...

2015-03-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5193#issuecomment-87030538 It's a fair point - in some cases we have made minor (patch) release updates in our own patch versions. @srowen what do you think? It can cause some friction

[GitHub] spark pull request: [SPARK-6544][build] Increment Avro version fro...

2015-03-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5193#issuecomment-87062417 Okay after reviewing the changes to Avro I've decided to backport this. I looked a bit more and in other cases we have done fairly small dependency updates like

[GitHub] spark pull request: [SPARK-6119][SQL] DataFrame.dropna

2015-03-27 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/5225#discussion_r27328262 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala --- @@ -730,6 +731,23 @@ class DataFrame private[sql]( Generate(generator

[GitHub] spark pull request: [spark] [SPARK-6168] Expose some of the collec...

2015-03-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5084#issuecomment-86835246 Yeah I agree with Sean and what pretty much everyone else said. My feeling is that we don't want typical spark users to be relying on unstable API's, else

[GitHub] spark pull request: [SPARK-6405] Limiting the maximum Kryo buffer ...

2015-03-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5218#issuecomment-86834844 LGTM - merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-6549 - Spark console logger logs to stde...

2015-03-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5202#issuecomment-86810509 Changing the default logging behavior like this would break compatibility for people who rely on the out-of-the-box behavior. IIRC this change was proposed

[GitHub] spark pull request: [SPARK-6544][build] Increment Avro version fro...

2015-03-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5193#issuecomment-86815673 Yeah LGTM - I think bumping maintenance releases like this has usually been fine for us with Avro. It may be best not merge this into 1.3 though, since we typically

[GitHub] spark pull request: [SPARK-6554] [SQL] Don't push down predicates ...

2015-03-26 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/5210#discussion_r27252832 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala --- @@ -435,11 +435,18 @@ private[sql] case class ParquetRelation2

[GitHub] spark pull request: [SPARK-6477][Build]: Run MIMA tests before the...

2015-03-24 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5145#issuecomment-85371214 LGTM too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-4123][Project Infra][WIP]: Show new dep...

2015-03-23 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/5093#discussion_r26996263 --- Diff: dev/tests/pr_new_dependencies.sh --- @@ -0,0 +1,85 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [spark] [SPARK-6168] Expose some of the collec...

2015-03-21 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5084#issuecomment-84447923 I agree with Sean and Sandy here. I don't think we should just expose internal utilities like this. The collections classes are simple enough that someone can just copy

[GitHub] spark pull request: [SPARK-6406] Launcher backward compatibility i...

2015-03-21 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5085#issuecomment-84443666 Okay - if there is no big performance hit in launching from 1.3-1.4, that's fine with me. I guess then the remaining question is whether to make a smaller

[GitHub] spark pull request: [SPARK-6406] Launcher backward compatibility i...

2015-03-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5085#issuecomment-84216181 @nishkamravi2 - It's about the startup time of the interactive shell. That's the usability trade off. If the shell takes 300 extra ms to start up, it is not acceptable

[GitHub] spark pull request: SPARK-5134 [BUILD] Bump default Hadoop version...

2015-03-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5027#issuecomment-84201074 Looks good - thanks for commiting this sean. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-6406] Launcher backward compatibility i...

2015-03-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5085#issuecomment-84205836 Hey @vanzin - can you explain a bit why the performance impact isn't worrisome? I think a first order goal was to avoid invoking multiple JVM's that have the assembly

[GitHub] spark pull request: [SPARK-6122][Core] Upgrade Tachyon client vers...

2015-03-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4867#issuecomment-84200925 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-6406] Launcher backward compatibility i...

2015-03-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5085#issuecomment-84214306 What is the intuitive explanation for why it is faster than before? It seems like launching the assembly twice would be slower. Separately, is it worth doing

[GitHub] spark pull request: [SPARK-5847] Allow for namespacing metrics by ...

2015-03-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4632#issuecomment-84209577 Hey @JoshRosen can probably take a look at this. One thing though, IIRC the reason why we have a unique ID here is because some sets of users requested the exact

[GitHub] spark pull request: [SPARK-4123][Project Infra][WIP]: Show new dep...

2015-03-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/5093#discussion_r26882829 --- Diff: dev/tests/pr_new_dependencies.sh --- @@ -0,0 +1,77 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-6371] [build] Update version to 1.4.0-S...

2015-03-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/5056#issuecomment-84085364 LGTM - I did some searching around and I think this upgrades all the necessary places. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-6275][Documentation]Miss toDF() functio...

2015-03-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4977#issuecomment-81428099 I've pushed this change to the website. http://spark.apache.org/docs/latest/sql-programming-guide.html --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-03-11 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3916#issuecomment-78219718 Okay cool - LGTM I will pull this in. I just did some local sanity tests, built with Maven and ran (a few) Maven tests. We'll need to keep any eye on the Maven build

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-03-10 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3916#issuecomment-78003405 @andrewor14 are you good with this one? I'd like to merge it soon! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-03-09 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3916#issuecomment-77929101 Hey Marcelo, I think this is really close to being ready to merge. I noticed some extra output (a '0') when running usage on `spark-shell

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-03-09 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r26065985 --- Diff: launcher/src/main/java/org/apache/spark/launcher/Main.java --- @@ -0,0 +1,173 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-03-09 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r26067056 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-03-09 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r26067102 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-5134] Bump default hadoop.version to 2....

2015-03-08 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3917#issuecomment-5142 Hey I commented on the JIRA, but some recent changes in the way we publish artifacts actually makes this more tenable of a change. https://issues.apache.org

[GitHub] spark pull request: [SPARK-6200] [SQL] Add a manager for dialects

2015-03-07 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4939#issuecomment-77736075 It might be simpler to just accept a dialect via a configuration option. Is the idea here that someone might switch dialects multiple times in a single session

[GitHub] spark pull request: SPARK-6182 [BUILD] spark-parent pom needs to b...

2015-03-05 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4912#issuecomment-77429640 I tested this locally and it worked well. I'd like to pull this in to do further testing. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-6182 [BUILD] spark-parent pom needs to b...

2015-03-05 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4913#issuecomment-77429517 I don't think this works in its current form because the artifact names are laid out in the parent pom and those can still only have a single version. The only way I

[GitHub] spark pull request: [SPARK-6149] [SQL] [Build] Excludes Guava 15 r...

2015-03-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4890#issuecomment-77203459 @liancheng can you put this exclusion in the pom.xml file instead of in sbt? If you look there we already have other exclusions. --- If your project is set up

[GitHub] spark pull request: SPARK-5143 [BUILD] [WIP] spark-network-yarn 2....

2015-03-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4876#issuecomment-77246969 I commented on the JIRA. This LGTM as an immediate fix - clearly the property is not correct in the published 2.11 poms. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-6149] [SQL] [Build] Excludes Guava 15 r...

2015-03-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4890#issuecomment-77306302 I'm going to optimistically merge this to cut another RC. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-3357 [CORE] Internal log messages should...

2015-03-02 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4838#discussion_r25647202 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -184,7 +184,7 @@ private[spark] class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: SPARK-3357 [CORE] Internal log messages should...

2015-03-02 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4838#discussion_r25646752 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -184,7 +184,7 @@ private[spark] class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: SPARK-5390 [DOCS] Encourage users to post on S...

2015-03-02 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4843#discussion_r25646936 --- Diff: docs/index.md --- @@ -115,6 +115,8 @@ options for deployment: * [Spark Homepage](http://spark.apache.org) * [Spark Wiki](https

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-02 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4821#discussion_r25648384 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -574,6 +583,11 @@ private[spark] object JsonProtocol

[GitHub] spark pull request: SPARK-5390 [DOCS] Encourage users to post on S...

2015-03-02 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4843#discussion_r25646565 --- Diff: docs/index.md --- @@ -115,6 +115,8 @@ options for deployment: * [Spark Homepage](http://spark.apache.org) * [Spark Wiki](https

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-02 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4821#issuecomment-76855580 Okay I took a close look through this and it LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-6048] SparkConf should not translate de...

2015-03-02 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4799#issuecomment-76862763 Okay - thanks everyone for helping with this. I'll pull it in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-6048] SparkConf should not translate de...

2015-03-02 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4799#discussion_r25582753 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -92,6 +92,16 @@ private[spark] class Executor( private val executorActor

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-02 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4821#issuecomment-76782633 @vanzin so just to be clear - you are anticipating a future case where we need to read the version to correctly parse the logs? Is that the argument for it? I am only

[GitHub] spark pull request: [SPARK-6048] SparkConf should not translate de...

2015-03-01 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4799#discussion_r25573606 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -92,6 +92,16 @@ private[spark] class Executor( private val executorActor

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-01 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4821#discussion_r25573801 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -217,57 +223,67 @@ private[spark] object EventLoggingListener

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-01 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4821#discussion_r25573931 --- Diff: core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala --- @@ -23,7 +23,9 @@ private[spark] class ApplicationDescription

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-01 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4821#discussion_r25573776 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -217,57 +223,67 @@ private[spark] object EventLoggingListener

[GitHub] spark pull request: SPARK-3357 [CORE] Internal log messages should...

2015-03-01 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4838#discussion_r25574579 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1074,7 +1074,7 @@ private[spark] class BlockManager( * Remove all

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-01 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4821#discussion_r25574126 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala --- @@ -99,6 +99,12 @@ case class SparkListenerExecutorRemoved(time: Long

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-01 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4821#issuecomment-76641955 I took a pass on this with some feedback. Overall, it would be good to really minimize the scope of the changes since this is so late in the game. There is some clean

[GitHub] spark pull request: [SPARK-6050] [yarn] Add config option to do la...

2015-03-01 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4818#issuecomment-76659139 @vanzin okay so maybe set this to true then? I don't have any opinion, but would love to get this in as it's one of the only release blockers. --- If your project

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-01 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4821#discussion_r25579888 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -322,28 +322,22 @@ private[history] class FsHistoryProvider

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-01 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4821#discussion_r25579899 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -322,28 +322,22 @@ private[history] class FsHistoryProvider

[GitHub] spark pull request: [SPARK-6066] Make event log format easier to p...

2015-03-01 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4821#discussion_r25579955 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -322,28 +322,22 @@ private[history] class FsHistoryProvider

[GitHub] spark pull request: [SPARK-6074] [sql] Package pyspark sql binding...

2015-02-28 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4822#issuecomment-76563416 Good catch @vanzin. This LGTM. I did some testing to verify that the assembly includes all relevant python files now: ``` $ jar -tf assembly/target/scala

[GitHub] spark pull request: [SPARK-6048] SparkConf should not translate de...

2015-02-28 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4799#issuecomment-76564115 I agree @andrewor14 can you add documentation about deprecated configs? I would extend what's there now: ``` Properties set directly

[GitHub] spark pull request: SPARK-3357 [CORE] Internal log messages should...

2015-02-28 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4838#discussion_r25563232 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -188,10 +188,10 @@ private[spark] class ContextCleaner(sc: SparkContext) extends

[GitHub] spark pull request: SPARK-3357 [CORE] Internal log messages should...

2015-02-28 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4838#issuecomment-76563715 Chimed in in a few places. I think overall, a good goal is that when we are doing our normal GC of RDD's and broadcasts, we don't want to be so verbose. This cleaning

[GitHub] spark pull request: [SPARK-6055] [PySpark] fix incorrect DataType....

2015-02-28 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4809#issuecomment-76564137 @davies can you close this? (auto close doesn't work for the backport commits). --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-6055] [PySpark] fix incorrect DataType....

2015-02-28 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4810#issuecomment-76564149 @davies can you close this? (auto close doesn't work for the backport commits). --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: SPARK-3357 [CORE] Internal log messages should...

2015-02-28 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4838#discussion_r25563229 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -371,7 +371,7 @@ private[spark] class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: SPARK-3357 [CORE] Internal log messages should...

2015-02-28 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4838#discussion_r25563255 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1074,7 +1074,7 @@ private[spark] class BlockManager( * Remove all

[GitHub] spark pull request: SPARK-3357 [CORE] Internal log messages should...

2015-02-28 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4838#discussion_r25563263 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala --- @@ -476,16 +476,16 @@ private[spark] class BlockManagerInfo

[GitHub] spark pull request: [SPARK-6070] [yarn] Remove unneeded classes fr...

2015-02-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4820#issuecomment-76514114 Thanks Marcelo, pulling this in! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-5979][SPARK-6032] Smaller safer --packa...

2015-02-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4802#issuecomment-76514538 Pulling this in - thanks Burak! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-5979][SPARK-6031][SPARK-6032][SPARK-604...

2015-02-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4754#issuecomment-76514573 @brkyvz let's close this issue for now and keep it in our back pocket. We can use it if we decide to put this in the 1.3 branch down the line. --- If your project

[GitHub] spark pull request: SPARK-6063

2015-02-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4815#issuecomment-76450569 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-6070] [yarn] Remove unneeded classes fr...

2015-02-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4820#issuecomment-76492395 Thanks for finding this! Either approach seems fine to me. This LGTM... it might be nice to add a comment explaining that this is needed because the root pom

[GitHub] spark pull request: [SPARK-6070] [yarn] Remove unneeded classes fr...

2015-02-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4820#issuecomment-76498092 Cool LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-6050] [yarn] Add config option to do la...

2015-02-27 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4818#issuecomment-76502508 It seems reasonable to me to have the default of false and make a comment in the release notes. No strong feelings here though. --- If your project is set up

[GitHub] spark pull request: [SPARK-5979][SPARK-6032] Smaller safer fix

2015-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4802#issuecomment-76352327 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-4579 [WEBUI] Scheduling Delay appears ne...

2015-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4796#issuecomment-76301218 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-4579 [WEBUI] Scheduling Delay appears ne...

2015-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4796#issuecomment-76301244 Yeah this would be great to pull into an RC... a very confusing presentation in the UI right now! --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-6048] SparkConf should not translate de...

2015-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4799#issuecomment-76320547 @vanzin does this look okay to you? I commented on the JIRA, but my main goal is to find a surgical patch here that can unblock the release, so just rewinding

[GitHub] spark pull request: [SPARK-5979][SPARK-6031][SPARK-6032][SPARK-604...

2015-02-26 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4754#discussion_r25482866 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmitDriverBootstrapper.scala --- @@ -82,13 +85,25 @@ private[spark] object

[GitHub] spark pull request: [SPARK-5982] Remove incorrect Local Read Time ...

2015-02-25 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4749#issuecomment-76054412 LGTM - thanks Kay! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5914] to run spark-submit requiring onl...

2015-02-24 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4742#discussion_r25237061 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -439,7 +439,14 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request: [SPARK-5914] to run spark-submit requiring onl...

2015-02-24 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4742#issuecomment-75720961 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5979] Made --package exclusions more re...

2015-02-24 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4754#issuecomment-75911222 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-5158] [core] [security] Spark standalon...

2015-02-24 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4106#issuecomment-75861282 The model I had in mind for this patch was to support dedicated clusters/appliances based on Spark where the Spark cluster itself is fully trusted and not multi-tenant

[GitHub] spark pull request: [SPARK-5993][Streaming][Build] Fix assembly ja...

2015-02-24 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4753#issuecomment-75895962 LGTM - I tested this approach locally and it worked. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-23 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r25148046 --- Diff: launcher/src/main/java/org/apache/spark/launcher/CommandBuilder.java --- @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-22 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r25145503 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java --- @@ -0,0 +1,684 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-22 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3916#issuecomment-75498418 Hey @vanzin okay took another quite long look (sorry, was delayed this week due to strata) and I have embarrassingly few useful comments given how long I looked

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-22 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r25147958 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java --- @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-22 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r25145313 --- Diff: launcher/src/main/java/org/apache/spark/launcher/CommandBuilder.java --- @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-22 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3916#discussion_r25145264 --- Diff: launcher/src/main/java/org/apache/spark/launcher/CommandBuilder.java --- @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4286] Integrate external shuffle servic...

2015-02-18 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3861#issuecomment-74950927 We spoke a bit offline about this, but my feeling was that the best thing here might be to add a way to launch the shuffle service as a standalone application

[GitHub] spark pull request: SPARK-5425: Use synchronised methods in system...

2015-02-18 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4221#issuecomment-74952993 Yeah our auto-close doesn't work on PR's into release branches like this. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-4808] Configurable spillable memory thr...

2015-02-18 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4420#issuecomment-74946838 @mccheah @mingyukim yeah, there isn't an OOM proof solution at all because these are all heuristics. Even checking every element is not OOM proof since memory

[GitHub] spark pull request: [SPARK-4924] Add a library for launching Spark...

2015-02-18 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3916#issuecomment-75008630 @vanzin about half way through reviewing... will pick up tomorrow --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

<    2   3   4   5   6   7   8   9   10   11   >